1. Workflow Selection
1).
Raw data <.idat>
If
the data you want to upload is the raw intensity data files (*.idat) of
microarray, please select this workflow.
2).
Beta value matrix <.csv>
If
the data you want to upload is methylation beta value matrix, which means the
ratio of intensities between methylated and unmethylated alleles, please select
this mode.
2. Customized Parameter
· Project Information
Project
name : Give your analysis a name, it will serve as the recognizable name “Job
Name” in “My Jobs - Results”.
· Data
1. Raw data <.idat>
In the current version, you need to add each sample one by one.
1).
"Data From": "DRS", "Local", "Web"
Location
of data to be analyzed.
Web
= Only upload the data temporarily for analysis;
DRS
= Data stored in the Data Repository Storage already;
Local
= Data stored in the HPC@SJTU already.
2).
"File Name"
Filename
of the data to be analyzed. (_Grn.idat).
The
files for each sample consists of a Red and a Grn (Green) IDAT file. Please
enter one of the file names. Note: do not lose the suffix.
3).
"File Name 2"
Filename
of the data to be analyzed.(_Red.idat)
The
files for each sample consists of a Red and a Grn (Green) IDAT file. Please
enter another file name. Note: do not lose the suffix.
4).
"Sample Group": "Case", "Control"
Group
of each sample.
5).
"Sample ID"
Customize
the name for each sample. It will appear in the figures and table results as
the sample identifier.
6).
"Data ID"
If
"Local" OR "Web" is selected in "Data From", you
need to fill in the corresponding ID here.
An Example of Customized Parameter (.idat):
2.
Beta value matrix <.csv>
For
the current version, you need to provide two csv files, beta value and
phenotype data. These two files need to be added one by one in data.
1).
"Data From": "DRS", "Local", "Web"
Location
of data to be analyzed.
Web
= Only upload the data temporarily for analysis;
DRS
= Data stored in the Data Repository Storage already;
Local
= Data stored in the HPC@SJTU already.
2).
"File Name"
Filename
of the data to be analyzed. Note: do not lose the suffix.
3).
"Data ID"
If
"Local" OR "Web" is selected in "Data From", you
need to fill in the corresponding ID here.
An Example of Customized Parameter (.csv):
· Script
1).
"Core"
Number
of cores applied for in your task.
· Other
1). "File Format"
Which
workflow is selected.
idat
= Raw data <.idat>
matrix
= Beta value matrix <.csv>
2). "Array Type"
450k
= Infinium HumanMethylation450 BeadChip – Illumina
EPIC
= Infinium HumanMethylationEPIC BeadChip – Illumina (850K)
3). "Normalization Method"(click for details)
Option
to normalize data with a selection of normalization methods.
4). "Adjust Method"
Multiple testing correction method for p-value.
5). "DMR Method" (click for details)
Options
to estimate regions for which a genomic profile deviates from its baseline
value.
6). "DMP p-value"
The
minimum threshold of significance for probes to be includede in DMPs.
7). "DMR p-value"
The
minimum threshold of significance for probes to be includede in DMRs.
3. Upload Data
You
need to compress the file which filled in “Data” into the corresponding format
according to the following requirements.
a.
If the data to be analyzed is not stored on our platform, only upload the data
temporarily (Web) to use the analysis function. Please compress all the data to
be analyzed into a single file, supported compression formats: rar, zip, gz,
tar.gz.
b.
If the data to be analyzed has been stored on our platform, including Data
Repository Storage (DRS), or stored in the HPC@SJTU through the administrator
(Local), please upload any content compressed file that conforms to these
formats: rar, zip, gz, tar.gz.
An Example of Input Data (.idat):
The input data are IDAT files, representing two different color channels (red and green) prior to normalization. The basename of each IDAT file consists of three parts: slide (sentrix_ID), array (Sentrix_Position), and channel.
NOTE: The format of the input file should be strictly in accordance with the example data.
An Example of Input Data (.csv):
① beta_matrix.csv
row = cg ID;
column = samples;
no missing values
② pheno_data.csv
column 1 = “Sample_Name”;
column 2 = “Sample_Group” (at least 2 groups);
no missing value