愚人節(jié)快樂(lè)户敬。
導(dǎo)入數(shù)據(jù),設(shè)置選項(xiàng)
hESCs vs?hiPSCs
注釋文件路徑
告訴軟件測(cè)序數(shù)據(jù)是哪個(gè)平臺(tái)的數(shù)據(jù)
測(cè)序數(shù)據(jù)路徑(450K:idat文件)
選擇option profile,即設(shè)置處理數(shù)據(jù)的預(yù)設(shè)值季率,以便分析在合理的時(shí)間內(nèi)運(yùn)行派近。
選擇分析的名字
選擇分析哪個(gè)區(qū)域的甲基化水平
選擇identifiers.column對(duì)應(yīng)列腾节,每個(gè)identifiers.column對(duì)應(yīng)一個(gè)特定的樣本
選擇配色方案
勾選移除性染色體上的探針
勾選bmiq歸一化方法
選擇需要用來(lái)比對(duì)的組別
質(zhì)控
前處理
這里的歸一化和methylkit中的歸一化概念有所不同,這里的歸一化是對(duì)一個(gè)樣本中所有的位點(diǎn)進(jìn)行歸一化崭捍,而methylkit中的歸一化是對(duì)覆蓋度進(jìn)行歸一化尸折。
同時(shí)methylkit對(duì)批次效應(yīng)進(jìn)行了校正。
中間文件格式
Track Hub data was generated for export to various genome browsers. Note that you need a server that is capable of serving the tracks to the genome browser via URL. Below, instructions are provided to view the tracks in the UCSC genome browser. Of course the files can also be viewed in other browsers such as the?Ensembl genome browser.
Track Hub數(shù)據(jù)被生成以導(dǎo)出到各種基因組瀏覽器殷蛇。
Convert the bed/bedGraph files contained in the bed/bedGraph directories (see table below) attached to this report. You can use the UCSC tool?bedTobigBed/bedGraphTobigWig?for this task. More information (e.g. where to obtain the tool) on how to convert can be found?here. Make sure the resulting bigBed/bigWig files are moved to the corresponding UCSC track hub directory (see table(s) below).
可以使用UCSC工具將bed轉(zhuǎn)換為bigbed实夹,或者將bedGraph轉(zhuǎn)為bigWig。
Bed files for each sample contain the locations of methylation sites and regions the methylation level (in the score column).
探索性分析
Dimension reduction is used to visually inspect the dataset for a strong signal in the methylation values that is related to samples' clinical or batch processing annotation. RnBeads implements two methods for dimension reduction - principal component analysis (PCA) and multidimensional scaling (MDS).
降維用于直觀地檢查數(shù)據(jù)集甲基化值中的強(qiáng)信號(hào)粒梦,該信號(hào)與樣本的臨床(細(xì)胞類型等)或批處理(批次效應(yīng))注釋相關(guān)亮航。RnBeads實(shí)現(xiàn)了兩種降維方法——主成分分析(PCA)和多維標(biāo)度(MDS)。
批次效應(yīng)
In this section, different properties of the dataset are tested for significant associations. The properties can include sample coordinates in the principal component space, phenotype traits and intensities of control probes. The tests used to calculate a p-value given two properties depend on the essence of the data:
If both properties contain categorical data (e.g. tissue type and sample processing date), the test of choice is a two-sided Fisher's exact test.
If both properties contain numerical data (e.g. coordinates in the first principal component and age of individual), the correlation coefficient between the traits is computed. A p-value is estimated using permutation tests with 10000 permutations.
If property?A?is categorical and property?B?contains numeric data, p-value for association is calculated by comparing the values of?B?for the different categories in?A. The test of choice is a two-sided Wilcoxon rank sum test (when?A?defines two categories) or a Kruskal-Wallis one-way analysis of variance (when?A?separates the samples into three or more categories).
Note that the p-values presented in this report are?not corrected?for multiple testing.
Associations between Principal Components and Traits
差異分析
In the following anlyses, p-values on the site level were computed using the?limma?method. I.e. hierarchical linear models from the?limma?package were employed and fitted using an empirical Bayes approach on derived M-values.
多層線性模型計(jì)算p值
基于衍生m值的經(jīng)驗(yàn)貝葉斯方法進(jìn)行擬合
Differential methylation on the site level was computed based on a variety of metrics. Of particular interest for the following plots and analyses are the following quantities for each site: a) the difference in mean methylation levels of the two groups being compared, b) the quotient in mean methylation and c) a statistical test (limma or t-test depending on the settings) assessing whether the methylation values in the two groups originate from distinct distributions. Additionally each site was assigned a rank based on each of these three criteria. A combined rank is computed as the maximum (i.e. worst) rank among the three ranks. The smaller the combined rank for a site, the more evidence for differential methylation it exhibits. This section includes scatterplots of the site group means as well as volcano plots of each pairwise comparison colored according to the combined ranks or p-values of a given site.
根據(jù)均值差異匀们,均值比例以及l(fā)imms統(tǒng)計(jì)檢驗(yàn)的結(jié)果來(lái)對(duì)差異的甲基化位點(diǎn)進(jìn)行排序缴淋,排序的數(shù)字越小表示這個(gè)位點(diǎn)越顯著地差異甲基化。
我們也可以下載csv文件泄朴,里面包含:
id: site id
Chromosome: chromosome of the site
Start: start coordinate of the site
Strand: strand of the site
mean.g1,mean.g2: (where g1 and g2 is replaced by the respective group names in the table) mean methylation in each of the two groups
mean.diff: difference in methylation means between the two groups: mean.g1-mean.g2. In case of paired analysis, it is the mean of the pairwise differences.
mean.quot.log2: log2 of the quotient in methylation: log2((mean.g1+epsilon)/(mean.g2+epsilon)), where epsilon:=0.01. In case of paired analysis, it is the mean of the pairwise quotients.
diffmeth.p.val: p-value obtained from linear models employed in the limma package (or alternatively from a two-sided Welch t-test; which type of p-value is computed is specified in the differential.site.test.method option).
diffmeth.p.adj.fdr: FDR adjusted p-value of all sites
Differential methylation on the region level was computed based on a variety of metrics. Of particular interest for the following plots and analyses are the following quantities for each region: the mean difference in means across all sites in a region of the two groups being compared and the mean of quotients in mean methylation as well as a combined p-value calculated from all site p-values in the region [1]. Additionally each region was assigned a rank based on each of these three criteria. A combined rank is computed as the maximum (i.e. worst) value among the three ranks. The smaller the combined rank for a region, the more evidence for differential methylation it exhibits. Regions were defined based on the region types specified in the analysis. This section includes scatterplots of the region group means as well as volcano plots of each pairwise comparison colored according to the combined rank of a given region.
Rnbeads數(shù)據(jù)資源
看了以下Rnbeads還有針對(duì)不同甲基化數(shù)據(jù)的整合重抖,并且也對(duì)這些數(shù)據(jù)進(jìn)行了分析。具體的分析結(jié)果都可以看到祖灰。