Statquest筆記2—edgeR (No.59)

Like DEseq2, edgeR does not use RPKM, TPM, etc. This is because it needs to adjust for:

  1. Sequencing Depth (that’s that RPKM etc. deal with).

  2. Library Composition (diffenrent samples contain different active genes).

How edgeR normalizies libraries

Step 1 Remove all untranscribed genes (remove genes with 0 read counts in all samples).

Step 2 Pick one sample to be the “reference sample”, which would be used to normalize all of the other samples against.

What is a good/bad reference sample?

Step 3 Select the genes for calculating the scaling factors. This is done separately for each sample relative to the “reference sample”.

We’ll start by looking at the different types of genes to choose from.

edgeR selects the genes in the middle, with more effort put into excluding biased genes.

Now that we have a table of log ratios to identify biased genes, let’s make another table to identify genes that are highly and lowly transcribed in both samples.

To identify genes that are high and low in both samples, first calculate the geometric mean for each gene. The geometric mean is not easily influenced by outliers.

Now we have two tables, one to identify biased genes (log2(Reference/Sample2)), and one to identify genes that are highly and lowly transcribed in both samples (mean of logs).

Filter out the top 30% and the bottom 30% biased genes.

Filter out the top 5% and the bottom 5% of the highly and lowly transcribed genes.

Then genes that are still in both lists are used to calculate the scaling factor. (Unfortunately, the genes in our example that are in both lists are “…”.)

Step 4 Calculate the weighted average of the remaining log2 ratios.

FYI (for your information), edgeR calls this the: “weighted trimmed mean of the log2 ratios”, because we “trimmed” off the most extreme genes.

By excluding the extreme genes, we avoid the effect of outliers (sort of like using the geometric mean).

Once you have selected which genes will be used to calculate the scaling factor, just calculate the ##weighted average## of their log2 ratios. Genes with more reads mapped to them get more weight, because they are less noisy. This because log ratios have more variance with low read counts.

Step 5 Convert the weighted average of log2 values to “normal numbers”.

Step 6 Center the scaling factors around 1.

?

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
  • 序言:七十年代末娶聘,一起剝皮案震驚了整個(gè)濱河市躏筏,隨后出現(xiàn)的幾起案子匕争,更是在濱河造成了極大的恐慌乖仇,老刑警劉巖原在,帶你破解...
    沈念sama閱讀 219,039評論 6 508
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件案铺,死亡現(xiàn)場離奇詭異闸氮,居然都是意外死亡甲抖,警方通過查閱死者的電腦和手機(jī),發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 93,426評論 3 395
  • 文/潘曉璐 我一進(jìn)店門沾瓦,熙熙樓的掌柜王于貴愁眉苦臉地迎上來满着,“玉大人谦炒,你說我怎么就攤上這事》缋” “怎么了宁改?”我有些...
    開封第一講書人閱讀 165,417評論 0 356
  • 文/不壞的土叔 我叫張陵,是天一觀的道長魂莫。 經(jīng)常有香客問我还蹲,道長,這世上最難降的妖魔是什么耙考? 我笑而不...
    開封第一講書人閱讀 58,868評論 1 295
  • 正文 為了忘掉前任谜喊,我火速辦了婚禮,結(jié)果婚禮上倦始,老公的妹妹穿的比我還像新娘斗遏。我一直安慰自己,他們只是感情好鞋邑,可當(dāng)我...
    茶點(diǎn)故事閱讀 67,892評論 6 392
  • 文/花漫 我一把揭開白布最易。 她就那樣靜靜地躺著,像睡著了一般炫狱。 火紅的嫁衣襯著肌膚如雪藻懒。 梳的紋絲不亂的頭發(fā)上,一...
    開封第一講書人閱讀 51,692評論 1 305
  • 那天视译,我揣著相機(jī)與錄音嬉荆,去河邊找鬼。 笑死酷含,一個(gè)胖子當(dāng)著我的面吹牛鄙早,可吹牛的內(nèi)容都是我干的。 我是一名探鬼主播椅亚,決...
    沈念sama閱讀 40,416評論 3 419
  • 文/蒼蘭香墨 我猛地睜開眼限番,長吁一口氣:“原來是場噩夢啊……” “哼!你這毒婦竟也來了呀舔?” 一聲冷哼從身側(cè)響起弥虐,我...
    開封第一講書人閱讀 39,326評論 0 276
  • 序言:老撾萬榮一對情侶失蹤,失蹤者是張志新(化名)和其女友劉穎媚赖,沒想到半個(gè)月后霜瘪,有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體,經(jīng)...
    沈念sama閱讀 45,782評論 1 316
  • 正文 獨(dú)居荒郊野嶺守林人離奇死亡惧磺,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點(diǎn)故事閱讀 37,957評論 3 337
  • 正文 我和宋清朗相戀三年颖对,在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了。 大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片磨隘。...
    茶點(diǎn)故事閱讀 40,102評論 1 350
  • 序言:一個(gè)原本活蹦亂跳的男人離奇死亡缤底,死狀恐怖顾患,靈堂內(nèi)的尸體忽然破棺而出,到底是詐尸還是另有隱情个唧,我是刑警寧澤江解,帶...
    沈念sama閱讀 35,790評論 5 346
  • 正文 年R本政府宣布,位于F島的核電站坑鱼,受9級特大地震影響膘流,放射性物質(zhì)發(fā)生泄漏絮缅。R本人自食惡果不足惜鲁沥,卻給世界環(huán)境...
    茶點(diǎn)故事閱讀 41,442評論 3 331
  • 文/蒙蒙 一、第九天 我趴在偏房一處隱蔽的房頂上張望耕魄。 院中可真熱鬧画恰,春花似錦、人聲如沸吸奴。這莊子的主人今日做“春日...
    開封第一講書人閱讀 31,996評論 0 22
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽则奥。三九已至考润,卻和暖如春,著一層夾襖步出監(jiān)牢的瞬間读处,已是汗流浹背糊治。 一陣腳步聲響...
    開封第一講書人閱讀 33,113評論 1 272
  • 我被黑心中介騙來泰國打工, 沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留罚舱,地道東北人井辜。 一個(gè)月前我還...
    沈念sama閱讀 48,332評論 3 373
  • 正文 我出身青樓,卻偏偏與公主長得像管闷,于是被迫代替她去往敵國和親粥脚。 傳聞我的和親對象是個(gè)殘疾皇子,可洞房花燭夜當(dāng)晚...
    茶點(diǎn)故事閱讀 45,044評論 2 355