轉(zhuǎn)錄組工具文獻(xiàn)介紹

聲明:以下內(nèi)容轉(zhuǎn)載自360圖書館衙吩。
/>前端大法好,網(wǎng)頁內(nèi)容隨意復(fù)制</
一溪窒、比對工具
(Kim et al., 2015) HISAT: a fast spliced aligner with low memory requirements. Nature methods.

Aligns RNA-seq reads to a reference genome using uncompressed suffix arrays. STAR has a potential for accurately aligning long (several kilobases) reads that are emerging from the third-generation sequencing technologies.

(Dobin et al., 2013) STAR: ultrafast universal RNA-seq aligner. Bioinformatics.

Self-training Algorithm for Splice Junction Detection using RNA-seq.

(Li et al., 2013) TrueSight: a new algorithm for splice junction detection using RNA-seq. Nucleic acids research.

A toolkit for processing next-gen sequencing data. These programs were also implemented in Bioconductor R package Rsubread.

(Liao et al., 2013) The Subread aligner: fast, accurate and scalable read mapping by seed-and-vote. Nucleic acids research.

(Rogers et al., 2012) SpliceGrapher: detecting patterns of alternative splicing from RNA-Seq data in the context of gene models and EST data. Genome biology.

(Philippe et al., 2013) CRAC: an integrated approach to the analysis of RNA-seq reads. Genome biology.

A fast splice junction mapper for RNA-Seq reads. TopHat aligns RNA-Seq reads to mammalian-sized genomes using the high-throughput short read aligner Bowtie, and then analyzes the mapping results to identify splice junctions between exons.

(Kim et al., 2013) TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol.

(Chu et al., 2015) SpliceJumper: a classification-based approach for calling splicing junctions from RNA-seq data. BMC bioinformatics.

(Srivastava et al., 2016) RapMap: a rapid, sensitive and accurate tool for mapping RNA-seq reads to transcriptomes. Bioinformatics.

A framework for genome-based transcript reconstruction and quantification. CIDANEis engineered to not only assembly RNA-seq reads ab initio, but to also make use of the growing annotation of known splice sites, transcription start and end sites, or even full-length transcripts, available for most model organisms. To some extent, CIDANEis able to recover splice junctions that are invisible to existing bioinformatics tools.

(Canzar et al., 2016) CIDANE: comprehensive isoform discovery and abundance estimation. Genome biology.

An open source tool for accurate genome-guided transcriptome assembly from RNA-seq reads based on the model of splice graph. An extension of our program CLASS, CLASS2 jointly optimizes read patterns and the number of supporting reads to score and prioritize transcripts, implemented in a novel, scalable and efficient dynamic programming algorithm.

(Song et al., 2016) CLASS2: accurate and efficient splice variant annotation from RNA-seq reads. Nucleic acids research.

二坤塞、Read數(shù)統(tǒng)計(jì)
An RNA-seq read counting tool which builds upon the speed of featureCounts and implements the counting modes of HTSeq. VERSE is more than 30x faster than HTSeq when computing the same gene counts. VERSE also supports a hierarchical assignment scheme, which allows reads to be assigned uniquely and sequentially to different types of features according to user-defined priorities. It is built on top of featureCounts.

(Zhu et al., 2016) VERSE: a versatile and efficient RNA-Seq read counting tool. bioRxiv.

A tool for RNA-Seq data analysis that counts for each gene how many aligned reads overlap its exons.

(Anders et al., 2013) Count-based differential expression analysis of RNA sequencing data using R and Bioconductor. Nature protocols.

A package that provides efficient low-level and highly reusable S4 classes for storing ranges of integers, RLE vectors (Run-Length Encoding) and, more generally, data that can be organized sequentially (formally defined as Vector objects), as well as views on these Vector objects. IRanges provides also efficient list-like classes for storing big collections of instances of the basic classes. All classes in the package use consistent naming and share the same rich and consistent Vector APIas much as possible.

(Lawrence et al., 2013) Software for computing and annotating genomic ranges. PLoS computational biology.

A read summarization program, which counts mapped reads for the genomic features such as genes and exons.

(Liao et., 2013) featureCounts: an efficient general-purpose program for assigning sequence reads to genomic features. Bioinformatics

三、定量
A fast and highly efficient assembler of RNA-Seq alignments into potential transcripts. It is primarily a genome-guided transcriptome assembler, although it can borrow algorithmic techniques from de novo genome assembly to help with transcript assembly. Its input can include not only the spliced read alignments used by reference-based assemblers, but also longer contigs that were assembled de novo from unambiguous, non-branching parts of a transcript.

(Pertea et al., 2015) StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nature biotechnology.

A computational approach that measures changes in mature RNA and pre-mRNA reads across different experimental conditions to quantify transcriptional and post-transcriptional regulation of gene expression. EISA reveals both transcriptional and post-transcriptional contributions to expression changes, increasing the amount of information that can be gained from RNA-seq data sets.

(Gaidatzis et al., 2015) Analysis of intronic and exonic reads in RNA-seq data characterizes transcriptional and post-transcriptional regulation. Nature biotechnology.

Assembles transcripts, estimates their abundances, and tests for differential expression and regulation in RNA-Seq samples.

(Trapnell et al., 2010) Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nature biotechnology.

A method for transcriptome reconstruction that relies solely on RNA-Seq reads and an assembled genome to build a transcriptome ab initio. The statistical methods to estimate read coverage significance are also applicable to other sequencing data. Scripture also has modules for ChIP-Seq peak calling.

(Guttman et al., 2010) Ab initio reconstruction of cell type-specific transcriptomes in mouse reveals the conserved multi-exonic structure of lincRNAs. Nature biotechnology

Accurate quantification of transcriptome from RNA-Seq data by effective length normalization.

(Lee et al., 2011) Accurate quantification of transcriptome from RNA-Seq data by effective length normalization. Nucleic acids research.

An integrated alignment workflow and a simple counting-based approach to derive estimates for gene, exon and exon-exon junction expression. In contrast to previous counting-based approaches, EQP takes into account only reads whose alignment pattern agrees with the splicing pattern of the features of interest. This leads to improved gene expression estimates as well as to the generation of exon counts that allow disambiguating reads between overlapping exons.

(Schuierer and Roma, 2016) The exon quantification pipeline (EQP): a comprehensive approach to the quantification of gene, exon and junction expression from RNA-seq data. Nucleic acids research.

It was designed as a user friendly solution to extract and annotate biologically important transcripts from next generation RNA sequencing data.

(Forster et al., 2013) RNA-eXpress annotates novel transcript features in RNA-seq data. Bioinformatics.

A versatile model to account for sequence specific bias that commonly occurs at the ends of fragments. Isolotar analyzes RNA-Seq experiments using a simple Bayesian hierarchical model. Combined with aggressive bias correction, it produces estimates that are simultaneously accurate and show high agreement between samples. Isolator is uniquely able to compute posterior probabilities corresponding to arbitrarily complex questions, within the confines of the model.

(Jones et al., 2016) Isolator: accurate and stable analysis of isoform-level expression in RNA-Seq experiments. bioRxiv.

四澈蚌、標(biāo)準(zhǔn)化與差異表達(dá)
A method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates. This enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression.

(Love et al., 2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome biology

A software package designed to facilitate flexible differential expression analysis of RNA-Seq data. Ballgown can also be used to visualize the transcript assembly on a gene-by-gene basis, extract abundance estimates for exons, introns, transcripts or genes, and perform linear model–based differential expression analyses.

(Frazee et al., 2015) Ballgown bridges the gap between transcriptome assembly and expression analysis. Nature biotechnology.

A package to dampen the effect of outliers on count-based differential expression analyses. edgeR uses empirical Bayes estimation and exact tests based on the negative binomial distribution and is useful for differential signal analysis with other types of genome-scale count data. It requires a delicate tradeoff to maintain high power while at the same time achieving a decent resistance to the presence of outliers. In particular, it is difficult to know exactly what an outlier is and where the line should be drawn to identify it as such.

(Zhou et al., 2014) Robustly detecting differential expression in RNA sequencing data using observation weights. Nucleic acids research

A differential transcript expression (DTE) analysis algorithm. SDEAPestimates the number of conditions directly from the input samples using a Dirichlet mixture model and discovers alternative splicing events using a new graph modular decomposition algorithm. By taking advantage of the above technical improvement, SDEAP was able to outperform the other DTE analysis methods in extensive experiments on simulated data and real data with qPCR validation. The prediction of SDEAP also allows users to classify the samples of cancer subtypes and cell-cycle phases more accurately.

(Yang and Jiang, 2016) SDEAP: a splice graph based differential transcript expression analysis tool for population data. Bioinformatics

Enables rapid interpretation of complex gene expression studies as well as other high-throughput genomics assays. variancePartition is a statistical and visualization framework, used to prioritize drivers of variation based on a genome-wide summary, and identify genes that deviate from the genome-wide trend. This tool quantifies variation in each expression trait attributable to differences in disease status, sex, cell or tissue type, ancestry, genetic background, experimental stimulus, or technical variables.

(Hoffman and Schadt, 2016) variancePartition: interpreting drivers of variation in complex gene expression studies. BMC BIoinformatics.

A realistic framework to assess the impact of the key components of the statistical framework for differential analyses of RNA-seq data. This tool is based on real data sets and allows the exploration of various scenarios differing in the proportion of non-differentially expressed genes. Hence, it provides an evaluation of the key ingredients of the differential analysis, free of the biases associated with the simulation of data using parametric models.

(Rigaill et al., 2016) Synthetic data sets for the identification of key ingredients for RNA-seq differential analysis. Briefings in Bioinformatics.

Detects differentially expressed (DE) genes for RNA-seq data with high level of hetergeniety such as cancer RNA-seq data. ELTSeq is an empirical likelihood ratio test (ELT) with a mean-variance relationship constraint for the differential expression analysis of RNA sequencing (RNA-seq). As a distribution-free nonparametric model, ELTSeq handles individual heterogeneity by estimating an empirical probability for each observation without making any assumption about read-count distribution. It also incorporates a constraint for the read-count overdispersion, which is widely observed in RNA-seq data. ELTSeq demonstrates a significant improvement over existing methods such as edgeR, DESeq, t-tests, Wilcoxon tests and the classic empirical likelihood-ratio test when handling heterogeneous groups. It will significantly advance the transcriptomics studies of cancers and other complex disease

(Xu and Chen, 2016) An empirical likelihood ratio test robust to individual heterogeneity for differential expression analysis of RNA-seq. Briefings in Bioinformatics.

A package for detecting the differentially expressed (DE) genes in time course RNA-Seq data. The negative binomial mixed-effect model (NBMM) method is applied to gene expression data on a gene-by-gene basis. A parallel computing option is implemented in timeSeq package to speed up the computing process. We showed that our approach outperforms other currently available methods in both synthetic and real data.

(Sun et al., 2016) Statistical inference for time course RNA-Seq data using a negative binomial mixed-effect model. BMC Bioinformatics.

A method for facilitating DE analysis using RNA-seq read count data with multiple treatment conditions. The read count is assumed to follow a log-linear model incorporating two factors (i.e., condition and gene), where an interaction term is used to quantify the association between gene and condition. The number of the degrees of freedom is reduced to one through the first order decomposition of the interaction, leading to a dramatically power improvement in testing DE genes when the number of conditions is greater than two.

(Kang et al., 2016) multiDE: a dimension reduced model based statistical method for differential expression analysis using RNA-sequencing data with multiple treatment conditions. BMC bioinformatics.

(Jia et al., 2015) MetaDiff: differential isoform expression analysis using random-effects meta-regression. BMC bioinformatics.

Provides a data-driven solution to test the assumptions of global normalization methods. Group level information about each sample (such as tumor/normal status) must be provided because the test assesses if there are global differences in the distributions between the user-defined groups.

(Hicks and Irizarry, 2015) quantro: a data-driven approach to guide the choice of an appropriate normalization method. Genome biology.

A Bayesian hierarchical approach to investigate within-sample and between-sample variations in RNA-Seq data.

(Gu et al., 2014) BADGE: A novel Bayesian model for accurate abundance quantification and differential analysis of RNA-Seq data. BMC bioinformatics.

An algorithm that estimates expression at transcript-level resolution and controls for variability evident across replicate libraries.

(Trapnell et al., 2013) Differential analysis of gene regulation at transcript resolution with RNA-seq. Nature biotechnology.

(Li et al., 2012) Normalization, testing, and false discovery rate estimation for RNA-sequencing data. Biostatistics.

A package to identify differentially expressed genes or isoforms for RNA-seq data from different samples. DEGseq also encourage users to export gene expression values in a table format which could be directly processed by edgeR (Robinson, 2009), an R package implementing the method based on negative binominal distribution to model overdispersion relative to Poisson for digital gene expression data with small replicates (Robinson and Smyth, 2007)

(Wang et al., 2010) DEGseq: an R package for identifying differentially expressed genes from RNA-seq data. Bioinformatics.

五摹芙、基因融合

An enhanced version with the ability to align reads across fusion points, which results from the breakage and re-joining of two different chromosomes, or from rearrangements within a chromosome.

(Kim and Salzberg, 2011) TopHat-Fusion: an algorithm for discovery of novel fusion transcripts. Genome biology.

A python package to annotate and visualize gene fusions. For a given gene fusion, AGFusion will predict the cDNA, CDS, and protein sequences resulting from fusion of all combinations of transcripts and save them to fasta files. AGFusion can also plot the protein domain architecture of the fusion transcripts.

(Murphy and Elemento, 2016) AGFusion: annotate and visualize gene fusions. bioRxiv.

A toolkit for fusion gene and chimeric transcript detection from RNA-seq data. InFusion is a computational method for the discovery of chimeric transcripts from RNA-seq data capable of detecting alternatively spliced chimeric transcripts and fusion genes involving non-coding regions. InFusion allows detection of fusions that involve intergenic regions, analyses and filters putative fusion events based on coverage depth, genomic context and strand specificity.

(Okonechnikov et al., 2016) InFusion: Advancing Discovery of Fusion Genes and Chimeric Transcripts from Deep RNA-Sequencing Data. PLoS One.

六、可變剪接
(Reuter et al., 2016) PreTIS: A Tool to Predict Non-canonical 5’ UTR Translational Initiation Sites in Human and Mouse. Plos Computational Biology.

(Afsari et al., 2016) Splice Expression Variation Analysis (SEVA) for Differential Gene Isoform Usage in Cancer. bioRxiv.

The DEXseq method is implemented as an open Bioconductor package, which facilitates data visualization and exploration. It can detect with high sensitivity genes, and in many cases exons, that are subject to differential exon usage.

(Anders et al., 2012) Detecting differential usage of exons from RNA-seq data. Genome research.

(Liu et al., 2012) Detection, annotation and visualization of alternative splicing from RNA-Seq data with SplicingViewer. Genomics.

(Ryan et al., 2012) SpliceSeq: a resource for analysis and visualization of RNA-Seq data on alternative splicing and its functional impacts. Bioinformatics.

Alternative Splicing transcriptional landscape visualization tool.

(Foissac and Sammeth, 2007) ASTALAVISTA: dynamic and flexible analysis of alternative splicing events in custom gene datasets. Nucleic acids research.

六宛瞄、等位基因
(Deonovic et al., 2016)IDP-ASE: haplotyping and quantifying allele-specific expression at the gene and gene isoform level by hybrid sequencing. Nucleic Acids Research.

(Soderlund et al., 2014) Allele workbench: transcriptome pipeline and interactive graphics for allele-specific expression. PloS one

(Romanel et al., 2015) ASEQ: fast allele-specific studies from next-generation sequencing data. BMC medical genomics.

(Nariai et al., 2015) A Bayesian approach for estimating allele-specific expression from RNA-Seq data with diploid genomes. BMC genomics.

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
  • 序言:七十年代末浮禾,一起剝皮案震驚了整個(gè)濱河市,隨后出現(xiàn)的幾起案子份汗,更是在濱河造成了極大的恐慌盈电,老刑警劉巖,帶你破解...
    沈念sama閱讀 212,816評論 6 492
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件杯活,死亡現(xiàn)場離奇詭異匆帚,居然都是意外死亡,警方通過查閱死者的電腦和手機(jī)旁钧,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 90,729評論 3 385
  • 文/潘曉璐 我一進(jìn)店門吸重,熙熙樓的掌柜王于貴愁眉苦臉地迎上來,“玉大人歪今,你說我怎么就攤上這事嚎幸。” “怎么了寄猩?”我有些...
    開封第一講書人閱讀 158,300評論 0 348
  • 文/不壞的土叔 我叫張陵嫉晶,是天一觀的道長。 經(jīng)常有香客問我焦影,道長车遂,這世上最難降的妖魔是什么? 我笑而不...
    開封第一講書人閱讀 56,780評論 1 285
  • 正文 為了忘掉前任斯辰,我火速辦了婚禮舶担,結(jié)果婚禮上,老公的妹妹穿的比我還像新娘彬呻。我一直安慰自己衣陶,他們只是感情好柄瑰,可當(dāng)我...
    茶點(diǎn)故事閱讀 65,890評論 6 385
  • 文/花漫 我一把揭開白布。 她就那樣靜靜地躺著剪况,像睡著了一般教沾。 火紅的嫁衣襯著肌膚如雪。 梳的紋絲不亂的頭發(fā)上译断,一...
    開封第一講書人閱讀 50,084評論 1 291
  • 那天授翻,我揣著相機(jī)與錄音,去河邊找鬼。 笑死,一個(gè)胖子當(dāng)著我的面吹牛愉棱,可吹牛的內(nèi)容都是我干的。 我是一名探鬼主播淮菠,決...
    沈念sama閱讀 39,151評論 3 410
  • 文/蒼蘭香墨 我猛地睜開眼,長吁一口氣:“原來是場噩夢啊……” “哼荤堪!你這毒婦竟也來了合陵?” 一聲冷哼從身側(cè)響起,我...
    開封第一講書人閱讀 37,912評論 0 268
  • 序言:老撾萬榮一對情侶失蹤澄阳,失蹤者是張志新(化名)和其女友劉穎拥知,沒想到半個(gè)月后,有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體寇荧,經(jīng)...
    沈念sama閱讀 44,355評論 1 303
  • 正文 獨(dú)居荒郊野嶺守林人離奇死亡举庶,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點(diǎn)故事閱讀 36,666評論 2 327
  • 正文 我和宋清朗相戀三年执隧,在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了揩抡。 大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
    茶點(diǎn)故事閱讀 38,809評論 1 341
  • 序言:一個(gè)原本活蹦亂跳的男人離奇死亡镀琉,死狀恐怖峦嗤,靈堂內(nèi)的尸體忽然破棺而出,到底是詐尸還是另有隱情屋摔,我是刑警寧澤烁设,帶...
    沈念sama閱讀 34,504評論 4 334
  • 正文 年R本政府宣布,位于F島的核電站钓试,受9級特大地震影響装黑,放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜弓熏,卻給世界環(huán)境...
    茶點(diǎn)故事閱讀 40,150評論 3 317
  • 文/蒙蒙 一恋谭、第九天 我趴在偏房一處隱蔽的房頂上張望。 院中可真熱鬧挽鞠,春花似錦疚颊、人聲如沸狈孔。這莊子的主人今日做“春日...
    開封第一講書人閱讀 30,882評論 0 21
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽均抽。三九已至,卻和暖如春其掂,著一層夾襖步出監(jiān)牢的瞬間油挥,已是汗流浹背。 一陣腳步聲響...
    開封第一講書人閱讀 32,121評論 1 267
  • 我被黑心中介騙來泰國打工款熬, 沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留喘漏,地道東北人。 一個(gè)月前我還...
    沈念sama閱讀 46,628評論 2 362
  • 正文 我出身青樓华烟,卻偏偏與公主長得像翩迈,于是被迫代替她去往敵國和親。 傳聞我的和親對象是個(gè)殘疾皇子盔夜,可洞房花燭夜當(dāng)晚...
    茶點(diǎn)故事閱讀 43,724評論 2 351

推薦閱讀更多精彩內(nèi)容

  • rljs by sennchi Timeline of History Part One The Cognitiv...
    sennchi閱讀 7,312評論 0 10
  • **2014真題Directions:Read the following text. Choose the be...
    又是夜半驚坐起閱讀 9,442評論 0 23
  • By clicking to agree to this Schedule 2, which is hereby ...
    qaz0622閱讀 1,440評論 0 2
  • 滴答滴答的雨聲 似在訴說心的真誠 忽遠(yuǎn)忽近的悠揚(yáng)中 邁開舞姿掀起裊裊涼風(fēng) 這是有多么愜意呢 看那窗外的枝條還在傻傻...
    抹茶味與向日葵閱讀 267評論 0 0
  • 繪畫分享之寶蓮燈负饲,今天畫的是動(dòng)畫片寶蓮燈里的沉香和他的媽媽三圣母快樂時(shí)光。這個(gè)故事講述了天宮中的三圣母愛上了人間書...
    芃芃5200閱讀 2,555評論 0 3