https://doi.org/10.1093/bioinformatics/btaa976
Abstract
Motivation
Single-cell RNA-sequencing (scRNA-seq) offers the opportunity to dissect heterogeneous cellular compositions and interrogate the cell-type-specific gene expression patterns across diverse conditions. However, batch effects such as laboratory conditions and individual-variability hinder their usage in cross-condition designs.
Results
Here, we present a single-cell Generative Adversarial Network (scGAN) to simultaneously acquire patterns from raw data while minimizing the confounding effect driven by technical artifacts or other factors inherent to the data. Specifically, scGAN models the data likelihood of the raw scRNA-seq counts by projecting each cell onto a latent embedding. Meanwhile, scGAN attempts to minimize the correlation between the latent embeddings and the batch labels across all cells. We demonstrate scGAN on three public scRNA-seq datasets and show that our method confers superior performance over the state-of-the-art methods in forming clusters of known cell types and identifying known psychiatric genes that are associated with major depressive disorder.
key: why?單細胞數(shù)據(jù)存在批次效應(yīng)(實驗室條件和個體差異)
how? VAE+GAN
which is generator?
which is discriminator?
why can it remove batch effect?
編碼器網(wǎng)絡(luò)還需要最小化每個單元樣本的嵌入和混淆批次變量之間的相關(guān)性鹤耍。 另一方面,鑒別器通過使用編碼器生成的嵌入作為輸入來學習預(yù)測批次變量
key: why? 尋找阿茲海默癥的候選藥物
how?
首先轿偎,我們收集了561個據(jù)報道是AD風險基因的基因生宛,并對這些基因進行了功能富集分析寞酿。 然后趟妥,通過基于人類相互作用組定量5595種分子藥物與AD之間的接近度,我們篩選出了1092種與疾病最接近的藥物疤坝。 我們進一步對這些候選藥物進行了反向基因集富集分析兆解,這使我們能夠估計擾動對基因表達的影響,并確定24種潛在的AD治療候選藥物跑揉。
Abstract
Drug repurposing involves the identification of new applications for existing drugs at a lower cost and in a shorter time. There are different computational drug-repurposing strategies and some of these approaches have been applied to the coronavirus disease 2019 (COVID-19) pandemic. Computational drug-repositioning approaches applied to COVID-19 can be broadly categorized into (i) network-based models, (ii) structure-based approaches and (iii) artificial intelligence (AI) approaches. Network-based approaches are divided into two categories: network-based clustering approaches and network-based propagation approaches. Both of them allowed to annotate some important patterns, to identify proteins that are functionally associated with COVID-19 and to discover novel drug–disease or drug–target relationships useful for new therapies. Structure-based approaches allowed to identify small chemical compounds able to bind macromolecular targets to evaluate how a chemical compound can interact with the biological counterpart, trying to find new applications for existing drugs. AI-based networks appear, at the moment, less relevant since they need more data for their application.
key: 1)基于網(wǎng)絡(luò)(聚類和傳播) 2)基于結(jié)構(gòu) 3)基于AI
基于網(wǎng)絡(luò)的聚類:modules
基于網(wǎng)絡(luò)的傳播:提示關(guān)鍵過程是病毒刺突蛋白與人血管緊張素轉(zhuǎn)化酶2(ACE2)和跨膜絲氨酸蛋白酶2(TMPRSS2)的相互作用:刺突蛋白的受體結(jié)合結(jié)構(gòu)域與人ACE2的肽酶結(jié)構(gòu)域結(jié)合锅睛。Mpro介導病毒的復制和轉(zhuǎn)錄
計算了藥物靶標與HCoV相關(guān)蛋白之間的網(wǎng)絡(luò)鄰近度,以篩選人蛋白相互作用組模型下HCoV的候選可重復使用藥物
Abstract
A newly described coronavirus named severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), which is the causative agent of coronavirus disease 2019 (COVID-19), has infected over 2.3 million people, led to the death of more than 160,000 individuals and caused worldwide social and economic disruption. There are no antiviral drugs with proven clinical efficacy for the treatment of COVID-19, nor are there any vaccines that prevent infection with SARS-CoV-2, and efforts to develop drugs and vaccines are hampered by the limited knowledge of the molecular details of how SARS-CoV-2 infects cells. Here we cloned, tagged and expressed 26 of the 29 SARS-CoV-2 proteins in human cells and identified the human proteins that physically associated with each of the SARS-CoV-2 proteins using affinity-purification mass spectrometry, identifying 332 high-confidence protein–protein interactions between SARS-CoV-2 and human proteins. Among these, we identify 66 druggable human proteins or host factors targeted by 69 compounds (of which, 29 drugs are approved by the US Food and Drug Administration, 12 are in clinical trials and 28 are preclinical compounds). We screened a subset of these in multiple viral assays and found two sets of pharmacological agents that displayed antiviral activity: inhibitors of mRNA translation and predicted regulators of the sigma-1 and sigma-2 receptors. Further studies of these host-factor-targeting agents, including their combination with drugs that directly target viral enzymes, could lead to a therapeutic regimen to treat COVID-19.
https://www.nature.com/articles/s41586-020-2286-9
key: 26 332 66 69 2
如何實驗历谍?
克隆表達了新冠病毒29種蛋白當中的26個蛋白现拒,親和純化質(zhì)譜法鑒定出與新冠病毒蛋白有相互作用的人類蛋白,找到332個高置信度的蛋白-蛋白相互作用關(guān)系望侈。找到其中是藥物靶點的66個蛋白印蔬,共有69種化合物。我們在多種病毒分析中篩選了其中的一個子集脱衙,發(fā)現(xiàn)了兩組具有抗病毒活性的藥理劑:mRNA翻譯抑制劑和sigma-1和sigma-2受體的調(diào)節(jié)劑侥猬。
Abstract
The recent epidemic outbreak of a novel human coronavirus called SARS-CoV-2 causing the respiratory tract disease COVID-19 has reached worldwide resonance and a global effort is being undertaken to characterize the molecular features and evolutionary origins of this virus. In this paper, we set out to shed light on the SARS-CoV-2/host receptor recognition, a crucial factor for successful virus infection. Based on the current knowledge of the interactome between SARS-CoV-2 and host cell proteins, we performed Master Regulator Analysis to detect which parts of the human interactome are most affected by the infection. We detected, amongst others, affected apoptotic and mitochondrial mechanisms, and a downregulation of the ACE2 protein receptor, notions that can be used to develop specific therapies against this new virus.
key: 125 proteins (31 viral proteins and 94 human host proteins) and 200 unique interactions.
https://www.mdpi.com/2077-0383/9/4/982/htm#B28-jcm-09-00982
key: 病毒感染過程中,哪些蛋白相互作用更容易被感染影響
how? 把病毒-人類蛋白相互關(guān)系捐韩,映射到基因調(diào)控網(wǎng)絡(luò)退唠,利用master regulator analysis分析關(guān)鍵的調(diào)控因子
Abstract
Coronavirus Disease-2019 (COVID-19) is an infectious disease caused by the SARS-CoV-2 virus. Various studies exist about the molecular mechanisms of viral infection. However, such information is spread across many publications and it is very time-consuming to integrate, and exploit. We develop CoVex, an interactive online platform for SARS-CoV-2 host interactome exploration and drug (target) identification. CoVex integrates virus-human protein interactions, human protein-protein interactions, and drug-target interactions. It allows visual exploration of the virus-host interactome and implements systems medicine algorithms for network-based prediction of drug candidates. Thus, CoVex is a resource to understand molecular mechanisms of pathogenicity and to prioritize candidate therapeutics. We investigate recent hypotheses on a systems biology level to explore mechanistic virus life cycle drivers, and to extract drug repurposing candidates. CoVex renders COVID-19 drug research systems-medicine-ready by giving the scientific community direct access to network medicine algorithms. It is available at https://exbio.wzw.tum.de/covex/.
https://www.nature.com/articles/s41467-020-17189-2
key: 它可以對病毒-宿主相互作用組進行可視化探索,并實現(xiàn)系統(tǒng)醫(yī)學算法荤胁,用于基于網(wǎng)絡(luò)的候選藥物預(yù)測
AI方法:
https://www.sciencedirect.com/science/article/pii/S2319417020300494
Abstract
Background
The ongoing COVID-19 pandemic has caused more than 193,825 deaths during the past few months. A quick-to-be-identified cure for the disease will be a therapeutic medicine that has prior use experiences in patients in order to resolve the current pandemic situation before it could become worsening. Artificial intelligence (AI) technology is hereby applied to identify the marketed drugs with potential for treating COVID-19.
Methods
An AI platform was established to identify potential old drugs with anti-coronavirus activities by using two different learning databases; one consisted of the compounds reported or proven active against SARS-CoV, SARS-CoV-2, human immunodeficiency virus, influenza virus, and the other one containing the known 3C-like protease inhibitors. All AI predicted drugs were then tested for activities against a feline coronavirus in in vitro cell-based assay. These assay results were feedbacks to the AI system for relearning and thus to generate a modified AI model to search for old drugs again.
Results
After a few runs of AI learning and prediction processes, the AI system identified 80 marketed drugs with potential. Among them, 8 drugs (bedaquiline, brequinar, celecoxib, clofazimine, conivaptan, gemcitabine, tolcapone, and vismodegib) showed in vitro activities against the proliferation of a feline infectious peritonitis (FIP) virus in Fcwf-4 cells. In addition, 5 other drugs (boceprevir, chloroquine, homoharringtonine, tilorone, and salinomycin) were also found active during the exercises of AI approaches.
Conclusion
Having taken advantages of AI, we identified old drugs with activities against FIP coronavirus. Further studies are underway to demonstrate their activities against SARS-CoV-2 in vitro and in vivo at clinically achievable concentrations and doses. With prior use experiences in patients, these old drugs if proven active against SARS-CoV-2 can readily be applied for fighting COVID-19 pandemic.
key:
建立了一個AI平臺瞧预,通過使用兩個不同的學習數(shù)據(jù)庫來識別具有抗冠狀病毒活性的潛在舊藥物; 一種由已報道或證明對SARS-CoV仅政,SARS-CoV-2垢油,人免疫缺陷病毒,流感病毒具有活性的化合物組成已旧,另一種包含已知的3C樣蛋白酶抑制劑秸苗。 然后在基于細胞的體外試驗中測試所有AI預(yù)測的藥物對貓冠狀病毒的活性召娜。 這些測定結(jié)果被反饋到AI系統(tǒng)進行再學習运褪,從而生成修改后的AI模型以再次搜索舊藥。
why? 貓冠狀病毒是一種α-冠狀病毒秸讹,是在家貓和野貓中引起腸炎的病毒檀咙。 大約5–15%的感染貓患上貓傳染性腹膜炎(FIP),對貓是致命的[2]璃诀。 貓中FIP病毒的感染表現(xiàn)出與嚴重急性呼吸系統(tǒng)綜合癥(SARS)感染類似的特征弧可,例如人的肺部病變[3]。 據(jù)證明劣欢,核苷類似物GS-441524和3C樣蛋白酶抑制劑GC376均在體外表現(xiàn)出對FIP病毒的抗病毒活性棕诵,可有效治療貓的FIP。
Abstract
The infection of a novel coronavirus found in Wuhan of China (SARS-CoV-2) is rapidly spreading, and the incidence rate is increasing worldwide. Due to the lack of effective treatment options for SARS-CoV-2, various strategies are being tested in China, including drug repurposing. In this study, we used our pre-trained deep learning-based drug-target interaction model called Molecule Transformer-Drug Target Interaction (MT-DTI) to identify commercially available drugs that could act on viral proteins of SARS-CoV-2. The result showed that atazanavir, an antiretroviral medication used to treat and prevent the human immunodeficiency virus (HIV), is the best chemical compound, showing an inhibitory potency with Kd of 94.94 nM against the SARS-CoV-2 3C-like proteinase, followed by remdesivir (113.13 nM), efavirenz (199.17 nM), ritonavir (204.05 nM), and dolutegravir (336.91 nM). Interestingly, lopinavir, ritonavir, and darunavir are all designed to target viral proteinases. However, in our prediction, they may also bind to the replication complex components of SARS-CoV-2 with an inhibitory potency with Kd < 1000 nM. In addition, we also found that several antiviral agents, such as Kaletra (lopinavir/ritonavir), could be used for the treatment of SARS-CoV-2. Overall, we suggest that the list of antiviral drugs identified by the MT-DTI model should be considered, when establishing effective treatment strategies for SARS-CoV-2.
key: 深度學習預(yù)測藥物-靶點相互作用凿将。藥物表示為SMILE(化合物的一維結(jié)構(gòu))校套,靶點表示為氨基酸序列,序列模型牧抵。Transformer.
Summary
We performed RNA-seq and high-resolution mass spectrometry on 128 blood samples from COVID-19-positive and COVID-19-negative patients with diverse disease severities and outcomes. Quantified transcripts, proteins, metabolites, and lipids were associated with clinical outcomes in a curated relational database, uniquely enabling systems analysis and cross-ome correlations to molecules and patient prognoses. We mapped 219 molecular features with high significance to COVID-19 status and severity, many of which were involved in complement activation, dysregulated lipid transport, and neutrophil activation. We identified sets of covarying molecules, e.g., protein gelsolin and metabolite citrate or plasmalogens and apolipoproteins, offering pathophysiological insights and therapeutic suggestions. The observed dysregulation of platelet function, blood coagulation, acute phase response, and endotheliopathy further illuminated the unique COVID-19 phenotype. We present a web-based tool (covid-omics.app) enabling interactive exploration of our compendium and illustrate its utility through a machine learning approach for prediction of COVID-19 severity.
https://www.sciencedirect.com/science/article/pii/S2405471220303719?via%3Dihub
https://covid-omics.app:8080/
key: 數(shù)據(jù)集:covid-19 陽性和陰性病人的128份血液樣本進行RNA-seq and high-resolution mass spectrometry
跨組學 鑒定與疾病嚴重程度相關(guān)的分子特征 219 molecular features 表明在COVID-19下確實可以調(diào)節(jié)關(guān)鍵的生物學過程笛匙,包括補體系統(tǒng)激活,脂質(zhì)轉(zhuǎn)運犀变,血管損傷妹孙,血小板激活和脫粒,凝血获枝,和急性期反應(yīng) 我們還提供了一個應(yīng)用示例蠢正,該示例利用此資源基于所有組學數(shù)據(jù)開發(fā)疾病嚴重性預(yù)測模型
Abstract
Motivation
Gene network inference and master regulator analysis (MRA) have been widely adopted to define specific transcriptional perturbations from gene expression signatures. Several tools exist to perform such analyses but most require a computer cluster or large amounts of RAM to be executed.
Results
We developed corto, a fast and lightweight R package to infer gene networks and perform MRA from gene expression data, with optional corrections for copy-number variations and able to run on signatures generated from RNA-Seq or ATAC-Seq data. We extensively benchmarked it to infer context-specific gene networks in 39 human tumor and 27 normal tissue datasets.
key:
1.基因網(wǎng)絡(luò)推斷
2.copy number變異校正
3.mra
4.網(wǎng)絡(luò)富集可視化
Abstract
An updated Lnc2Cancer 3.0 (http://www.bio-bigdata.net/lnc2cancer or http://bio-bigdata.hrbmu.edu.cn/lnc2cancer) database, which includes comprehensive data on experimentally supported long non-coding RNAs (lncRNAs) and circular RNAs (circRNAs) associated with human cancers. In addition, web tools for analyzing lncRNA expression by high-throughput RNA sequencing (RNA-seq) and single-cell RNA-seq (scRNA-seq) are described. Lnc2Cancer 3.0 was updated with several new features, including (i) Increased cancer-associated lncRNA entries over the previous version. The current release includes 9254 lncRNA-cancer associations, with 2659 lncRNAs and 216 cancer subtypes. (ii) Newly adding 1049 experimentally supported circRNA-cancer associations, with 743 circRNAs and 70 cancer subtypes. (iii) Experimentally supported regulatory mechanisms of cancer-related lncRNAs and circRNAs, involving microRNAs, transcription factors (TF), genetic variants, methylation and enhancers were included. (iv) Appending experimentally supported biological functions of cancer-related lncRNAs and circRNAs including cell growth, apoptosis, autophagy, epithelial mesenchymal transformation (EMT), immunity and coding ability. (v) Experimentally supported clinical relevance of cancer-related lncRNAs and circRNAs in metastasis, recurrence, circulation, drug resistance, and prognosis was included. Additionally, two flexible online tools, including RNA-seq and scRNA-seq web tools, were developed to enable fast and customizable analysis and visualization of lncRNAs in cancers. Lnc2Cancer 3.0 is a valuable resource for elucidating the associations between lncRNA, circRNA and cancer.
key:
lncRNA和circRNA在癌癥相關(guān)調(diào)控機制中的作用,包括增強子映琳,遺傳變異机隙,microRNA(miRNA)相互作用,轉(zhuǎn)錄因子(TFs)和甲基化修飾
webtools: 通過RNA-seq 和scRNA分析lncRNA的表達量
Abstract
There is an urgent need to better understand the pathophysiology of Coronavirus disease 2019 (COVID-19), the global pandemic caused by SARS-CoV-2, which has infected more than three million people worldwide1. Approximately 20% of patients with COVID-19 develop severe disease and 5% of patients require intensive care2. Severe disease has been associated with changes in peripheral immune activity, including increased levels of pro-inflammatory cytokines3,4 that may be produced by a subset of inflammatory monocytes5,6, lymphopenia7,8 and T cell exhaustion9,10. To elucidate pathways in peripheral immune cells that might lead to immunopathology or protective immunity in severe COVID-19, we applied single-cell RNA sequencing (scRNA-seq) to profile peripheral blood mononuclear cells (PBMCs) from seven patients hospitalized for COVID-19, four of whom had acute respiratory distress syndrome, and six healthy controls. We identify reconfiguration of peripheral immune cell phenotype in COVID-19, including a heterogeneous interferon-stimulated gene signature, HLA class II downregulation and a developing neutrophil population that appears closely related to plasmablasts appearing in patients with acute respiratory failure requiring mechanical ventilation. Importantly, we found that peripheral monocytes and lymphocytes do not express substantial amounts of pro-inflammatory cytokines. Collectively, we provide a cell atlas of the peripheral immune response to severe COVID-19.
https://www.nature.com/articles/s41591-020-0944-y
key:
7個Covid-19 住院患者萨西,其中4個患有急性呼吸窘迫綜合癥有鹿。6個健康對照的外周血單核細胞測序數(shù)據(jù)。
做了哪些分析谎脯?
1.重型COVID-19 的外周免疫細胞的單細胞轉(zhuǎn)錄圖譜
主成分分析葱跋,細胞聚類,表型差異
2.量化COVID-19驅(qū)動的細胞類型比例變化以及全新中性粒細胞亞群的發(fā)現(xiàn)
3.對單核細胞進行進一步降維分析
4.確定COVID-19樣本中引起免疫細胞表型變化的基因源梭,HLA II下調(diào)娱俺, 干擾素刺激基因異質(zhì)性
5.對COVID-19樣本中T細胞及NK細胞的分析以及漿母細胞與中性粒細胞表型連續(xù)性的發(fā)現(xiàn)
Abstract
Viruses are a constant threat to global health as highlighted by the current COVID-19 pandemic. Currently, lack of data underlying how the human host interacts with viruses, including the SARS-CoV-2 virus, limits effective therapeutic intervention. We introduce Viral-Track, a computational method that globally scans unmapped single-cell RNA sequencing (scRNA-seq) data for the presence of viral RNA, enabling transcriptional cell sorting of infected versus bystander cells. We demonstrate the sensitivity and specificity of Viral-Track to systematically detect viruses from multiple models of infection, including hepatitis B virus, in an unsupervised manner. Applying Viral-Track to bronchoalveloar-lavage samples from severe and mild COVID-19 patients reveals a dramatic impact of the virus on the immune system of severe patients compared to mild cases. Viral-Track detects an unexpected co-infection of the human metapneumovirus, present mainly in monocytes perturbed in type-I interferon (IFN)-signaling. Viral-Track provides a robust technology for dissecting the mechanisms of viral-infection and pathology.
key:
病毒追蹤,掃描scRNA-seq數(shù)據(jù)是否存在病毒RNA废麻,從而對感染細胞和未感染細胞進行分類荠卷。
Summary
Diabetes is associated with increased mortality from severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2). Given literature suggesting a potential association between SARS-CoV-2 infection and diabetes induction, we examined pancreatic expression of angiotensin-converting enzyme 2 (ACE2), the key entry factor for SARS-CoV-2 infection. Specifically, we analyzed five public scRNA-seq pancreas datasets and performed fluorescence in situ hybridization, western blotting, and immunolocalization for ACE2 with extensive reagent validation on normal human pancreatic tissues across the lifespan, as well as those from coronavirus disease 2019 (COVID-19) cases. These in silico and ex vivo analyses demonstrated prominent expression of ACE2 in pancreatic ductal epithelium and microvasculature, but we found rare endocrine cell expression at the mRNA level. Pancreata from individuals with COVID-19 demonstrated multiple thrombotic lesions with SARS-CoV-2 nucleocapsid protein expression that was primarily limited to ducts. These results suggest SARS-CoV-2 infection of pancreatic endocrine cells, via ACE2, is an unlikely central pathogenic feature of COVID-19-related diabetes.
key: 糖尿病與COVID-19高死亡率相關(guān),探索這種關(guān)聯(lián)烛愧。
ACE2是SARS-COV-2感染的關(guān)鍵進入因素油宜。
數(shù)據(jù):正常胰腺組織和感染COVID-19的胰腺組織掂碱。ACE2在胰腺導管上皮和微脈管系統(tǒng)中有突出表達,但我們發(fā)現(xiàn)在mRNA水平上罕見的內(nèi)分泌細胞表達慎冤。來自患有COVID-19的個體的胰腺表現(xiàn)出多處血栓性病變疼燥,其SARS-CoV-2核衣殼蛋白表達主要限于導管。這些結(jié)果表明蚁堤,通過ACE2感染胰腺內(nèi)分泌細胞的SARS-CoV-2是COVID-19相關(guān)糖尿病的不太可能的中央致病特征醉者。
Abstract
Single-cell RNA sequencing (scRNA-seq) technologies allow researchers to uncover the biological states of a single cell at high resolution. For computational efficiency and easy visualization, dimensionality reduction is necessary to capture gene expression patterns in low-dimensional space. Here we propose an ensemble method for simultaneous dimensionality reduction and feature gene extraction (EDGE) of scRNA-seq data. Different from existing dimensionality reduction techniques, the proposed method implements an ensemble learning scheme that utilizes massive weak learners for an accurate similarity search. Based on the similarity matrix constructed by those weak learners, the low-dimensional embedding of the data is estimated and optimized through spectral embedding and stochastic gradient descent. Comprehensive simulation and empirical studies show that EDGE is well suited for searching for meaningful organization of cells, detecting rare cell types, and identifying essential feature genes associated with certain cell types.
key:
單細胞數(shù)據(jù)的降維和特征提取
學習細胞之間的相似性