Genomic landscape and genetic manipulation of the black soldier fly Hermetia illucens, a natural waste recycler
2019年11月25日得滤,上海生科院植物生理生態(tài)研究所黃勇平,華中農(nóng)業(yè)大學(xué)張吉斌團隊等人在Cell Research上在線發(fā)表了題為Genomic landscape and genetic manipulation of the black soldier fly Hermetia illucens, a natural waste recycler的研究論文渤愁。該研究報告了黑水虻(BSF)的高質(zhì)量基因組圖譜碴卧,通過CRISPR/Cas9的基因編輯方法,獲得了一種能顯著提高BSF取食能力的基因型住册,為優(yōu)化BSF基因系的產(chǎn)業(yè)化提供了有價值的基因組和技術(shù)資源。
Abstract
黑水虻是雙翅目荧飞,水虻科昆蟲,能將有機物轉(zhuǎn)化成動物可食用的資源名党,基因組大小1.1G,16,770個蛋白編碼基因传睹。與其他雙翅目昆蟲相比,黑水虻基因組在septic adaptation(腐敗性環(huán)境的適應(yīng)性)的功能類群中的基因大量擴張,包括immune system factors, olfactory receptors, and cytochrome P450s睛藻。中腸轉(zhuǎn)錄組表明與消化系統(tǒng)以及抵抗細(xì)菌等通路大量富集。BSF幼蟲取食代表性的有機物的微生物組表明店印,F(xiàn)irmicutes bacteria(厚壁菌門細(xì)菌)在腸道微生物最多冈在。通過CRISPR/Cas9-based技術(shù)得到取食能力增強的基因型按摘。
Data availability: NCBI under BioProjectID PRJNA547968 and SRA under SRR10158821.
Introduction
隨著人類人口的大量擴增,產(chǎn)生了越來越多的有機廢物院峡,它們的處理辦法主要有三種:焚燒系宜,填埋照激,堆肥盹牧。然而這些方法或多或少都會造成環(huán)境的二次污染。而黑水虻被認(rèn)為是在全世界唯一可以用于水產(chǎn)以及家禽的飼料原料的昆蟲汰寓。它們可以高效的利用有機廢物轉(zhuǎn)化成蛋白,脂肪等有滑,降低二氧化碳排放跃闹,病原菌及抗生素污染毛好。隨著測序技術(shù)的不斷發(fā)展,本文利用基因組肌访,轉(zhuǎn)錄組,宏基因組及基因的遺傳操作吼驶,可用于探索BSF生物學(xué)特征的遺傳基礎(chǔ)惩激。
RESULTS AND DISCUSSION
Characteristics of the BSF genome
測序樣本為10代自交系昆蟲蟹演,~300×左右測序深度的Illumina sequencing,包括paired-end libraries of short inserts and mate-pair libraries of long inserts酒请,1102 Mb of assembled scaffolds with a 1.69 Mb N50 length。BSF由于轉(zhuǎn)座子蚌父,重復(fù)的非編碼DNA哮兰,以及大量的重復(fù)序列導(dǎo)致其基因組很大。
測序深度及GC含量呈現(xiàn)正態(tài)分布喝滞,表明組裝中污染較少。
16,770 protein-coding genes通過與六種雙翅目昆蟲的同源比對右遭,12個連續(xù)BSF發(fā)育階段的轉(zhuǎn)錄組數(shù)據(jù),以及三個從頭預(yù)測的基因集得到16,770 protein-coding genes窘哈。
Comparison of the BSF genome with those of other dipterans
the BSF genome fills a gap between the Nematocera(長角亞目), the earliest diverging suborder of Diptera, and more recent flies(短角亞目).
BSF與家蠅和果蠅的 nonsynonymous-to-synonymous substitution (dN/dS) ratios分析得出,發(fā)現(xiàn)342個基因dN/dS比例高滚婉,這些快速進化的基因主要富集在與核糖體相關(guān)的功能模塊上图筹,它們參與蛋白質(zhì)合成通路让腹。它們主要富集在氨基酸代謝以及免疫相關(guān)的代謝通路中。h這可能由于BSF長期生活在高蛋白以及病原體富集的環(huán)境中骇窍。
b. Identification of pathways that have rapidly evolved in BSF. dN/dS ratios were calculated independently in two parallel evolutionary lineages, M. domestica and D. melanogaster, using BSF as the common ancestor. Each dot indicates the median dN/dS ratios of all related genes in the corresponding pathway. Significantly enriched (FDR-adjusted P < 0.05), rapidly evolving genes in KEGG pathways are highlighted in red.
BSF表達1798個物種特異性的重復(fù)基因,在短角亞目中最多腹纳。這些基因主要表達在幼蟲期的最后階段,這可能與其廢物轉(zhuǎn)化的取食行為相關(guān)嘲恍。
Expansions in gene families are related to BSF environmental interactions
可以看到BSF與其他雙翅目昆蟲相比足画,在解毒酶蛔钙,嗅覺感受,免疫因子吁脱,免疫通路相關(guān)的基因出現(xiàn)大量擴張,這與它的環(huán)境適應(yīng)是相關(guān)聯(lián)的兼贡。
Fig. 3 Expansions in gene families related to BSF environmental adaptation. a Number of gene copies in the indicated families related to environmental adaptation in dipteran species. The area size of each pie indicates the relative gene number in each family. b–e Phylogenetic relationships across three dipteran species for gene families with prominent expansions in BSF: gram-negative binding proteins (b), cecropin antimicrobial peptides (c), Olfactoery receptors (d), cytochrome P450s (e). Phylogenetic trees were estimated using the maximum likelihood method.
Intestinal transcriptome of BSF larvae fed on organic waste
它們通過對BSF幼蟲喂食包括食物廢物,家禽糞遍希,牛糞和豬糞等曼,分別在第4,6胁黑,8,12天提取中腸進行轉(zhuǎn)錄組分析州泊。
Fig. 4 Intestinal transcriptome in BSF larvae fed with organic waste. Midguts of BSF larvae fed with food waste (FW), poultry manure (PM),
dairy manure (DM), or swine manure (SM) were sampled on days 4, 6, 8, and 12 of feeding with the indicated diet. The samples were subjected
to RNA-seq. a Distributions of expressed genes (n = 9417) across 16 samples: Genes expressed at each time point under each type of diet are
labeled “All”; those expressed in 15 out of 16 samples are labeled “Almost all”; genes commonly expressed under each diet but not at every
time point are labeled “Broad”; genes only expressed in one sample are labeled “Orphan”; genes only expressed by larvae fed with manure are
labeled “Manure”; and genes only expressed in larvae fed with food waste are labeled “Waste”. b Principal component analysis of intestinal
samples based on their overall expression profiles. The first two eigenvectors that explained 34.2% and 20.4% of the variance are plotted. c
Venn diagram of the 500 most highly expressed genes (~5% of all expressed genes), selected for each type of diet based on the average
expression values across all time points. A total of 326 genes were expressed by larvae fed all four diets. d The 326 genes expressed by larvae
fed all four diets were subjected to KEGG enrichment analysis. Pathways in blue belong to digestive systems, and pathways in red indicate
those related to infectious diseases. Gene counts are presented as histograms. Hypergeometric test (FDR-adjusted): *P < 0.05, ***P < 0.005,
****P < 0.001. e A representative gene cluster specific to BSF and highly expressed in larvae fed with organic waste. Genomic organization in
BSF and the homologous region in D. melanogaster are shown. Homolog pairs between these species are linked by lines. Genes in green and
blue indicate BSF-specific genes that belong to two ortholog groups. These 14 genes do not have homology to genes of any other sequenced
invertebrate species. Note that this cluster is located in the end of an assembled BSF scaffold. The heatmap shows the expression pattern of
corresponding genes in BSF larvae fed with the other diets at each of the four time points.
Microbiota of BSF larvae fed on organic wastes
通過16S rRNA測序,得到BSF在不同取食及不同時間的腸道微生物種類和豐度力喷,可以看到取食牛糞和豬糞的幼蟲腸道中有更多種類的微生物類群。不像中腸轉(zhuǎn)錄組的表達譜沒有規(guī)律性演训,取食與腸道微生物類群相關(guān)性很高。這其中厚壁菌門(Firmicutes)的細(xì)菌種類最多样悟,它們能分泌多種蛋白酶和果膠酶參與到消化稻草相關(guān)肥料的糖類代謝中
Firmicutes have an important
role in digestion of animal manure as these bacteria secrete a
variety of proteases and pectinases and are involved in degradation
of indigestible carbohydrates in straw-related compost
Fig. 5 Microbiome of BSF larvae fed with different types of organic waste. a Within-sample diversity estimates of the bacterial communities in
larvae fed with the indicated diets. b Constrained principal coordinate analysis of between-sample diversity. Bray-Curtis distances between
samples constrained by diets plotted for the first two CPCoAs. c The dynamic landscape of OTUs across all communities at a phylum level.
OTU richness is indicated by the area of corresponding symbols. Symbols indicate counts of contained sequences. Colors indicate the fraction
of target OTUs relative to all OTUs of the corresponding sample.
Genetic manipulation to facilitate the utilization of BSF larvae
主要的思路就是能讓BSF在幼蟲階段吃的更多,轉(zhuǎn)化有機物的能力增加乌奇,在成蟲階段減少其移動的距離没讲,這樣可以積累大量的種群數(shù)量礁苗。
首先昆蟲的變態(tài)過程是通過一系列激素和神經(jīng)肽控制的,而促前胸腺激素(Ptth)可以控制蛻皮激素的合成與釋放徙缴。Ptth的敲除可以有效延長幼蟲到蛹的時間,two sgRNAs, targeted to the second and fourth exons, to disrupt HiPtth substantially in vivo于样。the last larval instar increased from 4–5 days in controls to > 85 days in mutant larvae of any mosaic forms of disrupted HiPtth。體型和體重也有明顯增加穿剖,這可能由于延長其取食時間導(dǎo)致的蚤蔓。
其次糊余,通過與果蠅翅發(fā)育基因的同源比對秀又,BSF. Vestigial (Vg)編碼對果蠅翅大小和形狀的基因贬芥。通過對其敲除得到了無翅的成蟲個體,并且不影響成蟲的發(fā)育蘸劈。
Fig. 6 Mutagenesis of Ptth leads to increased feeding capacity in BSF larvae. The CRISPR/Cas9 system was used to induce mutations at the
HiPtth locus in H. illucens. a Schematic representation of the exon/intron boundaries of the HiPtth gene. Exons are shown as boxes; thin lines
represent introns; numbers are fragment lengths in base pairs (bp). Target site (TS) locations are noted and PAM sequences are shown in red.
b Sequences of the targeted region in the HiPtth locus in the mutants. The PAM sequence is in red. The numbers of nucleotides deleted in
each line are indicated on the right. c Morphology of HiPtth mutants showing their greater size relative to wild type (WT) controls. d Average
body weights of mutants and control (n = 30; mean values ± SEM).
Fig. 7 Mutagenesis of Vg in BSF eliminates wings in adults.
a Schematic representation of the exon/intron boundaries of HiVg.
Exons are shown as boxes and thin lines represent the introns.
Target site (TS) locations are noted and PAM sequences are shown in
red. b Sequences of the targeted region in the corresponding loci of
Vg mutants. The PAM sequence is in red. The numbers of
nucleotides deleted in each line are indicated on the right.
c Phenotypic images show that Vg mutants lack wings in the
adult stage.
MATERIALS AND METHODS
Genome sequencing
提取單個蛹的DNA用于基因組測序,主要通過構(gòu)建不同插入片段大小的paired-end和mate-pair文庫來構(gòu)建contig和scaffold。
Genome assembly
Kmer分析評估基因組大小洼专,Seqtk v1.0 trim Adaptors and low-quality bases.Kmer的統(tǒng)計使用jellyfish(21mer)。雜合度和其他基因組特征使用GenomeScope孵构。
- MiSeq read pairs were utilized to assemble contigs using DiscovarDeNovo,Initial contigs were processed by redundans v0.11c63 to remove potential redundant sequences浦译。
- The paired-end read information from the long libraries was used step by step from 800-bp to 13-kb insert size to join contigs into scaffolds using SSPACE棒假。
- The remaining gaps within scaffolds were iteratively filled with paired-end reads of 250-bp and 800-bp inserts using GapCloser available in SOAPdenovo精盅。
- CEGMA (Core Eukaryotic Genes Mapping Approach) and BUSCO (Benchmarking Universal Single-Copy Orthologs)用于基因組組裝質(zhì)量評估
Genome annotation
重復(fù)序列注釋
- Tandem Repeats Finder to annotate the tandem repeats(Tandem Repeats Database)
- RepeatModeler to construct a de novo repeat library
- Repeat-Masker to search similar TEs against the known Repbase TE library and de novo repeat library
- LTR FINDER to find long terminal repeats (LTRs)
蛋白編碼基因注釋
- transcriptome evidence
兩個生物學(xué)重復(fù)的12個連續(xù)BSF發(fā)育階段的轉(zhuǎn)錄組數(shù)據(jù),HISAT2 to map RNA-seq reads to the reference genome and StringTie to predict exons叹俏。 - homolog alignments
GeneWise with protein inputs from six dipteran species。 - ab initio gene annotation
Three independent gene predictors were applied to generate ab initio signatures, including AUGUSTUS, SNAPand Genscan.
上述三種pipelines最后都通過GLEAN產(chǎn)生一致性的基因集粘驰。
具體基因家族的功能注釋需要人工矯正屡谐,TBLASTN搜索雙翅目的同源基因確定其genomic loci蝌数,基因結(jié)構(gòu)預(yù)測通過GeneWise,基因的保守域及生物通路通過KEGG的KO注釋得到顶伞《牛基因家族的收縮和擴張通過本地的InterProScan去搜索雙翅目基因組唆貌。基因的表達定量使用salmon锨咙,標(biāo)準(zhǔn)化表達值TPM语卤。
Comparative genomics
orthomclSoftware用于尋找the final orthologs, inparalogs, and co-orthologs酪刀。Multiple alignments of protein sequences for each group
were performed using Muscle,Gblocks to identify conserved blocks蓖宦。
Conserved blocks were finally concatenated to 10 super genes with 255,475 amino acids, which were used to quantify the maximum likelihood
phylogeny using RAxML齐婴。
Codeml from the PAML package was used to calculate dN/dS ratios under the F3X4 codon frequency.
Functional enrichment analyses were performed via an online OMICSHARE cloud platform (http://www.omicshare.com/tools/Home/Soft/pathwaygsea).
Analysis of the BSF intestinal transcriptome
- Each sample was independently mapped to the reference genome and subjected to expression profiling using the mode “quant” of salmon with the parameter “-validateMappings”稠茂,All independent profile were finally merged to a TPM matrix using the mode“quantmerge” of salmon 情妖。
- Expression profile-based principle component analysis was performed using the built-in R function “prcomp”。
Metagenomic analyses of BSF intestinal microbiota
腸道微生物的16S rRNA sequencing.
- Clean read pairs were merged using the built-in command “join_paired_ends.py” from QIIME .
- OTU analyses were performed by VSEARCH. Within- and between-sample diversities were estimated by the built-in QIIME scripts “alpha_diversity.py” and “beta_diversity.py”, respectively.
- The dynamic landscape of OTUs was generated using the online platform, SILVAngs (https://www.arb-silva.de/ngs).
Mutagenesis of BSF target genes
通過與其他雙翅目昆蟲的同源比對得到預(yù)測的HiPtth and HiVg 的ORFs毡证。With the PAM sequences in consideration, newly designed sgRNAs should follow the NNN19GG rule。
Fertilized eggs were collected within 1 h and microinjection was performed within 2 h of oviposition. Cas9 protein (200 ng/μL) with the sgRNA-1 (100 ng/μL) and sgRNA-2 (100 ng/μL) molecules were co-injected into preblastoderm embryos.
-
first instar larvae were selected for genomic DNA preparation. Fragments covering the two targeting sites were amplified,The amplified fragments were cloned into a pJET1.2 vector (Fermentas) and sequenced on the Sanger platform.