项目文章丨兰州大学从全基因组水平揭示象草花青素积累和快速生长分子机制

近日,兰州大学草地农业科技学院联合广西畜牧研究所及国际家畜研究所共同合作的象草基因组研究成果以“The elephant grass (Cenchrus purpureus) genome provides insights into anthocyanidin accumulation and fast growth”为题在国际知名期刊《Molecular Ecology Resources》(3年IF=7.15)在线发表。希望组科技服务为本研究提供了Illumina、Nanopore和Hi-C测序服务,承担了基因组的组装及注释任务。该研究首次报道了象草的高质量染色体级别基因组,明确了象草的进化地位,在基因水平解析了紫色品种象草 “紫色”花青素积累的机制,并提出C4光合作用和激素信号转导通路的扩张可能有助于象草快速生长的新见解[1]

象草(Cenchrus purpureus Schumach)因大象爱采食而得名,是禾本科、黍族多年生大型草本植物,原产于亚洲。象草因其具有生物量大、生长快速、适应性强等特点,被用作重要的饲草作物在全世界热带及亚热带被广泛种植。此外,由于象草在生物能方面的优势也使其潜在的能源草。该研究是对象草研究的重大突破,为象草进化、性状改良和功能基因研究提供了理论基础。

图1 紫色象草

研究团队以紫色象草(Cenchrus purpureus cv. Purple)为材料,K-mer评估显示象草具有较高杂合(1.5%)。利用Illumina、Nanopore、Hi-C测序。采用NextDenovo + SMARTdenovo策略组装获得1.97Gb的基因组, Contig N50 为1.83Mb,最长Contig达到15.1Mb。结合Hi-C数据对基因组辅助染色体挂载及遗传连锁图谱,得到14条染色体,挂在率为96.65%。BUSCO评估结果达 97.8%,预测注释基因65,927个。

图2 象草亚基因组特征

象草为异源四倍体(2n=4x=28),包含A’和B两个亚基因组。研究表明同属二倍体植物珍珠粟(Cenchrus  americanus,2n=2x=14)的A基因组与象草A’基因组具有更高同源性。通过共线性分析研究者成功将象草的A’和B两个亚组区分开来,并利用单拷贝基因分析证明象草A’亚基因组和珍珠粟A基因组具有较近的同源性。象草A’A’BB的异源四倍体基因组大约起源于6.61 (4.11-10.92)MYA,并发生了较大的染色体重组。此外,研究者还利用转录组分析了象草亚基因组显性表达,结果表明其可能行使不同的功能。

图3 紫色象草花青素积累机制

紫色象草品种的叶片呈现紫色,一般认为苯丙类、黄酮类、花青素生物合成途径与叶片色素沉积有关。研究者从基因组和转录组层面对象草叶片紫色呈现进行了研究。比较基因组和转录组分析表明,象草关键酶基因苯丙氨酸解氨酶(PAL)、肉桂酸4-羟化酶(C4H)、4-香豆酸辅酶a连接酶(4CL)、查耳酮合酶(CHS)和黄烷酮醇 4 -还原酶(DFR)、类黄酮-3-O-葡糖基转移酶(3GT)发生了扩张并在叶片中显著高表达,其中4CL和DFR在进化过程中受到正选择。 
C4植物通常在碳固定方面效率更高,具有更高的用水效率,有助于它们在干燥环境中生存。C4植物可根据维管束鞘细胞中脱羧方式的不同分为3个亚类,即NAD-ME、NADP-ME和PEPCK。研究者分析了象草中涉及C4碳固定的九个主要基因家族,包括酶和代谢物转运体,比较基因组分析发现它们在象草中发生了扩张。转录组的结果表明这些关键酶和代谢物转运体在光合主要器官叶片中显著高表达,并且发现C4的3个亚类共同存在于象草中。另外,植物激素也是控制植物生物过程(发育过程、信号网络以及对生物和非生物胁迫的反应)的重要因素。研究者从基因组和转录组层面对激素信号转导相关通路进行了分析,发现参与细胞增大和细胞分裂等基因家族在象草中发生扩张并在茎间组织中高表达。这些结果可能为象草的快速生长及高生物量具有重要意义。

图4 象草C4光合途径

该研究利用报道的高质量的象草基因组、解析了花青素合成及快速生长机制,为象草作为优良饲草和潜在能源草的分子改良育种具有重要意义。此外,对于该属的进化以及其它物种的开发利用提供了重要资源。兰州大学草地农业科技学院张吉宇教授为通讯作者、广西畜牧研究所易显凤研究员、国际家畜研究所Jones Chris博士为共同通讯作者。兰州大学草地农业科技学院博士生闫启为第一作者、团队博士生吴凡、许攀和希望组孙宗毅为共同第一作者。

1. Yan Q, Wu F, Xu P, Sun ZY, Li J, Gao LJ, Lu LY, Chen DD, Muktar M, Jones C, Yi XF, Zhang JY. The elephant grass (Cenchrus purpureus) genome provides insights into anthocyanidin accumulation and fast growth. Mol Ecol Resour 2020, doi:10.1111/1755-0998.13271

Extensive intraspecific gene order and gene structural variations between Mo17 and other maize genomes

Nature Genetics        Published:30 July 2018

Abstract:Maize is an important crop with a high level of genome diversity and heterosis. The genome sequence of a typical female line, B73,was previously released. Here, we report a de novo genome assembly of a corresponding male representative line, Mo17. More than 96.4% of the 2,183 Mb assembled genome can be accounted for by 362 scaffolds in ten pseudochromosomes with 38,620 annotated protein-coding genes. Comparative analysis revealed large gene-order and gene structural variations: approximately 10% of the annotated genes were mutually nonsyntenic, and more than 20% of the predicted genes had either large-effect mutations or large structural variations, which might ca considerable protein divergence between the two inbred lines. Our study provides a high-quality reference-genome sequence of an important maize germplasm, and the intraspecific gene order and gene structural variations identified should have implications for heterosis and genome evolution.

Read the original articlehttps://www.nature.com/articles/s41588-018-0182-0

N 6 -Methyladenine DNA Modification in the Human Genome

Molecular Cell          Published:19 July 2018

Abstract:DNA N 6 -methyladenine (6mA) modification is the most prevalent DNA modification in prokaryotes, but whether it exists in human cells and whether it plays a role in human diseases remain enigmatic.Here, we showed that 6mA is extensively present in the human genome, and we cataloged 881,240 6mA sites accounting for ?0.051% of the total adenines.[G/C]AGG[C/T]wasthemostsignificantlyassociated motif with 6mA modification. 6mA sites were en-riched in the coding regions and mark actively tran-scribed genes in human cells. DNA 6mA and N 6 -de-
methyladenine modification in the human genome were mediated by methyltransferase N6AMT1 and demethylase ALKBH1, respectively. The abundance of 6mA was significantly lower in cancers, accompa-nied by decreased N6AMT1 and increased ALKBH1 levels, and downregulation of 6mA modification levels promoted tumorigenesis. Collectively, our re-sults demonstrate that DNA 6mA modification is extensively present in human cells and the decrease of genomic DNA 6mA promotes human tumori-genesis.

Read the original articlehttps://www.cell.com/molecular-cell/fulltext/S1097-2765(18)30460-X

Fern genomes elucidate land plant evolution and cyanobacterial symbioses

Nature Plants                             Published:02 July 2018

Abstract: Ferns are the closest sister group to all seed plants, yet little is known about their genomes other than that they are generally colossal. Here, we report on the genomes of Azolla filiculoides and Salvinia cucullata (Salviniales) and present evidence for episodic whole-genome duplication in ferns—one at the base of ‘core leptosporangiates’ and one specific to Azolla. One fernspecific gene that we identified, recently shown to confer high insect resistance, seems to have been derived from bacteria through horizontal gene transfer. Azolla coexists in a unique symbiosis with N2-fixing cyanobacteria, and we demonstrate a clear pattern of cospeciation between the two partners. Furthermore, the Azolla genome lacks genes that are common to arbuscular mycorrhizal and root nodule symbioses, and we identify several putative transporter genes specific to Azolla–cyanobacterial symbiosis. These genomic resources will help in exploring the biotechnological potential of Azolla and address fundamental questions in the evolution of plant life.

Read the original articlehttps://www.nature.com/articles/s41477-018-0188-8

Adaptation and conservation insights from the koala genome

Nature Genetics    Published: 02 July 2018

Abstract:The koala, the only extant species of the marsupial family Phascolarctidae, is classified as ‘vulnerable’ due to habitat loss and widespread disease. We sequenced the koala genome, producing a complete and contiguous marsupial reference genome, including centromeres. We reveal that the koala’s ability to detoxify eucalypt foliage may be due to expansions within a cytochrome P450 gene family, and its ability to smell, taste and moderate ingestion of plant secondary metabolites may be due to expansions in the vomeronasal and taste receptors. We characterized novel lactation proteins that protect young in the pouch and annotated immune genes important for response to chlamydial disease. Historical demography showed a substantial population crash coincident with the decline of Australian megafauna, while contemporary populations had biogeographic boundaries and increased inbreeding in populations affected by historic translocations. We identified genetically diverse populations that require habitat corridors and instituting of translocation programs to aid the koala’s survival in the wild.

Read the original article: https://www.nature.com/articles/s41588-018-0153-5

Oak genome reveals facets of long lifespan

Nature Plant       Published: 18 June 2018

Abstract:Oaks are an important part of our natural and cultural heritage. Not only are they ubiquitous in our most common landscapes but they have also supplied human societies with invaluable services, including food and shelter, since prehistoric times. With 450 species spread throughout Asia, Europe and America, oaks constitute a critical global renewable resource. The longevity of oaks (several hundred years) probably underlies their emblematic cultural and historical importance. Such long-lived sessile organisms must persist in the face of a wide range of abiotic and biotic threats over their lifespans. We investigated the genomic features associated with such a long lifespan by sequencing, assembling and annotating the oak genome. We then used the growing number of whole-genome sequences for plants (including tree and herbaceous species) to investigate the parallel evolution of genomic characteristics potentially underpinning tree longevity. A further consequence of the long lifespan of trees is their accumulation of somatic mutations during mitotic divisions of stem cells present in the shoot apical meristems. Empirical and modelling approaches have shown that intra-organismal genetic heterogeneity can be selected for and provides direct fitness benefits in the arms race with short-lived pests and pathogens through a patchwork of intra-organismal phenotypes. However, there is no clear proof that large-statured trees consist of a genetic mosaic of clonally distinct cell lineages within and between branches. Through this case study of oak, we demonstrate the accumulation and transmission of somatic mutations and the expansion of disease-resistance gene families in trees.

Read the original article: https://www.nature.com/articles/s41477-018-0172-3

High-resolution comparative analysis of great ape genomes

Science           08 June 2018

Abstract:Genetic studies of human evolution require high-quality contiguous ape genome assemblies that are not guided by the human reference. We coupled long-read sequence assembly and full-length complementary DNA sequencing with a multiplatform scaffolding approach to produce ab initio chimpanzee and orangutan genome assemblies. By comparing these with two long-read de novo human genome assemblies and a gorilla genome assembly, we characterized lineage-specific and shared great ape genetic variation ranging from single– to mega–base pair–sized variants. We identified ~17,000 fixed human-specific structural variants identifying genic and putative regulatory changes that have emerged in humans since divergence from nonhuman apes. Interestingly, these variants are enriched near genes that are down-regulated in human compared to chimpanzee cerebral organoids, particularly in cells analogous to radial glial neural progenitors.

Read the original article: http://science.sciencemag.org/content/360/6393/eaar6343

Genomic variation in 3,010 diverse accessions of Asian cultivated rice

Nature          Published: 25 April 2018

AbstractHere we analyse genetic variation, population structure and diversity among 3,010 diverse Asian cultivated rice (Oryza sativa L.) genomes from the 3,000 Rice Genomes Project. Our results are consistent with the five major groups previously recognized, but also suggest several unreported subpopulations that correlate with geographic location. We identified 29 million single nucleotide polymorphisms, 2.4 million small indels and over 90,000 structural variations that contribute to within- and between-population variation. Using pan-genome analyses, we identified more than 10,000 novel full-length protein-coding genes and a high number of presence–absence variations. The complex patterns of introgression observed in domestication genes are consistent with multiple independent rice domestication events. The public availability of data from the 3,000 Rice Genomes Project provides a resource for rice genomics research and breeding.

Read the original article: https://www.nature.com/articles/s41586-018-0063-9

Genome sequence of the progenitor of wheat A subgenome Triticum urartu

Nature                 Published: 09 May 2018

Abstract:Triticum urartu (diploid, AA) is the progenitor of the A subgenome of tetraploid (Triticum turgidum, AABB) and hexaploid (Triticum aestivum, AABBDD) wheat1,2. Genomic studies of T. urartu have been useful for investigating the structure, function and evolution of polyploid wheat genomes. Here we report the generation of a high-quality genome sequence of T. urartu by combining bacterial artificial chromosome (BAC)-by-BAC sequencing, single molecule real-time whole-genome shotgun sequencing3, linked reads and optical mapping4,5. We assembled seven chromosomescale pseudomolecules and identified protein-coding genes, and we suggest a model for the evolution of T. urartu chromosomes. Comparative analyses with genomes of other grasses showed gene loss and amplification in the numbers of transposable elements in the T. urartu genome. Population genomics analysis of 147 T. urartu accessions from across the Fertile Crescent showed clustering of three groups, with differences in altitude and biostress, such as powdery mildew disease. The T. urartu genome assembly provides a valuable resource for studying genetic variation in wheat and related grasses, and promises to facilitate the discovery of genes that could be useful for wheat improvement.

Read the original article: https://www.nature.com/articles/s41586-018-0108-0

Piercing the dark matter: bioinformatics of long- range sequencing and mapping

Nature Reviews                 29 March 2018

AbstractSeveral new genomics technologies have become available that offer long- read sequencing or long- range mapping with higher throughput and higher resolution analysis than ever before. These long- range technologies are rapidly advancing the field with improved reference genomes, more comprehensive variant identification and more complete views of transcriptomes and epigenomes. However, they also require new bioinformatics approaches to take full advantage of their unique characteristics while overcoming their complex errors and modalities. Here, we discuss several of the most important applications of the new technologies, focusing on both the currently available bioinformatics tools and opportunities for future research.

Read the original article: https://www.nature.com/articles/s41576-018-0003-4