5.9
CiteScore
5.9
Impact Factor
Volume 48 Issue 12
Dec.  2021
Turn off MathJax
Article Contents

GenOrigin: A comprehensive protein-coding gene origination database on the evolutionary timescale of life

doi: 10.1016/j.jgg.2021.03.018
Funds:

the Fundamental Research Funds for the Central Universities (2662019PY003, 2662020PY001)

and Huazhong Agricultural University Scientific & Technological Self-innovation Foundation (2016RC011). Funding for open access charge:National Natural Science Foundation of China.

We gratefully acknowledge Manyuan Long from the University of Chicago and Yong E. Zhang from the Institute of Zoology, Chinese Academy of Sciences, for the helpful suggestions. We thank Zhongliang Xue for providing the jingwei origin figure on the web page. We are also grateful to our users and all members in our lab for their valuable suggestions and comments. This work was supported by the National Natural Science Foundation of China (31871305)

HZAU-AGIS Cooperation Fund (SZYJY2021010)

  • Received Date: 2021-03-02
  • Accepted Date: 2021-03-29
  • Rev Recd Date: 2021-03-21
  • Publish Date: 2021-12-20
  • The origination of new genes contributes to the biological diversity of life. New genes may quickly build their network, exert important functions, and generate novel phenotypes. Dating gene age and inferring the origination mechanisms of new genes, like primate-specific genes, is the basis for the functional study of the genes. However, no comprehensive resource of gene age estimates across species is available. Here, we systematically date the age of 9, 102, 113 protein-coding genes from 565 species in the Ensembl and Ensembl Genomes databases, including 82 bacteria, 57 protists, 134 fungi, 58 plants, 56 metazoa, and 178 vertebrates, using a protein-family-based pipeline with Wagner parsimony algorithm. We also collect gene age estimate data from other studies and uniformly distribute the gene age estimates to time ranges in a million years for comparison across studies. All the data are cataloged into GenOrigin (http://genorigin.chenzxlab.cn/), a user-friendly new database of gene age estimates, where users can browse gene age estimates by species, age, and gene ontology. In GenOrigin, the information such as gene age estimates, annotation, gene ontology, ortholog, and paralog, as well as detailed gene presence/absence views for gene age inference based on the species tree with evolutionary timescale, is provided to researchers for exploring gene functions.
  • These authors contributed equally to this work.
  • loading
  • Altenhoff, A.M., Dessimoz, C., 2009. Phylogenetic and functional assessment of orthologs inference projects and methods. PLoS Comput. Biol. 5, e1000262.
    Altenhoff, A.M., Garrayo-Ventas, J., Cosentino, S., Emms, D., Glover, N.M., Hernandez-Plaza, A., Nevers, Y., Sundesha, V., Szklarczyk, D., Fernandez, J.M., et al., 2020. The quest for Orthologs benchmark service and consensus calls in 2020. Nucleic Acids Res. 48, W538-W545.
    Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., et al., 2000. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25, 25-29.
    Bhardwaj, V., Heyne, S., Sikora, K., Rabbani, L., Rauer, M., Kilpert, F., Richter, A.S., Ryan, D.P., Manke, T., 2019. snakePipes: facilitating flexible, scalable and integrative epigenomic analysis. Bioinformatics 35, 4757-4759.
    Bykov, V.J.N., Eriksson, S.E., Bianchi, J., Wiman, K.G., 2018. Targeting mutant p53 for efficient cancer therapy. Nat. Rev. Cancer 18, 89-102.
    Capra, J.A., Stolzer, M., Durand, D., Pollard, K.S., 2013. How old is my gene? Trends Genet. 29, 659-668.
    Capra, J.A., Williams, A.G., Pollard, K.S., 2012. ProteinHistorian: tools for the comparative analysis of eukaryote protein origin. PLoS Comput. Biol. 8, e1002567.
    Cardoso-Moreira, M., Halbert, J., Valloton, D., Velten, B., Chen, C.Y., Shao, Y., Liechti, A., Ascencao, K., Rummel, C., Ovchinnikova, S., et al., 2019. Gene expression across mammalian organ development. Nature 571, 505-509.
    Chen, S.D., Krinsky, B.H., Long, M.Y., 2013. New genes as drivers of phenotypic evolution. Nat. Rev. Genet. 14, 645-660.
    Csuros, M., 2010. Count: evolutionary analysis of phylogenetic profiles with parsimony and likelihood. Bioinformatics 26, 1910-1912.
    Domazet-Loso, T., Tautz, D., 2010. Phylostratigraphic tracking of cancer genes suggests a link to the emergence of multicellularity in metazoa. BMC Biol. 8, 66.
    Fitch, W.M., 1970. Distinguishing homologous from analogous proteins. Syst. Zool. 19, 99-113.
    Herrero, J., Muffato, M., Beal, K., Fitzgerald, S., Gordon, L., Pignatelli, M., Vilella, A.J., Searle, S.M.J., Amode, R., Brent, S., et al., 2016. Ensembl comparative genomics resources. Database (Oxford) 2016, baw053.
    Howe, K.L., Contreras-Moreira, B., De Silva, N., Maslen, G., Akanni, W., Allen, J., Alvarez-Jarreta, J., Barba, M., Bolser, D.M., Cambell, L., et al., 2020. Ensembl Genomes 2020-enabling non-vertebrate genomic research. Nucleic Acids Res. 48, D689-D695.
    Kanehisa, M., Goto, S., 2000. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28, 27-30.
    Kinsella, R.J., Kahari, A., Haider, S., Zamora, J., Proctor, G., Spudich, G., AlmeidaKing, J., Staines, D., Derwent, P., Kerhornou, A., et al., 2011. Ensembl BioMarts: a hub for data retrieval across taxonomic space. Database 2011, 2011, bar030.
    Kriventseva, E.V., Kuznetsov, D., Tegenfeldt, F., Manni, M., Dias, R., Simao, F.A., Zdobnov, E.M., 2019. OrthoDB v10: sampling the diversity of animal, plant, fungal, protist, bacterial and viral genomes for evolutionary and functional annotations of orthologs. Nucleic Acids Res. 47, D807-D811.
    Kryuchkova-Mostacci, N., Robinson-Rechavi, M., 2017. A benchmark of gene expression tissue-specificity metrics. Briefings Bioinf. 18, 205-214.
    Kumar, S., Stecher, G., Suleski, M., Hedges, S.B., 2017. TimeTree: a resource for timelines, timetrees, and divergence times. Mol. Biol. Evol. 34, 1812-1819.
    Liao, Y., Smyth, G.K., Shi, W., 2014. FeatureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923-930.
    Liebeskind, B.J., McWhite, C.D., Marcotte, E.M., 2016. Towards consensus gene ages. Genome Biol. Evol. 8, 1812-1823.
    Long, M., Langley, C.H., 1993. Natural selection and the origin of jingwei, a chimeric processed functional gene in Drosophila. Science 260, 91-95.
    Long, M.Y., VanKuren, N.W., Chen, S.D., Vibranovski, M.D., 2013. New gene evolution: little did we know. Annu. Rev. Genet. 47, 307-333.
    Mantovani, F., Collavin, L., Del Sal, G., 2019. Mutant p53 as a guardian of the cancer cell. Cell Death Differ. 26, 199-212.
    Maxwell, E.K., Schnitzler, C.E., Havlak, P., Putnam, N.H., Nguyen, A.D., Moreland, R.T., Baxevanis, A.D., 2014. Evolutionary profiling reveals the heterogeneous origins of classes of human disease genes: implications for modeling disease genetics in animals. BMC Evol. Biol. 14, 212.
    Neme, R., Tautz, D., 2013. Phylogenetic patterns of emergence of new genes support a model of frequent de novo evolution. BMC Genom. 14, 117.
    Paps, J., Holland, P.W.H., 2018. Reconstruction of the ancestral metazoan genome reveals an increase in genomic novelty. Nat. Commun 9, 1-8.
    Sacerdot, C., Louis, A., Bon, C., Berthelot, C., Crollius, H.R., 2018. Chromosome evolution at the origin of the ancestral vertebrate genome. Genome Biol. 19, 166.
    Shao, Y., Chen, C.Y., Shen, H., He, B.Z., Yu, D.Q., Jiang, S., Zhao, S.L., Gao, Z.Q., Zhu, Z.L., Chen, X., et al., 2019. GenTree, an integrated resource for analyzing the evolution and function of primate-specific coding genes. Genome Res. 29, 682-696.
    Sonnhammer, E.L.L., Koonin, E.V., 2002. Orthology, paralogy and proposed classification for paralog subtypes. Trends Genet. 18, 619-620.
    Southan, C., 2017. Last rolls of the yoyo: assessing the human canonical protein count. F1000Research 6, 448.
    Trachana, K., Larsson, T.A., Powell, S., Chen, W.H., Doerks, T., Muller, J., Bork, P., 2011. Orthology prediction methods: a quality assessment using curated protein families. Bioessays 33, 769-780.
    Vilella, A.J., Severin, J., Ureta-Vidal, A., Heng, L., Durbin, R., Birney, E., 2009. EnsemblCompara GeneTrees: complete, duplication-aware phylogenetic trees in vertebrates. Genome Res. 19, 327-335.
    Yates, A.D., Achuthan, P., Akanni, W., Allen, J., Allen, J., Alvarez-Jarreta, J., Amode, M.R., Armean, I.M., Azov, A.G., Bennett, R., et al., 2020. Ensembl 2020. Nucleic Acids Res. 48, D682-D688.
    Zhang, W., Landback, P., Gschwend, A.R., Shen, B., Long, M., 2015. New genes drive the evolution of gene interaction networks in the human and mouse genomes. Genome Biol. 16, 202.
    Zhang, Y.E., Landback, P., Vibranovski, M., Long, M.Y., 2012. New genes expressed in human brains: implications for annotating evolving genomes. Bioessays 34, 982-991.
    Zhang, Y.E., Vibranovski, M.D., Krinsky, B.H., Long, M.Y., 2010a. Age-dependent chromosomal distribution of male-biased genes in Drosophila. Genome Res. 20, 1526-1533.
    Zhang, Y.E., Vibranovski, M.D., Landback, P., Marais, G.A.B., Long, M.Y., 2010b. Chromosomal redistribution of male-biased genes in mammalian evolution with two bursts of gene gain on the X chromosome. PLoS Biol. 8, e1000494.
    Zhao, S.R., Ye, Z., Stanton, R., 2020. Misuse of RPKM or TPM normalization when comparing across samples and sequencing protocols. RNA 26, 903-909.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (174) PDF downloads (23) Cited by ()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return