Journal of Genetics and Genomics

Pathway analysis for genome-wide genetic variation data: Analytic principles, latest developments, and new opportunities

Micah Silberstein, Nicholas Nesbit, Jacquelyn Cai, Phil H. Lee

2021, 48(3): 173-183. doi: 10.1016/j.jgg.2021.01.007

Abstract (154) HTML PDF (12)

Abstract:
Pathway analysis, also known as gene-set enrichment analysis, is a multilocus analytic strategy that integrates a priori, biological knowledge into the statistical analysis of high-throughput genetics data. Originally developed for the studies of gene expression data, it has become a powerful analytic procedure for in-depth mining of genome-wide genetic variation data. Astonishing discoveries were made in the past years, uncovering genes and biological mechanisms underlying common and complex disorders. However, as massive amounts of diverse functional genomics data accrue, there is a pressing need for newer generations of pathway analysis methods that can utilize multiple layers of high-throughput genomics data. In this review, we provide an intellectual foundation of this powerful analytic strategy, as well as an update of the state-of-the-art in recent method developments. The goal of this review is threefold: (1) introduce the motivation and basic steps of pathway analysis for genome-wide genetic variation data; (2) review the merits and the shortcomings of classic and newly emerging integrative pathway analysis tools; and (3) discuss remaining challenges and future directions for further method developments.

Interplay between genome organization and epigenomic alterations of pericentromeric DNA in cancer

Subhadip Kundu, M.D. Ray, Ashok Sharma

2021, 48(3): 184-197. doi: 10.1016/j.jgg.2021.02.004

Abstract (166) HTML PDF (11)

Abstract:
In eukaryotic genome biology, the genomic organization inside the three-dimensional (3D) nucleus is highly complex, and whether this organization governs gene expression is poorly understood. Nuclear lamina (NL) is a filamentous meshwork of proteins present at the lining of inner nuclear membrane that serves as an anchoring platform for genome organization. Large chromatin domains termed as lamina-associated domains (LADs), play a major role in silencing genes at the nuclear periphery. The interaction of the NL and genome is dynamic and stochastic. Furthermore, many genes change their positions during developmental processes or under disease conditions such as cancer, to activate certain sorts of genes and/or silence others. Pericentromeric heterochromatin (PCH) is mostly in the silenced region within the genome, which localizes at the nuclear periphery. Studies show that several genes located at the PCH are aberrantly expressed in cancer. The interesting question is that despite being localized in the pericentromeric region, how these genes still manage to overcome pericentromeric repression. Although epigenetic mechanisms control the expression of the pericentromeric region, recent studies about genome organization and genome-nuclear lamina interaction have shed light on a new aspect of pericentromeric gene regulation through a complex and coordinated interplay between epigenomic remodeling and genomic organization in cancer.

A genome-wide association study of facial morphology identifies novel genetic loci in Han Chinese

Yin Huang, Dan Li, Lu Qiao, Yu Liu, Qianqian Peng, Sijie Wu, Manfei Zhang, Yajun Yang, Jingze Tan, Shuhua Xu, Li Jin, Sijia Wang, Kun Tang, Stefan Grünewald

2021, 48(3): 198-207. doi: 10.1016/j.jgg.2020.10.004

Abstract (240) HTML PDF (26)

Abstract:
The human face is a heritable surface with many complex sensory organs. In recent years, many genetic loci associated with facial features have been reported in different populations, yet there is a lack of studies on the Han Chinese population. Here, we report a genome-wide association study of 3D normal human faces of 2,659 Han Chinese with autosegment phenotypes of facial morphology. We identify single-nucleotide polymorphisms (SNPs) encompassing four genomic regions showing significant associations with different facial regions, including SNPs inDENND1B associated with the chin, SNPs among PISRT1 associated with eyes, SNPs between DCHS2 and SFRP2 associated with the nose, and SNPs in VPS13B associated with the nose. We replicate 24 SNPs from previously reported genetic loci in different populations, whose candidate genes are DCHS2, SUPT3H, HOXD1, SOX9, PAX3, and EDAR. These results provide a more comprehensive understanding of the genetic basis of variation in human facial morphology.

Smyd1 is essential for myosin expression and sarcomere organization in craniofacial, extraocular, and cardiac muscles

Shuang Jiao, Rui Xu, Shaojun Du

2021, 48(3): 208-218. doi: 10.1016/j.jgg.2021.03.004

Abstract (176) HTML PDF (9)

Abstract:
Skeletal and cardiac muscles are striated myofibers that contain highly organized sarcomeres for muscle contraction. Recent studies revealed that Smyd1, a lysine methyltransferase, plays a key role in sarcomere assembly in heart and trunk skeletal muscles. However, Smyd1 expression and function in craniofacial muscles are not known. Here, we analyze the developmental expression and function of two smyd1 paralogous genes, smyd1a and smyd1b, in craniofacial and cardiac muscles of zebrafish embryos. Our data show that loss of smyd1a (smyd1a) or smyd1b (smyd1b) has no visible effects on myogenic commitment and expression of myod and myosin heavy-chain mRNA transcripts in craniofacial muscles. However, myosin heavy-chain protein accumulation and sarcomere organization are dramatically reduced in smyd1b single mutant, and almost completely diminish in smyd1a; smyd1b double mutant, but not in smyd1a mutant. Similar defects are also observed in cardiac muscles ofsmyd1b mutant. Defective craniofacial and cardiac muscle formation is associated with an upregulation of hsp90α1 and unc45b mRNA expression in smyd1b and smyd1a; smyd1b mutants. Together, our studies indicate that Smyd1b, but not Smyd1a, plays a key role in myosin heavy-chain protein expression and sarcomere organization in craniofacial and cardiac muscles. Loss of smyd1b results in muscle-specific stress response.

An instantaneous coalescent method insensitive to population structure

Zeqi Yao, Kehui Liu, Shanjun Deng, Xionglei He

2021, 48(3): 219-224. doi: 10.1016/j.jgg.2021.03.005

Abstract (151) HTML PDF (6)

Abstract:
Conventional coalescent inferences of population history make the critical assumption that the population under examination is panmictic. However, most populations are structured. This complicates the prevailing coalescent analyses and sometimes leads to inaccurate estimates. To develop a coalescent method unhampered by population structure, we perform two analyses. First, we demonstrate that the coalescent probability of two randomly sampled alleles from the immediate preceding generation (one generation back) is independent of population structure. Second, motivated by this finding, we propose a new coalescent method: i-coalescent analysis. The i-coalescent analysis computes the instantaneous coalescent rate by using a phylogenetic tree of sampled alleles. Using simulated data, we broadly demonstrate the capability of i-coalescent analysis to accurately reconstruct population size dynamics of highly structured populations, although we find this method often requires larger sample sizes for structured populations than for panmictic populations. Overall, our results indicate i-coalescent analysis to be a useful tool, especially for the inference of population histories with intractable structure such as the developmental history of cell populations in the organs of complex organisms.

Genomes of 12 fig wasps provide insights into the adaptation of pollinators to fig syconia

Jinhua Xiao, Xianqin Wei, Yi Zhou, Zhaozhe Xin, Yunheng Miao, Hongxia Hou, Jiaxing Li, Dan Zhao, Jing Liu, Rui Chen, Liming Niu, Guangchang Ma, Wenquan Zhen, Shunmin He, Jianxia Wang, Xunfan Wei, Weihao Dou, Zhuoxiao Sui, Haikuan Zhang, Shilai Xing, Miao Shi, Dawei Huang

2021, 48(3): 225-236. doi: 10.1016/j.jgg.2021.02.010

Abstract (152) HTML PDF (15)

Abstract:
Figs and fig pollinators are one of the few classic textbook examples of obligate pollination mutualism. The specific dependence of fig pollinators on the relatively safe living environment with sufficient food sources in the enclosed fig syconia implies that they are vulnerable to habitat changes. However, there is still no extensive genomic evidence to reveal the evolutionary footprint of this long-term mutually beneficial symbiosis in fig pollinators. In fig syconia, there are also non-pollinator species. The non-pollinator species differ in their evolutionary and life histories from pollinators. We conducted comparative analyses on 11 newly sequenced fig wasp genomes and one previously published genome. The pollinators colonized the figs approximately 66.9 million years ago, consistent with the origin of host figs. Compared with non-pollinators, many more genes in pollinators were subject to relaxed selection. Seven genes were absent in pollinators in response to environmental stress and immune activation. Pollinators had more streamlined gene repertoires in the innate immune system, chemosensory toolbox, and detoxification system. Our results provide genomic evidence for the differentiation between pollinators and nonpollinators. The data suggest that owing to the long-term adaptation to the fig, some genes related to functions no longer required are absent in pollinators.

Freeze substitution Hi-C, a convenient and cost-effective method for capturing the natural 3D chromatin conformation from frozen samples

Wu Zheng, Zhaoen Yang, Xiaoyang Ge, Yijia Feng, Ye Wang, Chengwei Liu, Yanan Luan, Kun Cai, Serhii Vakal, Feng You, Wei Guo, Wei Wang, Zhenhua Feng, Fuguang Li

2021, 48(3): 237-247. doi: 10.1016/j.jgg.2020.11.002

Abstract (1109) HTML PDF (18)

Abstract:
Chromatin interactions functionally affect genome architecture and gene regulation, but to date, only fresh samples must be used in High-through chromosome conformation capture (Hi-C) to keep natural chromatin conformation intact. This requirement has impeded the advancement of 3D genome research by limiting sample collection and storage options for researchers and severely limiting the number of samples that can be processed in a short time. Here, we develop a freeze substitution Hi-C (FS-Hi-C) technique that overcomes the need for fresh samples. FS-Hi-C can be used with samples stored in liquid nitrogen (LN₂): the water in a vitreous form in the sample cells is replaced with ethanol via automated freeze substitution. After confirming that the FS step preserves the natural chromosome conformation during sample thawing, we tested the performance of FS-Hi-C with Drosophila melanogaster and Gossypium hirsutum. Beyond allowing the use of frozen samples and confirming that FS-Hi-C delivers robust data for generating contact heat maps and delineating A/B compartments and topologically associating domains, we found that FS-Hi-C outperforms the in situ Hi-C in terms of library quality, reproducibility, and valid interactions. Thus, FS-Hi-C will probably extend the application of 3D genome structure analysis to the vast number of experimental contexts in biological and medical research for which Hi-C methods have been unfeasible to date.

GGVD: A goat genome variation database for tracking the dynamic evolutionary process of selective signatures and ancient introgressions

Weiwei Fu, Rui Wang, Jiantao Yu, Dexiang Hu, Yudong Cai, Junjie Shao, Yu Jiang

2021, 48(3): 248-256. doi: 10.1016/j.jgg.2021.03.003

Abstract (329) HTML PDF (15)

Abstract:
Understanding the evolutionary history and adaptive process depends on the knowledge that we can acquire from both ancient and modern genomic data. With the availability of a deluge of whole-genome sequencing data from ancient and modern goat samples, a user-friendly database making efficient reuse of these important resources is needed. Here, we use the genomes of 208 modern domestic goats, 24 bezoars, 46 wild ibexes, and 82 ancient goats to present a comprehensive goat genome variation database (GGVD). GGVD hosts a total of ∼41.44 million SNPs, ∼5.14 million indels, 6,193 selected loci, and 112 introgression regions. Users can freely visualize the frequency of genomic variations in geographical maps, selective sweeps in interactive tables, Manhattan plots, or line charts, as well as the heatmap patterns of the SNP genotype. Ancient data can be shown in haplotypes to track the state of genetic variants of selection and introgression events in the early, middle, and late stages. For facilitating access to sequence features, the UCSC Genome Browser, BLAT, BLAST, LiftOver, and pcadapt are also integrated into GGVD. GGVD will be a convenient tool for population genetic studies and molecular marker designing in goat breeding programs, and it is publicly available at http://animal.nwsuaf.edu.cn/GoatVar.

Identification of a novel efficient transcriptional activation domain from Chinese fir (Cunninghamia lanceolata)

Tengfei Zhu, Wenyu Tang, Delan Chen, Jian Li, Jun Su

2021, 48(3): 257-259. doi: 10.1016/j.jgg.2020.12.001

Abstract (262) HTML PDF (28)

Abstract:

2021 Vol. 48, No. 3