5.9
CiteScore
5.9
Impact Factor
Volume 42 Issue 8
Aug.  2015
Turn off MathJax
Article Contents

SHEsisPCA: A GPU-Based Software to Correct for Population Stratification that Efficiently Accelerates the Process for Handling Genome-Wide Datasets

doi: 10.1016/j.jgg.2015.06.007
More Information
  • Corresponding author: E-mail address: shiyongyong@gmail.com (Yongyong Shi)
  • Received Date: 2015-03-24
  • Accepted Date: 2015-06-10
  • Rev Recd Date: 2015-05-30
  • Available Online: 2015-07-09
  • Publish Date: 2015-08-20
  • Population stratification is a problem in genetic association studies because it is likely to highlight loci that underlie the population structure rather than disease-related loci. At present, principal component analysis (PCA) has been proven to be an effective way to correct for population stratification. However, the conventional PCA algorithm is time-consuming when dealing with large datasets. We developed a Graphic processing unit (GPU)-based PCA software named SHEsisPCA (http://analysis.bio-x.cn/SHEsisMain.htm) that is highly parallel with a highest speedup greater than 100 compared with its CPU version. A cluster algorithm based on X-means was also implemented as a way to detect population subgroups and to obtain matched cases and controls in order to reduce the genomic inflation and increase the power. A study of both simulated and real datasets showed that SHEsisPCA ran at an extremely high speed while the accuracy was hardly reduced. Therefore, SHEsisPCA can help correct for population stratification much more efficiently than the conventional CPU-based algorithms.
  • loading
  • [1]
    Asimit, J., Zeggini, E. Rare variant association analysis methods for complex traits Annu. Rev. Genet., 44 (2010),pp. 293-308
    [2]
    Duda, R.O., Hart, P.E.
    [3]
    Epstein, M.P., Duncan, R., Broadaway, K.A. et al. Stratification-score matching improves correction for confounding by population stratification in case-control association studies Genet. Epidemiol., 36 (2012),pp. 195-205
    [4]
    Ewens, W.J.
    [5]
    Helgason, A., Yngvadóttir, B., Hrafnkelsson, B. et al. An Icelandic example of the impact of population structure on association studies Nat. Genet., 37 (2004),pp. 90-95
    [6]
    Madsen, B.E., Browning, S.R. A groupwise association test for rare mutations using a weighted sum statistic PLoS Genet., 5 (2009),p. e1000384
    [7]
    Marchini, J., Cardon, L.R., Phillips, M.S. et al. The effects of human population structure on large genetic association studies Nat. Genet., 36 (2004),pp. 512-517
    [8]
    NVIDIA Corporation
    [9]
    Patterson, N., Price, A., Reich, D. Population structure and Eigenanalysis PLoS Genet., 2 (2006),p. e190
    [10]
    Pelleg, D., Moore, A.
    [11]
    Purcell, S., Neale, B., Todd-Brown, K. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses Am. J. Hum. Genet., 81 (2007),pp. 559-575
    [12]
    Schwarz, G. Estimating the dimension of a model Ann. Stat., 6 (1978),pp. 461-464
    [13]
    Shi, Y., Li, L., Hu, Z. et al. A genome-wide association study identifies two new cervical cancer susceptibility loci at 4q12 and 17q12 Nat. Genet., 45 (2013),pp. 918-922
    [14]
    Solovieff, N., Hartley, S.W., Baldwin, C.T. et al. Clustering by genetic ancestry using genome-wide SNP data BMC Genet., 11 (2010),p. 108
    [15]
    Tomov, S., Dongarra, J., Volkov, V. et al.
    [16]
    Wright, S. Evolution in Mendelian populations Genetics, 16 (1931),p. 97
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (64) PDF downloads (0) Cited by ()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return