HITAC-seq enables high-throughput cost-effective sequencing of plasmids and DNA fragments with identity

Xiang Gao; Weipeng Mo; Junpeng Shi; Ning Song; Pei Liang; Jian Chen; Yiting Shi; Weilong Guo; Xinchen Li; Xiaohong Yang; Beibei Xin; Haiming Zhao; Weibin Song; Jinsheng Lai

doi:10.1016/j.jgg.2021.05.009

Volume 48 Issue 8

Aug. 2021

Turn off MathJax

Article Contents

Article Navigation > Journal of Genetics and Genomics > 2021 > 48(8): 671-680

PDF( 1910 KB)

HITAC-seq enables high-throughput cost-effective sequencing of plasmids and DNA fragments with identity

doi: 10.1016/j.jgg.2021.05.009

a State Key Laboratory of Plant Physiology and Biochemistry and National Maize Improvement Center, Department of Plant Genetics and Breeding, China Agricultural University, Beijing 100193, PR China;
b Department of Microbiology and Immunology, College of Biological Sciences, China Agricultural University, Beijing 100193, PR China;
c State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, PR China;
d Key Laboratory of Crop Heterosis and Utilization, State Key Laboratory for Agrobiotechnology, Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing 100193, PR China;
e Center for Crop Functional Genomics and Molecular Breeding, China Agricultural University, Beijing 100193, PR China

Funds:

This work was supported by grants from National Key Research and Development Program (2016YFD0101803 - 04) and National Natural Science Foundation of China (31421005 and 91935303).

Received Date: 2021-04-12
Accepted Date: 2021-05-13
Rev Recd Date: 2021-05-03
Publish Date: 2021-08-20

Abstract

Abstract

DNA sequencing is vital for many aspects of biological research and diagnostics. Despite the development of second and third generation sequencing technologies, Sanger sequencing has long been the only choice when required to precisely track each sequenced plasmids or DNA fragments. Here, we report a complete set of novel barcoding and assembling system, Highly-parallel Indexed Tagmentation-reads Assembled Consensus sequencing (HITAC-seq), that could massively sequence and track the identities of each individual sequencing sample. With the cost of much less than that of single read of Sanger sequencing, HITAC-seq can generate high-quality contiguous sequences of up to 10 kilobases or longer. The capability of HITAC-seq was confirmed through large-scale sequencing of thousands of plasmid clones and hundreds of amplicon fragments using approximately 100 pg of input DNAs. Due to its long synthetic length, HITAC-seq was effective in detecting relatively large structural variations, as demonstrated by the identification of a ∼1.3 kb Copia retrotransposon insertion in the upstream of a likely maize domestication gene. Besides being a practical alternative to traditional Sanger sequencing, HITAC-seq is suitable for many high-throughput sequencing and genotyping applications.
- HITAC-seq,
- Structure variation,
- Sequencing technology,
- Sanger sequencing,
- Comparative genomics

FullText(HTML)

References(56)

References

Alkan, C., Coe, B.P., Eichler, E.E., 2011. Genome structural variation discovery and genotyping. Nat. Rev. Genet. 12, 363-376.

Annaluru, N., Muller, H., Mitchell, L.A., Ramalingam, S., Stracquadanio, G., Richardson, S.M., Dymond, J.S., Kuang, Z., Scheifele, L.Z., Cooper, E.M., 2014. Total synthesis of a functional designer eukaryotic chromosome. Science 344, 55-58.

Bankevich, A., Nurk, S., Antipov, D., Gurevich, A.A., Dvorkin, M., Kulikov, A.S., Lesin, V.M., Nikolenko, S.I., Pham, S., Prjibelski, A.D., 2012. Spades: a new genome assembly algorithm and its applications to single-cell sequencing. J. Comput. Biol. 19, 455-477.

Bankevich, A., Pevzner, P.A., 2016. Truspades: barcode assembly of truseq synthetic long reads. Nat. Methods 13, 248-250.

Bolger, A.M., Lohse, M., Usadel, B., 2014. Trimmomatic: a flexible trimmer for illumina sequence data. Bioinformatics 30, 2114-2120.

Chaisson, M.J.P., Huddleston, J., Dennis, M.Y., Sudmant, P.H., Malig, M., Hormozdiari, F., Antonacci, F., Surti, U., Sandstrom, R., Boitano, M., 2015. Resolving the complexity of the human genome using single-molecule sequencing. Nature 517, 608-611.

Consortium, M.G.S., Waterston, R.H., Lindblad-Toh, K., Birney, E., Rogers, J., Abril, J.F., Agarwal, P., Agarwala, R., Ainscough, R., Alexandersson, M., 2002. Initial sequencing and comparative analysis of the mouse genome. Nature 420, 520-562.

Danecek, P., Auton, A., Abecasis, G., Albers, C.A., Banks, E., DePristo, M.A., Handsaker, R.E., Lunter, G., Marth, G.T., Sherry, S.T., 2011. The variant call format and VCFtools. Bioinformatics 27, 2156-2158.

Eisenstein, Michael, 2015. Startups use short-read data to expand long-read sequencing market. Nat. Biotechnol. 33, 433-435.

Gerhardus, A., Schleberger, H., Schlegelberger, B., Gadzicki, D., 2007. Diagnostic accuracy of methods for the detection of BRCA1 and BRCA2 mutations: a systematic review. Eur. J. Hum. Genet. 15, 619-627.

Goodwin, S., McPherson, J.D., McCombie, W.R., 2016. Coming of age: ten years of next-generation sequencing technologies. Nat. Rev. Genet. 17, 333-351.

Huang, X., Madan, A., 1999. CAP3: A DNA sequence assembly program. Genome Res. 9, 868-877.

Hwang, B., Heo, S., Cho, N., Seo, H., Bang, D., 2019. Facilitated large-scale sequence validation platform using Tn5-tagmented cell lysates. ACS Synth. Biol. 8, 596-600.

Initiative, A.G., 2000. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408, 796.

Jain, M., Koren, S., Miga, K.H., Quick, J., Rand, A.C., Sasani, T.A., Tyson, J.R., Beggs, A.D., Dilthey, A.T., Fiddes, I.T., 2018. Nanopore sequencing and assembly of a human genome with ultra-long reads. Nat. Biotechnol. 36, 338.

Jain, M., Olsen, H.E., Paten, B., Akeson, M., 2016. The Oxford Nanopore MinION: delivery of nanopore sequencing to the genomics community. Genome Biol. 17, 239.

Kang, Y.-J., Yang, D.-C., Kong, L., Hou, M., Meng, Y.-Q., Wei, L., Gao, G., 2017. CPC2: a fast and accurate coding potential calculator based on sequence intrinsic features. Nucleic Acids Res. 45, W12-W16.

Kosuri, S., Church, G.M., 2014. Large-scale de novo DNA synthesis: technologies and applications. Nat. Methods 11, 499-507.

Lander, E.S., Linton, L.M., Birren, B., Nusbaum, C., Zody, M.C., Baldwin, J., Devon, K., Dewar, K., Doyle, M., FitzHugh, W., 2001. Initial sequencing and analysis of the human genome. Nature 409, 860-921.

Lemmon, Z.H., Bukowski, R., Sun, Q., Doebley, J.F., The role of cis regulatory evolution in maize domestication. PLoS Genet. e1004745.

Li, H., 2012. Exploring single-sample SNP and INDEL calling with whole-genome de novo assembly. Bioinformatics 28, 1838-1844.

Li, H., 2013. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv:1303.3997.

Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., Durbin, R., Genome, P.D.P.S., 2009. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078-2079.

Liu, L., Du, Y., Shen, X., Li, M., Sun, W., Huang, J., Liu, Z., Tao, Y., Zheng, Y., Yan, J., KRN4 controls quantitative variation in maize kernel row number. PLoS Genet. e1005670.

Logsdon, G.A., Gambogi, C.W., Liskovykh, M.A., Barrey, E.J., Larionov, V., Miga, K.H., Heun, P., Black, B.E., 2019. Human artificial chromosomes that bypass centromeric DNA. Cell 178, 624-639.

Luck, K., Kim, D.-K., Lambourne, L., Spirohn, K., Begg, B.E., Bian, W., Brignall, R., Cafarelli, T., Campos-Laborie, F.J., Charloteaux, B., 2020. A reference map of the human binary protein interactome. Nature 580, 402-408.

Mao, H., Wang, H., Liu, S., Li, Z., Yang, X., Yan, J., Li, J., Tran, L.-S.P., Qin, F., 2015. A transposable element in a NAC gene is associated with drought tolerance in maize seedlings. Nat. Commun. 6, 8326.

McCoy, R.C., Taylor, R.W., Blauwkamp, T.A., Kelley, J.L., Kertesz, M., Pushkarev, D., Petrov, D.A., Fiston-Lavier, A.-S., Illumina truseq synthetic long-reads empower de novo assembly and resolve complex, highly-repetitive transposable elements. PLoS One e106689.

McKenna, A., Hanna, M., Banks, E., Sivachenko, A., Cibulskis, K., Kernytsky, A., Garimella, K., Altshuler, D., Gabriel, S., Daly, M., 2010. The Genome Analysis Toolkit: a mapreduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297-1303.

Mikkelsen, T.S., Hillier, L.W., Eichler, E.E., Zody, M.C., Waterston, R.H., 2005. Initial sequence of the chimpanzee genome and comparison with the human genome. Nature 437, 69-87.

Mostovoy, Y., Levy-Sakin, M., Lam, J., Lam, E.T., Hastie, A.R., Marks, P., Lee, J., Chu, C., Lin, C., Džakula, Ž., 2016. A hybrid approach for de novo human genome sequence assembly and phasing. Nat. Methods 13, 587-590.

Ncbi, R.C., 2018. Database resources of the national center for biotechnology information. Nucleic Acids Res. 46, D8-D13.

O Malley, R.C., Huang, S.-s.C., Song, L., Lewsey, M.G., Bartlett, A., Nery, J.R., Galli, M., Gallavotti, A., Ecker, J.R., 2016. Cistrome and epicistrome features shape the regulatory DNA landscape. Cell 165, 1280-1292.

Ota, T., Suzuki, Y., Nishikawa, T., Otsuki, T., Sugiyama, T., Irie, R., Wakamatsu, A., Hayashi, K., Sato, H., Nagai, K., 2004. Complete sequencing and characterization of 21,243 full-length human cDNAs. Nat. Genet. 36, 40-45.

Paterson, A.H., Bowers, J.E., Bruggmann, R., Dubchak, I., Grimwood, J., Gundlach, H., Haberer, G., Hellsten, U., Mitros, T., Poliakov, A., 2009. The sorghum bicolor genome and the diversification of grasses. Nature 457, 551-556.

Pertea, M., Kim, D., Pertea, G.M., Leek, J.T., Salzberg, S.L., 2016. Transcript-level expression analysis of RNA-seq experiments with hisat, stringtie and ballgown. Nat. Protoc. 11, 1650-1667.

Picelli, S., Björklund, A.K., Reinius, B., Sagasser, S., Winberg, G., Sandberg, R., 2014. Tn5 transposase and tagmentation procedures for massively scaled sequencing projects. Genome Res. 24, 2033-2040.

Portwood, J.L.N., Woodhouse, M.R., Cannon, E.K., Gardiner, J.M., Harper, L.C., Schaeffer, M.L., Walsh, J.R., Sen, T.Z., Cho, K.T., Schott, D.A., 2019. Maizegdb 2018: the maize multi-genome genetics and genomics database. Nucleic Acids Res. 47, D1146-D1154.

Roberts, R.J., Carneiro, M.O., Schatz, M.C., 2013. The advantages of SMRT sequencing. Genome Biol. 14, 405.

Sanger, F., Nicklen, S., Coulson, A.R., 1977. DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. 74, 5463-5467.

Schnable, P.S., Ware, D., Fulton, R.S., Stein, J.C., Wei, F., Pasternak, S., Liang, C., Zhang, J., Fulton, L., Graves, T.A., 2009. The B73 maize genome: complexity, diversity, and dynamics. Science 326, 1112-1115.

Seki, M., Narusaka, M., Kamiya, A., Ishida, J., Satou, M., Sakurai, T., Nakajima, M., Enju, A., Akiyama, K., Oono, Y., 2002. Functional annotation of a full-length Arabidopsis cDNA collection. Science 296, 141-145.

Shendure, J., Balasubramanian, S., Church, G.M., Gilbert, W., Rogers, J., Schloss, J.A., Waterston, R.H., 2017. DNA sequencing at 40: past, present and future. Nature 550, 345-353.

Simpson, J.T., Wong, K., Jackman, S.D., Schein, J.E., Jones, S.J.M., Birol, I., 2009. ABySS: a parallel assembler for short read sequence data. Genome Res. 19, 1117-1123.

Soderlund, C., Descour, A., Kudrna, D., Bomhoff, M., Boyd, L., Currie, J., Angelova, A., Collura, K., Wissotski, M., Ashley, E., Sequencing, mapping, and analysis of 27,455 maize full-length cDNAs. PLoS Genet. e1000740.

Sosso, D., Luo, D., Li, Q.-B., Sasse, J., Yang, J., Gendrot, G., Suzuki, M., Koch, K.E., McCarty, D.R., Chourey, P.S., 2015. Seed filling in domesticated maize and rice depends on sweet-mediated hexose transport. Nat. Genet. 47, 1489-1493.

Tello-Ruiz, M.K., Naithani, S., Stein, J.C., Gupta, P., Campbell, M., Olson, A., Wei, S., Preece, J., Geniza, M.J., Jiao, Y., 2018. Gramene 2018: unifying comparative genomics and pathway resources for plant research. Nucleic Acids Res. 46, D1181-D1189.

Tilgner, H., Jahanbani, F., Blauwkamp, T., Moshrefi, A., Jaeger, E., Chen, F., Harel, I., Bustamante, C.D., Rasmussen, M., Snyder, M.P., 2015. Comprehensive transcriptome analysis using synthetic long-read sequencing reveals molecular co-association of distant splicing events. Nat. Biotechnol. 33, 736-742.

Trapnell, C., Roberts, A., Goff, L., Pertea, G., Kim, D., Kelley, D.R., Pimentel, H., Salzberg, S.L., Rinn, J.L., Pachter, L., 2012. Differential gene and transcript expression analysis of RNA-seq experiments with tophat and cufflinks. Nat. Protoc. 7, 562-578.

Trigg, S.A., Garza, R.M., MacWilliams, A., Nery, J.R., Bartlett, A., Castanon, R., Goubil, A., Feeney, J., O'Malley, R., Huang, S.-s.C., 2017. CrY2H-seq: a massively multiplexed assay for deep-coverage interactome mapping. Nat. Methods 14, 819-825.

Vitak, S.A., Torkenczy, K.A., Rosenkrantz, J.L., Fields, A.J., Christiansen, L., Wong, M.H., Carbone, L., Steemers, F.J., Adey, A., 2017. Sequencing thousands of single-cell genomes with combinatorial indexing. Nat. Methods 14, 302-308.

Wang, B., Tseng, E., Regulski, M., Clark, T.A., Hon, T., Jiao, Y., Lu, Z., Olson, A., Stein, J.C., Ware, D., 2016. Unveiling the complexity of the maize transcriptome by single-molecule long-read sequencing. Nat. Commun. 7, 1-13.

Wenger, A.M., Peluso, P., Rowell, W.J., Chang, P.-C., Hall, R.J., Concepcion, G.T., Ebler, J., Fungtammasan, A., Kolesnikov, A., Olson, N.D., 2019. Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome. Nat. Biotechnol. 37, 1155-1162.

Ye, K., Schulz, M.H., Long, Q., Apweiler, R., Ning, Z., 2009. Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads. Bioinformatics 25, 2865-2871.

Yu, J., Hu, S., Wang, J., Wong, G.K.-S., Li, S., Liu, B., Deng, Y., Dai, L., Zhou, Y., Zhang, X., 2002. A draft sequence of the rice genome (Oryza sativa L. ssp. indica). Science 296, 79-92.

Zheng, G.X.Y., Lau, B.T., Schnall-Levin, M., Jarosz, M., Bell, J.M., Hindson, C.M., Kyriazopoulou-Panagiotopoulou, S., Masquelier, D.A., Merrill, L., Terry, J.M., 2016. Haplotyping germline and cancer genomes with high-throughput linked-read sequencing. Nat. Biotechnol. 34, 303-311.

Relative Articles

Supplements(0)

Cited By

Proportional views