Presentation Abstract

Title: C01.4 - Mapping of two human genomes with a single molecule nanochannel array platform for genome-wide structural variation analysis and de novo sequence assembly
Keywords: genome mapping; de novo sequence assembly; genome-wide structural variation analysis
Authors: P. Y. Kwok1, Y. Y. Y. Lai1, A. C. Y. Mak1, E. T. Lam2, J. Silbert3, T. P. Kwok4, J. W. Li4, A. K. Y. Leung4, J. J. K. Wu5, A. K. Y. Yim4, A. Poon1, C. Chu1, C. Lin1, M. Requa2, A. Hastie2, T. Anantharaman2, H. VanSteenhouse2, H. Dai2, F. Trintchouk2, M. Saghbini2, M. Austin2, K. Haden2, H. Cao2, S. M. Yiu5, K. Y. Yip4, T. F. Chan4, M. Xiao3;
1University of California, San Francisco, San Francisco, CA, United States, 2BioNano Genomics, San Diego, CA, United States, 3Drexel University, Philadelphia, PA, United States, 4Chinese University of Hong Kong, Shatin, Hong Kong, 5University of Hong Kong, Hong Kong, Hong Kong.
Abstract: Despite recent advances in base-calling accuracy and read length, de novo genome assembly and structural variant analysis using ‘short read’ shotgun sequencing remain challenging. We have developed a new approach that utilizes highly parallel nanochannel arrays in which many thousands of very long single DNA molecules are linearized and imaged. This novel approach is automated on the Irys System and can scan the entire genome rapidly to generate physical maps that provide a more comprehensive view of the genome. Here we describe the genome maps of the first two diploid human genomes constructed using this approach. Two members of a CEPH-CEU trio (father and daughter, NA12891 and NA12878, respectively) genotyped and sequenced extensively as part of the International HapMap and 1000 Genomes Projects were mapped to 50X coverage with long (150 kb to >500 kb) DNA fragments fluorescently labeled at Nt.BspQI (GCTCTTCN/) sites. The resultant sequence motif maps are used to resolve haplotypes, identify structural variations, and assist in de novo sequence assembly of these two individuals. Particularly complex genomic loci, such as the major histocompatibility (MHC) region are well characterized with these maps. Our results show that the DNA sequence of these two individuals differ significantly from the reference human genome sequence and confirm the majority of the structural variants identified previously. In addition, new structural variants not detected by next-generation sequencing are easily identified. The genome mapping approach is simple and can be performed in any modern molecular genetic laboratory.