메뉴 건너뛰기




Volumn 33, Issue 6, 2015, Pages 623-630

Assembling large genomes with single-molecule sequencing and locality-sensitive hashing

Author keywords

[No Author keywords available]

Indexed keywords

CELL CULTURE; MOLECULES; YEAST;

EID: 84930851165     PISSN: 10870156     EISSN: 15461696     Source Type: Journal    
DOI: 10.1038/nbt.3238     Document Type: Article
Times cited : (760)

References (68)
  • 1
    • 77952886150 scopus 로고    scopus 로고
    • Assembly algorithms for next-generation sequencing data
    • Miller, J.R., Koren, S. & Sutton, G. Assembly algorithms for next-generation sequencing data. Genomics 95, 315-327 (2010).
    • (2010) Genomics , vol.95 , pp. 315-327
    • Miller, J.R.1    Koren, S.2    Sutton, G.3
  • 2
    • 84874194145 scopus 로고    scopus 로고
    • Sequence assembly demystified
    • Nagarajan, N. & Pop, M. Sequence assembly demystified. Nat. Rev. Genet. 14, 157-167 (2013).
    • (2013) Nat. Rev. Genet. , vol.14 , pp. 157-167
    • Nagarajan, N.1    Pop, M.2
  • 3
    • 84919629313 scopus 로고    scopus 로고
    • Extensive error in the number of genes inferred from draft genome assemblies
    • Denton, J.F. et al. Extensive error in the number of genes inferred from draft genome assemblies. PLOS Comput. Biol. 10, e1003998 (2014).
    • (2014) PLOS Comput. Biol. , vol.10
    • Denton, J.F.1
  • 4
    • 0027113212 scopus 로고
    • Approximate string-matching with q-grams and maximal matches
    • Ukkonen, E. Approximate string-matching with q-grams and maximal matches. Theor. Comput. Sci. 92, 191-211 (1992).
    • (1992) Theor. Comput. Sci. , vol.92 , pp. 191-211
    • Ukkonen, E.1
  • 5
    • 77956279237 scopus 로고    scopus 로고
    • Assembly of large genomes using second-generation sequencing
    • Schatz, M.C., Delcher, A.L. & Salzberg, S.L. Assembly of large genomes using second-generation sequencing. Genome Res. 20, 1165-1173 (2010).
    • (2010) Genome Res. , vol.20 , pp. 1165-1173
    • Schatz, M.C.1    Delcher, A.L.2    Salzberg, S.L.3
  • 6
    • 64449088698 scopus 로고    scopus 로고
    • Continuous base identification for single-molecule nanopore DNA sequencing
    • Clarke, J. et al. Continuous base identification for single-molecule nanopore DNA sequencing. Nat. Nanotechnol. 4, 265-270 (2009).
    • (2009) Nat. Nanotechnol. , vol.4 , pp. 265-270
    • Clarke, J.1
  • 7
    • 58149234737 scopus 로고    scopus 로고
    • Real-time DNA sequencing from single polymerase molecules
    • Eid, J. et al. Real-time DNA sequencing from single polymerase molecules. Science 323, 133-138 (2009).
    • (2009) Science , vol.323 , pp. 133-138
    • Eid, J.1
  • 8
    • 84955151056 scopus 로고    scopus 로고
    • Error correction and assembly complexity of single molecule sequencing reads
    • Lee, H. et al. Error correction and assembly complexity of single molecule sequencing reads. bioRxiv doi:10.1101/006395 (2014).
    • (2014) bioRxiv
    • Lee, H.1
  • 9
    • 84942520038 scopus 로고    scopus 로고
    • A reference bacterial genome dataset generated on the MinION portable single-molecule nanopore sequencer
    • Quick, J., Quinlan, A.R. & Loman, N.J. A reference bacterial genome dataset generated on the MinION portable single-molecule nanopore sequencer. GigaScience 3, 22 (2014).
    • (2014) GigaScience , vol.3 , pp. 22
    • Quick, J.1    Quinlan, A.R.2    Loman, N.J.3
  • 10
    • 84883664726 scopus 로고    scopus 로고
    • Reducing assembly complexity of microbial genomes with singlemolecule sequencing
    • Koren, S. et al. Reducing assembly complexity of microbial genomes with singlemolecule sequencing. Genome Biol. 14, R101 (2013).
    • (2013) Genome Biol. , vol.14 , pp. R101
    • Koren, S.1
  • 11
    • 84913554630 scopus 로고    scopus 로고
    • One chromosome, one contig: Complete microbial genomes from long-read sequencing and assembly
    • Koren, S. & Phillippy, A.M. One chromosome, one contig: complete microbial genomes from long-read sequencing and assembly. Curr. Opin. Microbiol. 23, 110-120 (2015).
    • (2015) Curr. Opin. Microbiol. , vol.23 , pp. 110-120
    • Koren, S.1    Phillippy, A.M.2
  • 12
    • 84869814079 scopus 로고    scopus 로고
    • Mind the gap: Upgrading genomes with Pacific Biosciences RS long-read sequencing technology
    • English, A.C. et al. Mind the gap: upgrading genomes with Pacific Biosciences RS long-read sequencing technology. PLoS ONE 7, e47768 (2012).
    • (2012) PLoS ONE , vol.7
    • English, A.C.1
  • 13
    • 84863651532 scopus 로고    scopus 로고
    • Hybrid error correction and de novo assembly of single-molecule sequencing reads
    • Koren, S. et al. Hybrid error correction and de novo assembly of single-molecule sequencing reads. Nat. Biotechnol. 30, 693-700 (2012).
    • (2012) Nat. Biotechnol. , vol.30 , pp. 693-700
    • Koren, S.1
  • 14
    • 84868327508 scopus 로고    scopus 로고
    • Finished bacterial genomes from shotgun sequence data
    • Ribeiro, F.J. et al. Finished bacterial genomes from shotgun sequence data. Genome Res. 22, 2270-2277 (2012).
    • (2012) Genome Res. , vol.22 , pp. 2270-2277
    • Ribeiro, F.J.1
  • 15
    • 84880798154 scopus 로고    scopus 로고
    • Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data
    • Chin, C.S. et al. Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data. Nat. Methods 10, 563-569 (2013).
    • (2013) Nat. Methods , vol.10 , pp. 563-569
    • Chin, C.S.1
  • 16
    • 84878234942 scopus 로고    scopus 로고
    • Characterizing and measuring bias in sequence data
    • Ross, M.G. et al. Characterizing and measuring bias in sequence data. Genome Biol. 14, R51 (2013).
    • (2013) Genome Biol. , vol.14 , pp. R51
    • Ross, M.G.1
  • 17
    • 84925497196 scopus 로고    scopus 로고
    • Resolving the complexity of the human genome using singlemolecule sequencing
    • Chaisson, M.J. et al. Resolving the complexity of the human genome using singlemolecule sequencing. Nature 517, 608-611 (2014).
    • (2014) Nature , vol.517 , pp. 608-611
    • Chaisson, M.J.1
  • 18
    • 84921782284 scopus 로고    scopus 로고
    • Near-optimal assembly for shotgun sequencing with noisy reads
    • Lam, K.K., Khalak, A. & Tse, D. Near-optimal assembly for shotgun sequencing with noisy reads. BMC Bioinformatics 15 (suppl. 9), S4 (2014).
    • (2014) BMC Bioinformatics , vol.15 , pp. S4
    • Lam, K.K.1    Khalak, A.2    Tse, D.3
  • 21
    • 79956075292 scopus 로고    scopus 로고
    • Identifying and filtering near-duplicate documents
    • Broder, A.Z. Identifying and filtering near-duplicate documents. Combinatorial pattern matching 1-10 (2000).
    • (2000) Combinatorial Pattern Matching , pp. 1-10
    • Broder, A.Z.1
  • 22
    • 84898444828 scopus 로고    scopus 로고
    • Near duplicate image detection: Min-Hash and tf-idf weighting
    • Chum, O., Philbin, J. & Zisserman, A. Near duplicate image detection: min-Hash and tf-idf weighting. British Machine Vision Conference 810, 812-815 (2008).
    • (2008) British Machine Vision Conference , vol.810 , pp. 812-815
    • Chum, O.1    Philbin, J.2    Zisserman, A.3
  • 23
    • 0035024494 scopus 로고    scopus 로고
    • Efficient large-scale sequence comparison by locality-sensitive hashing
    • Buhler, J. Efficient large-scale sequence comparison by locality-sensitive hashing. Bioinformatics 17, 419-428 (2001).
    • (2001) Bioinformatics , vol.17 , pp. 419-428
    • Buhler, J.1
  • 24
    • 35048835693 scopus 로고    scopus 로고
    • Gapped local similarity search with provable guarantees
    • Narayanan, M. & Karp, R.M. Gapped local similarity search with provable guarantees. Algorithms Bioinform. 3240, 74-86 (2004).
    • (2004) Algorithms Bioinform. , vol.3240 , pp. 74-86
    • Narayanan, M.1    Karp, R.M.2
  • 25
    • 84866113768 scopus 로고    scopus 로고
    • De novo assembly of highly diverse viral populations
    • Yang, X. et al. De novo assembly of highly diverse viral populations. BMC Genomics 13, 475 (2012).
    • (2012) BMC Genomics , vol.13 , pp. 475
    • Yang, X.1
  • 27
    • 12344332552 scopus 로고    scopus 로고
    • Reducing storage requirements for biological sequence comparison
    • Roberts, M., Hayes, W., Hunt, B.R., Mount, S.M. & Yorke, J.A. Reducing storage requirements for biological sequence comparison. Bioinformatics 20, 3363-3369 (2004).
    • (2004) Bioinformatics , vol.20 , pp. 3363-3369
    • Roberts, M.1    Hayes, W.2    Hunt, B.R.3    Mount, S.M.4    Yorke, J.A.5
  • 28
    • 84866266717 scopus 로고    scopus 로고
    • Mapping single molecule sequencing reads using basic local alignment with successive refinement (BLASR): Application and theory
    • Chaisson, M.J. & Tesler, G. Mapping single molecule sequencing reads using basic local alignment with successive refinement (BLASR): application and theory. BMC Bioinformatics 13, 238 (2012).
    • (2012) BMC Bioinformatics , vol.13 , pp. 238
    • Chaisson, M.J.1    Tesler, G.2
  • 29
    • 84958554065 scopus 로고    scopus 로고
    • Efficient local alignment discovery amongst noisy long reads
    • Myers, G. Efficient local alignment discovery amongst noisy long reads. Algorithms Bioinform. 8701, 52-67 (2014).
    • (2014) Algorithms Bioinform. , vol.8701 , pp. 52-67
    • Myers, G.1
  • 32
    • 84870471176 scopus 로고    scopus 로고
    • RazerS 3: Faster, fully sensitive read mapping
    • Weese, D., Holtgrewe, M. & Reinert, K. RazerS 3: faster, fully sensitive read mapping. Bioinformatics 28, 2592-2599 (2012).
    • (2012) Bioinformatics , vol.28 , pp. 2592-2599
    • Weese, D.1    Holtgrewe, M.2    Reinert, K.3
  • 33
    • 33745128489 scopus 로고
    • AnO(ND) difference algorithm and its variations
    • Myers, E.W. AnO(ND) difference algorithm and its variations. Algorithmica 1, 251-266 (1986).
    • (1986) Algorithmica , vol.1 , pp. 251-266
    • Myers, E.W.1
  • 34
    • 0034708758 scopus 로고    scopus 로고
    • Whole-genome assembly of Drosophila
    • Myers, E.W.A. Whole-genome assembly of Drosophila. Science 287, 2196-2204 (2000).
    • (2000) Science , vol.287 , pp. 2196-2204
    • Myers, E.W.A.1
  • 35
    • 84930838411 scopus 로고    scopus 로고
    • Long-read, whole-genome shotgun sequence data for five model organisms
    • Kim, K.E. et al. Long-read, whole-genome shotgun sequence data for five model organisms. Scientific Data 1, 140045 (2014).
    • (2014) Scientific Data , vol.1 , pp. 140045
    • Kim, K.E.1
  • 36
    • 84867446455 scopus 로고    scopus 로고
    • The Saccharomyces cerevisiae W303-K6001 cross-platform genome sequence: Insights into ancestry and physiology of a laboratory mutt
    • Ralser, M. et al. The Saccharomyces cerevisiae W303-K6001 cross-platform genome sequence: insights into ancestry and physiology of a laboratory mutt. Open Biol. 2, 120093 (2012).
    • (2012) Open Biol. , vol.2
    • Ralser, M.1
  • 37
    • 0034649566 scopus 로고    scopus 로고
    • Analysis of the genome sequence of the flowering plant Arabidopsis thaliana
    • Arabidopsis Genome Initiative. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408, 796-815 (2000).
    • (2000) Nature , vol.408 , pp. 796-815
  • 38
    • 34250873404 scopus 로고    scopus 로고
    • Sequence finishing and mapping of Drosophila melanogaster heterochromatin
    • Hoskins, R.A. et al. Sequence finishing and mapping of Drosophila melanogaster heterochromatin. Science 316, 1625-1628 (2007).
    • (2007) Science , vol.316 , pp. 1625-1628
    • Hoskins, R.A.1
  • 39
    • 0030950735 scopus 로고    scopus 로고
    • Human whole-genome shotgun sequencing
    • Weber, J.L. & Myers, E.W. Human whole-genome shotgun sequencing. Genome Res. 7, 401-409 (1997).
    • (1997) Genome Res. , vol.7 , pp. 401-409
    • Weber, J.L.1    Myers, E.W.2
  • 40
    • 2042437650 scopus 로고    scopus 로고
    • Initial sequencing and analysis of the human genome
    • Lander, E.S. et al. Initial sequencing and analysis of the human genome. Nature 409, 860-921 (2001).
    • (2001) Nature , vol.409 , pp. 860-921
    • Lander, E.S.1
  • 41
    • 0035895505 scopus 로고    scopus 로고
    • The sequence of the human genome
    • Venter, J.C. et al. The sequence of the human genome. Science 291, 1304-1351 (2001).
    • (2001) Science , vol.291 , pp. 1304-1351
    • Venter, J.C.1
  • 42
    • 84913600842 scopus 로고    scopus 로고
    • Single haplotype assembly of the human genome from a hydatidiform mole
    • Steinberg, K.M. et al. Single haplotype assembly of the human genome from a hydatidiform mole. Genome Res. 24, 2066-2076 (2014).
    • (2014) Genome Res. , vol.24 , pp. 2066-2076
    • Steinberg, K.M.1
  • 43
    • 0344721480 scopus 로고    scopus 로고
    • Complete sequence and gene map of a human major histocompatibility complex
    • The MHC Sequencing Consortium. Complete sequence and gene map of a human major histocompatibility complex. Nature 401, 921-923 (1999).
    • (1999) Nature , vol.401 , pp. 921-923
  • 44
    • 84897965254 scopus 로고    scopus 로고
    • Reconstructing complex regions of genomes using long-read sequencing technology
    • Huddleston, J. et al. Reconstructing complex regions of genomes using long-read sequencing technology. Genome Res. 24, 688-696 (2014).
    • (2014) Genome Res. , vol.24 , pp. 688-696
    • Huddleston, J.1
  • 45
    • 45549109750 scopus 로고    scopus 로고
    • Genome assembly forensics: Finding the elusive mis-assembly
    • Phillippy, A.M., Schatz, M.C. & Pop, M. Genome assembly forensics: finding the elusive mis-assembly. Genome Biol. 9, R55 (2008).
    • (2008) Genome Biol. , vol.9 , pp. R55
    • Phillippy, A.M.1    Schatz, M.C.2    Pop, M.3
  • 46
    • 0034708480 scopus 로고    scopus 로고
    • The genome sequence of Drosophila melanogaster
    • Adams, M.D. et al. The genome sequence of Drosophila melanogaster. Science 287, 2185-2195 (2000).
    • (2000) Science , vol.287 , pp. 2185-2195
    • Adams, M.D.1
  • 47
    • 84857893016 scopus 로고    scopus 로고
    • GAGE: A critical evaluation of genome assemblies and assembly algorithms
    • Salzberg, S.L. et al. GAGE: a critical evaluation of genome assemblies and assembly algorithms. Genome Res. 22, 557-567 (2012).
    • (2012) Genome Res. , vol.22 , pp. 557-567
    • Salzberg, S.L.1
  • 48
    • 0031955518 scopus 로고    scopus 로고
    • Base-calling of automated sequencer traces using phred. I. Accuracy assessment
    • Ewing, B., Hillier, L., Wendl, M.C. & Green, P. Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res. 8, 175-185 (1998).
    • (1998) Genome Res. , vol.8 , pp. 175-185
    • Ewing, B.1    Hillier, L.2    Wendl, M.C.3    Green, P.4
  • 49
    • 0012341115 scopus 로고    scopus 로고
    • The transposable elements of the Drosophila melanogaster euchromatin: A genomics perspective
    • research0084
    • Kaminker, J.S. et al. The transposable elements of the Drosophila melanogaster euchromatin: a genomics perspective. Genome Biol. 3, research0084 (2002).
    • (2002) Genome Biol , vol.3
    • Kaminker, J.S.1
  • 50
    • 84907087679 scopus 로고    scopus 로고
    • Illumina TruSeq synthetic long-reads empower de novo assembly and resolve complex, highly-repetitive transposable elements
    • McCoy, R.C. et al. Illumina TruSeq synthetic long-reads empower de novo assembly and resolve complex, highly-repetitive transposable elements. PLoS ONE 9, e106689 (2014).
    • (2014) PLoS ONE , vol.9
    • McCoy, R.C.1
  • 51
    • 8544240102 scopus 로고    scopus 로고
    • Overview of the yeast genome
    • Mewes, H.W. et al. Overview of the yeast genome. Nature 387, 7-65 (1997).
    • (1997) Nature , vol.387 , pp. 7-65
    • Mewes, H.W.1
  • 52
    • 22944488871 scopus 로고    scopus 로고
    • Telomeres and human disease: Ageing, cancer and beyond
    • Blasco, M.A. Telomeres and human disease: ageing, cancer and beyond. Nat. Rev. Genet. 6, 611-622 (2005).
    • (2005) Nat. Rev. Genet. , vol.6 , pp. 611-622
    • Blasco, M.A.1
  • 53
  • 54
    • 23844525077 scopus 로고    scopus 로고
    • Repbase Update, a database of eukaryotic repetitive elements
    • Jurka, J. et al. Repbase Update, a database of eukaryotic repetitive elements. Cytogenet. Genome Res. 110, 462-467 (2005).
    • (2005) Cytogenet. Genome Res. , vol.110 , pp. 462-467
    • Jurka, J.1
  • 55
    • 84901367719 scopus 로고    scopus 로고
    • RepARK-de novo creation of repeat libraries from whole-genome NGS reads
    • Koch, P., Platzer, M. & Downie, B.R. RepARK-de novo creation of repeat libraries from whole-genome NGS reads. Nucleic Acids Res. 42, e80 (2014).
    • (2014) Nucleic Acids Res. , vol.42 , pp. e80
    • Koch, P.1    Platzer, M.2    Downie, B.R.3
  • 56
    • 0027485082 scopus 로고
    • Ordered restriction maps of Saccharomyces cerevisiae chromosomes constructed by optical mapping
    • Schwartz, D.C. et al. Ordered restriction maps of Saccharomyces cerevisiae chromosomes constructed by optical mapping. Science 262, 110-114 (1993).
    • (1993) Science , vol.262 , pp. 110-114
    • Schwartz, D.C.1
  • 57
    • 84890034912 scopus 로고    scopus 로고
    • Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions
    • Burton, J.N. et al. Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions. Nat. Biotechnol. 31, 1119-1125 (2013).
    • (2013) Nat. Biotechnol. , vol.31 , pp. 1119-1125
    • Burton, J.N.1
  • 58
    • 84890032321 scopus 로고    scopus 로고
    • High-throughput genome scaffolding from in vivo DNA interaction frequency
    • Kaplan, N. & Dekker, J. High-throughput genome scaffolding from in vivo DNA interaction frequency. Nat. Biotechnol. 31, 1143-1147 (2013).
    • (2013) Nat. Biotechnol. , vol.31 , pp. 1143-1147
    • Kaplan, N.1    Dekker, J.2
  • 60
    • 84860771820 scopus 로고    scopus 로고
    • SPAdes: A new genome assembly algorithm and its applications to single-cell sequencing
    • Bankevich, A. et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J. Comput. Biol. 19, 455-477 (2012).
    • (2012) J. Comput. Biol. , vol.19 , pp. 455-477
    • Bankevich, A.1
  • 62
    • 0019887799 scopus 로고
    • Identification of common molecular subsequences
    • Smith, T.F. & Waterman, M.S. Identification of common molecular subsequences. J. Mol. Biol. 147, 195-197 (1981).
    • (1981) J. Mol. Biol. , vol.147 , pp. 195-197
    • Smith, T.F.1    Waterman, M.S.2
  • 63
    • 33646000287 scopus 로고    scopus 로고
    • Efficient q-gram filters for finding all epsilon-matches over a given length
    • Rasmussen, K.R., Stoye, J. & Myers, E.W. Efficient q-gram filters for finding all epsilon-matches over a given length. J. Comput. Biol. 13, 296-308 (2006).
    • (2006) J. Comput. Biol. , vol.13 , pp. 296-308
    • Rasmussen, K.R.1    Stoye, J.2    Myers, E.W.3
  • 64
    • 0027681165 scopus 로고
    • Suffix arrays: A new method for on-line string searches
    • Manber, U. & Myers, G. Suffix arrays: a new method for on-line string searches. SIAM J. Comput. 22.5, 935-348 (1993).
    • (1993) SIAM J. Comput. , vol.22 , Issue.5 , pp. 935-1348
    • Manber, U.1    Myers, G.2
  • 65
    • 0001044948 scopus 로고
    • Estimating parameters in continuous univariate distributions with a shifted origin
    • Cheng, R.C.H. & Amin, N.A.K. Estimating parameters in continuous univariate distributions with a shifted origin. J. R. Stat. Soc., B 45, 394-403 (1983).
    • (1983) J. R. Stat. Soc., B , vol.45 , pp. 394-403
    • Cheng, R.C.H.1    Amin, N.A.K.2
  • 66
    • 0036203448 scopus 로고    scopus 로고
    • Multiple sequence alignment using partial order graphs
    • Lee, C., Grasso, C. & Sharlow, M.F. Multiple sequence alignment using partial order graphs. Bioinformatics 18, 452-464 (2002).
    • (2002) Bioinformatics , vol.18 , pp. 452-464
    • Lee, C.1    Grasso, C.2    Sharlow, M.F.3
  • 67
    • 0030833564 scopus 로고    scopus 로고
    • ReAligner: A program for refining DNA sequence multi-alignments
    • Anson, E.L. & Myers, E.W. ReAligner: a program for refining DNA sequence multi-alignments. J. Comput. Biol. 4, 369-383 (1997).
    • (1997) J. Comput. Biol. , vol.4 , pp. 369-383
    • Anson, E.L.1    Myers, E.W.2
  • 68
    • 2942538300 scopus 로고    scopus 로고
    • Versatile and open software for comparing large genomes
    • Kurtz, S. et al. Versatile and open software for comparing large genomes. Genome Biol. 5, R12 (2004).
    • (2004) Genome Biol. , vol.5 , pp. R12
    • Kurtz, S.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.