메뉴 건너뛰기




Volumn 9, Issue 7, 2014, Pages

These are not the K-mers you are looking for: Efficient online K-mer counting using a probabilistic data structure

Author keywords

[No Author keywords available]

Indexed keywords

ACCESS TO INFORMATION; ANALYTICAL ERROR; ARTICLE; BIOINFORMATICS; COMPUTER ANALYSIS; COMPUTER INTERFACE; COMPUTER MEMORY; COMPUTER SYSTEM; CONTROLLED STUDY; DATA ANALYSIS SOFTWARE; INTERMETHOD COMPARISON; K MER COUNTING; ONLINE ANALYSIS; ONLINE SYSTEM; PROBABILITY; ALGORITHM; BIOLOGY; COMPUTER PROGRAM; DNA SEQUENCE; HUMAN;

EID: 84904876565     PISSN: None     EISSN: 19326203     Source Type: Journal    
DOI: 10.1371/journal.pone.0101271     Document Type: Article
Times cited : (77)

References (40)
  • 1
    • 79952592810 scopus 로고    scopus 로고
    • A fast, lock-free approach for efficient parallel counting of occurrences of k-mers
    • Marçais G, Kingsford C (2011) A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics 27: 764-770.
    • (2011) Bioinformatics , vol.27 , pp. 764-770
    • Marçais, G.1    Kingsford, C.2
  • 2
    • 56549086632 scopus 로고    scopus 로고
    • A new method to compute Kmer frequencies and its application to annotate large repetitive plant genomes
    • Kurtz S, Narechania A, Stein JC, Ware D (2008) A new method to compute Kmer frequencies and its application to annotate large repetitive plant genomes. BMC Genomics 9: 517.
    • (2008) BMC Genomics , vol.9 , pp. 517
    • Kurtz, S.1    Narechania, A.2    Stein, J.C.3    Ware, D.4
  • 3
    • 72849144434 scopus 로고    scopus 로고
    • Sequencing technologies - The next generation
    • Metzker M (2010) Sequencing technologies - the next generation. Nat Rev Genet 11: 31-46.
    • (2010) Nat Rev Genet , vol.11 , pp. 31-46
    • Metzker, M.1
  • 4
    • 79951526698 scopus 로고    scopus 로고
    • Succinct data structures for assembling large genomes
    • Conway TC, Bromage AJ (2011) Succinct data structures for assembling large genomes. Bioinfor-matics 27: 479-86.
    • (2011) Bioinfor-matics , vol.27 , pp. 479-486
    • Conway, T.C.1    Bromage, A.J.2
  • 5
    • 80455126001 scopus 로고    scopus 로고
    • Evaluation of genomic high-throughput sequencing data generated on illumina hiseq and genome analyzer systems
    • Minoche AE, Dohm JC, Himmelbauer H (2011) Evaluation of genomic high-throughput sequencing data generated on illumina hiseq and genome analyzer systems. Genome Biol 12: R112.
    • (2011) Genome Biol , vol.12
    • Minoche, A.E.1    Dohm, J.C.2    Himmelbauer, H.3
  • 6
    • 81155158421 scopus 로고    scopus 로고
    • Efficient counting of k-mers in DNA sequences using a bloom filter
    • Melsted P, Pritchard JK (2011) Efficient counting of k-mers in DNA sequences using a bloom filter. BMC bioinformatics 12: 333.
    • (2011) BMC Bioinformatics , vol.12 , pp. 333
    • Melsted, P.1    Pritchard, J.K.2
  • 7
    • 84874741072 scopus 로고    scopus 로고
    • Dsk: K-mer counting with very low memory usage
    • Rizk G, Lavenier D, Chikhi R (2013) Dsk: k-mer counting with very low memory usage. Bioinfor- matics 29: 652-3.
    • (2013) Bioinfor- Matics , vol.29 , pp. 652-653
    • Rizk, G.1    Lavenier, D.2    Chikhi, R.3
  • 9
    • 84904013350 scopus 로고    scopus 로고
    • Turtle: Identifying frequent k-mers with cache-efficient algorithms
    • Advance Access published March 10, 2014 : doi: 10.1093/bioinformat-ics/ btu132
    • Roy RS, Bhattacharya D, Schliep A (2014) Turtle: Identifying frequent k-mers with cache-efficient algorithms. Bioinformatics: Advance Access published March 10, 2014 : doi: 10.1093/bioinformat-ics/btu132.
    • (2014) Bioinformatics
    • Roy, R.S.1    Bhattacharya, D.2    Schliep, A.3
  • 10
    • 84903977587 scopus 로고    scopus 로고
    • Kanalyze: A fast versatile pipelined k-mer toolkit
    • Advance Access published March 18, 2014: doi: 10.1093/bioinformatics/ btu152
    • Audano P, Vannberg F (2014) Kanalyze: A fast versatile pipelined k-mer toolkit. Bioinformatics: Advance Access published March 18, 2014: doi: 10.1093/bioinformatics/btu152.
    • (2014) Bioinformatics
    • Audano, P.1    Vannberg, F.2
  • 12
    • 14844367057 scopus 로고    scopus 로고
    • An improved data stream summary: The count-min sketch and its applications
    • DOI 10.1016/j.jalgor.2003.12.001, PII S0196677403001913
    • Cormode G, Muthukrishnan S (2005) An improved data stream summary: the count-min sketch and its applications. Journal of Algorithms 55: 58-75. (Pubitemid 40357145)
    • (2005) Journal of Algorithms , vol.55 , Issue.1 , pp. 58-75
    • Cormode, G.1    Muthukrishnan, S.2
  • 13
    • 0014814325 scopus 로고
    • Space/time trade-offs in hash coding with allowable errors
    • Bloom BH (1970) Space/time trade-offs in hash coding with allowable errors. Commun ACM 13: 422-426.
    • (1970) Commun ACM , vol.13 , pp. 422-426
    • Bloom, B.H.1
  • 14
    • 70350656155 scopus 로고    scopus 로고
    • Using bloom filters for large scale gene sequence analysis in haskell
    • Gill A, Swift T, editors, PADL. Springer
    • Malde K, O'Sullivan B (2009) Using bloom filters for large scale gene sequence analysis in haskell. In: Gill A, Swift T, editors, PADL. Springer, volume 5418 of Lecture Notes in Computer Science, pp. 183-194.
    • (2009) Lecture Notes in Computer Science , vol.5418 , pp. 183-194
    • Malde, K.1    O'Sullivan, B.2
  • 16
    • 84871199924 scopus 로고    scopus 로고
    • Compression of nextgeneration sequencing reads aided by highly efficient de novo assembly
    • Jones DC, Ruzzo WL, Peng X, Katze MG (2012) Compression of nextgeneration sequencing reads aided by highly efficient de novo assembly. Nucleic Acids Res 40: e171.
    • (2012) Nucleic Acids Res , vol.40
    • Jones, D.C.1    Ruzzo, W.L.2    Peng, X.3    Katze, M.G.4
  • 17
    • 70450232823 scopus 로고    scopus 로고
    • Survey: Network applications of bloom filters: A survey
    • Broder AZ, Mitzenmacher M (2003) Survey: Network applications of bloom filters: A survey. Internet Mathematics 1: 485-509.
    • (2003) Internet Mathematics , vol.1 , pp. 485-509
    • Broder, A.Z.1    Mitzenmacher, M.2
  • 18
    • 0034206002 scopus 로고    scopus 로고
    • Summary cache: A scalable wide-area web cache sharing protocol
    • Fan L, Cao P, Almeida J, Broder AZ (2000) Summary cache: A scalable wide-area web cache sharing protocol. IEEE/ACM Trans Netw 8: 281-293.
    • (2000) IEEE/ACM Trans Netw , vol.8 , pp. 281-293
    • Fan, L.1    Cao, P.2    Almeida, J.3    Broder, A.Z.4
  • 19
    • 33645782745 scopus 로고    scopus 로고
    • New directions in traffic measurement and accounting
    • DOI 10.1145/964725.633056, Proceedings of the ACM SIGCOMM 2002 Conference - Applications, Technologies, Architectures, and Protocols for Computer Communications
    • Estan C, Varghese G (2002) New directions in traffic measurement and accounting. In: SIGCOMM. ACM, pp. 323-336. (Pubitemid 43843793)
    • (2002) Computer Communication Review , vol.32 , Issue.4 , pp. 323-336
    • Estan, C.1    Varghese, G.2
  • 20
    • 1142279462 scopus 로고    scopus 로고
    • Spectral bloom filters
    • Halevy AY, Ives ZG, Doan A, editors, ACM
    • Cohen S, Matias Y (2003) Spectral bloom filters. In: Halevy AY, Ives ZG, Doan A, editors, SIGMOD Conference. ACM, pp. 241-252.
    • (2003) SIGMOD Conference , pp. 241-252
    • Cohen, S.1    Matias, Y.2
  • 23
    • 84904916884 scopus 로고    scopus 로고
    • Working with big data in bioinformatics
    • Armstrong T, editor, lulu.com, chapter 12
    • McDonald E, Brown CT (2013) Working with big data in bioinformatics. In: Armstrong T, editor, The Performance of Open Source Applications, lulu.com, chapter 12. p. 151.
    • (2013) The Performance of Open Source Applications , pp. 151
    • McDonald, E.1    Brown, C.T.2
  • 24
    • 70450232823 scopus 로고    scopus 로고
    • Network applications of bloom filters: A survey
    • Broder A, Mitzenmacher M (2004) Network applications of bloom filters: A survey. Internet mathematics 1: 485-509.
    • (2004) Internet Mathematics , vol.1 , pp. 485-509
    • Broder, A.1    Mitzenmacher, M.2
  • 25
    • 84893234795 scopus 로고    scopus 로고
    • Hyperloglog: The analysis of a near-optimal cardinality estimation algorithm
    • Flajolet P, Fusy É, Gandouet O, Meunier F (2008) Hyperloglog: the analysis of a near-optimal cardinality estimation algorithm. DMTCS Proceedings.
    • (2008) DMTCS Proceedings
    • Flajolet, P.1    Fusy, É.2    Gandouet, O.3    Meunier, F.4
  • 26
    • 84891349005 scopus 로고    scopus 로고
    • Informed and automated k-mer size selection for genome assembly
    • Chikhi R, Medvedev P (2014) Informed and automated k-mer size selection for genome assembly. Bioinformatics 30: 31-7.
    • (2014) Bioinformatics , vol.30 , pp. 31-37
    • Chikhi, R.1    Medvedev, P.2
  • 27
    • 79959485321 scopus 로고    scopus 로고
    • Error correction of high-throughput se- quencing datasets with non-uniform coverage
    • Medvedev P, Scott E, Kakaradov B, Pevzner P (2011) Error correction of high-throughput se- quencing datasets with non-uniform coverage. Bioinformatics 27: i137-41.
    • (2011) Bioinformatics , vol.27
    • Medvedev, P.1    Scott, E.2    Kakaradov, B.3    Pevzner, P.4
  • 29
    • 0041523918 scopus 로고    scopus 로고
    • Estimating the repeat structure and length of DNA sequences using l-tuples
    • Li X, Waterman MS (2003) Estimating the repeat structure and length of dna sequences using l-tuples. Genome Res 13: 1916-22. (Pubitemid 36987198)
    • (2003) Genome Research , vol.13 , Issue.8 , pp. 1916-1922
    • Li, X.1    Waterman, M.S.2
  • 30
    • 78649358717 scopus 로고    scopus 로고
    • Quake: Quality-aware detection and correction of sequencing errors
    • Kelley DR, Schatz MC, Salzberg SL (2010) Quake: quality-aware detection and correction of sequencing errors. Genome Biol 11: R116.
    • (2010) Genome Biol , vol.11
    • Kelley, D.R.1    Schatz, M.C.2    Salzberg, S.L.3
  • 31
    • 80054732674 scopus 로고    scopus 로고
    • Efficient de novo assembly of single-cell bacterial genomes from short-read data sets
    • Chitsaz H, Yee-Greenbaum J, Tesler G, Lombardo M, Dupont C, et al. (2011) Efficient de novo assembly of single-cell bacterial genomes from short-read data sets. Nat Biotechnol 29: 915-21.
    • (2011) Nat Biotechnol , vol.29 , pp. 915-921
    • Chitsaz, H.1    Yee-Greenbaum, J.2    Tesler, G.3    Lombardo, M.4    Dupont, C.5
  • 32
    • 84880266648 scopus 로고    scopus 로고
    • De novo transcript sequence reconstruction from rna-seq using the trinity platform for reference generation and analysis
    • Haas BJ, Papanicolaou A, Yassour M, Grabherr M, Blood PD, et al. (2013) De novo transcript sequence reconstruction from rna-seq using the trinity platform for reference generation and analysis. Nat Protoc 8: 1494-512.
    • (2013) Nat Protoc , vol.8 , pp. 1494-1512
    • Haas, B.J.1    Papanicolaou, A.2    Yassour, M.3    Grabherr, M.4    Blood, P.D.5
  • 34
    • 43149115851 scopus 로고    scopus 로고
    • Velvet: Algorithms for de novo short read assembly using de Bruijn graphs
    • DOI 10.1101/gr.074492.107
    • Zerbino DR, Birney E (2008) Velvet: algorithms for de novo short read assembly using de bruijn graphs. Genome Res 18: 821-9. (Pubitemid 351645072)
    • (2008) Genome Research , vol.18 , Issue.5 , pp. 821-829
    • Zerbino, D.R.1    Birney, E.2
  • 36
    • 84880101277 scopus 로고    scopus 로고
    • Summarizing and mining skewed data streams
    • Kargupta H, Srivastava J, Kamath C, Goodman A, editors, SIAM
    • Cormode G, Muthukrishnan S (2005) Summarizing and mining skewed data streams. In: Kargupta H, Srivastava J, Kamath C, Goodman A, editors, SDM. SIAM, pp. 44-55.
    • (2005) SDM , pp. 44-55
    • Cormode, G.1    Muthukrishnan, S.2
  • 39
    • 34247481878 scopus 로고    scopus 로고
    • IPython: A system for interactive scientific computing
    • DOI 10.1109/MCSE.2007.53, 4160251
    • Pérez F, Granger B (2007) Ipython: A system for interactive scientific computing. Computing in Science Engineering 9: 21-29. (Pubitemid 46646861)
    • (2007) Computing in Science and Engineering , vol.9 , Issue.3 , pp. 21-29
    • Perez, F.1    Granger, B.E.2
  • 40
    • 77950251400 scopus 로고    scopus 로고
    • A human gut microbial gene catalogue established by metagenomic sequencing
    • Qin J, Li R, Raes J, Arumugam M, Burgdorf KS, et al. (2010) A human gut microbial gene catalogue established by metagenomic sequencing. Nature 464: 59-65.
    • (2010) Nature , vol.464 , pp. 59-65
    • Qin, J.1    Li, R.2    Raes, J.3    Arumugam, M.4    Burgdorf, K.S.5


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.