메뉴 건너뛰기




Volumn 11, Issue 3, 2016, Pages 352-371

Hadooping the genome: The impact of big data tools on biology

Author keywords

Big data; DNA sequence; Genomics; Google; Hadoop

Indexed keywords


EID: 85011024215     PISSN: 17458552     EISSN: 17458560     Source Type: Journal    
DOI: 10.1057/s41292-016-0003-6     Document Type: Article
Times cited : (8)

References (79)
  • 1
    • 77957947562 scopus 로고    scopus 로고
    • Hundreds of variants clustered in genomic loci and biological pathways affect human height
    • Allen, H.L. et al (2010) Hundreds of variants clustered in genomic loci and biological pathways affect human height. Nature 467, no 7321: 832-838.
    • (2010) Nature , vol.467 , Issue.7321 , pp. 832-838
    • Allen, H.L.1
  • 2
    • 0025183708 scopus 로고
    • Basic local alignment search tool
    • Altschul, S.F. et al (1990) Basic local alignment search tool. Journal of Molecular Biology 215: 403-410.
    • (1990) Journal of Molecular Biology , vol.215 , pp. 403-410
    • Altschul, S.F.1
  • 3
    • 0032653312 scopus 로고    scopus 로고
    • Algorithms for whole genome shotgun sequencing
    • Lyon
    • Anson, E. and Myers, E. (1999) Algorithms for whole genome shotgun sequencing. In: Proceedings of RECOMB'99, Lyon, pp. 1-9.
    • (1999) Proceedings of RECOMB'99 , pp. 1-9
    • Anson, E.1    Myers, E.2
  • 4
    • 85011054550 scopus 로고
    • Encyclopedia of computer science and technology
    • New York: Marcel Dekker
    • Belzer, J. et al (eds.) (1978) Encyclopedia of Computer Science and Technology. Vo1. 10. Linear and Matrix Algebra to Microorganisms. New York: Marcel Dekker.
    • (1978) Linear and Matrix Algebra to Microorganisms , vol.10
    • Belzer, J.1
  • 5
    • 85011030232 scopus 로고    scopus 로고
    • Analyzing human genomes with Apache Hadoop
    • 15 October, Cloudera. accessed 27 May 2015
    • Bisciglia, C. (2009) Analyzing human genomes with Apache Hadoop. Weblog, 15 October, Cloudera. http:// blog.cloudera.com/blog/2009/10/analyzing-human-genomes-with-hadoop/, accessed 27 May 2015.
    • (2009) Weblog
    • Bisciglia, C.1
  • 9
    • 85011064445 scopus 로고    scopus 로고
    • The anatomy of a large-scale hypertextual web search engine
    • Stanford University accessed 27 May 2015
    • Brin, S. and Page, L. (2000) The anatomy of a large-scale hypertextual web search engine. Computer Science Department, Stanford University. http://infolab.stanford.edu/pub/papers/google.pdf, accessed 27 May 2015.
    • (2000) Computer Science Department
    • Brin, S.1    Page, L.2
  • 10
    • 85011070176 scopus 로고    scopus 로고
    • Cloudera and Mount Sinai: The structure of a big data revolution?
    • 6 July. accessed 27 May 2015
    • Brust, A. (2012) Cloudera and Mount Sinai: The structure of a big data revolution? ZDNet, 6 July. http:// www.zdnet.com/article/cloudera-and-mount-sinai-the-structure-of-a-big-data-revolution/, accessed 27 May 2015.
    • (2012) ZDNet
    • Brust, A.1
  • 11
    • 0003573193 scopus 로고
    • A block sorting lossless data compression algorithm
    • Digital Equipment Corporation. accessed 27 May 2015
    • Burrows, M. and Wheeler, D.J. (1994) A block sorting lossless data compression algorithm. Technical Report 124, Digital Equipment Corporation. http://www.hpl.hp.com/techreports/Compaq-DEC/SRC-RR-124. html, accessed 27 May 2015.
    • (1994) Technical Report 124
    • Burrows, M.1    Wheeler, D.J.2
  • 12
    • 34447311281 scopus 로고    scopus 로고
    • How google works: The google file system
    • 6 July. accessed 27 May 2015
    • Carr, D.F. (2006) How Google Works: The Google File System. Baseline, 6 July. http://www.baselinemag. com/c/a/Infrastructure/How-Google-Works-1/4, accessed 27 May 2015.
    • (2006) Baseline
    • Carr, D.F.1
  • 13
    • 85011048838 scopus 로고    scopus 로고
    • Press release, 20 March accessed 18 September 2015
    • Celera (2000) Celera Genomics to Acquire Paracel Inc. Press release, 20 March. https://www.celera.com/ celera/pr-1056568938, accessed 18 September 2015.
    • (2000) Celera Genomics to Acquire Paracel Inc
  • 14
    • 84930889761 scopus 로고    scopus 로고
    • What does a critical data studies look like, and why do we care? Seven points for a critical approach to big data
    • accessed 23 September 2015
    • Dalton, C. and Thatcher, J. (2014) What does a critical data studies look like, and why do we care? Seven points for a critical approach to big data. Society and Space. http://societyandspace.com/material/ commentaries/craig-dalton-and-jim-thatcher-what-does-a-critical-data-studies-look-like-and-why-do-we-careseven-points-for-a-critical-approach-to-big-data/#comments, accessed 23 September 2015.
    • (2014) Society and Space
    • Dalton, C.1    Thatcher, J.2
  • 15
    • 77949756362 scopus 로고    scopus 로고
    • Genome-wide association studies in pharmacogenomics
    • Daly, A.K. (2010) Genome-wide association studies in pharmacogenomics. Nature Reviews Genetics 11: 241-246.
    • (2010) Nature Reviews Genetics , vol.11 , pp. 241-246
    • Daly, A.K.1
  • 16
    • 85011048835 scopus 로고    scopus 로고
    • Google Research Publications (appeared in OSDI'04: Sixth Symposium on Operating System Design and Implementation, San Francisco, California, December 2004). accessed 27 May 2015
    • Dean, J. and Ghemawat, S. (2004) MapReduce: Simplified data processing on large clusters. Google Research Publications (appeared in OSDI'04: Sixth Symposium on Operating System Design and Implementation, San Francisco, California, December 2004). http://static.googleusercontent.com/media/research.google. com/es/us/archive/mapreduce-osdi04.pdf, accessed 27 May 2015.
    • (2004) MapReduce: Simplified Data Processing on Large Clusters
    • Dean, J.1    Ghemawat, S.2
  • 17
    • 0033153375 scopus 로고    scopus 로고
    • Alignment of whole genomes
    • Delcher, A.L. et al (1999) Alignment of whole genomes. Nucleic Acids Research 27(11): 2369-76.
    • (1999) Nucleic Acids Research , vol.27 , Issue.11 , pp. 2369-2376
    • Delcher, A.L.1
  • 18
    • 84962235045 scopus 로고    scopus 로고
    • No SQL: The shifting materialities of database technology
    • accessed 18 September 2015
    • Dourish, P. (2014) No SQL: The shifting materialities of database technology. Computational Culture: A Journal of Software. http://computationalculture.net/article/no-sql-the-shifting-materialities-of-databasetechnology, accessed 18 September 2015.
    • (2014) Computational Culture: A Journal of Software
    • Dourish, P.1
  • 19
    • 85011027692 scopus 로고    scopus 로고
    • Blinded by big science
    • 10 September. accessed 23 September 2015
    • Eisen, M. (2012) Blinded by big science. Weblog entry, 10 September. www.michaeleisen.org/blog/?p=1179, accessed 23 September 2015.
    • (2012) Weblog Entry
    • Eisen, M.1
  • 20
    • 85011084471 scopus 로고    scopus 로고
    • ENCODE at UCSC accessed 27 May 2015
    • ENCODE at UCSC (2012) ENCODE experiment matrix, http://genome.ucsc.edu/ENCODE/dataMatrix/ encodeDataMatrixHuman.html, accessed 27 May 2015.
    • (2012) ENCODE Experiment Matrix
  • 21
    • 0034506014 scopus 로고    scopus 로고
    • Opportunistic data structures with applications. Foundations of Computer Science
    • IEEE
    • Ferragina, P. and Manzini, G. (2000) Opportunistic data structures with applications. Foundations of Computer Science. In: Proceedings, 41st Annual Symposium, pp. 390-398. IEEE.
    • (2000) Proceedings, 41st Annual Symposium , pp. 390-398
    • Ferragina, P.1    Manzini, G.2
  • 22
    • 85011097489 scopus 로고    scopus 로고
    • Writer and director: Alex Garland
    • Garland, A. (2015) Ex Machina (film). Writer and director: Alex Garland.
    • (2015) Ex Machina (Film)
    • Garland, A.1
  • 24
    • 84860523681 scopus 로고    scopus 로고
    • Readjoiner: A fast and memory efficient string graph-based sequence assembler
    • Gonella, G and Kurtz, S. (2012) Readjoiner: A fast and memory efficient string graph-based sequence assembler. BMC Bioinformatics 13(1): 1-19.
    • (2012) BMC Bioinformatics , vol.13 , Issue.1 , pp. 1-19
    • Gonella, G.1    Kurtz, S.2
  • 26
    • 84941930524 scopus 로고    scopus 로고
    • Better medicine, brought to you by big data
    • accessed 27 May 2015. 15 July
    • Harris, D. (2012) Better medicine, brought to you by big data. GigaOm, 15 July. https://gigaom.com/2012/07/ 15/better-medicine-brought-to-you-by-big-data/, accessed 27 May 2015.
    • (2012) GigaOm
    • Harris, D.1
  • 27
    • 83355177241 scopus 로고    scopus 로고
    • KABOOM! A new auffix array based algorithm for clustering expression data
    • Hazelhurst, S. and Lipák, Z. (2011). KABOOM! a new auffix array based algorithm for clustering expression data. Bioinformatics 27(24): 3348-55.
    • (2011) Bioinformatics , vol.27 , Issue.24 , pp. 3348-3355
    • Hazelhurst, S.1    Lipák, Z.2
  • 28
    • 84891895694 scopus 로고    scopus 로고
    • The challenges, advantages and future of phenome-wide association studies
    • Hebbring, S.J. (2014) The challenges, advantages and future of phenome-wide association studies. Immunology 141(2): 157-65.
    • (2014) Immunology , vol.141 , Issue.2 , pp. 157-165
    • Hebbring, S.J.1
  • 29
    • 85011089349 scopus 로고    scopus 로고
    • Data crunchers ditch Hadoop for homegrown software
    • 20 February. accessed 27 May 2015
    • Hernandez, D. (2013) Data crunchers ditch Hadoop for homegrown software. Wired, 20 February. http:// www.wired.com/2013/02/genetic-data-glut/, accessed 27 May 2015.
    • (2013) Wired
    • Hernandez, D.1
  • 30
    • 79551589417 scopus 로고    scopus 로고
    • HiTEC: Accurate error correction in high-throughput sequencing data
    • Ilie, L. et al (2011) HiTEC: Accurate error correction in high-throughput sequencing data. Bioinformatics 27(3): 295-302.
    • (2011) Bioinformatics , vol.27 , Issue.3 , pp. 295-302
    • Ilie, L.1
  • 33
    • 79952256999 scopus 로고    scopus 로고
    • Adaptive seeds tame genomic sequence comparison
    • Kielbasa, S.M. et al (2011) Adaptive seeds tame genomic sequence comparison. Genome Research 21: 487-93.
    • (2011) Genome Research , vol.21 , pp. 487-493
    • Kielbasa, S.M.1
  • 36
    • 0003352252 scopus 로고
    • The art of computer programming
    • Addison-Wesley, Redwood City
    • Knuth, D.E. (1973) The Art of Computer Programming, Volume 3, "Sorting and Searching." Addison-Wesley, Redwood City.
    • (1973) Sorting and Searching , vol.3
    • Knuth, D.E.1
  • 37
    • 84884826911 scopus 로고    scopus 로고
    • The next-generation sequencing revolution and its impact on genomics
    • Koboldt, D.C. et al (2013) The next-generation sequencing revolution and its impact on genomics. Cell 155(1): 27-38.
    • (2013) Cell , vol.155 , Issue.1 , pp. 27-38
    • Koboldt, D.C.1
  • 38
    • 85011084473 scopus 로고    scopus 로고
    • A new method to computer k-mer frequencies and its application to annotate large plant genomes
    • Kurtz, S. et al (2008) A new method to computer k-mer frequencies and its application to annotate large plant genomes. BMC Genomics 9(1): 1-18.
    • (2008) BMC Genomics , vol.9 , Issue.1 , pp. 1-18
    • Kurtz, S.1
  • 39
    • 62349130698 scopus 로고    scopus 로고
    • Ultrafast and memory-efficient alignment of short DNA sequences to the human genome
    • Langmead, B. et al (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biology 10: R25.
    • (2009) Genome Biology , vol.10 , pp. R25
    • Langmead, B.1
  • 41
    • 77957272611 scopus 로고    scopus 로고
    • A survey of sequence alignment algorithms for next-generation sequencing
    • Li, H. and Homer, N. (2010) A survey of sequence alignment algorithms for next-generation sequencing. Briefings in Bioinformatics 11(5): 473-483.
    • (2010) Briefings in Bioinformatics , vol.11 , Issue.5 , pp. 473-483
    • Li, H.1    Homer, N.2
  • 42
    • 84982210765 scopus 로고    scopus 로고
    • On the case at Mount Sinai, It's Dr. Data
    • 7 March, BU1
    • Lohr, S. (2015) On the case at Mount Sinai, It's Dr. Data. New York Times, 7 March, BU1.
    • (2015) New York Times
    • Lohr, S.1
  • 45
    • 85011054584 scopus 로고    scopus 로고
    • Machine learning and genomic dimensionality
    • S. Richardson and H. Stevens (eds.) Durham and London: Duke University Press
    • Mackenzie, A. (2015b) Machine learning and genomic dimensionality. In: S. Richardson and H. Stevens (eds.) Postgenomics: Perspectives on Biology After the Genome. Durham and London: Duke University Press, pp. 73-102.
    • (2015) Postgenomics: Perspectives on Biology after the Genome , pp. 73-102
    • Mackenzie, A.1
  • 46
    • 84959327493 scopus 로고    scopus 로고
    • Post-archival genomics and the bulk Logistics of DNA sequences
    • Mackenzie, A. et al (2015) Post-archival genomics and the bulk Logistics of DNA sequences. Biosocieties 11(1): 82-105.
    • (2015) Biosocieties , vol.11 , Issue.1 , pp. 82-105
    • Mackenzie, A.1
  • 48
    • 70349956433 scopus 로고    scopus 로고
    • Finding the missing heritability of complex diseases
    • Manolio, T.A. et al (2009) Finding the missing heritability of complex diseases. Nature 461, no. 7265: 747-753.
    • (2009) Nature , vol.461 , Issue.7265 , pp. 747-753
    • Manolio, T.A.1
  • 49
    • 34547884903 scopus 로고    scopus 로고
    • Database as a symbolic form
    • Manovich, L. (1999) Database as a symbolic form. Millennium Film Journal 34 (Fall).
    • (1999) Millennium Film Journal , vol.34 , Issue.FALL
    • Manovich, L.1
  • 52
    • 84925948133 scopus 로고    scopus 로고
    • How Yahoo spawned Hadoop, the future of big data
    • 18 October. accessed 27 May 2015
    • Metz, C. (2011) How Yahoo spawned Hadoop, the future of big data. Wired, 18 October. http://www.wired. com/2011/10/how-yahoo-spawned-hadoop/, accessed 27 May 2015.
    • (2011) Wired
    • Metz, C.1
  • 53
    • 0034708758 scopus 로고    scopus 로고
    • Whole-genome assembly of Drosophila
    • Myers, E. et al (2000) Whole-genome assembly of Drosophila. Science 287: 2196-2204.
    • (2000) Science , vol.287 , pp. 2196-2204
    • Myers, E.1
  • 56
    • 77953886372 scopus 로고    scopus 로고
    • An Enviroment-Wide Association Study (EWAS) on Type 2 Diabetes Mellitus
    • Patel, C.J. et al (2010) An Enviroment-Wide Association Study (EWAS) on Type 2 Diabetes Mellitus. PLoS One DOI:10.1371/journal.pone.0010746.
    • (2010) PLoS One
    • Patel, C.J.1
  • 57
    • 85011070162 scopus 로고    scopus 로고
    • Technology; Supercomputers track human genome
    • 28 August
    • Pollack, A. (2000) Technology; Supercomputers Track Human Genome. New York Times, 28 August.
    • (2000) New York Times
    • Pollack, A.1
  • 59
    • 84951788583 scopus 로고    scopus 로고
    • Socializing big data: From concept to practice
    • The University of Manchester and Open University
    • Ruppert, E. et al (2015) Socializing big data: From concept to practice. CRESC Working Paper No. 138, The University of Manchester and Open University.
    • (2015) CRESC Working Paper No. 138
    • Ruppert, E.1
  • 60
    • 65649120715 scopus 로고    scopus 로고
    • Cloudburst: Highly sensitive read mapping with MapReduce
    • Schatz, M. (2009) Cloudburst: Highly sensitive read mapping with MapReduce. Bioinformatics 25(11): 1363-1369.
    • (2009) Bioinformatics , vol.25 , Issue.11 , pp. 1363-1369
    • Schatz, M.1
  • 62
    • 0000817058 scopus 로고    scopus 로고
    • Epigenetics
    • Science special issue
    • Science (2001) Epigenetics. Science, special issue, 293, no. 5532: 1001-1208.
    • (2001) Science , vol.293 , Issue.5532 , pp. 1001-1208
  • 63
    • 53649106195 scopus 로고    scopus 로고
    • Next-generation DNA sequencing
    • Shendure, J. and Ji, H. (2008) Next-generation DNA sequencing. Nature Biotechnology 26: 1135-45.
    • (2008) Nature Biotechnology , vol.26 , pp. 1135-1145
    • Shendure, J.1    Ji, H.2
  • 66
    • 54049091943 scopus 로고    scopus 로고
    • Next-generation sequencing update
    • accessed 27 May 2015. 1 September
    • Stein, R. A. (2008) Next-generation sequencing update. Genetic Engineering & Biotechnology News 28(15), 1 September. http://www.genengnews.com/gen-articles/next-generation-sequencing-update/2584/, accessed 27 May 2015.
    • (2008) Genetic Engineering & Biotechnology News , vol.28 , Issue.15
    • Stein, R.A.1
  • 67
    • 84858269030 scopus 로고    scopus 로고
    • Coding Sequences: A History of Sequence Comparison Algorithms as a Scientific Instrument
    • Stevens, H. (2011a) Coding Sequences: A History of Sequence Comparison Algorithms as a Scientific Instrument. Perspectives on Science 19(3): 263-299.
    • (2011) Perspectives on Science , vol.19 , Issue.3 , pp. 263-299
    • Stevens, H.1
  • 68
    • 84860421561 scopus 로고    scopus 로고
    • On the means of bioproduction: Bioinformatics and how to make knowledge in a highthroughput genomics laboratory
    • Stevens, H. (2011b) On the means of bioproduction: Bioinformatics and how to make knowledge in a highthroughput genomics laboratory. Biosocieties 6(2): 217-242.
    • (2011) Biosocieties , vol.6 , Issue.2 , pp. 217-242
    • Stevens, H.1
  • 70
    • 0001899550 scopus 로고
    • TIGR assembler: A new tool for assembling large shotgun sequencing projects
    • Sutton et al (1995) TIGR assembler: a new tool for assembling large shotgun sequencing projects. Genome Science & Technology 1(1): 9-19.
    • (1995) Genome Science & Technology , vol.1 , Issue.1 , pp. 9-19
    • Sutton1
  • 71
    • 78650811522 scopus 로고    scopus 로고
    • An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics
    • Taylor, R.C. (2010) An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics. BMC Bioinformatics 11(Suppl 12): S1.
    • (2010) BMC Bioinformatics , vol.11 , pp. S1
    • Taylor, R.C.1
  • 73
    • 85011042379 scopus 로고    scopus 로고
    • Google works with ISB to evaluate life sciences as application area for new cloud infrastructure
    • 20 July. accessed 27 May 2015
    • Thomas, U.G. (2012) Google works with ISB to evaluate life sciences as application area for new cloud infrastructure. Genomeweb, 20 July. https://www.genomeweb.com/informatics/google-works-isb-evaluatelife-sciences-application-area-new-cloud-infrastructur, accessed 27 May 2015.
    • (2012) Genomeweb
    • Thomas, U.G.1
  • 75
    • 0035895505 scopus 로고    scopus 로고
    • The sequence of the human genome
    • Venter, J.C. et al (2001) The Sequence of the Human Genome. Science 291, no. 5507: 1304-1351.
    • (2001) Science , vol.291 , Issue.5507 , pp. 1304-1351
    • Venter, J.C.1
  • 76
    • 84860151961 scopus 로고    scopus 로고
    • Evidence-based psychiatric genetics, AKA the false dichotomy between the common and rare variant hypotheses
    • Visscher, P.M. et al (2012a) Evidence-based psychiatric genetics, AKA the false dichotomy between the common and rare variant hypotheses. Molecular Psychiatry 17, no. 5: 474-485.
    • (2012) Molecular Psychiatry , vol.17 , Issue.5 , pp. 474-485
    • Visscher, P.M.1
  • 78
    • 85011048899 scopus 로고    scopus 로고
    • Deleterious Me: Whole Genome Sequencing, 23andMe, and the Crowd-Sourced Health Care Revolution
    • Harvard Kennedy School, 18 April
    • Wojcicki, A. et al (2012) Deleterious Me: Whole Genome Sequencing, 23andMe, and the Crowd-Sourced Health Care Revolution. Science and Democracy Lecture Series, Harvard Kennedy School, 18 April. Available at https://vimeo.com/40657814.
    • (2012) Science and Democracy Lecture Series
    • Wojcicki, A.1
  • 79
    • 79953685152 scopus 로고    scopus 로고
    • The impact of next-generation sequencing on genomics
    • Zhang, J. et al (2011) The impact of next-generation sequencing on genomics. Journal of Genetics and Genomics 38(3): 95-109.
    • (2011) Journal of Genetics and Genomics , vol.38 , Issue.3 , pp. 95-109
    • Zhang, J.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.