메뉴 건너뛰기




Volumn 4, Issue 1, 2015, Pages

A quantitative assessment of the Hadoop framework for analyzing massively parallel DNA sequencing data

Author keywords

Bioinformatics; DNA seq; Hadoop; High performance computing; Massively parallel sequencing; Next generation sequencing

Indexed keywords

ARTICLE; COMPUTER INTERFACE; CONTROLLED STUDY; DATA ANALYSIS; DATA ANALYSIS SOFTWARE; DNA SEQUENCE; INFORMATION; INFORMATION SYSTEM; MEASUREMENT; PRIORITY JOURNAL; QUANTITATIVE ANALYSIS; BIOLOGY; COMPUTER PROGRAM; INTERNET; PROCEDURES;

EID: 84979619031     PISSN: None     EISSN: 2047217X     Source Type: Journal    
DOI: 10.1186/s13742-015-0058-5     Document Type: Article
Times cited : (21)

References (51)
  • 1
    • 72849144434 scopus 로고    scopus 로고
    • Sequencing technologies-the next generation
    • Metzker ML. Sequencing technologies-the next generation. Nat Rev Genet. 2010;11(1):31-46.
    • (2010) Nat Rev Genet. , vol.11 , Issue.1 , pp. 31-46
    • Metzker, M.L.1
  • 2
    • 84878979335 scopus 로고    scopus 로고
    • Biology: The big challenges of big data
    • Marx V. Biology: The big challenges of big data. Nature. 2013;498(7453): 255-60.
    • (2013) Nature. , vol.498 , Issue.7453 , pp. 255-260
    • Marx, V.1
  • 3
    • 85006221417 scopus 로고    scopus 로고
    • Hiseq Comparison. Available from: http://www.illumina.com/systems/ sequencing.ilmn.
  • 4
    • 67649884743 scopus 로고    scopus 로고
    • Fast and accurate short read alignment with Burrows-Wheeler transform
    • Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25(14):1754-60.
    • (2009) Bioinformatics. , vol.25 , Issue.14 , pp. 1754-1760
    • Li, H.1    Durbin, R.2
  • 5
    • 62349130698 scopus 로고    scopus 로고
    • Ultrafast and memory-efficient alignment of short DNA sequences to the human genome
    • Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009;10(3):R25.
    • (2009) Genome Biol. , vol.10 , Issue.3
    • Langmead, B.1    Trapnell, C.2    Pop, M.3    Salzberg, S.L.4
  • 9
    • 84885406698 scopus 로고    scopus 로고
    • GNU Parallel-The Command-Line Power Tool
    • Tange O. GNU Parallel-The Command-Line Power Tool. The USENIX Magazine. 2011;36(1):42-7. Available from: http://www.gnu.org/s/parallel.
    • (2011) The USENIX Magazine. , vol.36 , Issue.1 , pp. 42-47
    • Tange, O.1
  • 10
    • 85006226477 scopus 로고    scopus 로고
    • The Message Passing Interface (MPI) standard. Available from: http://www.mcs.anl.gov/research/projects/mpi/.
  • 11
    • 85006196906 scopus 로고    scopus 로고
    • The Extended Randomized Numerical alignEr. Available from: http://erne.sourceforge.net.
  • 12
    • 85006196918 scopus 로고    scopus 로고
    • pMap: Parallel Sequence Mapping Tool. Available from: http://bmi.osu.edu/hpc/software/pmap/pmap.html.
  • 14
    • 77956295988 scopus 로고    scopus 로고
    • The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data
    • McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, et al. The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20(9): 1297-303.
    • (2010) Genome Res. , vol.20 , Issue.9 , pp. 1297-1303
    • McKenna, A.1    Hanna, M.2    Banks, E.3    Sivachenko, A.4    Cibulskis, K.5    Kernytsky, A.6
  • 16
    • 85006193954 scopus 로고    scopus 로고
    • Hadoop Wiki-Powered By. Available from: https://wiki.apache.org/ hadoop/PoweredBy.
  • 19
    • 74049113467 scopus 로고    scopus 로고
    • 1st ed. Sebastopol: O'Reilly
    • White T. Hadoop: The Definitive Guide. 1st ed. Sebastopol: O'Reilly; 2009. Available from: http://oreilly.com/catalog/9780596521981.
    • (2009) Hadoop: The Definitive Guide
    • White, T.1
  • 20
    • 84905178199 scopus 로고    scopus 로고
    • 1st ed. Sebastopol. O'Reilly Media: Inc
    • Sammer E. Hadoop Operations. 1st ed. Sebastopol. O'Reilly Media: Inc.; 2012.
    • (2012) Hadoop Operations
    • Sammer, E.1
  • 21
    • 65649120715 scopus 로고    scopus 로고
    • CloudBurst: highly sensitive read mapping with MapReduce
    • Schatz MC. CloudBurst: highly sensitive read mapping with MapReduce. Bioinformatics. 2009;25(11):1363-9.
    • (2009) Bioinformatics. , vol.25 , Issue.11 , pp. 1363-1369
    • Schatz, M.C.1
  • 23
    • 77955343193 scopus 로고    scopus 로고
    • Cloud-scale RNA-sequencing differential expression analysis with Myrna
    • Langmead B, Hansen KD, Leek JT, et al. Cloud-scale RNA-sequencing differential expression analysis with Myrna. Genome Biol. 2010;11(8):R83.
    • (2010) Genome Biol. , vol.11 , Issue.8
    • Langmead, B.1    Hansen, K.D.2    Leek, J.T.3
  • 25
    • 78650811522 scopus 로고    scopus 로고
    • An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics
    • Taylor R. An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics. BMC Bioinformatics. 2010;11(Suppl 12):S1. Available from: http://www.biomedcentral.com/1471-2105/11/S12/S1.
    • (2010) BMC Bioinformatics. , vol.11
    • Taylor, R.1
  • 26
    • 85006196865 scopus 로고    scopus 로고
    • The Arabidopsis Information Resource (TAIR). Available from: www.arabidopsis.org.
  • 28
    • 85006195565 scopus 로고    scopus 로고
    • UPPMAX. Available from: http://uppmax.uu.se.
  • 29
    • 85006195570 scopus 로고    scopus 로고
    • Open Nebula. Available from: http://opennebula.org.
  • 30
    • 85006197996 scopus 로고    scopus 로고
    • Cloudera. http://www.cloudera.com/content/cloudera/en/whycloudera/hadoop-and-big-data.html.
  • 31
    • 70449709794 scopus 로고    scopus 로고
    • Virtualization with KVM
    • Feb
    • Habib I. Virtualization with KVM. Linux J. 2008 Feb;2008(166). Available from: http://dl.acm.org/citation.cfm?id=1344209.1344217.
    • (2008) Linux J , vol.2008 , Issue.166
    • Habib, I.1
  • 32
    • 84877074138 scopus 로고    scopus 로고
    • Single Nucleotide Polymorphism (SNP) Detection and Genotype Calling from Massively Parallel Sequencing (MPS) Data
    • Li Y, Chen W, Liu EY, Zhou YH. Single Nucleotide Polymorphism (SNP) Detection and Genotype Calling from Massively Parallel Sequencing (MPS) Data. Stat Biosci. 2013;5(1):3-25.
    • (2013) Stat Biosci. , vol.5 , Issue.1 , pp. 3-25
    • Li, Y.1    Chen, W.2    Liu, E.Y.3    Zhou, Y.H.4
  • 33
    • 85006193884 scopus 로고    scopus 로고
    • Short Oligonucleotide Analysis Package. Available from: http://soap.genomics.org.cn/soapsnp.html.
  • 35
  • 38
    • 85006221465 scopus 로고    scopus 로고
    • 1001 Genomes Project database. Available from: http://1001genomes. org/data/software/shoremap/shoremap%5C_2.0%5C%5C/data/reads/Schneeberger.2009/Schneeberger.2009.single%5C_end.gz.
    • 1001 Genomes Project database
  • 39
    • 79960410567 scopus 로고    scopus 로고
    • SEAL: a distributed short read mapping and duplicate removal tool
    • Pireddu L, Leo S, Zanetti G. SEAL: a distributed short read mapping and duplicate removal tool. Bioinformatics. 2011;27(15):2159-60.
    • (2011) Bioinformatics. , vol.27 , Issue.15 , pp. 2159-2160
    • Pireddu, L.1    Leo, S.2    Zanetti, G.3
  • 40
    • 85030321143 scopus 로고    scopus 로고
    • MapReduce: Simplified data processing on large clusters
    • San Francisco, CA. 2004.
    • Dean J, Ghemawat S. MapReduce: Simplified data processing on large clusters. In: Sixth Symposium on Operating System Design and Implementation: 2004; San Francisco, CA. 2004. https://www.usenix.org/ legacy/event/osdi04/tech/full_papers/dean/dean.pdf.
    • (2004) Sixth Symposium on Operating System Design and Implementation:
    • Dean, J.1    Ghemawat, S.2
  • 45
    • 84899935089 scopus 로고    scopus 로고
    • MARIANE: Using MapReduce in HPC environments
    • Special Section: Intelligent Big Data Processing Special Section: Behavior Data Security Issues in Network Information Propagation Special Section: Energy-efficiency in Large Distributed Computing Architectures Special Section: eScience Infrastructure and Applications
    • Fadika Z, Dede E, Govindaraju M, Ramakrishnan L. MARIANE: Using MapReduce in HPC environments. Future Generation Comput Syst. 2014;36(0):379-88. Special Section: Intelligent Big Data Processing Special Section: Behavior Data Security Issues in Network Information Propagation Special Section: Energy-efficiency in Large Distributed Computing Architectures Special Section: eScience Infrastructure and Applications. Available from: http://www.sciencedirect.com/science/ article/pii/S0167739X13002719.
    • (2014) Future Generation Comput Syst. , vol.36 , pp. 379-388
    • Fadika, Z.1    Dede, E.2    Govindaraju, M.3    Ramakrishnan, L.4
  • 46
    • 84890045371 scopus 로고    scopus 로고
    • BioPig: a Hadoop-based analytic toolkit for large-scale sequence data
    • Nordberg H, Bhatia K, Wang K, Wang Z. BioPig: a Hadoop-based analytic toolkit for large-scale sequence data. Bioinformatics. 2013;29(23):3014-9.
    • (2013) Bioinformatics. , vol.29 , Issue.23 , pp. 3014-3019
    • Nordberg, H.1    Bhatia, K.2    Wang, K.3    Wang, Z.4
  • 50
    • 84907494876 scopus 로고    scopus 로고
    • SparkSeq: fast, scalable, cloud-ready tool for the interactive genomic data analysis with nucleotide precision
    • Wiewiórka MS, Messina A, Pacholewska A, Maffioletti S, Gawrysiak P, Okoniewski MJ. SparkSeq: fast, scalable, cloud-ready tool for the interactive genomic data analysis with nucleotide precision. Bioinformatics. 2014. p. btu343 http://dx.doi.org/10.1093/bioinformatics/btu343.
    • (2014) Bioinformatics
    • Wiewiórka, M.S.1    Messina, A.2    Pacholewska, A.3    Maffioletti, S.4    Gawrysiak, P.5    Okoniewski, M.J.6


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.