메뉴 건너뛰기




Volumn 65, Issue , 2016, Pages 153-168

Scalable and efficient whole-exome data processing using workflows on the cloud

Author keywords

Cloud computing; HPC; Performance analysis; Whole exome sequencing; Workflow based application

Indexed keywords

BIOINFORMATICS; CLOUD COMPUTING; COMPUTATION THEORY; COMPUTER ARCHITECTURE; DATA FLOW ANALYSIS; DATA HANDLING; INFORMATION MANAGEMENT; PIPELINE PROCESSING SYSTEMS; PIPELINES; SOFTWARE PROTOTYPING; WINDOWS OPERATING SYSTEM; WORK SIMPLIFICATION;

EID: 84959191705     PISSN: 0167739X     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.future.2016.01.001     Document Type: Article
Times cited : (21)

References (38)
  • 1
    • 84936984398 scopus 로고    scopus 로고
    • Data analysis: Create a cloud commons
    • [1] Stein, L.D., Knoppers, B.M., Campbell, P., Getz, G., Korbel, J.O., Data analysis: Create a cloud commons. Nature 523:7559 (2015), 149–151, 10.1038/523149a.
    • (2015) Nature , vol.523 , Issue.7559 , pp. 149-151
    • Stein, L.D.1    Knoppers, B.M.2    Campbell, P.3    Getz, G.4    Korbel, J.O.5
  • 2
    • 84875909665 scopus 로고    scopus 로고
    • DNA sequencing costs: data from the NHGRI Genome Sequencing Program (GSP)
    • [2] K. Wetterstrand, DNA sequencing costs: data from the NHGRI Genome Sequencing Program (GSP), 2015.
    • (2015)
    • Wetterstrand, K.1
  • 3
    • 77954526823 scopus 로고    scopus 로고
    • The case for cloud computing in genome informatics
    • [3] Stein, L.D., The case for cloud computing in genome informatics. Genome Biol., 11(5), 2010, 207, 10.1186/gb-2010-11-5-207.
    • (2010) Genome Biol. , vol.11 , Issue.5 , pp. 207
    • Stein, L.D.1
  • 4
    • 84887611816 scopus 로고    scopus 로고
    • Next-generation sequencing in the clinic: promises and challenges
    • [4] Xuan, J., Yu, Y., Qing, T., Guo, L., Shi, L., Next-generation sequencing in the clinic: promises and challenges. Cancer lett. 340:2 (2013), 284–295, 10.1016/j.canlet.2012.11.025.
    • (2013) Cancer lett. , vol.340 , Issue.2 , pp. 284-295
    • Xuan, J.1    Yu, Y.2    Qing, T.3    Guo, L.4    Shi, L.5
  • 5
    • 84873721644 scopus 로고    scopus 로고
    • From sequencer to supercomputer: an automatic pipeline for managing and processing next generation sequencing data
    • in: AMIA Summits on Translational Science proceedings AMIA Summit on Translational Science,
    • [5] T. Camerlengo, H.G. Ozer, R. Onti-Srinivasan, P. Yan, T. Huang, J. Parvin, K. Huang, From sequencer to supercomputer: an automatic pipeline for managing and processing next generation sequencing data, in: AMIA Summits on Translational Science proceedings AMIA Summit on Translational Science, 2012, pp. 1–10.
    • (2012) , pp. 1-10
    • Camerlengo, T.1    Ozer, H.G.2    Onti-Srinivasan, R.3    Yan, P.4    Huang, T.5    Parvin, J.6    Huang, K.7
  • 6
    • 84864754844 scopus 로고    scopus 로고
    • SIMPLEX: cloud-enabled pipeline for the comprehensive analysis of exome sequencing data
    • [6] Fischer, M., Snajder, R., Pabinger, S., Dander, A., Schossig, A., Zschocke, J., Trajanoski, Z., Stocker, G., SIMPLEX: cloud-enabled pipeline for the comprehensive analysis of exome sequencing data. PLoS ONE, 7(8), 2012, e41948, 10.1371/journal.pone.0041948.
    • (2012) PLoS ONE , vol.7 , Issue.8 , pp. e41948
    • Fischer, M.1    Snajder, R.2    Pabinger, S.3    Dander, A.4    Schossig, A.5    Zschocke, J.6    Trajanoski, Z.7    Stocker, G.8
  • 8
    • 84892473851 scopus 로고    scopus 로고
    • SeqBench: integrated solution for the management and analysis of exome sequencing data
    • [8] Dander, A., Pabinger, S., Sperk, M., Fischer, M., Stocker, G., Trajanoski, Z., SeqBench: integrated solution for the management and analysis of exome sequencing data. BMC Res. Notes, 7(1), 2014, 43, 10.1186/1756-0500-7-43.
    • (2014) BMC Res. Notes , vol.7 , Issue.1 , pp. 43
    • Dander, A.1    Pabinger, S.2    Sperk, M.3    Fischer, M.4    Stocker, G.5    Trajanoski, Z.6
  • 15
    • 59849095285 scopus 로고    scopus 로고
    • Workflows and e-Science: An overview of workflow system features and capabilities
    • [15] Deelman, E., Gannon, D., Shields, M., Taylor, I., Workflows and e-Science: An overview of workflow system features and capabilities. Future Gener. Comput. Syst. 25:5 (2009), 528–540, 10.1016/j.future.2008.06.012.
    • (2009) Future Gener. Comput. Syst. , vol.25 , Issue.5 , pp. 528-540
    • Deelman, E.1    Gannon, D.2    Shields, M.3    Taylor, I.4
  • 18
    • 84893786296 scopus 로고    scopus 로고
    • Cloud computing for fast prediction of chemical activity
    • [18] Cała, J., Hiden, H., Woodman, S., Watson, P., Cloud computing for fast prediction of chemical activity. Future Gener. Comput. Syst. 29:7 (2013), 1860–1869, 10.1016/j.future.2013.01.011.
    • (2013) Future Gener. Comput. Syst. , vol.29 , Issue.7 , pp. 1860-1869
    • Cała, J.1    Hiden, H.2    Woodman, S.3    Watson, P.4
  • 19
    • 77949587649 scopus 로고    scopus 로고
    • Fast and accurate long-read alignment with Burrows-Wheeler transform
    • [19] Li, H., Durbin, R., Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics (Oxford, England) 26:5 (2010), 589–595, 10.1093/bioinformatics/btp698.
    • (2010) Bioinformatics (Oxford, England) , vol.26 , Issue.5 , pp. 589-595
    • Li, H.1    Durbin, R.2
  • 21
    • 77956534324 scopus 로고    scopus 로고
    • ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data
    • e164–e164.
    • [21] Wang, K., Li, M., Hakonarson, H., ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res., 38(16), 2010 e164–e164. http://dx.doi.org/10.1093/nar/gkq603.
    • (2010) Nucleic Acids Res. , vol.38 , Issue.16
    • Wang, K.1    Li, M.2    Hakonarson, H.3
  • 22
    • 84858659440 scopus 로고    scopus 로고
    • Treating shimantic web syndrome with ontologies
    • J. Domingue L. Cabral E. Motta Milton Keynes UK
    • [22] Hull, D., Stevens, R., Lord, P., Wroe, C., Goble, C., Treating shimantic web syndrome with ontologies. Domingue, J., Cabral, L., Motta, E., (eds.) AKT Workshop on Semantic Web Services, 2004, Milton Keynes, UK, 1–8.
    • (2004) AKT Workshop on Semantic Web Services , pp. 1-8
    • Hull, D.1    Stevens, R.2    Lord, P.3    Wroe, C.4    Goble, C.5
  • 24
  • 25
    • 84982983574 scopus 로고    scopus 로고
    • ExM: High level dataflow programming for extreme-scale systems
    • in: 4th USENIX Workshop on Hot Topics in Parallelism (HotPar), poster, Berkeley, CA
    • [25] T.G. Armstrong, J.M. Wozniak, M. Wilde, K. Maheshwari, D.S. Katz, M. Ripeanu, E.L. Lusk, I.T. Foster, ExM: High level dataflow programming for extreme-scale systems, in: 4th USENIX Workshop on Hot Topics in Parallelism (HotPar), poster, Berkeley, CA, 2012.
    • (2012)
    • Armstrong, T.G.1    Wozniak, J.M.2    Wilde, M.3    Maheshwari, K.4    Katz, D.S.5    Ripeanu, M.6    Lusk, E.L.7    Foster, I.T.8
  • 26
    • 84893817402 scopus 로고    scopus 로고
    • Performance evaluation of parallel strategies in public clouds: A study with phylogenomic workflows
    • [26] de~Oliveira, D., Ocaña, K.A., Ogasawara, E., Dias, J., Gonçalves, J., Baião, F., Mattoso, M., Performance evaluation of parallel strategies in public clouds: A study with phylogenomic workflows. Future Gener. Comput. Syst. 29:7 (2013), 1816–1825, 10.1016/j.future.2012.12.019.
    • (2013) Future Gener. Comput. Syst. , vol.29 , Issue.7 , pp. 1816-1825
    • de~Oliveira, D.1    Ocaña, K.A.2    Ogasawara, E.3    Dias, J.4    Gonçalves, J.5    Baião, F.6    Mattoso, M.7
  • 27
    • 84924339377 scopus 로고    scopus 로고
    • Churchill: an ultra-fast, deterministic, highly scalable and balanced parallelization strategy for the discovery of human genetic variation in clinical and population-scale genomics
    • [27] Kelly, B.J., Fitch, J.R., Hu, Y., Corsmeier, D.J., Zhong, H., Wetzel, A.N., Nordquist, R.D., Newsom, D.L., White, P., Churchill: an ultra-fast, deterministic, highly scalable and balanced parallelization strategy for the discovery of human genetic variation in clinical and population-scale genomics. Genome Biol., 16(1), 2015, 6, 10.1186/s13059-014-0577-x.
    • (2015) Genome Biol. , vol.16 , Issue.1 , pp. 6
    • Kelly, B.J.1    Fitch, J.R.2    Hu, Y.3    Corsmeier, D.J.4    Zhong, H.5    Wetzel, A.N.6    Nordquist, R.D.7    Newsom, D.L.8    White, P.9
  • 28
    • 84902553652 scopus 로고    scopus 로고
    • Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses
    • [28] Liu, B., Madduri, R.K., Sotomayor, B., Chard, K., Lacinski, L., Dave, U.J., Li, J., Liu, C., Foster, I.T., Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses. J. Biomed. Inf. 49 (2014), 119–133, 10.1016/j.jbi.2014.01.005.
    • (2014) J. Biomed. Inf. , vol.49 , pp. 119-133
    • Liu, B.1    Madduri, R.K.2    Sotomayor, B.3    Chard, K.4    Lacinski, L.5    Dave, U.J.6    Li, J.7    Liu, C.8    Foster, I.T.9
  • 29
    • 77955801615 scopus 로고    scopus 로고
    • Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences
    • [29] Goecks, J., Nekrutenko, A., Taylor, J., Team, T.G., Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biol., 11(8), 2010, R86, 10.1186/gb-2010-11-8-r86.
    • (2010) Genome Biol. , vol.11 , Issue.8 , pp. R86
    • Goecks, J.1    Nekrutenko, A.2    Taylor, J.3    Team, T.G.4
  • 30
    • 84947222502 scopus 로고    scopus 로고
    • SVI: a simple single-nucleotide human variant interpretation tool for clinical use - springer
    • N. Ashish J.-L. Ambite Springer International Publishing
    • [30] Missier, P., Wijaya, E., Kirby, R., SVI: a simple single-nucleotide human variant interpretation tool for clinical use - springer. Ashish, N., Ambite, J.-L., (eds.) Data Integration in the Life Sciences, vol. 9162, 2015, Springer International Publishing, 180–194, 10.1007/978-3-319-21843-4_14.
    • (2015) Data Integration in the Life Sciences , vol.9162 , pp. 180-194
    • Missier, P.1    Wijaya, E.2    Kirby, R.3
  • 31
    • 84883619207 scopus 로고    scopus 로고
    • Accelerating data-intensive genome analysis in the cloud
    • in: Proceedings of the 5th International Conference on Bioinformatics and Computational Biology, Honolulu
    • [31] N.M. Mohamed, H. Lin, W. Feng, Accelerating data-intensive genome analysis in the cloud, in: Proceedings of the 5th International Conference on Bioinformatics and Computational Biology, Honolulu, 2013.
    • (2013)
    • Mohamed, N.M.1    Lin, H.2    Feng, W.3
  • 32
    • 79955554401 scopus 로고    scopus 로고
    • Efficient storage of high throughput DNA sequencing data using reference-based compression
    • [32] Hsi-Yang Fritz, M., Leinonen, R., Cochrane, G., Birney, E., Efficient storage of high throughput DNA sequencing data using reference-based compression. Genome Res. 21:5 (2011), 734–740, 10.1101/gr.114819.110.
    • (2011) Genome Res. , vol.21 , Issue.5 , pp. 734-740
    • Hsi-Yang Fritz, M.1    Leinonen, R.2    Cochrane, G.3    Birney, E.4
  • 33
    • 84925865899 scopus 로고    scopus 로고
    • Fastq2vcf: a concise and transparent pipeline for whole-exome sequencing data analyses
    • [33] Gao, X., Xu, J., Starmer, J., Fastq2vcf: a concise and transparent pipeline for whole-exome sequencing data analyses. BMC Res. Notes, 8(1), 2015, 72, 10.1186/s13104-015-1027-x.
    • (2015) BMC Res. Notes , vol.8 , Issue.1 , pp. 72
    • Gao, X.1    Xu, J.2    Starmer, J.3
  • 34
    • 84962831643 scopus 로고    scopus 로고
    • The Impact of High-Performance Computing Best Practice Applied to Next-Generation Sequencing Workflows
    • bioRxiv, 017665.
    • [34] P. Carrier, B. Long, R. Walsh, J. Dawson, C.P. Sosa, B. Haas, T. Tickle, T. William, The Impact of High-Performance Computing Best Practice Applied to Next-Generation Sequencing Workflows, bioRxiv, 2015, 017665. http://dx.doi.org/10.1101/017665.
    • (2015)
    • Carrier, P.1    Long, B.2    Walsh, R.3    Dawson, J.4    Sosa, C.P.5    Haas, B.6    Tickle, T.7    William, T.8
  • 36
    • 84869862536 scopus 로고    scopus 로고
    • CloudMan as a platform for tool, data, and analysis distribution
    • [36] Afgan, E., Chapman, B., Taylor, J., CloudMan as a platform for tool, data, and analysis distribution. BMC Bioinformatics, 13(1), 2012, 315, 10.1186/1471-2105-13-315.
    • (2012) BMC Bioinformatics , vol.13 , Issue.1 , pp. 315
    • Afgan, E.1    Chapman, B.2    Taylor, J.3
  • 37
    • 70349753672 scopus 로고    scopus 로고
    • Benchmarking amazon EC2 for high-performance scientific computing
    • [37] Walker, E., Benchmarking amazon EC2 for high-performance scientific computing. login 33:5 (2008), 18–23.
    • (2008) login , vol.33 , Issue.5 , pp. 18-23
    • Walker, E.1
  • 38
    • 84885883700 scopus 로고    scopus 로고
    • A performance analysis of EC2 cloud computing services for scientific computing
    • Dimiter R. Avresky M. Diaz A. Bode B. Ciciani E. Dekel Springer Berlin, Heidelberg (Chapter 4)
    • [38] Ostermann, S., Iosup, A., Yigitbasi, N., Prodan, R., Fahringer, T., Epema, D., A performance analysis of EC2 cloud computing services for scientific computing. Avresky, Dimiter R., Diaz, M., Bode, A., Ciciani, B., Dekel, E., (eds.) Cloud Computing, 2010, Springer, Berlin, Heidelberg, 115–131 (Chapter 4) http://dx.doi.org/10.1007/978-3-642-12636-9_9.
    • (2010) Cloud Computing , pp. 115-131
    • Ostermann, S.1    Iosup, A.2    Yigitbasi, N.3    Prodan, R.4    Fahringer, T.5    Epema, D.6


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.