메뉴 건너뛰기




Volumn 7, Issue 1, 2018, Pages 1-6

SOAPnuke: A MapReduce acceleration-supported software for integrated quality control and preprocessing of high-throughput sequencing data

Author keywords

High throughput sequencing; MapReduce; Preprocessing; Quality control

Indexed keywords

MESSENGER RNA; SMALL UNTRANSLATED RNA;

EID: 85044575558     PISSN: None     EISSN: 2047217X     Source Type: Journal    
DOI: 10.1093/gigascience/gix120     Document Type: Article
Times cited : (732)

References (51)
  • 1
    • 73949090104 scopus 로고    scopus 로고
    • Applications of ultra-highthroughput sequencing
    • Fox S, Filichkin S, Mockler TC. Applications of ultra-highthroughput sequencing. Methods Mol Biol 2009;553:79-108
    • (2009) Methods Mol Biol , vol.553 , pp. 79-108
    • Fox, S.1    Filichkin, S.2    Mockler, T.C.3
  • 2
    • 84873801934 scopus 로고    scopus 로고
    • High-throughput sequencing for biology and medicine
    • Soon WW, Hariharan M, Snyder MP. High-throughput sequencing for biology and medicine. Mol Syst Biol 2014;9(1):640-
    • (2014) Mol Syst Biol , vol.9 , Issue.1 , pp. 640
    • Soon, W.W.1    Hariharan, M.2    Snyder, M.P.3
  • 3
    • 84938709678 scopus 로고    scopus 로고
    • Big data: astronomical or genomical?
    • Stephens ZD, Lee SY, Faghri F et al. Big data: astronomical or genomical? PLoS Biol 2015;13(7):e1002195
    • (2015) PLoS Biol , vol.13 , Issue.7
    • Stephens, Z.D.1    Lee, S.Y.2    Faghri, F.3
  • 4
    • 84901013670 scopus 로고    scopus 로고
    • Three-stage quality control strategies for DNA re-sequencing data
    • Guo Y, Ye F, Sheng Q et al. Three-stage quality control strategies for DNA re-sequencing data. Brief Bioinformatics 2014;15(6):879-89
    • (2014) Brief Bioinformatics , vol.15 , Issue.6 , pp. 879-889
    • Guo, Y.1    Ye, F.2    Sheng, Q.3
  • 5
    • 84897111313 scopus 로고    scopus 로고
    • Prevention, diagnosis and treatment of high-throughput sequencing data pathologies
    • Zhou X, Rokas A. Prevention, diagnosis and treatment of high-throughput sequencing data pathologies. Mol Ecol 2014;23(7):1679-700
    • (2014) Mol Ecol , vol.23 , Issue.7 , pp. 1679-1700
    • Zhou, X.1    Rokas, A.2
  • 6
    • 79952422304 scopus 로고    scopus 로고
    • Quality control and preprocessing of metagenomic datasets
    • Schmieder R, Edwards R. Quality control and preprocessing of metagenomic datasets. Bioinformatics 2011;27(6):863-4
    • (2011) Bioinformatics , vol.27 , Issue.6 , pp. 863-864
    • Schmieder, R.1    Edwards, R.2
  • 7
    • 52949088691 scopus 로고    scopus 로고
    • A toolkit for analysing large-scale plant small RNA datasets
    • Moxon S, Schwach F, Dalmay T et al. A toolkit for analysing large-scale plant small RNA datasets. Bioinformatics 2008;24(19):2252-3
    • (2008) Bioinformatics , vol.24 , Issue.19 , pp. 2252-2253
    • Moxon, S.1    Schwach, F.2    Dalmay, T.3
  • 8
    • 84874604550 scopus 로고    scopus 로고
    • FASTQ/A short-reads preprocessing tools. Accessed 1 November 2017
    • Gordon A, Hannon GJ. Fastx-toolkit. FASTQ/A short-reads preprocessing tools. http://hannonlab.cshl.edu/fastx toolkit. Accessed 1 November 2017
    • Fastx-toolkit
    • Gordon, A.1    Hannon, G.J.2
  • 9
    • 77957151956 scopus 로고    scopus 로고
    • SolexaQA: At-a-glance quality assessment of Illumina second-generation sequencing data
    • CoxMP, Peterson DA, Biggs PJ. SolexaQA: At-a-glance quality assessment of Illumina second-generation sequencing data. BMC Bioinformatics 2010;11(1):485
    • (2010) BMC Bioinformatics , vol.11 , Issue.1 , pp. 485
    • Cox, M.P.1    Peterson, D.A.2    Biggs, P.J.3
  • 10
    • 84863054719 scopus 로고    scopus 로고
    • BIGpre: a quality assessment package for next-generation sequencing data
    • Zhang T, Luo Y, Liu K et al. BIGpre: a quality assessment package for next-generation sequencing data. Genomics Proteomics Bioinformatics 2011;9(6):238-44
    • (2011) Genomics Proteomics Bioinformatics , vol.9 , Issue.6 , pp. 238-244
    • Zhang, T.1    Luo, Y.2    Liu, K.3
  • 12
    • 84873041419 scopus 로고    scopus 로고
    • HTQC: a fast quality control toolkit for Illumina sequencing data
    • Yang X, Liu D, Liu F et al. HTQC: a fast quality control toolkit for Illumina sequencing data. BMC Bioinformatics 2013;14(1):33
    • (2013) BMC Bioinformatics , vol.14 , Issue.1 , pp. 33
    • Yang, X.1    Liu, D.2    Liu, F.3
  • 14
    • 84875647144 scopus 로고    scopus 로고
    • QC-Chain: fast and holistic quality control method for next-generation sequencing data
    • Zhou Q, Su X, Wang A et al. QC-Chain: fast and holistic quality control method for next-generation sequencing data. PLoS One 2013;8(4):e60234
    • (2013) PLoS One , vol.8 , Issue.4
    • Zhou, Q.1    Su, X.2    Wang, A.3
  • 15
    • 84894254948 scopus 로고    scopus 로고
    • Meta-QC-Chain: comprehensive and fast quality control method for metagenomic data
    • Zhou Q, Su X, Jing G et al. Meta-QC-Chain: comprehensive and fast quality control method for metagenomic data. Genomics Proteomics Bioinformatics 2014;12(1):52-56
    • (2014) Genomics Proteomics Bioinformatics , vol.12 , Issue.1 , pp. 52-56
    • Zhou, Q.1    Su, X.2    Jing, G.3
  • 16
    • 84856468953 scopus 로고    scopus 로고
    • NGS QC Toolkit: a toolkit for quality control of next generation sequencing data
    • Patel RK, Jain M. NGS QC Toolkit: a toolkit for quality control of next generation sequencing data. PLoS One 2012;7(2):e30619
    • (2012) PLoS One , vol.7 , Issue.2
    • Patel, R.K.1    Jain, M.2
  • 18
    • 77953727646 scopus 로고    scopus 로고
    • TagCleaner: identification and removal of tag sequences from genomic and metagenomic datasets
    • Schmieder R, Lim YW, Rohwer F et al. TagCleaner: identification and removal of tag sequences from genomic and metagenomic datasets. BMC Bioinformatics 2010;11(1):341
    • (2010) BMC Bioinformatics , vol.11 , Issue.1 , pp. 341
    • Schmieder, R.1    Lim, Y.W.2    Rohwer, F.3
  • 19
    • 77949513645 scopus 로고    scopus 로고
    • SeqTrim: a highthroughput pipeline for preprocessing any type of sequence reads
    • Falgueras J, Lara AJ, Fernandez-Pozo N et al. SeqTrim: a highthroughput pipeline for preprocessing any type of sequence reads. BMC Bioinformatics 2010;11(1):38
    • (2010) BMC Bioinformatics , vol.11 , Issue.1 , pp. 38
    • Falgueras, J.1    Lara, A.J.2    Fernandez-Pozo, N.3
  • 21
    • 79960698975 scopus 로고    scopus 로고
    • Btrim: a fast, lightweight adapter and quality trimming programfor next-generation sequencing technologies
    • Kong Y. Btrim: a fast, lightweight adapter and quality trimming programfor next-generation sequencing technologies. Genomics 2011;98(2):152-3
    • (2011) Genomics , vol.98 , Issue.2 , pp. 152-153
    • Kong, Y.1
  • 22
    • 84864483625 scopus 로고    scopus 로고
    • RobiNA: a user-friendly, integrated software solution for RNA-seq-based transcriptomics
    • Lohse M, Bolger AM, Nagel A et al. RobiNA: a user-friendly, integrated software solution for RNA-seq-based transcriptomics. Nucleic Acids Res 2012;40(W1):W622-7
    • (2012) Nucleic Acids Res , vol.40 , Issue.W1 , pp. W622-W627
    • Lohse, M.1    Bolger, A.M.2    Nagel, A.3
  • 23
    • 80255127234 scopus 로고    scopus 로고
    • Cutadapt removes adapter sequences from high-throughput sequencing reads
    • Martin M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J 2011;17(1): pp-10
    • (2011) EMBnet J , vol.17 , Issue.1 , pp. 10
    • Martin, M.1
  • 24
    • 84958153152 scopus 로고    scopus 로고
    • AdapterRemoval v2: rapid adapter trimming, identification, and read merging
    • Schubert M, Lindgreen S, Orlando L. AdapterRemoval v2: rapid adapter trimming, identification, and read merging. BMC Res Notes 2016;9(1):88
    • (2016) BMC Res Notes , vol.9 , Issue.1 , pp. 88
    • Schubert, M.1    Lindgreen, S.2    Orlando, L.3
  • 25
    • 84887224285 scopus 로고    scopus 로고
    • FLEXBAR-flexible barcode and adapter processing for next-generation sequencing platforms
    • Dodt M, Roehr JT, Ahmed R et al. FLEXBAR-flexible barcode and adapter processing for next-generation sequencing platforms. Biology (Basel) 2012;1(3):895-905
    • (2012) Biology (Basel) , vol.1 , Issue.3 , pp. 895-905
    • Dodt, M.1    Roehr, J.T.2    Ahmed, R.3
  • 26
    • 84923918674 scopus 로고    scopus 로고
    • PEAT: an intelligent and efficient paired-end sequencing adapter trimming algorithm
    • Li YL, Weng JC, Hsiao CC et al. PEAT: an intelligent and efficient paired-end sequencing adapter trimming algorithm. BMC Bioinformatics 2015;16(Suppl 1):S2
    • (2015) BMC Bioinformatics , vol.16 , pp. S2
    • Li, Y.L.1    Weng, J.C.2    Hsiao, C.C.3
  • 27
    • 84905049901 scopus 로고    scopus 로고
    • Trimmomatic: a flexible trimmer for Illumina sequence data
    • Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 2014;30(15):2114-20
    • (2014) Bioinformatics , vol.30 , Issue.15 , pp. 2114-2120
    • Bolger, A.M.1    Lohse, M.2    Usadel, B.3
  • 28
    • 84968912174 scopus 로고    scopus 로고
    • SeqPurge: highly-sensitive adapter trimming for paired-end NGS data
    • Sturm M, Schroeder C, Bauer P. SeqPurge: highly-sensitive adapter trimming for paired-end NGS data. BMC Bioinformatics 2016;17(1):208
    • (2016) BMC Bioinformatics , vol.17 , Issue.1 , pp. 208
    • Sturm, M.1    Schroeder, C.2    Bauer, P.3
  • 29
    • 84903513963 scopus 로고    scopus 로고
    • Skewer: a fast and accurate adapter trimmer for next-generation sequencing paired-end reads
    • Jiang H, Lei R, Ding SW et al. Skewer: a fast and accurate adapter trimmer for next-generation sequencing paired-end reads. BMC Bioinformatics 2014;15(1):182
    • (2014) BMC Bioinformatics , vol.15 , Issue.1 , pp. 182
    • Jiang, H.1    Lei, R.2    Ding, S.W.3
  • 30
    • 85015226001 scopus 로고    scopus 로고
    • AfterQC: automatic filtering, trimming, error removing and quality control for fastq data
    • Chen S, Huang T, Zhou Y et al. AfterQC: automatic filtering, trimming, error removing and quality control for fastq data. BMC Bioinformatics 2017;18(S3):80
    • (2017) BMC Bioinformatics , vol.18 , pp. 80
    • Chen, S.1    Huang, T.2    Zhou, Y.3
  • 31
    • 85017527138 scopus 로고    scopus 로고
    • Berkeley, CA: Ernest Orlando Lawrence Berkeley National Laboratory
    • BUSHNELL Brian. BBMap: A Fast, Accurate, Splice-Aware Aligner. Berkeley, CA: Ernest Orlando Lawrence Berkeley National Laboratory; 2014
    • (2014) BBMap: A Fast, Accurate, Splice-Aware Aligner
  • 35
    • 84892405230 scopus 로고    scopus 로고
    • NextClip: an analysis and read preparation tool for Nextera long mate pair libraries
    • Leggett RM, Clavijo BJ, Clissold L et al. NextClip: an analysis and read preparation tool for Nextera long mate pair libraries. Bioinformatics 2014;30(4):566-8
    • (2014) Bioinformatics , vol.30 , Issue.4 , pp. 566-568
    • Leggett, R.M.1    Clavijo, B.J.2    Clissold, L.3
  • 36
    • 84890119453 scopus 로고    scopus 로고
    • AlienTrimmer: a tool to quickly and accurately trim offmultiple short contaminant sequences from high-throughput sequencing reads
    • Criscuolo A, Brisse S. AlienTrimmer: a tool to quickly and accurately trim offmultiple short contaminant sequences from high-throughput sequencing reads. Genomics 2013;102(5-6):500-6
    • (2013) Genomics , vol.102 , Issue.5-6 , pp. 500-506
    • Criscuolo, A.1    Brisse, S.2
  • 37
    • 77955801615 scopus 로고    scopus 로고
    • Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences
    • Goecks J, Nekrutenko A, Taylor J et al. Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biol 2010;11(8):R86
    • (2010) Genome Biol , vol.11 , Issue.8 , pp. R86
    • Goecks, J.1    Nekrutenko, A.2    Taylor, J.3
  • 39
    • 85044576037 scopus 로고    scopus 로고
    • Accessed 1 November 2017
    • Illumina. NextSeq 500 system overview. https://support. illumina.com/content/dam/illumina-support/courses/nextseq-system-overview/story content/external files/NextSeq500 System Overview narration.pdf Accessed 1 November 2017
    • NextSeq 500 system overview
  • 40
    • 85040314351 scopus 로고    scopus 로고
    • A reference human genome dataset of the BGISEQ-500 sequencer
    • Huang J, Liang X, Xuan Y et al. A reference human genome dataset of the BGISEQ-500 sequencer. Gigascience 2017;6(5):1-9
    • (2017) Gigascience , vol.6 , Issue.5 , pp. 1-9
    • Huang, J.1    Liang, X.2    Xuan, Y.3
  • 41
    • 84875013846 scopus 로고    scopus 로고
    • Digital gene expression tag profiling analysis of the gene expression patterns regulating the early stage of mouse spermatogenesis
    • Zhang X, Hao L, Meng L et al. Digital gene expression tag profiling analysis of the gene expression patterns regulating the early stage of mouse spermatogenesis. PLoS One 2013;8(3):e58680
    • (2013) PLoS One , vol.8 , Issue.3
    • Zhang, X.1    Hao, L.2    Meng, L.3
  • 42
    • 84938380221 scopus 로고    scopus 로고
    • Optimization of miRNA-seq data preprocessing
    • Tam S, Tsao MS, McPherson JD. Optimization of miRNA-seq data preprocessing. Brief Bioinformatics 2015;16(6):950-63
    • (2015) Brief Bioinformatics , vol.16 , Issue.6 , pp. 950-963
    • Tam, S.1    Tsao, M.S.2    McPherson, J.D.3
  • 43
    • 84976413217 scopus 로고    scopus 로고
    • Extensive sequencing of seven human genomes to characterize benchmark reference materials
    • Zook JM, Catoe D, McDaniel J et al. Extensive sequencing of seven human genomes to characterize benchmark reference materials. Sci Data 2016;3:160025
    • (2016) Sci Data , vol.3
    • Zook, J.M.1    Catoe, D.2    McDaniel, J.3
  • 44
    • 85044611582 scopus 로고    scopus 로고
    • Access 1 November 2017
    • GATK best practices. http://www.broadinstitute.org/gatk/guide/best-practices. Access 1 November 2017
  • 45
    • 85044575799 scopus 로고    scopus 로고
    • Access 1 November 2017
    • NISTv3.3.2, NA12878 high-confidence variant calls as a gold standard. GIAB. 2017. ftp://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/release/NA12878 HG001/NISTv3.3.2/. Access 1 November 2017
    • (2017) GIAB
  • 46
    • 84875013846 scopus 로고    scopus 로고
    • Digital gene expression tag profiling analysis of the gene expression patterns regulating the early stage of mouse spermatogenesis
    • Zhang X, Hao L, Meng L et al. Digital gene expression tag profiling analysis of the gene expression patterns regulating the early stage of mouse spermatogenesis. PLoS One 2013;8(3):e58680
    • (2013) PLoS One , vol.8 , Issue.3
    • Zhang, X.1    Hao, L.2    Meng, L.3
  • 47
    • 78651253572 scopus 로고    scopus 로고
    • Integrated profiling of microRNAs and mRNAs: microRNAs located on Xq27.3 associate with clear cell renal cell carcinoma
    • Zhou L, Chen J, Li Z et al. Integrated profiling of microRNAs and mRNAs: microRNAs located on Xq27.3 associate with clear cell renal cell carcinoma. PLoS One 2010;5(12):e15224
    • (2010) PLoS One , vol.5 , Issue.12
    • Zhou, L.1    Chen, J.2    Li, Z.3
  • 48
    • 84892420248 scopus 로고    scopus 로고
    • The suppression of WRKY44 by GIGANTEA-miR172 pathway is involved in drought response of Arabidopsis thaliana
    • Han Y, Zhang X, Wang W et al. The suppression of WRKY44 by GIGANTEA-miR172 pathway is involved in drought response of Arabidopsis thaliana. PLoS One 2013;8(11):e73541
    • (2013) PLoS One , vol.8 , Issue.11
    • Han, Y.1    Zhang, X.2    Wang, W.3
  • 49
    • 84963525428 scopus 로고    scopus 로고
    • The cytoskeleton adaptor protein ankyrin-1 is upregulated by p53 following DNA damage and alters cell migration
    • Hall AE, Lu WT, Godfrey JD et al. The cytoskeleton adaptor protein ankyrin-1 is upregulated by p53 following DNA damage and alters cell migration. Cell Death Dis 2016;7(4):e2184
    • (2016) Cell Death Dis , vol.7 , Issue.4
    • Hall, A.E.1    Lu, W.T.2    Godfrey, J.D.3
  • 50
    • 84961370715 scopus 로고    scopus 로고
    • A highly specific microRNA-mediated mechanism silences LTR retrotransposons of strawberry
    • Surbanovski N, Brilli M, Moser M et al. A highly specific microRNA-mediated mechanism silences LTR retrotransposons of strawberry. Plant J 2016;85(1):70-82
    • (2016) Plant J , vol.85 , Issue.1 , pp. 70-82
    • Surbanovski, N.1    Brilli, M.2    Moser, M.3
  • 51
    • 85044610051 scopus 로고    scopus 로고
    • Supporting data for "SOAPnuke: a MapReduce acceleration-supported software for integrated quality control and preprocessing of highthroughput sequencing data."
    • Chen Y, Chen Y, Shi C et al. Supporting data for "SOAPnuke: a MapReduce acceleration-supported software for integrated quality control and preprocessing of highthroughput sequencing data." GigaScience Database 2017. http://dx.doi.org/10.5524/100373
    • (2017) GigaScience Database
    • Chen, Y.1    Chen, Y.2    Shi, C.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.