메뉴 건너뛰기




Volumn 16, Issue 6, 2009, Pages 759-767

Large Datasets in Biomedicine: A Discussion of Salient Analytic Issues

Author keywords

[No Author keywords available]

Indexed keywords

ARTICLE; BIOINFORMATICS; BIOMEDICINE; DATA ANALYSIS; DATA BASE; DECISION SUPPORT SYSTEM; INFORMATION PROCESSING; STATISTICAL SIGNIFICANCE;

EID: 70350465146     PISSN: 10675027     EISSN: None     Source Type: Journal    
DOI: 10.1197/jamia.M2780     Document Type: Article
Times cited : (38)

References (72)
  • 1
    • 68949091228 scopus 로고    scopus 로고
    • A Perspective on Cluster Analysis
    • Kettenring J.R. A Perspective on Cluster Analysis. Stat Anal Data Min 1 (2008) 52-53
    • (2008) Stat Anal Data Min , vol.1 , pp. 52-53
    • Kettenring, J.R.1
  • 2
    • 0022891847 scopus 로고
    • A rapid two-stage modeling technique for exploring large datasets
    • Gilks W.R. A rapid two-stage modeling technique for exploring large datasets. Appl Stat 352 (1986) 183-194
    • (1986) Appl Stat , vol.352 , pp. 183-194
    • Gilks, W.R.1
  • 3
    • 0043114430 scopus 로고
    • A high dimensional two sample significance test
    • Dempster A.P. A high dimensional two sample significance test. Ann Math Stat 294 (1958) 995-1,010
    • (1958) Ann Math Stat , vol.294
    • Dempster, A.P.1
  • 6
    • 70350519927 scopus 로고
    • Huge datasets
    • Dutter, and Grossmann (Eds), Verlag, Heidelberg Proceedings. Physica
    • Huber P.J. Huge datasets. In: Dutter, and Grossmann (Eds). Compstat (1994), Verlag, Heidelberg 7 Proceedings. Physica
    • (1994) Compstat , pp. 7
    • Huber, P.J.1
  • 7
    • 0033245954 scopus 로고    scopus 로고
    • Massive datasets workshop: Four years after
    • Huber P.J. Massive datasets workshop: Four years after. J Comput Graph Stat 83 (1999) 635-652
    • (1999) J Comput Graph Stat , vol.83 , pp. 635-652
    • Huber, P.J.1
  • 8
    • 0033220832 scopus 로고    scopus 로고
    • Meta analysis of classification algorithms for pattern recognition
    • Sohn S.Y. Meta analysis of classification algorithms for pattern recognition. IEEE T Patterns Anal 2111 (1999) 1137-1144
    • (1999) IEEE T Patterns Anal , vol.2111 , pp. 1137-1144
    • Sohn, S.Y.1
  • 9
    • 33846088142 scopus 로고    scopus 로고
    • The molecular biology database collection: 2007 Update
    • Galperin M.Y. The molecular biology database collection: 2007 Update. Nucleic Acids Res 35 (2007) D3-D4
    • (2007) Nucleic Acids Res , vol.35
    • Galperin, M.Y.1
  • 10
    • 0345720362 scopus 로고    scopus 로고
    • High-dimensional data analysis: The curses and blessings of dimensionality
    • st Century (2000)
    • (2000) st Century
    • Donoho, D.L.1
  • 11
    • 0003447510 scopus 로고    scopus 로고
    • Shortliffe E.H., and Cimino J.J. (Eds), Springer-Verlag, New York
    • rd edn (2006), Springer-Verlag, New York
    • (2006) rd edn
  • 14
    • 0033750892 scopus 로고    scopus 로고
    • Overcoming the curse of dimensionality in clustering by means of the wavelet transform
    • Murtaugh F., Starck J.L., and Berry M.W. Overcoming the curse of dimensionality in clustering by means of the wavelet transform. Comput J 43 (2000) 107-120
    • (2000) Comput J , vol.43 , pp. 107-120
    • Murtaugh, F.1    Starck, J.L.2    Berry, M.W.3
  • 15
    • 33750688534 scopus 로고    scopus 로고
    • Inter-patient distance metrics using SNOMED CT defining relationships
    • Melton G.B., Parsons S., Morrison F.P., et al. Inter-patient distance metrics using SNOMED CT defining relationships. J Biomed Inform 396 (2006) 697-705
    • (2006) J Biomed Inform , vol.396 , pp. 697-705
    • Melton, G.B.1    Parsons, S.2    Morrison, F.P.3
  • 19
    • 0002774069 scopus 로고
    • Pattern recognition and signal processing
    • Sijthoff, and Noordhoff (Eds), Algorithms, Alphen aan den Rijn, Netherlands
    • Kittler J. Pattern recognition and signal processing. In: Sijthoff, and Noordhoff (Eds). Feature Set Search (1978), Algorithms, Alphen aan den Rijn, Netherlands 41-60
    • (1978) Feature Set Search , pp. 41-60
    • Kittler, J.1
  • 23
    • 35748932917 scopus 로고    scopus 로고
    • A review of feature selection techniques in bioinformatics
    • Saeys Y., Inza I., and Larranaga P. A review of feature selection techniques in bioinformatics. Bioinformatics 2319 (2007) 2507-2517
    • (2007) Bioinformatics , vol.2319 , pp. 2507-2517
    • Saeys, Y.1    Inza, I.2    Larranaga, P.3
  • 24
    • 0344609892 scopus 로고    scopus 로고
    • An introduction to variable and feature selection
    • Guyon I., and Elisseeff A. An introduction to variable and feature selection. J Mach Learn Rev 1 (2003; 3Mar) 157-182
    • (2003) J Mach Learn Rev , vol.1 , pp. 157-182
    • Guyon, I.1    Elisseeff, A.2
  • 25
    • 84950941772 scopus 로고
    • Projection pursuit regression
    • Friedman J.H., and Stuetzle W. Projection pursuit regression. J Am Stat Assoc 76 (1981) 817-823
    • (1981) J Am Stat Assoc , vol.76 , pp. 817-823
    • Friedman, J.H.1    Stuetzle, W.2
  • 26
    • 0033909182 scopus 로고    scopus 로고
    • On the geometry of similarity search: Dimensionality curse and concentration of measure
    • Pestov V. On the geometry of similarity search: Dimensionality curse and concentration of measure. Info Process Lett 73 (2000) 41-51
    • (2000) Info Process Lett , vol.73 , pp. 41-51
    • Pestov, V.1
  • 27
    • 0001677717 scopus 로고
    • Controlling the false discovery rate: A practical and powerful approach to multiple testing
    • Benjamini Y., and Hochberg Y. Controlling the false discovery rate: A practical and powerful approach to multiple testing. J R Stat Soc B 57 (1995) 289-300
    • (1995) J R Stat Soc B , vol.57 , pp. 289-300
    • Benjamini, Y.1    Hochberg, Y.2
  • 28
    • 2142732441 scopus 로고    scopus 로고
    • Large-Scale simultaneous hypothesis testing: The choice of a null hypothesis
    • Efron B. Large-Scale simultaneous hypothesis testing: The choice of a null hypothesis. J Am Stat Assoc 99 465 (2004) 96-104
    • (2004) J Am Stat Assoc , vol.99 , Issue.465 , pp. 96-104
    • Efron, B.1
  • 29
  • 30
    • 33947216221 scopus 로고    scopus 로고
    • Correlation and Large-Scale simultaneous significance testing
    • Efron B. Correlation and Large-Scale simultaneous significance testing. J Am Stat Assoc 102 (2007) 93-103
    • (2007) J Am Stat Assoc , vol.102 , pp. 93-103
    • Efron, B.1
  • 31
    • 0036020892 scopus 로고    scopus 로고
    • A direct approach to false discovery rates
    • Storey J.D. A direct approach to false discovery rates. J R Stat Soc B 64 (2002) 479-498
    • (2002) J R Stat Soc B , vol.64 , pp. 479-498
    • Storey, J.D.1
  • 32
    • 0036020944 scopus 로고    scopus 로고
    • Operating characteristics and extensions of the false discovery rate procedure
    • Genovese C., and Wasserman L. Operating characteristics and extensions of the false discovery rate procedure. J R Stat Soc B 64 (2002) 499-517
    • (2002) J R Stat Soc B , vol.64 , pp. 499-517
    • Genovese, C.1    Wasserman, L.2
  • 33
    • 34248361892 scopus 로고    scopus 로고
    • A statistical methodology for analyzing co-occurence data from a large sample
    • Cao H., Hripcsak G., and Markatou M. A statistical methodology for analyzing co-occurence data from a large sample. J Biomed Inform 403 (2007) 343-352
    • (2007) J Biomed Inform , vol.403 , pp. 343-352
    • Cao, H.1    Hripcsak, G.2    Markatou, M.3
  • 34
    • 0034567392 scopus 로고    scopus 로고
    • Considering clustering: A methodological review of clinical decision support system studies
    • Chuang J.H., Hripcsak G., and Jenders R.A. Considering clustering: A methodological review of clinical decision support system studies. Proc AMIA Symp (2000) 146-150
    • (2000) Proc AMIA Symp , pp. 146-150
    • Chuang, J.H.1    Hripcsak, G.2    Jenders, R.A.3
  • 36
    • 37549029793 scopus 로고    scopus 로고
    • The properties of high-dimensional data spaces: Implications for exploring gene and protein expression data
    • Clarke R., Ressom H.W., Wang A., et al. The properties of high-dimensional data spaces: Implications for exploring gene and protein expression data. Nat Rev Cancer 81 (2008) 37-49
    • (2008) Nat Rev Cancer , vol.81 , pp. 37-49
    • Clarke, R.1    Ressom, H.W.2    Wang, A.3
  • 37
    • 44049091467 scopus 로고    scopus 로고
    • Diverse correlation structures in gene expression data and their utility in improving statistical inference
    • Klebanov L., and Yakovlev A. Diverse correlation structures in gene expression data and their utility in improving statistical inference. Ann Appl Stat 12 (2007) 538-559
    • (2007) Ann Appl Stat , vol.12 , pp. 538-559
    • Klebanov, L.1    Yakovlev, A.2
  • 38
  • 39
    • 2942677611 scopus 로고    scopus 로고
    • Multivariate exploratory tools for microarray data analysis
    • Szabo A., Boucher K., Jones D., et al. Multivariate exploratory tools for microarray data analysis. Biostatistics 4 (2003) 555-567
    • (2003) Biostatistics , vol.4 , pp. 555-567
    • Szabo, A.1    Boucher, K.2    Jones, D.3
  • 40
    • 0348143180 scopus 로고    scopus 로고
    • A random variance model for detection of differential gene expression in small microarray experiments
    • Wright G.W., and Simon R.M. A random variance model for detection of differential gene expression in small microarray experiments. Bioinform 19 (2003) 2448-2455
    • (2003) Bioinform , vol.19 , pp. 2448-2455
    • Wright, G.W.1    Simon, R.M.2
  • 41
    • 0035733108 scopus 로고    scopus 로고
    • The control of the false discovery rate in multiple testing under dependency
    • Benjamini Y., and Yekutieli D. The control of the false discovery rate in multiple testing under dependency. Ann Stat 29 (2001) 1165-1188
    • (2001) Ann Stat , vol.29 , pp. 1165-1188
    • Benjamini, Y.1    Yekutieli, D.2
  • 42
    • 39749098114 scopus 로고    scopus 로고
    • Control of generalized error rates in multiple testing
    • Romano J.P., and Wolf M. Control of generalized error rates in multiple testing. Ann Stat 35 (2007) 1378-1408
    • (2007) Ann Stat , vol.35 , pp. 1378-1408
    • Romano, J.P.1    Wolf, M.2
  • 43
    • 49749111978 scopus 로고    scopus 로고
    • On false discovery control under dependence
    • Wu W.B. On false discovery control under dependence. Ann Stat 361 (2008) 364-380
    • (2008) Ann Stat , vol.361 , pp. 364-380
    • Wu, W.B.1
  • 44
    • 0033472709 scopus 로고    scopus 로고
    • Resampling-based false discovery rate controlling multiple test procedures for correlated test statistics
    • Yekultieli D., and Benjamini Y. Resampling-based false discovery rate controlling multiple test procedures for correlated test statistics. J Stat Plan Infer 82 (1999) 171-196
    • (1999) J Stat Plan Infer , vol.82 , pp. 171-196
    • Yekultieli, D.1    Benjamini, Y.2
  • 45
    • 0345822598 scopus 로고    scopus 로고
    • The positive false discovery rate: A Bayesian interpretation and the Q-value
    • Storey J.D. The positive false discovery rate: A Bayesian interpretation and the Q-value. Ann Stat 316 (2003) 2013-2035
    • (2003) Ann Stat , vol.316 , pp. 2013-2035
    • Storey, J.D.1
  • 46
    • 34547326633 scopus 로고    scopus 로고
    • Some results on the control of the false discovery rate under dependence
    • Farcomeni A. Some results on the control of the false discovery rate under dependence. Scand J Stat 34 (2007) 275-297
    • (2007) Scand J Stat , vol.34 , pp. 275-297
    • Farcomeni, A.1
  • 47
    • 33746257145 scopus 로고    scopus 로고
    • The practice of cluster analysis
    • Kettenring J.R. The practice of cluster analysis. J Classif 23 (2006) 3-30
    • (2006) J Classif , vol.23 , pp. 3-30
    • Kettenring, J.R.1
  • 48
    • 25444485444 scopus 로고    scopus 로고
    • The effects of normalization on the correlation structure of microarray data
    • Qiu X., Brooks A.I., Klebanov L., and Yakovlev N. The effects of normalization on the correlation structure of microarray data. BMC Bioinform 6 (2005) 120
    • (2005) BMC Bioinform , vol.6 , pp. 120
    • Qiu, X.1    Brooks, A.I.2    Klebanov, L.3    Yakovlev, N.4
  • 49
    • 38549134485 scopus 로고    scopus 로고
    • Revisiting adverse effects of cross-hybridization in Affymetrix gene expression data: Do they matter for correlation analysis?
    • Klebanov L., Chen L., and Yakovlev A. Revisiting adverse effects of cross-hybridization in Affymetrix gene expression data: Do they matter for correlation analysis?. Biol Direct 2 (2007) 28
    • (2007) Biol Direct , vol.2 , pp. 28
    • Klebanov, L.1    Chen, L.2    Yakovlev, A.3
  • 50
    • 0001735517 scopus 로고
    • On the mathematical foundations of theoretical statistics
    • Fisher R.A. On the mathematical foundations of theoretical statistics. Philos Trans Roy Soc London Ser A 222 (1922) 309-368
    • (1922) Philos Trans Roy Soc London Ser A , vol.222 , pp. 309-368
    • Fisher, R.A.1
  • 51
    • 84939431353 scopus 로고
    • Theory of statistical estimation
    • Fisher R.A. Theory of statistical estimation. Proc Cambr Philos Soc 22 (1925) 700-725
    • (1925) Proc Cambr Philos Soc , vol.22 , pp. 700-725
    • Fisher, R.A.1
  • 52
    • 84856043672 scopus 로고
    • A mathematical theory of communication
    • and 623-56
    • Shannon C.E. A mathematical theory of communication. Bell Syst Tech J 27 (1948) 379-423 and 623-56
    • (1948) Bell Syst Tech J , vol.27 , pp. 379-423
    • Shannon, C.E.1
  • 53
    • 0001927585 scopus 로고
    • On information and sufficiency
    • Kullback S., and Leibler R.A. On information and sufficiency. Math Stat A 22 (1951) 79-86
    • (1951) Math Stat A , vol.22 , pp. 79-86
    • Kullback, S.1    Leibler, R.A.2
  • 55
    • 70350508261 scopus 로고    scopus 로고
    • A review of data complexity measures and their applicability to pattern classification problems. Actas del III Taller Nacional de Mineria de Datos y Aprendizaje
    • Sotoca J.M., Sánchez J.S., and Mollineda R.A. A review of data complexity measures and their applicability to pattern classification problems. Actas del III Taller Nacional de Mineria de Datos y Aprendizaje. TAMIDA (2005) 77-83
    • (2005) TAMIDA , pp. 77-83
    • Sotoca, J.M.1    Sánchez, J.S.2    Mollineda, R.A.3
  • 58
    • 70350501227 scopus 로고    scopus 로고
    • Hunting of the snark; finding data glitches using data mining methods
    • Dasu T., and Johnson T. Hunting of the snark; finding data glitches using data mining methods. MIT Workshop of Information Quality (1999) 89-98
    • (1999) MIT Workshop of Information Quality , pp. 89-98
    • Dasu, T.1    Johnson, T.2
  • 61
    • 34248190149 scopus 로고    scopus 로고
    • Anatomy of data integration
    • Brazhnik O., and Jones J.F. Anatomy of data integration. J Biomed Inform 403 (2007) 252-269
    • (2007) J Biomed Inform , vol.403 , pp. 252-269
    • Brazhnik, O.1    Jones, J.F.2
  • 62
    • 33748491517 scopus 로고    scopus 로고
    • The microarray quality control MAQC. Project shows inter- and intraplatform reproducibility of gene expression measurements
    • MAQC Consortium
    • MAQC Consortium. The microarray quality control MAQC. Project shows inter- and intraplatform reproducibility of gene expression measurements. Nat Biotechnol 249 (2006) 1151-1161
    • (2006) Nat Biotechnol , vol.249 , pp. 1151-1161
  • 63
    • 33751359281 scopus 로고    scopus 로고
    • A scalable method for integration and functional analysis of multiple microarray datasets
    • Huttenhower C., Hibbs M., Myers C., and Troyanskaya O.G. A scalable method for integration and functional analysis of multiple microarray datasets. Bioinform 22 (2006) 2890-2897
    • (2006) Bioinform , vol.22 , pp. 2890-2897
    • Huttenhower, C.1    Hibbs, M.2    Myers, C.3    Troyanskaya, O.G.4
  • 64
    • 4944232237 scopus 로고    scopus 로고
    • Combining multiple microarray studies and modeling interstudy variation
    • i84-90
    • Choi J.K., Yu U., Kim S., and Yoo O.J. Combining multiple microarray studies and modeling interstudy variation. Bioinform 19S (2003) 1 i84-90
    • (2003) Bioinform , vol.19 S , pp. 1
    • Choi, J.K.1    Yu, U.2    Kim, S.3    Yoo, O.J.4
  • 65
    • 3042694117 scopus 로고    scopus 로고
    • Large-scale meta-analysis of cancer microarray data identifies common transcriptional profiles of neoplastic transformation and progression
    • Rhodes D.R., Yu J., Shanker K., et al. Large-scale meta-analysis of cancer microarray data identifies common transcriptional profiles of neoplastic transformation and progression. Proc Natl Acad Sci USA 101 (2004) 9309-9314
    • (2004) Proc Natl Acad Sci USA , vol.101 , pp. 9309-9314
    • Rhodes, D.R.1    Yu, J.2    Shanker, K.3
  • 66
    • 25444521278 scopus 로고    scopus 로고
    • Integrative analysis of multiple gene expression profiles with quality-adjusted effect size models
    • Hu P., Greenwood C.M.T., and Beyene J. Integrative analysis of multiple gene expression profiles with quality-adjusted effect size models. BMC Bioinform 6 (2005) 128
    • (2005) BMC Bioinform , vol.6 , pp. 128
    • Hu, P.1    Greenwood, C.M.T.2    Beyene, J.3
  • 67
    • 33646574675 scopus 로고    scopus 로고
    • Toward understanding the genetics of alcohol drinking through transcriptome meta-analysis
    • Mulligan M.K., Ponomarev I., Hitzmann, et al. Toward understanding the genetics of alcohol drinking through transcriptome meta-analysis. Proc Natl Acad Sci USA 10316 (2006) 6368-6373
    • (2006) Proc Natl Acad Sci USA , vol.10316 , pp. 6368-6373
    • Mulligan, M.K.1    Ponomarev, I.2    Hitzmann3
  • 68
    • 0038492417 scopus 로고    scopus 로고
    • A Bayesian framework for combining heterogeneous data sources for gene function prediction in Saccharomyces cerevisiae
    • Epub 2003 Jun 25
    • Troyanskaya O.G., Dolinski K., Owen A.B., Altman R.B., and Botstein D. A Bayesian framework for combining heterogeneous data sources for gene function prediction in Saccharomyces cerevisiae. Proc Natl Acad Sci U S A 100 14 (2003 Jul 8) 8348-8353 Epub 2003 Jun 25
    • (2003) Proc Natl Acad Sci U S A , vol.100 , Issue.14 , pp. 8348-8353
    • Troyanskaya, O.G.1    Dolinski, K.2    Owen, A.B.3    Altman, R.B.4    Botstein, D.5
  • 69
    • 9444239213 scopus 로고    scopus 로고
    • A probabilistic functional network of yeast genes
    • Lee I., Date S.V., Adai A.T., and Marcotte E.M. A probabilistic functional network of yeast genes. Science 1 (2004) 1555-1558
    • (2004) Science , vol.1 , pp. 1555-1558
    • Lee, I.1    Date, S.V.2    Adai, A.T.3    Marcotte, E.M.4
  • 70
    • 0036100116 scopus 로고    scopus 로고
    • Learning gene functional classifications from multiple data types
    • Pavlidis P., Weston J., Cai J., and Noble W.S. Learning gene functional classifications from multiple data types. J Comput Biol 92 (2002) 401-411
    • (2002) J Comput Biol , vol.92 , pp. 401-411
    • Pavlidis, P.1    Weston, J.2    Cai, J.3    Noble, W.S.4
  • 71
    • 6944251719 scopus 로고    scopus 로고
    • Predicting gene function in Saccharomyces cerevisiae
    • ii42-9
    • Clare A., and King R.D. Predicting gene function in Saccharomyces cerevisiae. Bioinform 19S (2003) 2 ii42-9
    • (2003) Bioinform , vol.19 S , pp. 2
    • Clare, A.1    King, R.D.2
  • 72
    • 0003529280 scopus 로고    scopus 로고
    • American College of Physicians, Philadelphia, PA
    • nd edn (2007), American College of Physicians, Philadelphia, PA
    • (2007) nd edn
    • Lang, T.A.1    Secic, M.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.