메뉴 건너뛰기




Volumn 38, Issue 11, 2005, Pages 1902-1912

The method of N-grams in large-scale clustering of DNA texts

Author keywords

Clustering; Compositional spectra; Genome comparisons; N grams; Strings mismatching

Indexed keywords

CLUSTERING; COMPOSITIONAL SPECTRA; GENOME COMPARISONS; N-GRAMS; STRINGS MISMATCHING;

EID: 24044518181     PISSN: 00313203     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.patcog.2005.05.002     Document Type: Article
Times cited : (27)

References (32)
  • 2
    • 0242662034 scopus 로고    scopus 로고
    • Text mining with information-theoretical clustering
    • J. Kogan, C. Nicholas, and V. Volkovich Text mining with information-theoretical clustering Comput. Sci. Eng. 5 6 2003 52 59
    • (2003) Comput. Sci. Eng. , vol.5 , Issue.6 , pp. 52-59
    • Kogan, J.1    Nicholas, C.2    Volkovich, V.3
  • 3
    • 0000014486 scopus 로고
    • Cluster analysis of multivariate data: Efficiency vs. interpretability of classifications
    • E. Forgy Cluster analysis of multivariate data: efficiency vs. interpretability of classifications Biometrics 21 3 1965 768 769
    • (1965) Biometrics , vol.21 , Issue.3 , pp. 768-769
    • Forgy, E.1
  • 5
    • 84856043672 scopus 로고
    • A mathematical theory of communication
    • C.E. Shannon, A mathematical theory of communication, Bell System Tech. J. 27 (1948) 379-423, 623-656.
    • (1948) Bell System Tech. J. , vol.27 , pp. 379-423
    • Shannon, C.E.1
  • 6
    • 3843083229 scopus 로고
    • Experiments with syntactic traces in information retrieval
    • T. De Heer Experiments with syntactic traces in information retrieval Inform. Storage Retrieval 10 1974
    • (1974) Inform. Storage Retrieval , vol.10
    • De Heer, T.1
  • 7
    • 0027334710 scopus 로고
    • Trigrams as index elements in full text retrieval, observations and experimental results
    • E.S. Adams, A.C. Meltzer, Trigrams as index elements in full text retrieval, observations and experimental results, ACM Composite Science Conference, 1993.
    • (1993) ACM Composite Science Conference
    • Adams, E.S.1    Meltzer, A.C.2
  • 8
    • 0142093775 scopus 로고
    • Using an N-gram-based document representation with a vector processing retrieval model
    • W.B. Cavnar, Using an N -gram-based document representation with a vector processing retrieval model, The Fourth Text Retrieval Conference (TREC-3), 1995.
    • (1995) The Fourth Text Retrieval Conference (TREC-3)
    • Cavnar, W.B.1
  • 9
    • 84982395055 scopus 로고
    • Highlights: Language- and domain-independent automatic indexing terms for abstracting
    • J.D. Cohen Highlights: language- and domain-independent automatic indexing terms for abstracting J. Am. Soc. Inform. Sci. 46 3 1995
    • (1995) J. Am. Soc. Inform. Sci. , vol.46 , Issue.3
    • Cohen, J.D.1
  • 10
    • 0028911698 scopus 로고
    • Gauging Similarity with N-grams: Language-independent categorization of text
    • M. Damashek Gauging Similarity with N -grams: language-independent categorization of text Science 267 1995
    • (1995) Science , vol.267
    • Damashek, M.1
  • 16
    • 0026961606 scopus 로고
    • Scatter/gather: A cluster-based approach to browsing large document collections
    • R. Douglass, D.R. Cutting, J. Karger, O. Pedersen, J.W. Tukey, Scatter/gather: a cluster-based approach to browsing large document collections, SIGIR '92, 1992, pp. 318-329.
    • (1992) SIGIR '92 , pp. 318-329
    • Douglass, R.1    Cutting, D.R.2    Karger, J.3    Pedersen, O.4    Tukey, J.W.5
  • 17
    • 2642530436 scopus 로고    scopus 로고
    • DNA sequence analysis linguistic tools
    • A. Bolshoy DNA sequence analysis linguistic tools Rev. Appl. Bioinform. 2 2003 103 112
    • (2003) Rev. Appl. Bioinform. , vol.2 , pp. 103-112
    • Bolshoy, A.1
  • 18
    • 0025359462 scopus 로고
    • Linguistic measure of taxonomic and functional relatedness of nucleotide sequences
    • S. Pietrokovski, J. Hirshon, and E.N. Trifonov Linguistic measure of taxonomic and functional relatedness of nucleotide sequences J. Biomol. Struct. Dynamics 7 1990 1251 1268
    • (1990) J. Biomol. Struct. Dynamics , vol.7 , pp. 1251-1268
    • Pietrokovski, S.1    Hirshon, J.2    Trifonov, E.N.3
  • 20
    • 0029060923 scopus 로고
    • Dinucleotide relative abundance extremes: A genomic signature
    • S. Karlin, and C. Burge Dinucleotide relative abundance extremes: a genomic signature Trends Genet. 11 1995 283 290
    • (1995) Trends Genet. , vol.11 , pp. 283-290
    • Karlin, S.1    Burge, C.2
  • 21
    • 0037105996 scopus 로고    scopus 로고
    • Compositional spectrum - Revealing patterns for genomic sequence characterization and comparison
    • V. Kirzhner, A. Korol, A. Bolshoy, and E. Nevo Compositional spectrum - revealing patterns for genomic sequence characterization and comparison Physics A 312 2002 447 457
    • (2002) Physics A , vol.312 , pp. 447-457
    • Kirzhner, V.1    Korol, A.2    Bolshoy, A.3    Nevo, E.4
  • 22
    • 0037632127 scopus 로고    scopus 로고
    • One promising approach to a large-scale comparison of genomic sequences
    • V. Kirzhner, E. Nevo, A. Korol, and A. Bolshoy One promising approach to a large-scale comparison of genomic sequences Acta Biotheor. 51 2 2003 73 89
    • (2003) Acta Biotheor. , vol.51 , Issue.2 , pp. 73-89
    • Kirzhner, V.1    Nevo, E.2    Korol, A.3    Bolshoy, A.4
  • 24
    • 0032728083 scopus 로고    scopus 로고
    • Optimal reconstruction of a sequence from its probes
    • F. Preparata, A. Frieze, and E. Upfal Optimal reconstruction of a sequence from its probes J. Comput. Biol. 6 1999 361 368
    • (1999) J. Comput. Biol. , vol.6 , pp. 361-368
    • Preparata, F.1    Frieze, A.2    Upfal, E.3
  • 25
    • 4644360472 scopus 로고    scopus 로고
    • Sequencing-by-hybridization revisited: The analog-spectrum proposal
    • F.P. Preparata Sequencing-by-hybridization revisited: the analog-spectrum proposal Comput. Biol. Bioinform. 1 2004 47 52
    • (2004) Comput. Biol. Bioinform. , vol.1 , pp. 47-52
    • Preparata, F.P.1
  • 28
    • 0000825481 scopus 로고
    • A statistical method for evaluating systematic relationships
    • R.R. Sokal, and C.D. Michener A statistical method for evaluating systematic relationships Univ. Kansas Sci. Bull. 38 1958 1409 1438
    • (1958) Univ. Kansas Sci. Bull. , vol.38 , pp. 1409-1438
    • Sokal, R.R.1    Michener, C.D.2
  • 29
    • 0347918435 scopus 로고    scopus 로고
    • A resampling approach to cluster validation
    • V. Roth, V. Lange, M. Braun, J. Buhmann, A resampling approach to cluster validation, COMPSTAT 2002, available at http://www.cs.unibonn.De/ ∼ braunm.
    • COMPSTAT 2002
    • Roth, V.1    Lange, V.2    Braun, M.3    Buhmann, J.4
  • 30
    • 84950632109 scopus 로고
    • Objective criteria for the evaluation of clustering methods
    • W.M. Rand Objective criteria for the evaluation of clustering methods J. Am. Stat. Assoc. 66 1971 846 850
    • (1971) J. Am. Stat. Assoc. , vol.66 , pp. 846-850
    • Rand, W.M.1
  • 31
    • 84945923591 scopus 로고
    • A method for comparing two hierarchical clustering
    • E.B. Fowlkes, and C.L. Mallows A method for comparing two hierarchical clustering J. Am. Stat. Assoc. 78 1983 553 584
    • (1983) J. Am. Stat. Assoc. , vol.78 , pp. 553-584
    • Fowlkes, E.B.1    Mallows, C.L.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.