메뉴 건너뛰기




Volumn 2, Issue 2 SPEC. ISS., 2004, Pages 137-159

Adaptive text mining: Inferring structure from sequences

Author keywords

Acronym extraction; Adaptive techniques; Compression algorithms; Generic entity extraction; Keyphrase extraction; Phrase hierarchies; Text categorization; Text mining; Word segmentation

Indexed keywords

DATA COMPRESSION; DATA MINING; FORMAL LANGUAGES; HEURISTIC METHODS; INFORMATION THEORY; METADATA; PATTERN RECOGNITION; TEXT PROCESSING;

EID: 10644223670     PISSN: 15708667     EISSN: None     Source Type: Journal    
DOI: 10.1016/S1570-8667(03)00084-4     Document Type: Article
Times cited : (19)

References (38)
  • 2
    • 0032686423 scopus 로고    scopus 로고
    • Data compression using long common strings
    • IEEE Press, Los Alamitos, CA
    • J. Bentley, D. McIlroy, Data compression using long common strings, in: Proc. Data Compression Conference, IEEE Press, Los Alamitos, CA, 1999, pp. 287-295.
    • (1999) Proc. Data Compression Conference , pp. 287-295
    • Bentley, J.1    McIlroy, D.2
  • 3
    • 0029370635 scopus 로고
    • Automatic condensation of electronic publications by sentence selection
    • R. Brandow, K. Mitze, L.F. Rau, Automatic condensation of electronic publications by sentence selection, Information Processing and Management 31 (5) (1995) 675-685.
    • (1995) Information Processing and Management , vol.31 , Issue.5 , pp. 675-685
    • Brandow, R.1    Mitze, K.2    Rau, L.F.3
  • 4
    • 0030211964 scopus 로고    scopus 로고
    • Bagging predictors
    • L. Breiman, Bagging predictors, Machine Learning 24 (2) (1996) 123-140.
    • (1996) Machine Learning , vol.24 , Issue.2 , pp. 123-140
    • Breiman, L.1
  • 6
    • 0021405335 scopus 로고
    • Data compression using adaptive coding and partial string matching
    • J.G. Cleary, I.H. Witten, Data compression using adaptive coding and partial string matching, IEEE Trans. Comm. 32 (4) (1984) 396-402.
    • (1984) IEEE Trans. Comm. , vol.32 , Issue.4 , pp. 396-402
    • Cleary, J.G.1    Witten, I.H.2
  • 10
    • 0033894701 scopus 로고    scopus 로고
    • Text categorization using compression models
    • (Poster Paper), IEEE Press, Los Alamitos, CA
    • E. Frank, C. Chiu, I.H. Witten, Text categorization using compression models, in: Proc. Data Compression Conference (Poster paper), IEEE Press, Los Alamitos, CA, 2000. Full version available as Working Paper 00/2, Department of Computer Science, University of Waikato.
    • (2000) Proc. Data Compression Conference
    • Frank, E.1    Chiu, C.2    Witten, I.H.3
  • 11
    • 0033894701 scopus 로고    scopus 로고
    • Department of Computer Science, University of Waikato
    • E. Frank, C. Chiu, I.H. Witten, Text categorization using compression models, in: Proc. Data Compression Conference (Poster paper), IEEE Press, Los Alamitos, CA, 2000. Full version available as Working Paper 00/2, Department of Computer Science, University of Waikato.
    • Working Paper , vol.2
  • 17
    • 0032647886 scopus 로고    scopus 로고
    • Offline dictionary-based compression
    • IEEE Press, Los Alamitos, CA
    • N.J. Larsson, A. Moffat, Offline dictionary-based compression, in: Proc. Data Compression Conference, IEEE Press, Los Alamitos, CA, 1999, pp. 296-305.
    • (1999) Proc. Data Compression Conference , pp. 296-305
    • Larsson, N.J.1    Moffat, A.2
  • 21
    • 0032010306 scopus 로고    scopus 로고
    • Collaborative, programmable intelligent agents
    • B.A. Nardi, J.R. Miller, D.J. Wright, Collaborative, programmable intelligent agents, Comm. ACM 41 (3) (1998) 96-104.
    • (1998) Comm. ACM , vol.41 , Issue.3 , pp. 96-104
    • Nardi, B.A.1    Miller, J.R.2    Wright, D.J.3
  • 22
    • 0002044093 scopus 로고    scopus 로고
    • Identifying hierarchical structure in sequences: A linear-time algorithm
    • C.G. Nevill-Manning, I.H. Witten, Identifying hierarchical structure in sequences: a linear-time algorithm, J. Artificial Intelligence Res. 7 (1997) 67-82.
    • (1997) J. Artificial Intelligence Res. , vol.7 , pp. 67-82
    • Nevill-Manning, C.G.1    Witten, I.H.2
  • 23
    • 0031702890 scopus 로고    scopus 로고
    • Phrase hierarchy inference and compression in bounded space
    • J.A. Storer, M. Cohn (Eds.), IEEE Press, Los Alamitos, CA
    • C.G. Nevill-Manning, I.H. Witten, Phrase hierarchy inference and compression in bounded space, in: J.A. Storer, M. Cohn (Eds.), Proc. Data Compression Conference, IEEE Press, Los Alamitos, CA, 1998, pp. 179-188.
    • (1998) Proc. Data Compression Conference , pp. 179-188
    • Nevill-Manning, C.G.1    Witten, I.H.2
  • 24
  • 25
    • 10644257127 scopus 로고    scopus 로고
    • Online and offline heuristics for inferring hierarchies of repetitions in sequences
    • C.G. Nevill-Manning, I.H. Witten. Online and offline heuristics for inferring hierarchies of repetitions in sequences, Proc. IEEE 88 (11) (2000) 1745-1755.
    • (2000) Proc. IEEE , vol.88 , Issue.11 , pp. 1745-1755
    • Nevill-Manning, C.G.1    Witten, I.H.2
  • 27
  • 28
    • 0001277731 scopus 로고    scopus 로고
    • A compression-based algorithm for Chinese word segmentation
    • W.J. Teahan, Y. Wen, R. McNab, I.H. Witten, A compression-based algorithm for Chinese word segmentation, Comput. Linguistics 26 (3) (2000) 375-393.
    • (2000) Comput. Linguistics , vol.26 , Issue.3 , pp. 375-393
    • Teahan, W.J.1    Wen, Y.2    McNab, R.3    Witten, I.H.4
  • 29
    • 0009151655 scopus 로고    scopus 로고
    • Text mining technology: Turning information into knowledge
    • D. Tkach, Text mining technology: Turning information into knowledge, IBM White paper, 1997.
    • (1997) IBM White Paper
    • Tkach, D.1
  • 30
    • 21844478478 scopus 로고    scopus 로고
    • Learning algorithms for keyphrase extraction
    • P. Turney, Learning algorithms for keyphrase extraction, Information Retrieval 2 (4) (2000) 303-336.
    • (2000) Information Retrieval , vol.2 , Issue.4 , pp. 303-336
    • Turney, P.1
  • 31
    • 84935113569 scopus 로고
    • Error bounds for convolutional codes and an asymptotically optimum decoding algorithm
    • A.J. Viterbi, Error bounds for convolutional codes and an asymptotically optimum decoding algorithm, IEEE Trans. Inform. Theory (1967) 260-269.
    • (1967) IEEE Trans. Inform. Theory , pp. 260-269
    • Viterbi, A.J.1
  • 32
    • 0032650194 scopus 로고    scopus 로고
    • Text mining: A new frontier for lossless compression
    • IEEE Press, Los Alamitos, CA
    • I.H. Witten, Z. Bray, M. Mahoui, W.J. Teahan, Text mining: a new frontier for lossless compression, in: Proc. Data Compression Conference, IEEE Press, Los Alamitos, CA, 1999, pp. 198-207.
    • (1999) Proc. Data Compression Conference , pp. 198-207
    • Witten, I.H.1    Bray, Z.2    Mahoui, M.3    Teahan, W.J.4
  • 34
    • 84982036565 scopus 로고
    • An algorithm for the segmentation of an artificial language analogue
    • J.G. Wolff, An algorithm for the segmentation of an artificial language analogue, British J. Psychol. 66 (1975) 79-90.
    • (1975) British J. Psychol. , vol.66 , pp. 79-90
    • Wolff, J.G.1
  • 35
    • 0033891710 scopus 로고    scopus 로고
    • Using compression to identify acronyms in text
    • (Poster Paper), IEEE Press, Los Alamitos, CA, 2000
    • S. Yeates, D. Bainbridge, I.H. Witten, Using compression to identify acronyms in text, in: Proc. Data Compression Conference (Poster paper), IEEE Press, Los Alamitos, CA, 2000. Full version available as Working Paper 00/1, Department of Computer Science, University of Waikato.
    • Proc. Data Compression Conference
    • Yeates, S.1    Bainbridge, D.2    Witten, I.H.3
  • 36
    • 0033891710 scopus 로고    scopus 로고
    • Department of Computer Science, University of Waikato
    • S. Yeates, D. Bainbridge, I.H. Witten, Using compression to identify acronyms in text, in: Proc. Data Compression Conference (Poster paper), IEEE Press, Los Alamitos, CA, 2000. Full version available as Working Paper 00/1, Department of Computer Science, University of Waikato.
    • Working Paper , vol.1
  • 37
    • 0017493286 scopus 로고
    • A universal algorithm for sequential data compression
    • J. Ziv, A. Lempel, A universal algorithm for sequential data compression, IEEE Trans. Inform. Theory IT-23 (3) (1977) 337-343.
    • (1977) IEEE Trans. Inform. Theory , vol.IT-23 , Issue.3 , pp. 337-343
    • Ziv, J.1    Lempel, A.2
  • 38
    • 0018019231 scopus 로고
    • Compression of individual sequences via variable-rate coding
    • J. Ziv, A. Lempel, Compression of individual sequences via variable-rate coding, IEEE Trans. Inform. Theory IT-24 (5) (1978) 530-536.
    • (1978) IEEE Trans. Inform. Theory , vol.IT-24 , Issue.5 , pp. 530-536
    • Ziv, J.1    Lempel, A.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.