메뉴 건너뛰기




Volumn 20, Issue 3, 2002, Pages 329-355

A general-purpose compression scheme for large collections

Author keywords

Phrase based compression; Random access; Sampling

Indexed keywords

COMPRESS SCHEMES; PHRASE BASED COMPRESSIONS; QUERY EVALUATION; SINGLE PASS COMPRESSION;

EID: 33645851469     PISSN: 10468188     EISSN: 10468188     Source Type: Journal    
DOI: 10.1145/568727.568730     Document Type: Article
Times cited : (13)

References (50)
  • 1
    • 0034708480 scopus 로고    scopus 로고
    • The genome sequence of drosophila melanogaster
    • ADAMS, M., CELNIKER, S., ET AL. 2000. The genome sequence of drosophila melanogaster. Science 287, 5461 (Mar.), 2185-2195. (See http://www.ncbi.nlm.nih. gov:80/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids= 10731132&dopt=Abstract for complete list of authors.)
    • (2000) Science , vol.287 , Issue.5461 MAR , pp. 2185-2195
    • Adams, M.1    Celniker, S.2
  • 2
    • 85009167014 scopus 로고
    • Efficient two-dimensional compressed matching
    • (Snowbird, Utah), J. Storer and M. Cohn, Eds., IEEE Computer Society Press, Los Alamitos, Calif.
    • AMIR, A. AND BENSON, G. 1992. Efficient two-dimensional compressed matching. In Proceedings of the IEEE Data Compression Conference (Snowbird, Utah), J. Storer and M. Cohn, Eds., IEEE Computer Society Press, Los Alamitos, Calif., 279-288.
    • (1992) Proceedings of the IEEE Data Compression Conference , pp. 279-288
    • Amir, A.1    Benson, G.2
  • 3
    • 0031675019 scopus 로고    scopus 로고
    • Some theory and practice of greedy off-line textual substitution
    • (Snowbird, Utah), J. Storer and M. Cohn, Eds., IEEE Computer Society Press, Los Alamitos, Calif.
    • APOSTOLICO, A. AND LONARDI, S. 1998. Some theory and practice of greedy off-line textual substitution. In Proceedings of the IEEE Data Compression Conference (Snowbird, Utah), J. Storer and M. Cohn, Eds., IEEE Computer Society Press, Los Alamitos, Calif., 119-128.
    • (1998) Proceedings of the IEEE Data Compression Conference , pp. 119-128
    • Apostolico, A.1    Lonardi, S.2
  • 4
    • 2642584062 scopus 로고    scopus 로고
    • Off-line compression by greedy textual substitution
    • APOSTOLICO, A. AND LONARDI, S. 2000. Off-line compression by greedy textual substitution. Proc. IEEE 88, 11, 1733-1744.
    • (2000) Proc. IEEE , vol.88 , Issue.11 , pp. 1733-1744
    • Apostolico, A.1    Lonardi, S.2
  • 7
    • 0032686423 scopus 로고    scopus 로고
    • Data compression using long common strings
    • (Snowbird, Utah), J. Storer and M. Cohn, Eds. IEEE Computer Society Press, Los Alamitos, Calif.
    • BENTLEY, J. AND MCILROY, D. 1999. Data compression using long common strings. In Proceedings of the IEEE Data Compression Conference (Snowbird, Utah), J. Storer and M. Cohn, Eds. IEEE Computer Society Press, Los Alamitos, Calif., 287-295.
    • (1999) Proceedings of the IEEE Data Compression Conference , pp. 287-295
    • Bentley, J.1    Mcilroy, D.2
  • 8
    • 0035367637 scopus 로고    scopus 로고
    • Data compression with long repeated strings
    • BENTLEY, J. AND MCILROY, D. 2001. Data compression with long repeated strings. Inf. Sci. 135, 1-2 (June), 1-11.
    • (2001) Inf. Sci. , vol.135 , Issue.1-2 JUNE , pp. 1-11
    • Bentley, J.1    Mcilroy, D.2
  • 10
    • 0005665720 scopus 로고    scopus 로고
    • A compression scheme for large databases
    • (Canberra), M. Orlowska, Ed., IEEE Computer Society Press, Los Alamitos, Calif.
    • CANNANE, A. AND WILLIAMS, H. 2000. A compression scheme for large databases. In Proceedings of the Australasian Database Conference (Canberra), M. Orlowska, Ed., Vol. 22., IEEE Computer Society Press, Los Alamitos, Calif., 6-11.
    • (2000) Proceedings of the Australasian Database Conference , vol.22 , pp. 6-11
    • Cannane, A.1    Williams, H.2
  • 11
    • 0035282105 scopus 로고    scopus 로고
    • General-purpose compression for efficient retrieval
    • CANNANE, A. AND WILLIAMS, H. 2001. General-purpose compression for efficient retrieval. J. Am. Soc. Inf. Sci. Tech. 52, 5 (Apr.), 430-437.
    • (2001) J. Am. Soc. Inf. Sci. Tech. , vol.52 , Issue.5 APR , pp. 430-437
    • Cannane, A.1    Williams, H.2
  • 12
    • 0032663939 scopus 로고    scopus 로고
    • A general-purpose compression scheme for databases
    • (Snowbird, Utah), J. Storer and M. Cohn, Eds., IEEE Computer Society Press, Los Alamitos, Calif., (refereed poster).
    • CANNANE, A., WILLIAMS, H., AND ZOBEL, J. 1999. A general-purpose compression scheme for databases. In Proceedings of the IEEE Data Compression Conference (Snowbird, Utah), J. Storer and M. Cohn, Eds., IEEE Computer Society Press, Los Alamitos, Calif., 519 (refereed poster).
    • (1999) Proceedings of the IEEE Data Compression Conference , pp. 519
    • Cannane, A.1    Williams, H.2    Zobel, J.3
  • 14
    • 33645718097 scopus 로고
    • Available by anonymous ftp from prep.ai.mit.edu:/pub/gnu/gzip-*.tar
    • GAILLY, J. 1993. Gzip program and documentation. Available by anonymous ftp from prep.ai.mit.edu:/pub/gnu/gzip-*.tar.
    • (1993) Gzip Program and Documentation
    • Gailly, J.1
  • 16
    • 0000526256 scopus 로고
    • Overview of the second text retrieval conference (TREC-2)
    • HARMAN, D. 1995. Overview of the second text retrieval conference (TREC-2). Inf. Process. Manage. 31, 3, 271-289.
    • (1995) Inf. Process. Manage. , vol.31 , Issue.3 , pp. 271-289
    • Harman, D.1
  • 17
    • 0003338029 scopus 로고    scopus 로고
    • Overview of TREC-7 very large collection track
    • E. Voorhees and D. Harman., Eds., National Institute of Standards and Technology Special Publication 500-242, Washington, D.C.
    • HAWKING, D., CRESWELL, N., AND THISTLEWAITE, P. 1999. Overview of TREC-7 very large collection track. In Proceedings of the Text Retrieval Conference (TREC), E. Voorhees and D. Harman., Eds., National Institute of Standards and Technology Special Publication 500-242, Washington, D.C., 91-104.
    • (1999) Proceedings of the Text Retrieval Conference (TREC) , pp. 91-104
    • Hawking, D.1    Creswell, N.2    Thistlewaite, P.3
  • 18
    • 33747868548 scopus 로고
    • Improving LZW
    • (Snowbird, Utah), J. Storer and J. Reif, Eds., IEEE Computer Society Press, Los Alamitos, Calif.
    • HORSPOOL, R. 1991. Improving LZW. In Proceedings of the IEEE Data Compression Conference (Snowbird, Utah), J. Storer and J. Reif, Eds., IEEE Computer Society Press, Los Alamitos, Calif., 332-341.
    • (1991) Proceedings of the IEEE Data Compression Conference , pp. 332-341
    • Horspool, R.1
  • 19
    • 19944392360 scopus 로고    scopus 로고
    • Offline dictionary-based compression
    • LARSSON, N. J. AND MOFFAT, A. 2000. Offline dictionary-based compression. Proc. IEEE 88, 11 (Nov.), 1722-1732.
    • (2000) Proc. IEEE , vol.88 , Issue.11 NOV , pp. 1722-1732
    • Larsson, N.J.1    Moffat, A.2
  • 20
    • 84976741299 scopus 로고
    • Data compression
    • LELEWER, D. A. AND HIRSCHBERG, D. S. 1987. Data compression. Comput. Surv. 19, 3 (Sept.), 261-296.
    • (1987) Comput. Surv. , vol.19 , Issue.3 SEPT , pp. 261-296
    • Lelewer, D.A.1    Hirschberg, D.S.2
  • 21
    • 0015617524 scopus 로고
    • Compression of bibliographic files using an adaption of run-length coding
    • LYNCH, M. 1973. Compression of bibliographic files using an adaption of run-length coding. Inf. Storage Retrieval 9, 207-214.
    • (1973) Inf. Storage Retrieval , vol.9 , pp. 207-214
    • Lynch, M.1
  • 22
    • 0016508073 scopus 로고
    • Information compression by factoring common strings
    • MAYNE, A. AND JAMES, E. B. 1975. Information compression by factoring common strings. Comput. J. 18, 157-160.
    • (1975) Comput. J. , vol.18 , pp. 157-160
    • Mayne, A.1    James, E.B.2
  • 23
    • 0024606846 scopus 로고
    • Word based text compression
    • MOFFAT, A. 1989. Word based text compression. Softw. Pract. Exper. 19, 2 (Feb.), 185-198.
    • (1989) Softw. Pract. Exper. , vol.19 , Issue.2 FEB , pp. 185-198
    • Moffat, A.1
  • 24
    • 0029228574 scopus 로고
    • Arithmetic coding revisited
    • (Snowbird, Utah), J. Storer and M. Cohn, Eds., IEEE Computer Society Press, Los Alamitos, Calif.
    • MOFFAT, A., NEAL, R., AND WITTEN, I. 1995. Arithmetic coding revisited. In Proceedings of the IEEE Data Compression Conference (Snowbird, Utah), J. Storer and M. Cohn, Eds., IEEE Computer Society Press, Los Alamitos, Calif., 202-211.
    • (1995) Proceedings of the IEEE Data Compression Conference , pp. 202-211
    • Moffat, A.1    Neal, R.2    Witten, I.3
  • 26
    • 0031102365 scopus 로고    scopus 로고
    • Text compression for dynamic document databases
    • MOFFAT, A., ZOBEL, J., AND SHARMAN, N. 1997. Text compression for dynamic document databases. IEEE Trans. Knowl. Data Eng. 9, 2, 302-313.
    • (1997) IEEE Trans. Knowl. Data Eng. , vol.9 , Issue.2 , pp. 302-313
    • Moffat, A.1    Zobel, J.2    Sharman, N.3
  • 28
    • 0000523223 scopus 로고    scopus 로고
    • Compression and explanation using hierarchical grammars
    • NEVILL-MANNING, C. AND WITTEN, I. 1997. Compression and explanation using hierarchical grammars. Comput. J. 40, 2/3, 103-116.
    • (1997) Comput. J. , vol.40 , Issue.2-3 , pp. 103-116
    • Nevill-Manning, C.1    Witten, I.2
  • 29
    • 10644257127 scopus 로고    scopus 로고
    • Online and offline heuristics for inferring hierarchies of repetitions in sequences
    • NEVILL-MANNING, C. AND WITTEN, I. 2000. Online and offline heuristics for inferring hierarchies of repetitions in sequences. Proc. IEEE 88, 11, 1745-1755.
    • (2000) Proc. IEEE , vol.88 , Issue.11 , pp. 1745-1755
    • Nevill-Manning, C.1    Witten, I.2
  • 30
    • 0017017225 scopus 로고
    • Experiments in text file compression
    • RUBIN, F. 1976. Experiments in text file compression. Commun. ACM 19, 11 (Nov.), 617-623.
    • (1976) Commun. ACM , vol.19 , Issue.11 NOV , pp. 617-623
    • Rubin, F.1
  • 31
    • 0004743466 scopus 로고
    • A comparison of algorithms for data base compression by use of fragments as language elements
    • SCHUEGRAF, E. AND HEAPS, H. 1974. A comparison of algorithms for data base compression by use of fragments as language elements. Inf. Storage Retrieval 10, 309-319.
    • (1974) Inf. Storage Retrieval , vol.10 , pp. 309-319
    • Schuegraf, E.1    Heaps, H.2
  • 32
    • 33750382299 scopus 로고    scopus 로고
    • SEWARD, J. 2000. The bzip2 and Iibbzip2 home page. Available by anonymous ftp from sourceware.cygnus.com:/pub/bzip2/v100/bzip2-*.tar.gz.
    • SEWARD, J. 2000. The bzip2 and Iibbzip2 home page. Available by anonymous ftp from sourceware.cygnus.com:/pub/bzip2/v100/bzip2-*.tar.gz.
  • 35
    • 0020190931 scopus 로고
    • Data compression via textual substitution
    • STORER, J. AND SZYMANSKI, T. 1982. Data compression via textual substitution. J. ACM 29, 928-951.
    • (1982) J. ACM , vol.29 , pp. 928-951
    • Storer, J.1    Szymanski, T.2
  • 36
    • 0029716119 scopus 로고    scopus 로고
    • The entropy of English using PPM-based models
    • (Snowbird, Utah), J. Storer and M. Cohn, Eds., IEEE Computer Society Press, Los Alamitos, Calif.
    • TEAHAN, W. AND CLEARY, J. 1996. The entropy of English using PPM-based models. In Proceedings of the IEEE Data Compression Conference (Snowbird, Utah), J. Storer and M. Cohn, Eds., IEEE Computer Society Press, Los Alamitos, Calif., 53-62.
    • (1996) Proceedings of the IEEE Data Compression Conference , pp. 53-62
    • Teahan, W.1    Cleary, J.2
  • 38
    • 0005726584 scopus 로고    scopus 로고
    • PhD thesis, The University of Melbourne
    • TURPIN, A. 1999. Efficient prefix coding. PhD thesis, The University of Melbourne.
    • (1999) Efficient Prefix Coding
    • Turpin, A.1
  • 39
    • 0039002149 scopus 로고    scopus 로고
    • Fast file search using text compression
    • M. Patel, Ed., Australian Computer Science Communications, Sydney
    • TURPIN, A. AND MOFFAT, A. 1997. Fast file search using text compression. In Proceedings of the Australasian Computer Science Conference, M. Patel, Ed., Vol. 19, Australian Computer Science Communications, Sydney, 1-8.
    • (1997) Proceedings of the Australasian Computer Science Conference , vol.19 , pp. 1-8
    • Turpin, A.1    Moffat, A.2
  • 40
    • 0015600497 scopus 로고
    • Common phrases and minimum-space text storage
    • WAGNER, R. A. 1973. Common phrases and minimum-space text storage. Commun. ACM 16, 3 (Mar.), 148-152.
    • (1973) Commun. ACM , vol.16 , Issue.3 MAR , pp. 148-152
    • Wagner, R.A.1
  • 41
    • 0021439618 scopus 로고
    • A technique for high performance data compression
    • WELCH, T. 1984. A technique for high performance data compression. IEEE Comput. 17, 8-20.
    • (1984) IEEE Comput. , vol.17 , pp. 8-20
    • Welch, T.1
  • 42
    • 0032654288 scopus 로고    scopus 로고
    • Compressing integers for fast file access
    • WILLIAMS, H. AND ZOBEL, J. 1999. Compressing integers for fast file access. Comput. J. 42, 3, 193-201.
    • (1999) Comput. J. , vol.42 , Issue.3 , pp. 193-201
    • Williams, H.1    Zobel, J.2
  • 45
    • 0017931873 scopus 로고
    • Receding of natural language for economy of transmission or storage
    • WOLFF, J. 1978. Receding of natural language for economy of transmission or storage. Comput. J. 21, 1, 42-44.
    • (1978) Comput. J. , vol.21 , Issue.1 , pp. 42-44
    • Wolff, J.1
  • 46
    • 0017493286 scopus 로고
    • A universal algorithm for sequential data compression
    • ZIV, J. AND LEMPEL, A. 1977. A universal algorithm for sequential data compression. IEEE Trans. Inf. Theor. IT-23, 3, 337-343.
    • (1977) IEEE Trans. Inf. Theor. , vol.IT-23 , Issue.3 , pp. 337-343
    • Ziv, J.1    Lempel, A.2
  • 47
    • 0018019231 scopus 로고
    • Compression of individual sequences via variable rate coding
    • ZIV J. AND LEMPEL, A. 1978. Compression of individual sequences via variable rate coding. IEEE Trans. Inf. Theor. IT-24, 5, 530-536.
    • (1978) IEEE Trans. Inf. Theor. , vol.IT-24 , Issue.5 , pp. 530-536
    • Ziv, J.1    Lempel, A.2
  • 48
    • 0029359786 scopus 로고
    • Adding compression to a full-text retrieval system
    • ZOBEL, J. AND MOFFAT, A. 1995. Adding compression to a full-text retrieval system. Softw. Pract. Exper. 25, 8 (Aug.), 891-903.
    • (1995) Softw. Pract. Exper. , vol.25 , Issue.8 AUG , pp. 891-903
    • Zobel, J.1    Moffat, A.2
  • 49
    • 84963895259 scopus 로고    scopus 로고
    • Combined models for high-performance compression of large text collections
    • (Cancun), R. Baeza-Yates, E. Chàvez, and J. Favela, Eds., IEEE Computer Society Press, Los Alamitos, Calif.
    • ZOBEL, J. AND WILLIAMS, H. 1999. Combined models for high-performance compression of large text collections. In Proceedings of the Sixth String Processing and Information Retrieval Conference (SPIRE'99) (Cancun), R. Baeza-Yates, E. Chàvez, and J. Favela, Eds., IEEE Computer Society Press, Los Alamitos, Calif., 224-231.
    • (1999) Proceedings of the Sixth String Processing and Information Retrieval Conference (SPIRE'99) , pp. 224-231
    • Zobel, J.1    Williams, H.2
  • 50
    • 0035980876 scopus 로고    scopus 로고
    • In-memory hash tables for accumulating text vocabularies
    • ZOBEL, J., HELNZ, S., AND WILLIAMS, H. 2001. In-memory hash tables for accumulating text vocabularies. Inf. Proc. Lett. 80, 271-277.
    • (2001) Inf. Proc. Lett. , vol.80 , pp. 271-277
    • Zobel, J.1    Helnz, S.2    Williams, H.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.