메뉴 건너뛰기




Volumn , Issue , 2014, Pages 283-292

Principled dictionary pruning for low-memory corpus compression

Author keywords

Corpus compression; Optimization; Retrieval efficiency; String algorithms

Indexed keywords

COMPRESSION RATIO (MACHINERY); EFFICIENCY; INFORMATION RETRIEVAL; OPTIMIZATION;

EID: 84904582651     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/2600428.2609576     Document Type: Conference Paper
Times cited : (11)

References (32)
  • 3
    • 0035282105 scopus 로고    scopus 로고
    • General-purpose compression for efficient retrieval
    • A. Cannane and H. E. Williams. General-purpose compression for efficient retrieval. JASIST, 52(5):430-437, 2001.
    • (2001) JASIST , vol.52 , Issue.5 , pp. 430-437
    • Cannane, A.1    Williams, H.E.2
  • 4
    • 33645851469 scopus 로고    scopus 로고
    • A general-purpose compression scheme for large collections
    • A. Cannane and H. E. Williams. A general-purpose compression scheme for large collections. ACM Transactions on Information Systems, 20(3):329-355, 2002.
    • (2002) ACM Transactions on Information Systems , vol.20 , Issue.3 , pp. 329-355
    • Cannane, A.1    Williams, H.E.2
  • 6
    • 70549099442 scopus 로고    scopus 로고
    • Overview of the trec 2004 terabyte track
    • C. Clarke, N. Craswell, and I. Soboroff. Overview of the trec 2004 terabyte track. In TREC, 2004.
    • (2004) TREC
    • Clarke, C.1    Craswell, N.2    Soboroff, I.3
  • 8
    • 77950946921 scopus 로고    scopus 로고
    • On compressing the textual web
    • P. Ferragina and G. Manzini. On compressing the textual web. In WSDM, pages 391-400, 2010.
    • (2010) WSDM , pp. 391-400
    • Ferragina, P.1    Manzini, G.2
  • 9
    • 84863731168 scopus 로고    scopus 로고
    • Relative Lempel-Ziv factorization for efficient storage and retrieval of web collections
    • C. Hoobin, S. J. Puglisi, and J. Zobel. Relative Lempel-Ziv factorization for efficient storage and retrieval of web collections. PVLDB, 5(3):265-273, 2011.
    • (2011) PVLDB , vol.5 , Issue.3 , pp. 265-273
    • Hoobin, C.1    Puglisi, S.J.2    Zobel, J.3
  • 10
    • 80052129821 scopus 로고    scopus 로고
    • Sample selection for dictionary-based corpus compression
    • C. Hoobin, S. J. Puglisi, and J. Zobel. Sample selection for dictionary-based corpus compression. In SIGIR, pages 1137-1138, 2011.
    • (2011) SIGIR , pp. 1137-1138
    • Hoobin, C.1    Puglisi, S.J.2    Zobel, J.3
  • 15
    • 78449295543 scopus 로고    scopus 로고
    • Relative Lempel-Ziv compression of genomes for large-scale storage and retrieval
    • S. Kuruppu, S. J. Puglisi, and J. Zobel. Relative Lempel-Ziv compression of genomes for large-scale storage and retrieval. In SPIRE, pages 201-206, 2010.
    • (2010) SPIRE , pp. 201-206
    • Kuruppu, S.1    Puglisi, S.J.2    Zobel, J.3
  • 16
    • 84857880849 scopus 로고    scopus 로고
    • Optimized relative Lempel-Ziv compression of genomes
    • S. Kuruppu, S. J. Puglisi, and J. Zobel. Optimized relative Lempel-Ziv compression of genomes. In ACSC, pages 91-98, 2011.
    • (2011) ACSC , pp. 91-98
    • Kuruppu, S.1    Puglisi, S.J.2    Zobel, J.3
  • 17
    • 80054000499 scopus 로고    scopus 로고
    • Reference sequence construction for relative compression of genomes
    • S. Kuruppu, S. J. Puglisi, and J. Zobel. Reference sequence construction for relative compression of genomes. In SPIRE, pages 420-425, 2011.
    • (2011) SPIRE , pp. 420-425
    • Kuruppu, S.1    Puglisi, S.J.2    Zobel, J.3
  • 18
    • 0032647886 scopus 로고    scopus 로고
    • Offline dictionary-based compression
    • N. J. Larsson and A. Moffat. Offline dictionary-based compression. In DCC, pages 296-305, 1999.
    • (1999) DCC , pp. 296-305
    • Larsson, N.J.1    Moffat, A.2
  • 20
  • 21
    • 84961214036 scopus 로고    scopus 로고
    • Cluster-based delta compression of a collection of files
    • Z. Ouyang, N. D. Memon, T. Suel, and D. Trendafilov. Cluster-based delta compression of a collection of files. In WISE, pages 257-268, 2002.
    • (2002) WISE , pp. 257-268
    • Ouyang, Z.1    Memon, N.D.2    Suel, T.3    Trendafilov, D.4
  • 22
    • 83055197113 scopus 로고    scopus 로고
    • Collection-based compression using discovered long matching strings
    • A. Peel, A. Wirth, and J. Zobel. Collection-based compression using discovered long matching strings. In CIKM, pages 2361-2364, 2011.
    • (2011) CIKM , pp. 2361-2364
    • Peel, A.1    Wirth, A.2    Zobel, J.3
  • 23
    • 84871214454 scopus 로고    scopus 로고
    • WAN-optimized replication of backup datasets using stream-informed delta compression
    • P. Shilane, M. Huang, G. Wallace, and W. Hsu. WAN-optimized replication of backup datasets using stream-informed delta compression. ACM Transations on Storage, 8(4):1-26, 2012.
    • (2012) ACM Transations on Storage , vol.8 , Issue.4 , pp. 1-26
    • Shilane, P.1    Huang, M.2    Wallace, G.3    Hsu, W.4
  • 25
    • 0020190931 scopus 로고
    • Data compression via textual substitution
    • J. A. Storer and T. G. Szymanski. Data compression via textual substitution. Journal of the ACM, 29(4):928-951, 1982.
    • (1982) Journal of the ACM , vol.29 , Issue.4 , pp. 928-951
    • Storer, J.A.1    Szymanski, T.G.2
  • 26
    • 2442563450 scopus 로고    scopus 로고
    • Improved file synchronization techniques for maintaining large replicated collections over slow networks
    • T. Suel, P. Noel, and D. Trendafilov. Improved file synchronization techniques for maintaining large replicated collections over slow networks. In ICDE, pages 153-164, 2004.
    • (2004) ICDE , pp. 153-164
    • Suel, T.1    Noel, P.2    Trendafilov, D.3
  • 27
    • 0032654288 scopus 로고    scopus 로고
    • Compressing integers for fast file access
    • H. E. Williams and J. Zobel. Compressing integers for fast file access. The Computer Journal, 42(3):193-201, 1999.
    • (1999) The Computer Journal , vol.42 , Issue.3 , pp. 193-201
    • Williams, H.E.1    Zobel, J.2
  • 29
    • 0017493286 scopus 로고
    • A universal algorithm for sequential data compression
    • J. Ziv and A. Lempel. A universal algorithm for sequential data compression. IEEE Transactions on Information Theory, 23(3):337-343, 1977.
    • (1977) IEEE Transactions on Information Theory , vol.23 , Issue.3 , pp. 337-343
    • Ziv, J.1    Lempel, A.2
  • 30
    • 0027632406 scopus 로고
    • A measure of relative entropy between individual sequences with application to universal classification
    • J. Ziv and N. Merhav. A measure of relative entropy between individual sequences with application to universal classification. IEEE Transactions on Information Theory, 39(4):1270-1279, 1993.
    • (1993) IEEE Transactions on Information Theory , vol.39 , Issue.4 , pp. 1270-1279
    • Ziv, J.1    Merhav, N.2
  • 31
    • 0342521304 scopus 로고    scopus 로고
    • Compression: A key for next-generation text retrieval systems
    • N. Ziviani, E. Silva de Moura, G. Navarro, and R. A. Baeza-Yates. Compression: A key for next-generation text retrieval systems. IEEE Computer, 33(11):37-44, 2000.
    • (2000) IEEE Computer , vol.33 , Issue.11 , pp. 37-44
    • Ziviani, N.1    Moura De E.Silva2    Navarro, G.3    Baeza-Yates, R.A.4
  • 32
    • 33747729581 scopus 로고    scopus 로고
    • Inverted files for text search engines
    • J. Zobel and A. Moffat. Inverted files for text search engines. ACM Computing Surveys, 38(2), 2006.
    • (2006) ACM Computing Surveys , vol.38 , Issue.2
    • Zobel, J.1    Moffat, A.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.