메뉴 건너뛰기




Volumn 17, Issue 4, 1999, Pages 288-336

Quantifying Loop Nest Locality Using SPEC'95 and the Perfect Benchmarks

Author keywords

C.4 Computer Systems Organization : Performance of Systems Performance attributes; Measurement techniques; Measurement; Performance

Indexed keywords


EID: 0003665539     PISSN: 07342071     EISSN: None     Source Type: Journal    
DOI: 10.1145/329466.329484     Document Type: Article
Times cited : (37)

References (45)
  • 1
    • 0028055525 scopus 로고
    • Predictability of load/store instruction latencies
    • (MICRO 26, Austin, TX, Dec. 1-3), A. Wolfe and W. Mangione-Smith, Eds. IEEE Computer Society Press, Los Alamitos, CA
    • ABRAHAM, S. G., SUGUMAR, R. A., WINDHEISER, D., RAU, B. R., AND GUPTA, R. 1993. Predictability of load/store instruction latencies. In Proceedings of the 26th Annual International Symposium on Microarchitecture (MICRO 26, Austin, TX, Dec. 1-3), A. Wolfe and W. Mangione-Smith, Eds. IEEE Computer Society Press, Los Alamitos, CA, 139-152.
    • (1993) Proceedings of the 26th Annual International Symposium on Microarchitecture , pp. 139-152
    • Abraham, S.G.1    Sugumar, R.A.2    Windheiser, D.3    Rau, B.R.4    Gupta, R.5
  • 2
    • 0027192667 scopus 로고
    • Column-associative caches: A technique for reducing the miss rate of direct-mapped caches
    • AGARWAL, A. AND PUDAR, S. D. 1993. Column-associative caches: A technique for reducing the miss rate of direct-mapped caches. SIGARCH Comput. Arch. News 21, 2 (May), 179-190.
    • (1993) SIGARCH Comput. Arch. News , vol.21 , Issue.2 MAY , pp. 179-190
    • Agarwal, A.1    Pudar, S.D.2
  • 3
    • 0026267802 scopus 로고
    • An effective on-chip preloading scheme to reduce data access penalty
    • (Albuquerque, NM, Nov. 18-22), J. L. Martin, Ed. ACM Press, New York, NY
    • BAER, J.-L. AND CHEN, T.-F. 1991. An effective on-chip preloading scheme to reduce data access penalty. In Proceedings of the 1991 Conference on Supercomputing (Albuquerque, NM, Nov. 18-22), J. L. Martin, Ed. ACM Press, New York, NY, 176-186.
    • (1991) Proceedings of the 1991 Conference on Supercomputing , pp. 176-186
    • Baer, J.-L.1    Chen, T.-F.2
  • 4
    • 0003003638 scopus 로고
    • A study of replacement algorithms for a virtual-storage computer
    • BELADY, L. A. 1966. A study of replacement algorithms for a virtual-storage computer. IBM Syst. J. 5, 2, 79-101.
    • (1966) IBM Syst. J. , vol.5 , Issue.2 , pp. 79-101
    • Belady, L.A.1
  • 6
    • 0029666646 scopus 로고    scopus 로고
    • Memory bandwidth limitations of future microprocessors
    • BURGER, D., GOODMAN, J. R., AND KAGI, A. 1996. Memory bandwidth limitations of future microprocessors. SIGARCH Comput. Arch. News 24, 2, 78-89.
    • (1996) SIGARCH Comput. Arch. News , vol.24 , Issue.2 , pp. 78-89
    • Burger, D.1    Goodman, J.R.2    Kagi, A.3
  • 7
    • 0025447908 scopus 로고
    • Improving register allocation for subscripted variables
    • CALLAHAN, D., CARR, S., AND KENNEDY, K. 1990. Improving register allocation for subscripted variables. SIGPLAN Not. 25, 6 (June), 53-65.
    • (1990) SIGPLAN Not. , vol.25 , Issue.6 JUNE , pp. 53-65
    • Callahan, D.1    Carr, S.2    Kennedy, K.3
  • 9
    • 0028549474 scopus 로고
    • Improving the ratio of memory operations to floating-point operations in loops
    • CARR, S. AND KENNEDY, K. 1994. Improving the ratio of memory operations to floating-point operations in loops. ACM Trans. Program. Lang. Syst. 16, 6 (Nov.), 1768-1810.
    • (1994) ACM Trans. Program. Lang. Syst. , vol.16 , Issue.6 NOV , pp. 1768-1810
    • Carr, S.1    Kennedy, K.2
  • 10
    • 0029308368 scopus 로고
    • Effective hardware-based data prefetching for high-performance processors
    • CHEN, T. F. AND BEAR, J. L. 1995. Effective hardware-based data prefetching for high-performance processors. IEEE Trans. Comput. 44, 5 (May), 609-623.
    • (1995) IEEE Trans. Comput. , vol.44 , Issue.5 MAY , pp. 609-623
    • Chen, T.F.1    Bear, J.L.2
  • 11
    • 84976745804 scopus 로고
    • Tile size selection using cache organization and data layout
    • COLEMAN, S. AND MCKINLEY, K. S. 1995. Tile size selection using cache organization and data layout. SIGPLAN Not. 30, 6 (June 1995), 279-290.
    • (1995) SIGPLAN Not. , vol.30 , Issue.6 JUNE 1995 , pp. 279-290
    • Coleman, S.1    Mckinley, K.S.2
  • 12
    • 0442285964 scopus 로고
    • An emprical study of cross-loop reuse in the NAS benchmarks
    • Center for Research on Parallel Computation, Rice University, Houston, TX
    • COOPER, K., KENNEDY, K., AND MCINTOSH, N. 1995. An emprical study of cross-loop reuse in the NAS benchmarks. Tech. Rep. CRPC-TR95519-S. Center for Research on Parallel Computation, Rice University, Houston, TX.
    • (1995) Tech. Rep. CRPC-TR95519-S
    • Cooper, K.1    Kennedy, K.2    Mcintosh, N.3
  • 14
    • 0025022825 scopus 로고
    • Supercomputer performance evaluation and the Perfect Benchmarks
    • CYBENKO, G., KIPP, L., POINTER, L., AND KUCK, D. 1990. Supercomputer performance evaluation and the Perfect Benchmarks. SIGARCH Comput. Arch. News 18, 3, 254-266.
    • (1990) SIGARCH Comput. Arch. News , vol.18 , Issue.3 , pp. 254-266
    • Cybenko, G.1    Kipp, L.2    Pointer, L.3    Kuck, D.4
  • 16
    • 0029194450 scopus 로고
    • Hardware implementation issues of data prefetching
    • (ICS '95, Barcelona, Spain, July 3-7, 1995), M. Valero, Ed. ACM Press, New York, NY
    • DRACH, N. 1995. Hardware implementation issues of data prefetching. In Proceedings of the 9th ACM International Conference on Supercomputing (ICS '95, Barcelona, Spain, July 3-7, 1995), M. Valero, Ed. ACM Press, New York, NY, 245-254.
    • (1995) Proceedings of the 9th ACM International Conference on Supercomputing , pp. 245-254
    • Drach, N.1
  • 17
    • 0001366267 scopus 로고
    • Strategies for cache and local memory management by global program transformation
    • GANNON, D., JALBY, W., AND GALLIVAN, K. 1988. Strategies for cache and local memory management by global program transformation. J. Parallel Distrib. Comput. 5, 5 (Oct. 1988), 587-616.
    • (1988) J. Parallel Distrib. Comput. , vol.5 , Issue.5 OCT. 1988 , pp. 587-616
    • Gannon, D.1    Jalby, W.2    Gallivan, K.3
  • 18
    • 0027640963 scopus 로고
    • Cache performance of the SPEC92 benchmark suite
    • GEE, J. D., HILL, M. D., AND PNEVMATIKATOS, D. N. 1993. Cache performance of the SPEC92 benchmark suite. IEEE Micro 13, 4 (Aug.), 17-27.
    • (1993) IEEE Micro , vol.13 , Issue.4 AUG , pp. 17-27
    • Gee, J.D.1    Hill, M.D.2    Pnevmatikatos, D.N.3
  • 19
    • 0347468637 scopus 로고    scopus 로고
    • Precise miss analysis for program transformations with caches of arbitrary associativity
    • GHOSH, S., MARTONOSI, M., AND MALIK, S. 1998. Precise miss analysis for program transformations with caches of arbitrary associativity. SIGPLAN Not. 33, 11, 228-239.
    • (1998) SIGPLAN Not. , vol.33 , Issue.11 , pp. 228-239
    • Ghosh, S.1    Martonosi, M.2    Malik, S.3
  • 21
    • 0003789873 scopus 로고
    • Ph.D. Dissertation. Computer Science Department, University of California at Berkeley, Berkeley, CA
    • HILL, M. D. 1987. Aspects of cache memory and instruction buffer performance. Ph.D. Dissertation. Computer Science Department, University of California at Berkeley, Berkeley, CA.
    • (1987) Aspects of Cache Memory and Instruction Buffer Performance
    • Hill, M.D.1
  • 22
    • 0024173488 scopus 로고
    • A case for direct-mapped caches
    • HILL, M. D. 1988. A case for direct-mapped caches. IEEE Computer 21, 12 (Dec. 1988), 25-40.
    • (1988) IEEE Computer , vol.21 , Issue.12 DEC. 1988 , pp. 25-40
    • Hill, M.D.1
  • 23
    • 0024903997 scopus 로고
    • Evaluating associativity in CPU caches
    • HILL, M. D. AND SMITH, A. J. 1989. Evaluating associativity in CPU caches. IEEE Trans. Comput. 38, 12 (Dec. 1989), 1612-1631.
    • (1989) IEEE Trans. Comput. , vol.38 , Issue.12 DEC. 1989 , pp. 1612-1631
    • Hill, M.D.1    Smith, A.J.2
  • 24
    • 85133561916 scopus 로고    scopus 로고
    • Improving direct-mapped cache performance by the addition of a small fully-associative cache prefetch buffers
    • G. S. Sohi, Ed. ACM Press, New York, NY
    • JOUPPI, N. P. 1998. Improving direct-mapped cache performance by the addition of a small fully-associative cache prefetch buffers. In Computer Architecture (ISCA '98), G. S. Sohi, Ed. ACM Press, New York, NY, 388-397.
    • (1998) Computer Architecture (ISCA '98) , pp. 388-397
    • Jouppi, N.P.1
  • 25
    • 0442270363 scopus 로고
    • Cache based computer systems
    • KAPLAN, K. R. AND WINDER, R. O. 1973. Cache based computer systems. IEEE Computer 6, 3, 30-36.
    • (1973) IEEE Computer , vol.6 , Issue.3 , pp. 30-36
    • Kaplan, K.R.1    Winder, R.O.2
  • 26
    • 0026153646 scopus 로고
    • An architecture for software-controlled data prefetching
    • KLAIBER, A. C. AND LEVY, H. M. 1991. An architecture for software-controlled data prefetching. SIGARCH Comput. Arch. News 19, 3 (May 1991), 43-53.
    • (1991) SIGARCH Comput. Arch. News , vol.19 , Issue.3 MAY 1991 , pp. 43-53
    • Klaiber, A.C.1    Levy, H.M.2
  • 28
    • 0026918388 scopus 로고
    • Access normalization: Loop restructuring for NUMA compilers
    • LI, W. AND PINGALI, K. 1992. Access normalization: Loop restructuring for NUMA compilers. SIGPLAN Not. 27, 9 (Sept. 1992), 285-295.
    • (1992) SIGPLAN Not. , vol.27 , Issue.9 SEPT. 1992 , pp. 285-295
    • Li, W.1    Pingali, K.2
  • 30
    • 0442285975 scopus 로고    scopus 로고
    • A quantitative analysis of loop nest locality
    • MCKINLEY, K. S. AND TEMAM, O. 1996. A quantitative analysis of loop nest locality. ACM SIGOPS Oper. Syst. Rev. 30, 5, 94-104.
    • (1996) ACM SIGOPS Oper. Syst. Rev. , vol.30 , Issue.5 , pp. 94-104
    • Mckinley, K.S.1    Temam, O.2
  • 31
    • 0030190854 scopus 로고    scopus 로고
    • Improving data locality with loop transformations
    • MCKINLEY, K. S., CARR, S., AND TSENG, C.-W. 1996. Improving data locality with loop transformations. ACM Trans. Program. Lang. Syst. 18, 4 (July), 424-453.
    • (1996) ACM Trans. Program. Lang. Syst. , vol.18 , Issue.4 JULY , pp. 424-453
    • Mckinley, K.S.1    Carr, S.2    Tseng, C.-W.3
  • 33
    • 0023708930 scopus 로고
    • Performance tradeoffs in cache design
    • (ISCA '88, Honolulu, HI, May 30-June 2), H. J. Siegel, Ed. IEEE Computer Society Press, Los Alamitos, CA
    • PRZYBYLSKI, S., HOROWITZ, M., AND HENNESSY, J. 1988. Performance tradeoffs in cache design. In The 15th Annual International Symposium on Computer Architecture (ISCA '88, Honolulu, HI, May 30-June 2), H. J. Siegel, Ed. IEEE Computer Society Press, Los Alamitos, CA, 290-298.
    • (1988) The 15th Annual International Symposium on Computer Architecture , pp. 290-298
    • Przybylski, S.1    Horowitz, M.2    Hennessy, J.3
  • 34
    • 0008815559 scopus 로고
    • SPEC describes SPEC'95 product and benchmarks
    • Sept.
    • REILLY, J. 1995. SPEC describes SPEC'95 product and benchmarks. SPEC Newslett. (Sept.). Available via http://www.spec.org/osg/news/articles/news9509/cpu95descr.html.
    • (1995) SPEC Newslett.
    • Reilly, J.1
  • 35
    • 0020177251 scopus 로고
    • Cache memories
    • SMITH, A. J. 1982. Cache memories. ACM Comput. Surv. 14, 3 (Sept.), 473-530.
    • (1982) ACM Comput. Surv. , vol.14 , Issue.3 SEPT , pp. 473-530
    • Smith, A.J.1
  • 36
    • 0004802504 scopus 로고
    • Bibliography and readings on CPU cache memories and related topics
    • SMITH, A. J. 1986. Bibliography and readings on CPU cache memories and related topics. SIGARCH Comput. Arch. News 14, 1 (Jan. 1986), 22-42.
    • (1986) SIGARCH Comput. Arch. News , vol.14 , Issue.1 JAN. 1986 , pp. 22-42
    • Smith, A.J.1
  • 37
    • 84939323181 scopus 로고
    • Line (block) size choice for CPU cache memories
    • SMITH, A. J. 1987. Line (block) size choice for CPU cache memories. IEEE Trans. Comput. C-36, 9 (Sept. 1987), 1063-1076.
    • (1987) IEEE Trans. Comput. , vol.C-36 , Issue.9 SEPT. 1987 , pp. 1063-1076
    • Smith, A.J.1
  • 38
    • 0042028057 scopus 로고
    • Second bibliography on cache memories
    • SMITH, A. J. 1991. Second bibliography on cache memories. SIGARCH Comput. Arch. News 19, 4 (June 1991), 154-182.
    • (1991) SIGARCH Comput. Arch. News , vol.19 , Issue.4 JUNE 1991 , pp. 154-182
    • Smith, A.J.1
  • 39
    • 0028132513 scopus 로고
    • ATOM: A system for building customized program analysis tools
    • (PLDI '94, Orlando, FL, June 20-24, 1994), V. Sarkar, B. Ryder, and M. L. Soffa, Eds. ACM Press, New York, NY
    • SRIVASTAVA, A. AND EUSTACE, A. 1994. ATOM: a system for building customized program analysis tools. In Proceedings of the ACM SIGPLAN '94 Conference on Programming Language, Design and Implementation (PLDI '94, Orlando, FL, June 20-24, 1994), V. Sarkar, B. Ryder, and M. L. Soffa, Eds. ACM Press, New York, NY, 196-205.
    • (1994) Proceedings of the ACM SIGPLAN '94 Conference on Programming Language, Design and Implementation , pp. 196-205
    • Srivastava, A.1    Eustace, A.2
  • 40
    • 85008189411 scopus 로고
    • Efficient simulation of caches under optimal replacement with applications to miss characterization
    • SUGUMAR, R. A. AND ABRAHAM, S. G. 1993. Efficient simulation of caches under optimal replacement with applications to miss characterization. SIGMETRICS Perform. Eval. Rev. 21, 1 (June 1993), 24-35.
    • (1993) SIGMETRICS Perform. Eval. Rev. , vol.21 , Issue.1 JUNE 1993 , pp. 24-35
    • Sugumar, R.A.1    Abraham, S.G.2
  • 41
    • 0027764718 scopus 로고
    • To copy or not to copy: A compile-time technique for assessing when data copying should be used to eliminate cache conflicts
    • (Supercomputing '93, Portland, OR, Nov. 15-19), B. Borchers and D. Crawford, Eds. IEEE Computer Society Press, Los Alamitos, CA
    • TEMAM, O., GRANSTON, E. D., AND JALBY, W. 1993. To copy or not to copy: a compile-time technique for assessing when data copying should be used to eliminate cache conflicts. In Proceedings of the Conference on Supercomputing (Supercomputing '93, Portland, OR, Nov. 15-19), B. Borchers and D. Crawford, Eds. IEEE Computer Society Press, Los Alamitos, CA, 410-419.
    • (1993) Proceedings of the Conference on Supercomputing , pp. 410-419
    • Temam, O.1    Granston, E.D.2    Jalby, W.3
  • 42
    • 0029508817 scopus 로고
    • A modified approach to data cache management
    • (Ann Arbor, MI, Nov. 29 - Dec. 1, 1995), T. Mudge and K. Ebcioǧlu, Eds. IEEE Computer Society Press, Los Alamitos, CA
    • TYSON, G., FARRENS, M., MATTHEWS, J., AND PLESZKUN, A. R. 1995. A modified approach to data cache management. In Proceedings of the 28th annual international symposium on Microarchitecture (Ann Arbor, MI, Nov. 29 - Dec. 1, 1995), T. Mudge and K. Ebcioǧlu, Eds. IEEE Computer Society Press, Los Alamitos, CA, 93-103.
    • (1995) Proceedings of the 28th Annual International Symposium on Microarchitecture , pp. 93-103
    • Tyson, G.1    Farrens, M.2    Matthews, J.3    Pleszkun, A.R.4
  • 43
    • 84976827033 scopus 로고
    • A data locality optimization algorithm
    • (SIGPLAN '91, Toronto, Ontario, Canada, June 26-28), D. S. Wise, Ed. ACM Press, New York, NY
    • WOLF, M. E. AND LAM, M. S. 1991. A data locality optimization algorithm. In Proceedings of the ACM Conference on Programming Language Design and Implementation (SIGPLAN '91, Toronto, Ontario, Canada, June 26-28), D. S. Wise, Ed. ACM Press, New York, NY, 30-44.
    • (1991) Proceedings of the ACM Conference on Programming Language Design and Implementation , pp. 30-44
    • Wolf, M.E.1    Lam, M.S.2
  • 45
    • 84910652234 scopus 로고
    • A model for estimating trace-sample miss ratios
    • WOOD, D. A., HILL, M. D., AND KESSLER, R. E. 1991. A model for estimating trace-sample miss ratios. SIGMETRICS Perform. Eval. Rev. 19, 1 (May 1991), 79-89.
    • (1991) SIGMETRICS Perform. Eval. Rev. , vol.19 , Issue.1 MAY 1991 , pp. 79-89
    • Wood, D.A.1    Hill, M.D.2    Kessler, R.E.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.