메뉴 건너뛰기




Volumn 47, Issue 2, 2009, Pages 171-197

Matrix-based streamization approach for improving locality and parallelism on FT64 stream processor

Author keywords

D C Matrix; FT64; Program transformation; Stream organization; Streamization

Indexed keywords

APPLICATIONS; FOURIER TRANSFORMS; NATURAL SCIENCES COMPUTING; SYSTEMS ANALYSIS;

EID: 59549093333     PISSN: 09208542     EISSN: 15730484     Source Type: Journal    
DOI: 10.1007/s11227-008-0186-0     Document Type: Article
Times cited : (5)

References (39)
  • 3
    • 0036505033 scopus 로고    scopus 로고
    • The RAW microprocessor: A computational fabric for software circuits and general purpose programs
    • 2
    • M Taylor J Kim J Miller 2002 The RAW microprocessor: a computational fabric for software circuits and general purpose programs IEEE Micro 22 2 25 35
    • (2002) IEEE Micro , vol.22 , pp. 25-35
    • Taylor, M.1    Kim, J.2    Miller, J.3
  • 4
    • 3242815471 scopus 로고    scopus 로고
    • Scaling to the end of silicon with EDGE architectures
    • 7
    • D Burger SW Keckler KS McKinley 2004 Scaling to the end of silicon with EDGE architectures Computer 37 7 44 55
    • (2004) Computer , vol.37 , pp. 44-55
    • Burger, D.1    Keckler, S.W.2    McKinley, K.S.3
  • 5
    • 34547423880 scopus 로고    scopus 로고
    • Exploiting coarse-grained task, data, and pipeline parallelism in stream programs
    • California, USA
    • Gordon MI, Thies W, Amarasinghe S (2006) Exploiting coarse-grained task, data, and pipeline parallelism in stream programs. In: Proceedings of ASPLOS'06, California, USA
    • (2006) Proceedings of ASPLOS'06
    • Gordon, M.I.1    Thies, W.2    Amarasinghe, S.3
  • 16
    • 0026232450 scopus 로고
    • A loop transformation theory and an algorithm to maximize parallelism
    • 4
    • ME Wolf M Lam 1991 A loop transformation theory and an algorithm to maximize parallelism IEEE Trans Parallel Distrib Syst 2 4 452 471
    • (1991) IEEE Trans Parallel Distrib Syst , vol.2 , pp. 452-471
    • Wolf, M.E.1    Lam, M.2
  • 22
    • 77953998137 scopus 로고    scopus 로고
    • Sparse matrix solvers on the Gpu: Conjugate gradients and multigrid
    • 3
    • J Bolz I Farmer E Grinspun P Schr Öder 2003 Sparse matrix solvers on the Gpu: conjugate gradients and multigrid ACM Trans Graph 22 3 917 924
    • (2003) ACM Trans Graph , vol.22 , pp. 917-924
    • Bolz, J.1    Farmer, I.2    Grinspun, E.3    Schr Öder, P.4
  • 24
    • 84934343786 scopus 로고    scopus 로고
    • Analysis and performance results of a molecular modeling application on Merrimac
    • Erez M, Ahn J et al (2004) Analysis and performance results of a molecular modeling application on Merrimac. In: Proceedings of supercomputing conference 2004
    • (2004) Proceedings of Supercomputing Conference 2004
    • Erez, M.1    Ahn, J.2
  • 29
    • 31844444712 scopus 로고    scopus 로고
    • Cache aware optimization of stream programs
    • Chicago, Illinois, USA
    • Sermulins J, Thies W et al (2005) Cache aware optimization of stream programs. In: Proceedings of LCTES'05, Chicago, Illinois, USA
    • (2005) Proceedings of LCTES'05
    • Sermulins, J.1    Thies, W.2
  • 33
    • 0033077834 scopus 로고    scopus 로고
    • A linear algebra framework for automatic determination of optimal data layouts
    • 2
    • M Kandemir A Choudhary 1999 A linear algebra framework for automatic determination of optimal data layouts IEEE Trans Parallel Distrib Syst 10 2 115 135
    • (1999) IEEE Trans Parallel Distrib Syst , vol.10 , pp. 115-135
    • Kandemir, M.1    Choudhary, A.2
  • 34
    • 84976859799 scopus 로고
    • Unifying Data and control transformations for distributed shared memory machines
    • Cierniak M, Li W (1995) Unifying Data and control transformations for distributed shared memory machines. In: ACM SIGPLAN IPDPS, pp 205-217
    • (1995) ACM SIGPLAN IPDPS , pp. 205-217
    • Cierniak, M.1    Li, W.2
  • 36
    • 0035439109 scopus 로고    scopus 로고
    • Static and dynamic locality optimizations using integer linear programming
    • 9
    • M Kandemir P Banerjee 2001 Static and dynamic locality optimizations using integer linear programming IEEE Trans Parallel Distrib Syst 12 9 922 940
    • (2001) IEEE Trans Parallel Distrib Syst , vol.12 , pp. 922-940
    • Kandemir, M.1    Banerjee, P.2
  • 37
    • 0032656320 scopus 로고    scopus 로고
    • A graph based framework to detect optimal memory layouts for improving data locality
    • San Juan, Puerto Rico
    • Kandemir M et al (1999) A graph based framework to detect optimal memory layouts for improving data locality. In: Proceedings of the 13th international parallel processing symposium, San Juan, Puerto Rico, pp 738-743
    • (1999) Proceedings of the 13th International Parallel Processing Symposium , pp. 738-743
    • Kandemir, M.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.