메뉴 건너뛰기




Volumn , Issue , 2014, Pages

HAMLeT: Hardware accelerated memory layout transform within 3D-stacked DRAM

Author keywords

[No Author keywords available]

Indexed keywords

COMPUTATION THEORY; ENERGY EFFICIENCY; HARDWARE; MATRIX ALGEBRA; MEMORY ARCHITECTURE; THREE DIMENSIONAL INTEGRATED CIRCUITS;

EID: 84946692636     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/HPEC.2014.7040954     Document Type: Conference Paper
Times cited : (23)

References (27)
  • 2
    • 0030190854 scopus 로고    scopus 로고
    • Improving data locality with loop transformations
    • K. S. McKinley et al., "Improving data locality with loop transformations," ACM Trns. Prog. Lang. Syst., vol. 18, no. 4, pp. 424-453, 1996.
    • (1996) ACM Trns. Prog. Lang. Syst. , vol.18 , Issue.4 , pp. 424-453
    • McKinley, K.S.1
  • 4
    • 77952283542 scopus 로고    scopus 로고
    • Micro-pages: Increasing dram efficiency with localityaware data placement
    • K. Sudan et al., "Micro-pages: Increasing dram efficiency with localityaware data placement," in Proc. of Arch. Sup. for Prog. Lang. and OS, ser. ASPLOS XV, 2010, pp. 219-230.
    • (2010) Proc. of Arch. Sup. for Prog. Lang. and OS, Ser. ASPLOS XV , pp. 219-230
    • Sudan, K.1
  • 5
    • 84876588873 scopus 로고    scopus 로고
    • Hybrid memory cube (HMC)
    • J. T. Pawlowski, "Hybrid memory cube (HMC)," in Hotchips, 2011.
    • (2011) Hotchips
    • Pawlowski, J.T.1
  • 8
    • 84893898462 scopus 로고    scopus 로고
    • A 3D-stacked logic-in-memory accelerator for application-specific data intensive computing
    • Oct
    • Q. Zhu et al., "A 3D-stacked logic-in-memory accelerator for application-specific data intensive computing," in 3D Systems Integration Conference (3DIC), 2013 IEEE International, Oct 2013, pp. 1-7.
    • (2013) 3D Systems Integration Conference (3DIC), 2013 IEEE International , pp. 1-7
    • Zhu, Q.1
  • 9
    • 84862084382 scopus 로고    scopus 로고
    • CACTI-3DD: Architecture-level modeling for 3D diestacked DRAM main memory
    • K. Chen et al., "CACTI-3DD: Architecture-level modeling for 3D diestacked DRAM main memory," in Design, Automation Test in Europe (DATE), 2012, pp. 33-38.
    • (2012) Design, Automation Test in Europe (DATE) , pp. 33-38
    • Chen, K.1
  • 10
    • 84866544858 scopus 로고    scopus 로고
    • Hybrid memory cube new dram architecture increases density and performance
    • June
    • J. Jeddeloh et al., "Hybrid memory cube new dram architecture increases density and performance," in VLSI Technology (VLSIT), 2012 Symposium on, June 2012, pp. 87-88.
    • (2012) VLSI Technology (VLSIT), 2012 Symposium on , pp. 87-88
    • Jeddeloh, J.1
  • 11
    • 70349972511 scopus 로고    scopus 로고
    • Permuting streaming data using RAMs
    • M. Püschel et al., "Permuting streaming data using RAMs," Journal of the ACM, vol. 56, no. 2, pp. 10:1-10:34, 2009.
    • (2009) Journal of the ACM , vol.56 , Issue.2 , pp. 101-1034
    • Püschel, M.1
  • 13
    • 84924476773 scopus 로고    scopus 로고
    • "Gromacs," http://www. gromacs. org, 2008.
    • (2008)
  • 14
    • 0000011164 scopus 로고
    • A fast computer method for matrix transposing
    • July
    • J. O. Eklundh, "A fast computer method for matrix transposing," IEEE Transactions on Computers, vol. C-21, no. 7, pp. 801-803, July 1972.
    • (1972) IEEE Transactions on Computers , vol.C-21 , Issue.7 , pp. 801-803
    • Eklundh, J.O.1
  • 15
    • 0042235298 scopus 로고    scopus 로고
    • Tiling, block data layout, and memory hierarchy performance
    • July
    • N. Park et al., "Tiling, block data layout, and memory hierarchy performance," IEEE Transactions on Parallel and Distributed Systems, vol. 14, no. 7, pp. 640-654, July 2003.
    • (2003) IEEE Transactions on Parallel and Distributed Systems , vol.14 , Issue.7 , pp. 640-654
    • Park, N.1
  • 16
    • 84947031567 scopus 로고
    • Parallel matrix transpose algorithms on distributed memory concurrent computers
    • Oct
    • J. Choi et al., "Parallel matrix transpose algorithms on distributed memory concurrent computers," in Proceedings of the Scalable Parallel Libraries Conference, Oct 1993, pp. 245-252.
    • (1993) Proceedings of the Scalable Parallel Libraries Conference , pp. 245-252
    • Choi, J.1
  • 17
    • 84864952164 scopus 로고    scopus 로고
    • Memory bandwidth efficient two-dimensional fast Fourier transform algorithm and implementation for large problem sizes
    • B. Akin et al., "Memory bandwidth efficient two-dimensional fast Fourier transform algorithm and implementation for large problem sizes," in Proc. of the IEEE Symp. on FCCM, 2012, pp. 188-191.
    • (2012) Proc. of the IEEE Symp. on FCCM , pp. 188-191
    • Akin, B.1
  • 18
    • 78650833009 scopus 로고    scopus 로고
    • Simple but effective heterogeneous main memory with on-chip memory controller support
    • Nov
    • X. Dong et al., "Simple but effective heterogeneous main memory with on-chip memory controller support," in Intl. Conf. for High Perf. Comp., Networking, Storage and Analysis (SC), Nov 2010, pp. 1-11.
    • (2010) Intl. Conf. for High Perf. Comp., Networking, Storage and Analysis (SC) , pp. 1-11
    • Dong, X.1
  • 20
    • 77952265152 scopus 로고    scopus 로고
    • Optimizing matrix transpose in cuda
    • Jan
    • G. Ruetsch et al., "Optimizing matrix transpose in cuda," Nvidia Tech. Report, Jan 2009.
    • (2009) Nvidia Tech. Report
    • Ruetsch, G.1
  • 21
    • 84924476772 scopus 로고    scopus 로고
    • CACTI 6. 5, HP labs
    • "CACTI 6. 5, HP labs," http://www. hpl. hp. com/research/cacti/.
  • 22
    • 3142665556 scopus 로고    scopus 로고
    • Dynamic data layouts for cache-conscious implementation of a class of signal transforms
    • July
    • N. Park et al., "Dynamic data layouts for cache-conscious implementation of a class of signal transforms," IEEE Transactions on Signal Processing, vol. 52, no. 7, pp. 2120-2134, July 2004.
    • (2004) IEEE Transactions on Signal Processing , vol.52 , Issue.7 , pp. 2120-2134
    • Park, N.1
  • 23
    • 33748543231 scopus 로고    scopus 로고
    • Hardware support for bulk data movement in server platforms
    • Oct
    • L. Zhao et al., "Hardware support for bulk data movement in server platforms," in Proc. of IEEE Intl. Conf. on Computer Design, (ICCD), Oct 2005, pp. 53-60.
    • (2005) Proc. of IEEE Intl. Conf. on Computer Design, (ICCD) , pp. 53-60
    • Zhao, L.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.