SCOPUS 정보 검색 플랫폼

Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT

Volumn , Issue , 2011, Pages 243-252

Enhancing data locality for dynamic simulations through asynchronous data transformations and adaptive control

(3) Wu, Bo a Zhang, Eddy Z a Shen, Xipeng a

a The College of William and Mary (United States)

Author keywords

[No Author keywords available]

Indexed keywords

ADAPTIVE CONTROL; ASYNCHRONOUS DATA; CRITICAL PATHS; DATA LOCALITY; DATA REORDERING; DATA TRANSFORMATION; DYNAMIC ADAPTATIONS; HETEROGENEOUS CHIP MULTIPROCESSOR; MEMORY REFERENCES; PERFORMANCE IMPROVEMENTS; PROGRAM STATE; RUNTIME OPTIMIZATION; TRADITIONAL TECHNIQUES;

BENCHMARKING; COMPUTER SIMULATION; OPTIMIZATION; PARALLEL ARCHITECTURES;

METADATA;

EID: 84856544146 PISSN: 1089795X EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/PACT.2011.56 Document Type: Conference Paper

Times cited : (16)

References (30)

1
- 84856515925
- libpfm4
- libpfm4. http://perfmon2.sourceforge.net/docs.html.

2
- 84856519788
- NVIDIA CUDA. http://www.nvidia.com/cuda.

3
- 57349180412
- A compiler framework for optimization of affine loop nests for GPGPUs
- M. M. Baskaran, U. Bondhugula, S. Krishnamoorthy, J. Ramanujam, A. Rountev, and P. Sadayappan. A compiler framework for optimization of affine loop nests for GPGPUs. In Proceedings of ICS, 2008.
- (2008) Proceedings of ICS
- Baskaran, M.M.¹ Bondhugula, U.² Krishnamoorthy, S.³ Ramanujam, J.⁴ Rountev, A.⁵ Sadayappan, P.⁶

4
- 0023346636
- Partitioning strategy for nonuniform problems on multiprocessors
- M. Berger and S. Bokhari. A partitioning strategy for non-uniform problems on multiprocessors. IEEE Trans. Computers, 37(12):570-580, 1987. (Pubitemid 17582501)
- (1987) IEEE Transactions on Computers , vol.C-36 , Issue.5 , pp. 570-580
- Berger Marsha, J.¹ Bokhari Shahid, H.²

5
- 77958483977
- Running unstructured grid based cfd solvers on modern graphics hardware
- A. Corrigan, F. Camelli, R. Lohner, and J. Wallin. Running unstructured grid based cfd solvers on modern graphics hardware. In Proceedings of the 19th AIAA Computational Fluid Dynamics, 2009.
- (2009) Proceedings of the 19th AIAA Computational Fluid Dynamics
- Corrigan, A.¹ Camelli, F.² Lohner, R.³ Wallin, J.⁴

6
- 0037883031
- The design and implementation of a parallel unstructured euler solver using software primitives
- R. Das, D. Mavriplis, J. Saltz, S. Gupta, and R. Ponnusamy. The design and implementation of a parallel unstructured euler solver using software primitives. In Proceedings of the 30th Aerospace Science Meeting, 1992.
- (1992) Proceedings of the 30th Aerospace Science Meeting
- Das, R.¹ Mavriplis, D.² Saltz, J.³ Gupta, S.⁴ Ponnusamy, R.⁵

7
- 0001483604
- Communication optimizations for irregular scientific computations on distributioned memory architectures
- R. Das, M. Uysal, J. Saltz, and Y.-S. Hwang. Communication optimizations for irregular scientific computations on distributioned memory architectures. Journal of Parallel and Distributed Computing, 22(3):462-479, 1994.
- (1994) Journal of Parallel and Distributed Computing , vol.22 , Issue.3 , pp. 462-479
- Das, R.¹ Uysal, M.² Saltz, J.³ Hwang, Y.-S.⁴

8
- 1642502420
- Improving effective bandwidth through compiler enhancement of global cache reuse
- DOI 10.1016/j.jpdc.2003.09.005
- C. Ding and K. Kennedy. Improving effective bandwidth through compiler enhancement of global cache reuse. Journal of Parallel and Distributed Computing, 64(1):108-134, 2004. (Pubitemid 38117742)
- (2004) Journal of Parallel and Distributed Computing , vol.64 , Issue.1 , pp. 108-134
- Ding, C.¹ Kennedy, K.²

9
- 84864039589
- Fine-grained treatment to synchronizations in gpu-to-cpu translation
- Z. Guo and X. Shen. Fine-grained treatment to synchronizations in gpu-to-cpu translation. In Proc. of the Workshop on Languages and Compilers for Parallel Computing, 2011.
- (2011) Proc. of the Workshop on Languages and Compilers for Parallel Computing
- Guo, Z.¹ Shen, X.²

10
- 84856512446
- Correctly treating synchronizations in compiling fine-grained spmd-threaded programs for cpu
- Z. Guo, E. Zhang, and X. Shen. Correctly treating synchronizations in compiling fine-grained spmd-threaded programs for cpu. In Proceedings of the International Conference on Parallel Architecture and Compilation Techniques (PACT), 2011.
- (2011) Proceedings of the International Conference on Parallel Architecture and Compilation Techniques (PACT)
- Guo, Z.¹ Zhang, E.² Shen, X.³

11
- 12344323872
- Improving locality for adaptive irregular scientific codes
- White Plains, NY, August
- H. Han and C. W. Tseng. Improving locality for adaptive irregular scientific codes. In Proceedings of Workshop on Languages and Compilers for High-Performance Computing (LCPC'00), White Plains, NY, August 2000.
- (2000) Proceedings of Workshop on Languages and Compilers for High-Performance Computing (LCPC'00)
- Han, H.¹ Tseng, C.W.²

12
- 33745715056
- Exploiting locality for irregular scientific codes
- DOI 10.1109/TPDS.2006.88
- H. Han and C.-W. Tseng. Exploiting locality for irregular scientific codes. IEEE Transactions on Parallel Distributed Systems, 17(7):606-618, 2006. (Pubitemid 43997184)
- (2006) IEEE Transactions on Parallel and Distributed Systems , vol.17 , Issue.7 , pp. 606-618
- Han, H.¹ Tseng, C.-W.²

13
- 0003684449
- Springer
- T. Hastie, R. Tibshirani, and J. Friedman. The elements of statistical learning. Springer, 2001.
- (2001) The Elements of Statistical Learning
- Hastie, T.¹ Tibshirani, R.² Friedman, J.³

14
- 0009406160
- A fast and high quality multilevel scheme for partitioning irregular graphs
- G. Karypis and V. Kumar. A fast and high quality multilevel scheme for partitioning irregular graphs. In Proceedings of ICPP, 1995.
- (1995) Proceedings of ICPP
- Karypis, G.¹ Kumar, V.²

15
- 79958785075
- Region-based parallelization of irregular reductions onexplicitly managed memory hierarchies
- S. Kim, H. Han, and K. Choe. Region-based parallelization of irregular reductions onexplicitly managed memory hierarchies. Journal of Supercomputing, 2009.
- (2009) Journal of Supercomputing
- Kim, S.¹ Han, H.² Choe, K.³

16
- 77957808385
- Optimistic parallelism benefits from data partitioning
- DOI 10.1145/1346281.1346311, ASPLOS XIII - Thirteenth International Conference on Architectural Support for Programming Languages and Operating Systems
- M. Kulkarni, K. Pingali, G. Ramanarayanan, B. Walter, K. Bala, and L. P. Chew. Optimistic parallelism benefits from data partitioning. In Proceedings of ASPLOS, pages 233-243, 2008. (Pubitemid 351585410)
- (2008) International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS , pp. 233-243
- Kulkarni, M.¹ Pingali, K.² Ramanarayanan, G.³ Walter, B.⁴ Bala, K.⁵ Chew, L.P.⁶

17
- 67650081010
- OpenMP to GPGPU: A compiler framework for automatic translation and optimization
- S. Lee, S. Min, and R. Eigenmann. OpenMP to GPGPU: A compiler framework for automatic translation and optimization. In Proceedings of PPoPP, 2009.
- (2009) Proceedings of PPoPP
- Lee, S.¹ Min, S.² Eigenmann, R.³

18
- 0016940739
- Comparative analysis of the cuthill-mckee and the reverse cuthill-mckee ordering algorithms for sparse matrices
- April
- W. Liu and A. Sherman. Comparative analysis of the cuthill-mckee and the reverse cuthill-mckee ordering algorithms for sparse matrices. SIAM J. Numerical Analysis, 13(2), April 1976.
- (1976) SIAM J. Numerical Analysis , vol.13 , pp. 2
- Liu, W.¹ Sherman, A.²

19
- 70450103746
- A cross-input adaptive framework for gpu programs optimization
- Y. Liu, E. Z. Zhang, and X. Shen. A cross-input adaptive framework for gpu programs optimization. In Proceedings of International Parallel and Distribute Processing Symposium (IPDPS), pages 1-10, 2009.
- (2009) Proceedings of International Parallel and Distribute Processing Symposium (IPDPS) , pp. 1-10
- Liu, Y.¹ Zhang, E.Z.² Shen, X.³

20
- 84856559314
- G. Marin, G. Jin, and J. Mellor-Crummey. Managing locality in grand challenge applications: a case study of the gyrokinetic toroidal code. 2008.
- (2008) Managing Locality in Grand Challenge Applications: A Case Study of the Gyrokinetic Toroidal Code
- Marin, G.¹ Jin, G.² Mellor-Crummey, J.³

21
- 0032684978
- Improving memory hierarchy performance for irregular applications
- J. Mellor-Crummey, D. Whalley, and K. Kennedy. Improving memory hierarchy performance for irregular applications. In Proceedings of SC, 1999.
- (1999) Proceedings of SC
- Mellor-Crummey, J.¹ Whalley, D.² Kennedy, K.³

22
- 0033362479
- Localizing non-affine array references
- N. Mitchell, L. Carter, and J. Ferrante. Localizing non-affine array references. In Proceedings of PACT, 1999.
- (1999) Proceedings of PACT
- Mitchell, N.¹ Carter, L.² Ferrante, J.³

23
- 0036040497
- The hardness of cache conscious data placement
- Portland, Oregon, January
- E. Petrank and D. Rawitz. The hardness of cache conscious data placement. In Proceedings of ACM Symposium on Principles of Programming Languages, Portland, Oregon, January 2002.
- (2002) Proceedings of ACM Symposium on Principles of Programming Languages
- Petrank, E.¹ Rawitz, D.²

24
- 77954709868
- Compiler and runtime support for enabling generalized reduction computations on heterogeneous parallel configurations
- V. Ravi, W. Ma, D. Chiu, and G. Agrawal. compiler and runtime support for enabling generalized reduction computations on heterogeneous parallel configurations. In Proceedings of ICS, 2010.
- (2010) Proceedings of ICS
- Ravi, V.¹ Ma, W.² Chiu, D.³ Agrawal, G.⁴

25
- 77953978573
- Efficient compilation of fine-grained spmd-threaded programs for multicore cpus
- J. Stratton, V. Grover, J. Marathe, B. Aarts, M. Murphy, Z. Hu, and W. Hwu. Efficient compilation of fine-grained spmd-threaded programs for multicore cpus. In CGO '10: Proceedings of the International Symposium on Code Generation and Optimization, 2010.
- (2010) CGO '10: Proceedings of the International Symposium on Code Generation and Optimization
- Stratton, J.¹ Grover, V.² Marathe, J.³ Aarts, B.⁴ Murphy, M.⁵ Hu, Z.⁶ Hwu, W.⁷

26
- 0038039924
- Compile-time composition of run-time data and iteration reorderings
- June
- M. M. Strout, L. Carter, and J. Ferrante. Compile-time composition of run-time data and iteration reorderings. In Proceedings of the 2003 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), June 2003.
- (2003) Proceedings of the 2003 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI)
- Strout, M.M.¹ Carter, L.² Ferrante, J.³

27
- 77954691442
- A gpgpu compiler for memory optimization and parallelism management
- Y. Yang, P. Xiang, J. Kong, and H. Zhou. A gpgpu compiler for memory optimization and parallelism management. In PLDI, 2010.
- (2010) PLDI
- Yang, Y.¹ Xiang, P.² Kong, J.³ Zhou, H.⁴

28
- 79953126288
- On-the-fly elimination of dynamic irregularities for gpu computing
- E. Zhang, Y. Jiang, Z. Guo, K. Tian, and X. Shen. On-the-fly elimination of dynamic irregularities for gpu computing. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems, pages 369-380, 2011.
- (2011) Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems , pp. 369-380
- Zhang, E.¹ Jiang, Y.² Guo, Z.³ Tian, K.⁴ Shen, X.⁵

29
- 77954724148
- Streamlining gpu applications on the fly
- E. Z. Zhang, Y. Jiang, Z. Guo, and X. Shen. Streamlining gpu applications on the fly. In Proceedings of ICS, 2010.
- (2010) Proceedings of ICS
- Zhang, E.Z.¹ Jiang, Y.² Guo, Z.³ Shen, X.⁴

30
- 8344272049
- Array regrouping and structure splitting using whole-program reference affinity
- June
- Y. Zhong, M. Orlovich, X. Shen, and C. Ding. Array regrouping and structure splitting using whole-program reference affinity. In Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 255-266, June 2004.
- (2004) Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation , pp. 255-266
- Zhong, Y.¹ Orlovich, M.² Shen, X.³ Ding, C.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.