SCOPUS 정보 검색 플랫폼

Journal of Computational Physics

Volumn 230, Issue 13, 2011, Pages 5383-5398

Importance of explicit vectorization for CPU and GPU software performance

(3) Dickson, Neil G a Karimi, Kamran a Hamze, Firas a

a D WAVE SYSTEMS INC (Canada)

Author keywords

GPU; Ising model; Monte Carlo; Optimization; Performance; Vectorization

Indexed keywords

COMPUTER GRAPHICS; COMPUTER GRAPHICS EQUIPMENT; FLOCCULATION; ISING MODEL; MONTE CARLO METHODS; PROGRAM PROCESSORS;

'CURRENT; MONTE CARLO; MULTI-THREADING; OPTIMISATIONS; OPTIMIZATION TECHNIQUES; PARALLEL OPTIMIZATION; PERFORMANCE; PERFORMANCE COMPUTING; SOFTWARE PERFORMANCE; VECTORIZATION;

GRAPHICS PROCESSING UNIT;

EID: 79955479184 PISSN: 00219991 EISSN: 10902716 Source Type: Journal
DOI: 10.1016/j.jcp.2011.03.041 Document Type: Article

Times cited : (22)

References (25)

1
- 33947588048
- ISSN: 0167-7055
- Owens J.D., Luebke D., et al. Computer Graphics Forum 2007, 26(1):80-113. ISSN: 0167-7055.
- (2007) Computer Graphics Forum , vol.26 , Issue.1 , pp. 80-113
- Owens, J.D.¹ Luebke, D.²

2
- 85198247001
- Princeton University Press
- Scott L.R., Clark T., Bagheri B. Scientific Parallel Computing 2005, Princeton University Press.
- (2005) Scientific Parallel Computing
- Scott, L.R.¹ Clark, T.² Bagheri, B.³

3
- 77951157944
- Morgan Kaufmann
- Kirk D., Hwu W. Programming Massively Parallel Processors: A Hands-on Approach 2010, Morgan Kaufmann.
- (2010) Programming Massively Parallel Processors: A Hands-on Approach
- Kirk, D.¹ Hwu, W.²

4
- 48349093945
- High performance computing for deformable image registration: towards a new paradigm in adaptive radiotherapy
- Samant S.S., Xia J., Muyan-Özçelik P., Owens J.D. High performance computing for deformable image registration: towards a new paradigm in adaptive radiotherapy. Medical Physics 2008, 35(8):3546-3553.
- (2008) Medical Physics , vol.35 , Issue.8 , pp. 3546-3553
- Samant, S.S.¹ Xia, J.² Muyan-Özçelik, P.³ Owens, J.D.⁴

5
- 67349267818
- GPU accelerated Monte Carlo simulation of the 2D and 3D Ising model
- Preis T., Virnau P., et al. GPU accelerated Monte Carlo simulation of the 2D and 3D Ising model. Journal of Computational Physics 2009, 228:4468-4477.
- (2009) Journal of Computational Physics , vol.228 , pp. 4468-4477
- Preis, T.¹ Virnau, P.²

6
- 10644279605
- Benchmarking and implementation of probability-based simulations on programmable graphics cards
- Tomov S., McGuigan M., et al. Benchmarking and implementation of probability-based simulations on programmable graphics cards. Computers & Graphics 2005, 29:71-80.
- (2005) Computers & Graphics , vol.29 , pp. 71-80
- Tomov, S.¹ McGuigan, M.²

7
- 79959466764
- Optimization principles and application performance of a multithreaded GPU using CUDA, in: Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
- S. Ryoo, C. Rodrigues, et al., Optimization principles and application performance of a multithreaded GPU using CUDA, in: Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming, 2008, pp. 73-82.
- (2008) , pp. 73-82
- Ryoo, S.¹ Rodrigues, C.²

8
- 0037952146
- Morgan Kaufmann Publishers, Academic Press
- Allen R., Kennedy K. Optimizing Compilers for Modern Architectures 2002, Morgan Kaufmann Publishers, Academic Press.
- (2002) Optimizing Compilers for Modern Architectures
- Allen, R.¹ Kennedy, K.²

9
- 33646558229
- Using advanced compiler technology to exploit the performance of the cell broadband engine architecture
- Eichenberger A.E., O'Brien J.K., et al. Using advanced compiler technology to exploit the performance of the cell broadband engine architecture. IBM Systems Journal 2006, 45(1):59-84.
- (2006) IBM Systems Journal , vol.45 , Issue.1 , pp. 59-84
- Eichenberger, A.E.¹ O'Brien, J.K.²

10
- 85190229132
- Intel 64 and IA-32 Architectures Software Developer's Manual, Intel Corporation Order Numbers 253665 through 253669
- Intel 64 and IA-32 Architectures Software Developer's Manual, Intel Corporation Order Numbers 253665 through 253669, 2011.
- (2011)

11
- 0031599142
- Mersenne twister: A 623-Dimensionally equidistributed uniform pseudo-random number generator
- Matsumoto M., Nishimura T. Mersenne twister: A 623-Dimensionally equidistributed uniform pseudo-random number generator. ACM Transactions on Modeling and Computer Simulation 1998, 8(1):3-30.
- (1998) ACM Transactions on Modeling and Computer Simulation , vol.8 , Issue.1 , pp. 3-30
- Matsumoto, M.¹ Nishimura, T.²

12
- 0003657590
- The art of computer programming
- Addison Wesley Longman Publishing
- Knuth D.E. The art of computer programming. Sorting and Searching 1998, vol. 3. Addison Wesley Longman Publishing.
- (1998) Sorting and Searching , vol.3
- Knuth, D.E.¹

13
- 5744249209
- Equation of state calculations by fast computing machines
- Metropolis N., Rosenbluth A.W., et al. Equation of state calculations by fast computing machines. Journal of Chemical Physics 1953, 21(6).
- (1953) Journal of Chemical Physics , vol.21 , Issue.6
- Metropolis, N.¹ Rosenbluth, A.W.²

14
- 14644408446
- World Scientific Publishing
- Berg B.A. Markov chain Monte Carlo simulations and their statistical analysis 2004, World Scientific Publishing.
- (2004) Markov chain Monte Carlo simulations and their statistical analysis
- Berg, B.A.¹

15
- 34548731458
- Oxford University Press, US
- Anderson J.B. Quantum Monte Carlo: Origins, Development, Applications 2007, Oxford University Press, US.
- (2007) Quantum Monte Carlo: Origins, Development, Applications
- Anderson, J.B.¹

16
- 79951638366
- High-performance physics simulations using multi-core CPUs and GPGPUs in a volunteer computing context
- Karimi K., Dickson N., Hamze F. High-performance physics simulations using multi-core CPUs and GPGPUs in a volunteer computing context. International Journal of High-Performance Applications 2011, 25(1).
- (2011) International Journal of High-Performance Applications , vol.25 , Issue.1
- Karimi, K.¹ Dickson, N.² Hamze, F.³

17
- 77952849732
- Robust parameter selection for parallel tempering
- Hamze F., Dickson N., Karimi K. Robust parameter selection for parallel tempering. International Journal of Modern Physics C 2010, 21(5).
- (2010) International Journal of Modern Physics C , vol.21 , Issue.5
- Hamze, F.¹ Dickson, N.² Karimi, K.³

18
- 85190223871
- Investigating the performance of an adiabatic quantum optimization processor
- Karimi K., Dickson N., Hamze F., et al. Investigating the performance of an adiabatic quantum optimization processor. Quantum Information Processing 2011.
- (2011) Quantum Information Processing
- Karimi, K.¹ Dickson, N.² Hamze, F.³

19
- 0001064590
- Generalized Trotter's formula and systematic approximants of exponential operators and inner derivations with applications to many-body problems
- Suzuki M. Generalized Trotter's formula and systematic approximants of exponential operators and inner derivations with applications to many-body problems. Communications in Mathematical Physics 1976, 51(2):183-190.
- (1976) Communications in Mathematical Physics , vol.51 , Issue.2 , pp. 183-190
- Suzuki, M.¹

20
- 85190199130
- Improving Particle Filter Performance Using SSE Instructions, in: International Conference on Intelligent Robotics and Systems, USA
- P. Djeu, M. Quinlan, P. Stone, Improving Particle Filter Performance Using SSE Instructions, in: International Conference on Intelligent Robotics and Systems, USA, 2009.
- (2009)
- Djeu, P.¹ Quinlan, M.² Stone, P.³

21
- 85190179298
- Efficient SIMD numerical interpolation, in: International Conference on High Performance Computing and Communications (HPCC), Italy
- H. Ahmadi, M. Moslemi-Naeini, H. Sarbazi-Azad, Efficient SIMD numerical interpolation, in: International Conference on High Performance Computing and Communications (HPCC), Italy, 2005.
- (2005)
- Ahmadi, H.¹ Moslemi-Naeini, M.² Sarbazi-Azad, H.³

22
- 85190200116
- GPU-CPU multi-core for real-time signal processing, in: 2009 Digest of Technical Papers International Conference on Consumer Electronics
- S.P. Mohanty, GPU-CPU multi-core for real-time signal processing, in: 2009 Digest of Technical Papers International Conference on Consumer Electronics, 2009, pp. 1-2.
- (2009) , pp. 1-2
- Mohanty, S.P.¹

23
- 85190209175
- Intel 64 and IA-32 Architectures Optimization Reference Manual, Intel Corporation Order Number 248966, November
- Intel 64 and IA-32 Architectures Optimization Reference Manual, Intel Corporation Order Number 248966, November 2007.
- (2007)

24
- 4544235762
- Xorshift RNGs
- Marsaglia G. Xorshift RNGs. Journal of Statistical Software 2003, 8(14):1-6.
- (2003) Journal of Statistical Software , vol.8 , Issue.14 , pp. 1-6
- Marsaglia, G.¹

25
- 85190203290
- NVIDIA OpenCL Best Practices Guide, Version 1.0, NVIDIA Corporation
- NVIDIA OpenCL Best Practices Guide, Version 1.0, NVIDIA Corporation, 2009.
- (2009)

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.