SCOPUS 정보 검색 플랫폼

Scientific Computing with Multicore and Accelerators

Volumn , Issue , 2010, Pages 83-110

Sparse matrix-vector multiplication on multicore and accelerators

(6) Williams, Samuel a Bell, Nathan b Choi, Jee Whan c Garland, Michael b Oliker, Leonid a Vuduc, Richard c

a LAWRENCE BERKELEY NATIONAL LABORATORY (United States)

b NVIDIA (United States)

c GEORGIA INSTITUTE OF TECHNOLOGY (United States)

Author keywords

[No Author keywords available]

Indexed keywords

COMPUTER AIDED ENGINEERING; SEARCH ENGINES;

COMPUTATIONAL INTENSITY; ECONOMIC MODELING; ENGINEERING COMPUTING; HIGHPERFORMANCE; MEMORY ACCESS; MULTI CORE; SPARSE MATRIX-VECTOR MULTIPLICATION; STATE-OF-THE-ART SYSTEM;

MATRIX ALGEBRA;

EID: 85017244495 PISSN: None EISSN: None Source Type: Book
DOI: 10.1201/b10376 Document Type: Chapter

Times cited : (28)

References (16)

1
- 85054422862
- POSIX Threads (pthread.h), IEEE Std 1003.1
- The Open Group Base Specifications, Issue 6: POSIX Threads (pthread.h). IEEE Std 1003.1, 2004. http://www.opengroup.org/onlinepubs/009695399/basedefs/pthread.h.html.
- (2004) The Open Group Base Specifications , Issue.6

2
- 67650694407
- December
- NVIDIA CUDA (Compute Unified Device Architecture): Programming Guide, Version 2.1. http://developer.download.nvidia.com/compute/cuda/2_1/toolkit/docs/NVIDIA_CUDA_Programming_Guide_2.1.pdf, December 2008.
- (2008) NVIDIA CUDA (Compute Unified Device Architecture): Programming Guide, Version 2.1

3
- 33745612838
- May
- OpenMP: Application Program Interface, version 3.0, May 2008. http://www.openmp.org/mp-documents/spec30.pdf.
- (2008) OpenMP: Application Program Interface, version 3.0

4
- 72849129747
- Technical Report RC24704 (W0812-047), IBM T.J. Watson Research Center, Yorktown Heights, NY, USA, December
- Muthu Manikandan Baskaran, and Rajesh Bordawekar. Optimizing sparse matrix-vector multiplication on GPUs using compile-time, and run-time strategies. Technical Report RC24704 (W0812-047), IBM T.J. Watson Research Center, Yorktown Heights, NY, USA, December 2008.
- (2008) Optimizing sparse matrix-vector multiplication on GPUs using compile-time, and run-time strategies
- Baskaran, M.M.¹ Bordawekar, R.²

5
- 74049143158
- Implementing a sparse matrix-vector multiplication on throughput-oriented processors
- Portland, OR, USA, November
- Nathan Bell, and Michael Garland. Implementing a sparse matrix-vector multiplication on throughput-oriented processors. In Proc. ACM/IEEE Conf. Supercomputing (SC), Portland, OR, USA, November 2009.
- (2009) Proc. ACM/IEEE Conf. Supercomputing (SC)
- Bell, N.¹ Garland, M.²

6
- 0002924006
- Technical Report, Carnegie Mellon University, Department of Computer Science, Pittsburgh, PA, USA, August
- Guy E. Blelloch, Michael A. Heroux, and Marco Zagha. Segmented operations for sparse matrix computations on vector multiprocessors. Technical Report, Carnegie Mellon University, Department of Computer Science, Pittsburgh, PA, USA, August 1993.
- (1993) Segmented operations for sparse matrix computations on vector multiprocessors
- Blelloch, G.E.¹ Heroux, M.A.² Zagha, M.³

7
- 77749340082
- Model-driven autotuning of sparse matrix-vector multiply on GPUs
- Bangalore, India, January
- Jee Whan Choi, Amik Singh, and Richard W. Vuduc. Model-driven autotuning of sparse matrix-vector multiply on GPUs. In Proc. ACM SIGPLAN Symp. Principles, and Practice of Parallel Programming (PPoPP), Bangalore, India, January 2010.
- (2010) Proc. ACM SIGPLAN Symp. Principles, and Practice of Parallel Programming (PPoPP)
- Choi, J.W.¹ Singh, A.² Vuduc, R.W.³

8
- 1542501019
- Sparsity: Optimization framework for sparse matrix kernels
- February
- Eun-Jin Im, Katherine Yelick, and Richard Vuduc. Sparsity: Optimization framework for sparse matrix kernels. Int’l J. of High Performance Computing Applications (IJHPCA), 18(1): 135-158, February 2004.
- (2004) Int’l J. of High Performance Computing Applications (IJHPCA) , vol.18 , Issue.1 , pp. 135-158
- Im, E.-J.¹ Yelick, K.² Vuduc, R.³

9
- 77951900491
- Whitepaper (electronic), September
- NVIDIA. NVIDIA’s next generation CUDA compute architecture: FermiTM, v1.1. Whitepaper (electronic), September 2009. http://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf.
- (2009) NVIDIA’s next generation CUDA compute architecture: FermiTM, v1.1

10
- 0003877041
- Springer Verlag
- John R. Rice, and Ronald F. Boisvert. Solving Elliptic Problems Using ELLPACK. Springer Verlag, 1984.
- (1984) Solving Elliptic Problems Using ELLPACK
- Rice, J.R.¹ Boisvert, R.F.²

11
- 1842829625
- Society for Industrial, and Applied Mathematics, April
- Yousef Saad. Iterative Methods for Sparse Linear Systems, Second Edition. Society for Industrial, and Applied Mathematics, April 2003.
- (2003) Iterative Methods for Sparse Linear Systems, Second Edition
- Saad, Y.¹

12
- 78651284120
- Scan primitives for GPU computing
- San Diego, CA, USA
- Shubhabrata Sengupta, Mark Harris, Yao Zhang, and John D. Owens. Scan primitives for GPU computing. In Proc. ACM SIGGRAPH/EUROGRAPHICS Symp. Graphics Hardware, San Diego, CA, USA, 2007.
- (2007) Proc. ACM SIGGRAPH/EUROGRAPHICS Symp. Graphics Hardware
- Sengupta, S.¹ Harris, M.² Zhang, Y.³ Owens, J.D.⁴

13
- 10044233808
- PhD thesis, University of California, Berkeley, CA, USA, January
- Richard W. Vuduc. Automatic performance tuning of sparse matrix kernels. PhD thesis, University of California, Berkeley, CA, USA, January 2004.
- (2004) Automatic performance tuning of sparse matrix kernels
- Vuduc, R.W.¹

14
- 56749158843
- Optimization of sparse matrix-vector multiplication on emerging multicore platforms
- Sam Williams, Leonid Oliker, Richard Vuduc, John Shalf, Katherine Yelick, and James Demmel. Optimization of sparse matrix-vector multiplication on emerging multicore platforms. In Proc. ACM/IEEE Conf. Supercomputing (SC), 2007.
- (2007) Proc. ACM/IEEE Conf. Supercomputing (SC)
- Williams, S.¹ Oliker, L.² Vuduc, R.³ Shalf, J.⁴ Yelick, K.⁵ Demmel, J.⁶

15
- 56749158843
- Optimizing sparse matrix-vector multiply on emerging multicore platforms
- March, Extends conference version
- Sam Williams, Richard Vuduc, Leonid Oliker, John Shalf, Katherine Yelick, and James Demmel. Optimizing sparse matrix-vector multiply on emerging multicore platforms. Parallel Computing (ParCo), 35(3): 178-194, March 2009. Extends conference version: http://dx.doi.org/10.1145/1362622.1362674.
- (2009) Parallel Computing (ParCo) , vol.35 , Issue.3 , pp. 178-194
- Williams, S.¹ Vuduc, R.² Oliker, L.³ Shalf, J.⁴ Yelick, K.⁵ Demmel, J.⁶

16
- 65649090648
- UCB/EECS-2008-164, University of California, Berkeley, CA, USA, December
- Samuel Webb Williams. Auto-tuning performance on multicore computers. UCB/EECS-2008-164, University of California, Berkeley, CA, USA, December 2008.
- (2008) Auto-tuning performance on multicore computers
- Williams, S.W.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.