SCOPUS 정보 검색 플랫폼

Procedia Computer Science

Volumn 9, Issue , 2012, Pages 1900-1909

Benchmarking data and compute intensive applications on modern CPU and GPU architectures

(5) Ciznicki, Miłosz a Kierzynka, Michał a Kopta, Piotr a Kurowski, Krzysztof a Gepner, Paweł b

a POZNAN SUPERCOMPUTING AND NETWORKING CENTER (Poland)

b Intel Corporation (United Kingdom)

Author keywords

Benchmarks; GPU; JPEG 2000; Multi core CPU; Signal processing

Indexed keywords

EID: 84890826444 PISSN: 18770509 EISSN: None Source Type: Conference Proceeding
DOI: 10.1016/j.procs.2012.04.208 Document Type: Conference Paper

Times cited : (12)

References (38)

1
- 0042674307
- The LINPACK benchmark: Past, present and future
- J. Dongarra, P. Luszczek, A. Petitet, The LINPACK Benchmark: past, present and future, Concurrency and Computation: Practice and Experience 15 (9) (2003) 803-820.
- (2003) Concurrency and Computation: Practice and Experience , vol.15 , Issue.9 , pp. 803-820
- Dongarra, J.¹ Luszczek, P.² Petitet, A.³

2
- 35348885705
- LAPACK users' guide
- E. Anderson, Z. Bai, C. Bischof, LAPACK Users' guide, Vol. 9, Society for Industrial Mathematics, 1999.
- (1999) Society for Industrial Mathematics , vol.9
- Anderson, E.¹ Bai, Z.² Bischof, C.³

3
- 0026294379
- The NAS parallel benchmarks summary and preliminary results
- Proceedings of the 1991 ACM/IEEE Conference on, IEEE
- D. Bailey, E. Barszcz, J. Barton, D. Browning, R. Carter, L. Dagum, R. Fatoohi, P. Frederickson, T. Lasinski, R. Schreiber, et al, The NAS parallel benchmarks summary and preliminary results, in: Supercomputing, 1991. Supercomputing'91. Proceedings of the 1991 ACM/IEEE Conference on, IEEE, 1991, pp. 158-165.
- (1991) Supercomputing, 1991. Supercomputing'91 , pp. 158-165
- Bailey, D.¹ Barszcz, E.² Barton, J.³ Browning, D.⁴ Carter, R.⁵ Dagum, L.⁶ Fatoohi, R.⁷ Frederickson, P.⁸ Lasinski, T.⁹ Schreiber, R.¹⁰

4
- 83755206878
- CaKernel - A parallel application programming framework for heterogenous computing architectures
- M. Blazewicz, S. Brandt, M. Kierzynka, K. Kurowski, B. Ludwiczak, J. Tao, J. Weglarz, CaKernel - a parallel application programming framework for heterogenous computing architectures, Scientific Programming 19 (4) (2011) 185-197.
- (2011) Scientific Programming , vol.19 , Issue.4 , pp. 185-197
- Blazewicz, M.¹ Brandt, S.² Kierzynka, M.³ Kurowski, K.⁴ Ludwiczak, B.⁵ Tao, J.⁶ Weglarz, J.⁷

5
- 85030961442
- LNCS, To appear
- M. Ciznicki, M. Kierzynka, K. Kurowski, B. Ludwiczak, K. Napierala, J. Palczynski, Efficient isosurface extraction using marching tetrahedra and histogram pyramids on multiple GPUs, LNCS, To appear.
- Efficient Isosurface Extraction Using Marching Tetrahedra and Histogram Pyramids on Multiple GPUs
- Ciznicki, M.¹ Kierzynka, M.² Kurowski, K.³ Ludwiczak, B.⁴ Napierala, K.⁵ Palczynski, J.⁶

6
- 79956218804
- Protein alignment algorithms with an efficient backtracking routine on multiple GPUs
- (181)
- J. Blazewicz, W. Frohmberg, M. Kierzynka, E. Pesch, P. Wojciechowski, Protein alignment algorithms with an efficient backtracking routine on multiple GPUs, BMC Bioinformatics 12:181 (181).
- BMC Bioinformatics , vol.12 , pp. 181
- Blazewicz, J.¹ Frohmberg, W.² Kierzynka, M.³ Pesch, E.⁴ Wojciechowski, P.⁵

7
- 85030960469
- G-MSA - GPU-based, fast and accurate algorithm for multiple sequence alignment
- To appear
- J. Blazewicz, W. Frohmberg, M. Kierzynka, P. Wojciechowski, G-MSA - GPU-based, fast and accurate algorithm for multiple sequence alignment, Journal of Parallel and Distributed Computing, To appear.
- Journal of Parallel and Distributed Computing
- Blazewicz, J.¹ Frohmberg, W.² Kierzynka, M.³ Wojciechowski, P.⁴

8
- 79958266939
- Parallel application benchmarks and performance evaluation of the intel xeon 7500 family processors
- P. Kopta, M. Kulczewski, K. Kurowski, T. Piontek, P. Gepner, M. Puchalski, J. Komasa, Parallel application benchmarks and performance evaluation of the Intel Xeon 7500 family processors, Procedia Computer Science 4 (2011) 372-381.
- (2011) Procedia Computer Science , vol.4 , pp. 372-381
- Kopta, P.¹ Kulczewski, M.² Kurowski, K.³ Piontek, T.⁴ Gepner, P.⁵ Puchalski, M.⁶ Komasa, J.⁷

9
- 79958284905
- Innovative Computing Laboratory, University of Tennessee, Tech. Rep.
- R. Nath, S. Tomov, J. Dongarra, An improved MAGMA GEMM for Fermi GPUs, Innovative Computing Laboratory, University of Tennessee, Tech. Rep.
- An Improved MAGMA GEMM for Fermi GPUs
- Nath, R.¹ Tomov, S.² Dongarra, J.³

10
- 77954701719
- FAST: Fast architecture sensitive tree search on modern CPUs and GPUs
- ACM
- C. Kim, J. Chhugani, N. Satish, E. Sedlar, A. Nguyen, T. Kaldewey, V. Lee, S. Brandt, P. Dubey, FAST: fast architecture sensitive tree search on modern CPUs and GPUs, in: Proceedings of the 2010 international conference on Management of data, ACM, 2010, pp. 339-350.
- (2010) Proceedings of the 2010 International Conference on Management of Data , pp. 339-350
- Kim, C.¹ Chhugani, J.² Satish, N.³ Sedlar, E.⁴ Nguyen, A.⁵ Kaldewey, T.⁶ Lee, V.⁷ Brandt, S.⁸ Dubey, P.⁹

11
- 77954743119
- Fast sort on CPUs and GPUs: A case for bandwidth oblivious SIMD sort
- ACM
- N. Satish, C. Kim, J. Chhugani, A. Nguyen, V. Lee, D. Kim, P. Dubey, Fast sort on CPUs and GPUs: a case for bandwidth oblivious SIMD sort, in: Proceedings of the 2010 international conference on Management of data, ACM, 2010, pp. 351-362.
- (2010) Proceedings of the 2010 International Conference on Management of Data , pp. 351-362
- Satish, N.¹ Kim, C.² Chhugani, J.³ Nguyen, A.⁴ Lee, V.⁵ Kim, D.⁶ Dubey, P.⁷

12
- 74049114159
- Auto-tuning 3-D FFT library for CUDA GPUs
- ACM
- A. Nukada, S. Matsuoka, Auto-tuning 3-D FFT library for CUDA GPUs, in: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, ACM, 2009, p. 30.
- (2009) Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis , pp. 30
- Nukada, A.¹ Matsuoka, S.²

13
- 79955741773
- GPU-based DWT acceleration for JPEG2000
- J. Matela, GPU-based DWT acceleration for JPEG2000, in: Annual Doctoral Workshop on Mathematical and Engineering Methods in Computer Science, 2009, pp. 136-143.
- (2009) Annual Doctoral Workshop on Mathematical and Engineering Methods in Computer Science , pp. 136-143
- Matela, J.¹

14
- 78649781713
- Accelerating wavelet lifting on graphics hardware using CUDA
- W. van der Laan, A. Jalba, J. Roerdink, Accelerating wavelet lifting on graphics hardware using CUDA, IEEE Trans. Parallel Distrib. Syst 22 (1) (2011) 132-146.
- (2011) IEEE Trans. Parallel Distrib. Syst , vol.22 , Issue.1 , pp. 132-146
- Van Der, W.¹ Laan, A.J.² Roerdink, J.³

15
- 60649110203
- Computing discrete transforms on the cell broadband engine
- D. Bader, V. Agarwal, S. Kang, Computing discrete transforms on the Cell Broadband Engine, Parallel Computing 35 (3) (2009) 119-137.
- (2009) Parallel Computing , vol.35 , Issue.3 , pp. 119-137
- Bader, D.¹ Agarwal, V.² Kang, S.³

16
- 70349160422
- A parallel implementation of the 2D wavelet transform using CUDA
- 2009 17th Euromicro International Conference on, IEEE
- J. Franco, G. Bernabé, J. Fernández, M. Acacio, A parallel implementation of the 2D wavelet transform using CUDA, in: Parallel, Distributed and Network-based Processing, 2009 17th Euromicro International Conference on, IEEE, 2009, pp. 111-118.
- (2009) Parallel, Distributed and Network-based Processing , pp. 111-118
- Franco, J.¹ Bernabé, G.² Fernández, J.³ Acacio, M.⁴

17
- 84895561443
- Considerations when evaluating microprocessor platforms
- M. Anderson, B. Catanzaro, J. Chong, E. Gonina, K. Keutzer, C. Lai, M. Murphy, D. Sheffield, B. Su, N. Sundaram, Considerations when evaluating microprocessor platforms, in: Proceedings of the 3rd USENIX conference on Hot topic in parallelism, USENIX Association, 2011, pp. 1-1.
- (2011) Proceedings of the 3rd USENIX Conference on Hot Topic in Parallelism, USENIX Association , pp. 1-1
- Anderson, M.¹ Catanzaro, B.² Chong, J.³ Gonina, E.⁴ Keutzer, K.⁵ Lai, C.⁶ Murphy, M.⁷ Sheffield, D.⁸ Su, B.⁹ Sundaram, N.¹⁰

18
- 77954995885
- Debunking the 100x GPU vs. CPU myth: An evaluation of throughput computing on CPU and GPU
- ACM
- V. Lee, C. Kim, J. Chhugani, M. Deisher, D. Kim, A. Nguyen, N. Satish, M. Smelyanskiy, S. Chennupaty, P. Hammarlund, et al, Debunking the 100x GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU, in: ACM SIGARCH Computer Architecture News, Vol. 38, ACM, 2010, pp. 451-460.
- (2010) ACM SIGARCH Computer Architecture News , vol.38 , pp. 451-460
- Lee, V.¹ Kim, C.² Chhugani, J.³ Deisher, M.⁴ Kim, D.⁵ Nguyen, A.⁶ Satish, N.⁷ Smelyanskiy, M.⁸ Chennupaty, S.⁹ Hammarlund, P.¹⁰

19
- 85092761228
- On the limits of GPU acceleration
- R. Vuduc, A. Chandramowlishwaran, J. Choi, M. Guney, A. Shringarpure, On the limits of GPU acceleration, in: Proceedings of the 2nd USENIX conference on Hot topics in parallelism, USENIX Association, 2010, pp. 13-13.
- (2010) Proceedings of the 2nd USENIX Conference on Hot Topics in Parallelism, USENIX Association , pp. 13-13
- Vuduc, R.¹ Chandramowlishwaran, A.² Choi, J.³ Guney, M.⁴ Shringarpure, A.⁵

20
- 80052336166
- Lessons learned from exploring the backtracking paradigm on the GPU
- J. Jenkins, I. Arkatkar, J. Owens, A. Choudhary, N. Samatova, Lessons learned from exploring the backtracking paradigm on the GPU, Euro-Par 2011 Parallel Processing (2011) 425-437.
- (2011) Euro-Par 2011 Parallel Processing , pp. 425-437
- Jenkins, J.¹ Arkatkar, I.² Owens, J.³ Choudhary, A.⁴ Samatova, N.⁵

21
- 0003984680
- Information technology - JPEG 2000 image coding system: Core coding system (2004).
- (2004) Information Technology - JPEG 2000 Image Coding System: Core Coding System

22
- 78649975505
- JPEG 2000 compression of medical imagery
- D. Foes, E. Mukab, R. Sloneb, B. Erickson, M. Flynr, D. Clunie, K. Lloyd Hildebrand, S. Younga, JPEG 2000 compression of medical imagery, in: Proc. of SPIE, Vol. 3980, 2002.
- (2002) Proc. of SPIE , vol.3980
- Foes, D.¹ Mukab, E.² Sloneb, R.³ Erickson, B.⁴ Flynr, M.⁵ Clunie, D.⁶ Lloyd Hildebrand, K.⁷ Younga, S.⁸

23
- 81155126075
- GPU implementation of JPEG2000 for hyperspectral image compression
- M. Ciznicki, K. Kurowski, A. Plaza, GPU implementation of JPEG2000 for hyperspectral image compression, in: Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series, Vol. 8183, 2011, p. 12.
- (2011) Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series , vol.8183 , pp. 12
- Ciznicki, M.¹ Kurowski, K.² Plaza, A.³

24
- 0034229499
- High performance scalable image compression with EBCOT
- IEEE transactions on
- D. Taubman, High performance scalable image compression with EBCOT, Image Processing, IEEE transactions on 9 (7) (2000) 1158-1170.
- (2000) Image Processing , vol.9 , Issue.7 , pp. 1158-1170
- Taubman, D.¹

25
- 34247544326
- Transform coding techniques for lossy hyperspectral data compression
- IEEE Transactions on
- B. Penna, T. Tillo, E. Magli, G. Olmo, Transform coding techniques for lossy hyperspectral data compression, Geoscience and Remote Sensing, IEEE Transactions on 45 (5) (2007) 1408-1421.
- (2007) Geoscience and Remote Sensing , vol.45 , Issue.5 , pp. 1408-1421
- Penna, B.¹ Tillo, T.² Magli, E.³ Olmo, G.⁴

26
- 68149112974
- Intel Press
- S. Taylor, Optimizing Applications for Multi-core Processors: Using the Intel Integrated Performance Primitives, Intel Press, 2007.
- (2007) Optimizing Applications for Multi-core Processors: Using the Intel Integrated Performance Primitives
- Taylor, S.¹

27
- 57649208491
- A performance evaluation of the nehalem quad-core processor for scientific computing
- K. Barker, K. Davis, A. Hoisie, D. Kerbyson, M. Lang, S. Pakin, J. Sancho, A performance evaluation of the Nehalem quad-core processor for scientific computing, Parallel Processing Letters 18 (4).
- Parallel Processing Letters , vol.18 , Issue.4
- Barker, K.¹ Davis, K.² Hoisie, A.³ Kerbyson, D.⁴ Lang, M.⁵ Pakin, S.⁶ Sancho, J.⁷

28
- 77956439996
- Early performance evaluation of new six-core intel® xeon® 5600 family processors for HPC
- P. Gepner, M. Kowalik, D. Fraser, K. Wackowski, Early performance evaluation of new Six-Core Intel® Xeon® 5600 family processors for HPC, in: Parallel and Distributed Computing (ISPDC), 2010 Ninth International Symposium on, IEEE, 2010, pp. 117-124.
- (2010) Parallel and Distributed Computing (ISPDC), 2010 Ninth International Symposium On, IEEE , pp. 117-124
- Gepner, P.¹ Kowalik, M.² Fraser, D.³ Wackowski, K.⁴

29
- 84856653383
- July
- Intel Advanced Vector Extensions Programming Reference (July 2009).
- (2009) Intel Advanced Vector Extensions Programming Reference

30
- 84856647887
- Evaluation of executing DGEMM algorithms on modern multicore CPU
- P. Gepner, V. Gamayunov, D. L. Fraser, Evaluation of Executing DGEMM Algorithms on modern Multicore CPU, in: In Proceedings of The Parallel and Distributed Computing and Systems 2011 Conference, 2011.
- (2011) Proceedings of the Parallel and Distributed Computing and Systems 2011 Conference
- Gepner, P.¹ Gamayunov, V.² Fraser, D.L.³

31
- 77952295055
- Parallel GPU implementation of iterative pca algorithms
- M. Andrecut, Parallel GPU implementation of iterative pca algorithms, Journal of Computational Biology 16 (11) (2009) 1593-1599.
- (2009) Journal of Computational Biology , vol.16 , Issue.11 , pp. 1593-1599
- Andrecut, M.¹

32
- 85030958315
- GPU JPEG2K, https://apps.man.poznan.pl/trac/jpeg2k/.
- GPU JPEG2K

33
- 85030968611
- A. Weiß, M. Heide, S. Papandreou, N. Fürst, CUJ2K.
- CUJ2K
- Weiß, A.¹ Heide, M.² Papandreou, S.³ Fürst, N.⁴

34
- 84875579589
- Test images, http://www.imagecompression.info/testimages/.
- Test Images

35
- 33746684281
- High efficiency EBCOT with parallel coding architecture for JPEG2000
- (2006)
- J. Chiang, C. Chang, C. Hsieh, C. Hsia, High efficiency EBCOT with parallel coding architecture for JPEG2000, EURASIP journal on applied signal processing 2006 (2006) 17-17.
- (2006) EURASIP Journal on Applied Signal Processing , pp. 17-17
- Chiang, J.¹ Chang, C.² Hsieh, C.³ Hsia, C.⁴

36
- 77950498517
- Parallelizing motion JPEG 2000 with CUDA
- ICCEE'09. Second International Conference on IEEE
- S. Datla, N. Gidijala, Parallelizing motion JPEG 2000 with CUDA, in: Computer and Electrical Engineering, 2009. ICCEE'09. Second International Conference on, Vol. 1, IEEE, 2009, pp. 630-634.
- (2009) Computer and Electrical Engineering, 2009 , vol.1 , pp. 630-634
- Datla, S.¹ Gidijala, N.²

37
- 85127697308
- GPU-based sample-parallel context modeling for EBCOT in JPEG2000
- Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik
- J. Matela, V. Rusnak, P. Holub, GPU-Based Sample-Parallel Context Modeling for EBCOT in JPEG2000, in: Sixth Doctoral Workshop on Mathematical and Engineering Methods in Computer Science (MEMICS'10)-Selected Papers, Vol. 16, Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, pp. 77-84.
- Sixth Doctoral Workshop on Mathematical and Engineering Methods in Computer Science (MEMICS'10)-Selected Papers , vol.16 , pp. 77-84
- Matela, J.¹ Rusnak, V.² Holub, P.³

38
- 84906349576
- N. Whitehead, A. Fit-Florea, Precision & Performance: Floating Point and IEEE 754 Compliance for NVIDIA GPUs, rn (A+ B) 21 (2011) 1-1874919424.
- (2011) Precision & Performance: Floating Point and IEEE 754 Compliance for NVIDIA GPUs, Rn (A+ B) , vol.21 , pp. 1-1874919424
- Whitehead, N.¹ Fit-Florea, A.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.