SCOPUS 정보 검색 플랫폼

ICPE 2014 - Proceedings of the 5th ACM/SPEC International Conference on Performance Engineering

Volumn , Issue , 2014, Pages 137-148

Test-driving Intel Xeon Phi

(6) Fang, Jianbin a Xu, Chuanfu b Sips, Henk a Che, Yonggang b Zhang, Lilun b Varbanescu, Ana Lucia c

a DELFT UNIVERSITY OF TECHNOLOGY (Netherlands)

b NATIONAL UNIVERSITY OF DEFENSE TECHNOLOGY (China)

c UNIVERSITY OF AMSTERDAM (Netherlands)

Author keywords

Experience with Xeon Phi; Microbenchmarking; Optimization; Performance analysis

Indexed keywords

COMPUTER ARCHITECTURE; MEDICAL IMAGING; OPTIMIZATION; PROGRAM PROCESSORS;

APPLICATION LEVEL; EXPERIENCE WITH XEON PHI; FUNCTIONAL CODES; IMAGING APPLICATIONS; MASSIVE PARALLELISM; MICRO-BENCHMARKING; PARALLELIZATIONS; PERFORMANCE ANALYSIS;

BENCHMARKING;

EID: 84899675168 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1145/2568088.2576799 Document Type: Conference Paper

Times cited : (79)

References (29)

1
- 0029191296
- Cilk: An efficient multithreaded runtime system
- R. D. e. a. Blumofe Aug.
- R. D. e. a. Blumofe. Cilk: an efficient multithreaded runtime system. SIGPLAN Not., 30(8):207-216, Aug. 1995.
- (1995) SIGPLAN Not. , vol.30 , Issue.8 , pp. 207-216

2
- 84863918549
- Technical report, July
- O. A. R. Board. OpenMP application program interface (version 4.0). Technical report, July 2013.
- (2013) OpenMP Application Program Interface (version 4.0)
- Board, O.A.R.¹

3
- 70450059008
- Accelerating leukocyte tracking using CUDA: A case study in leveraging manycore coprocessors
- May
- M. Boyer et al. Accelerating leukocyte tracking using CUDA: A case study in leveraging manycore coprocessors. In IPDPS'09, May 2009.
- (2009) IPDPS'09
- Boyer, M.¹

4
- 59749100826
- Optimization and performance modeling of stencil computations on modern microprocessors
- Feb.
- K. Datta et al. Optimization and performance modeling of stencil computations on modern microprocessors. SIAM Rev., 51(1):129-159, Feb. 2009.
- (2009) SIAM Rev. , vol.51 , Issue.1 , pp. 129-159
- Datta, K.¹

5
- 0004224686
- May
- David. Programming with POSIX Threads. May 1997.
- (1997) Programming with POSIX Threads
- David¹

6
- 84899692751
- Benchmarking intel xeon phi to guide kernel design
- Apr.
- J. Fang et al. Benchmarking intel xeon phi to guide kernel design. Technical Report PDS-2013-005, Delft University of Technology, Apr. 2013.
- (2013) Technical Report PDS-2013-005, Delft University of Technology
- Fang, J.¹

7
- 84899688523
- Lists of instruction latencies, throughputs and micro-operation reakdowns
- Feb.
- A. Fog. Lists of instruction latencies, throughputs and micro-operation reakdowns. Technical report, Copenhagen University, Feb. 2012.
- (2012) Technical Report, Copenhagen University
- Fog, A.¹

8
- 84880053798
- Modeling communication in cache-coherent SMP systems - A case-study with xeon phi
- S. R. Garea and T. Hoefler. Modeling Communication in Cache-Coherent SMP Systems - A Case-Study with Xeon Phi. 2013. HPDC'13.
- (2013) HPDC'13
- Garea, S.R.¹ Hoefler, T.²

9
- 33746763319
- Instruction latencies and throughput for AMD and intel x86 processors
- Feb.
- T. Granlund. Instruction latencies and throughput for AMD and intel x86 processors. Technical report, KTH, Feb. 2012.
- (2012) Technical Report, KTH
- Granlund, T.¹

10
- 2342441476
- Morgan Kaufmann, 5 edition, Sept.
- J. L. Hennessy and D. A. Patterson. Computer Architecture, Fifth Edition: A Quantitative Approach. Morgan Kaufmann, 5 edition, Sept. 2011.
- (2011) Computer Architecture, Fifth Edition: A Quantitative Approach
- Hennessy, J.L.¹ Patterson, D.A.²

11
- 84880553500
- Intel Sept.
- Intel. Intel Xeon Phi Coprocessor Instruction Set Architecture Reference Manual, Sept. 2012.
- (2012) Intel Xeon Phi Coprocessor Instruction Set Architecture Reference Manual

12
- 84875664258
- Intel Nov.
- Intel. Intel Xeon Phi Coprocessor System Software Development Guide, Nov. 2012.
- (2012) Intel Xeon Phi Coprocessor System Software Development Guide

13
- 84887232898
- Intel Oct.
- Intel. An Overview of Programming for Intel Xeon Processors and Intel Xeon Phi Coprocessors, Oct. 2012.
- (2012) An Overview of Programming for Intel Xeon Processors and Intel Xeon Phi Coprocessors

14
- 84899704684
- Intel
- Intel. Streaming Store Instructions in the Intel Xeon Phi coprocessor, 2012.
- (2012) Streaming Store Instructions in the Intel Xeon Phi Coprocessor

15
- 84875664258
- Intel April
- Intel. Intel Xeon Phi Coprocessor. http://software.intel.com/mic- developer, April 2013.
- (2013) Intel Xeon Phi Coprocessor

16
- 0345025793
- April
- John D. McCalpin. STREAM: Sustainable Memory Bandwidth With High Performance Computers, April 2013.
- (2013) STREAM: Sustainable Memory Bandwidth with High Performance Computers
- McCalpin, J.D.¹

17
- 84865353129
- Debunking the 100X GPU vs. CPU myth: An evaluation of throughput computing on CPU and GPU
- June
- V. W. Lee et al. Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU. SIGARCH Comput. Archit. News, 38(3), June 2010.
- (2010) SIGARCH Comput. Archit. News , vol.38 , Issue.3
- Lee, V.W.¹

18
- 85084160699
- Lmbench: Portable tools for performance analysis
- L. McVoy et al. lmbench: portable tools for performance analysis. In USENIX ATEC'96, 1996.
- (1996) USENIX ATEC'96
- McVoy, L.¹

19
- 70449643566
- Memory performance and cache coherency effects on an intel nehalem multiprocessor system
- Sept.
- D. Molka et al. Memory performance and cache coherency effects on an intel nehalem multiprocessor system. In PACT'09., Sept. 2009.
- (2009) PACT'09
- Molka, D.¹

20
- 48149094931
- Memory hierarchy performance measurement of commercial dual-core desktop processors
- Aug.
- L. Peng et al. Memory hierarchy performance measurement of commercial dual-core desktop processors. Journal of Systems Architecture, 54(8):816-828, Aug. 2008.
- (2008) Journal of Systems Architecture , vol.54 , Issue.8 , pp. 816-828
- Peng, L.¹

21
- 10044237712
- Motion gradient vector flow: An external force for tracking rolling leukocytes with shape and size constrained active contours
- IEEE Transactions on, Dec.
- N. Ray et al. Motion gradient vector flow: an external force for tracking rolling leukocytes with shape and size constrained active contours. Medical Imaging, IEEE Transactions on, Dec. 2004.
- (2004) Medical Imaging
- Ray, N.¹

22
- 84866875424
- Radio astronomy beam forming on many-core architectures
- A. Sclocco et al. Radio astronomy beam forming on Many-Core architectures. In IPDPS, 2012.
- (2012) IPDPS
- Sclocco, A.¹

23
- 0000718681
- Measuring cache and TLB performance and their effect on benchmark runtimes
- Oct.
- A. J. Smith et al. Measuring cache and TLB performance and their effect on benchmark runtimes. IEEE Trans. Comput., (10), Oct. 1995.
- (1995) IEEE Trans. Comput. , Issue.10
- Smith, A.J.¹

24
- 77952162137
- OpenCL: A parallel programming standard for heterogeneous computing systems
- May
- J. E. Stone et al. OpenCL: A parallel programming standard for heterogeneous computing systems. Computing in science & engineering, 12(3):66-72, May 2010.
- (2010) Computing in Science & Engineering , vol.12 , Issue.3 , pp. 66-72
- Stone, J.E.¹

25
- 84899692405
- Technical University of Dresden August
- Technical University of Dresden. BenchIT: Performance Measurement for Scientific Applications, August 2013.
- (2013) BenchIT: Performance Measurement for Scientific Applications

26
- 84899670707
- Automatic OpenCL device characterization: Guiding optimized kernel design
- P. Thoman et al. Automatic OpenCL device characterization: Guiding optimized kernel design. In Euro-Par'11. 2011.
- (2011) Euro-Par'11
- Thoman, P.¹

27
- 60649117768
- Building high-resolution sky images using the Cell/B.e
- A. L. Varbanescu, A. S. van Amesfoort, T. Cornwell, G. van Diepen, R. van Nieuwpoort, B. G. Elmegreen, and H. J. Sips. Building high-resolution sky images using the Cell/B.e. Scientific Programming, 17(1-2):113-134, 2009.
- (2009) Scientific Programming , vol.17 , Issue.1-2 , pp. 113-134
- Varbanescu, A.L.¹ Van Amesfoort, A.S.² Cornwell, T.³ Van Diepen, G.⁴ Van Nieuwpoort, R.⁵ Elmegreen, B.G.⁶ Sips, H.J.⁷

28
- 70350771131
- Benchmarking GPUs to tune dense linear algebra
- SC 2008. International Conference for IEEE, Nov.
- V. Volkov and J. W. Demmel. Benchmarking GPUs to tune dense linear algebra. In High Performance Computing, Networking, Storage and Analysis, 2008. SC 2008. International Conference for, pages 1-11. IEEE, Nov. 2008.
- (2008) High Performance Computing, Networking, Storage and Analysis, 2008 , pp. 1-11
- Volkov, V.¹ Demmel, J.W.²

29
- 77952579552
- Demystifying GPU microarchitecture through microbenchmarking
- IEEE, Mar.
- H. Wong, M.-M. Papadopoulou, M. Sadooghi-Alvandi, and A. Moshovos. Demystifying GPU microarchitecture through microbenchmarking. In 2010 IEEE International Symposium on Performance Analysis of Systems & Software (ISPASS), pages 235-246. IEEE, Mar. 2010.
- (2010) 2010 IEEE International Symposium on Performance Analysis of Systems & Software (ISPASS) , pp. 235-246
- Wong, H.¹ Papadopoulou, M.-M.² Sadooghi-Alvandi, M.³ Moshovos, A.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.