SCOPUS 정보 검색 플랫폼

International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS

Volumn , Issue , 2010, Pages 347-358

An asymmetric distributed shared memory model for heterogeneous parallel systems

(6) Gelado, Isaac a Cabezas, Javier a Navarro, Nacho a Stone, John E b Patel, Sanjay b Hwu, Wen Mei W b

a UNIVERSITAT POLITÈCNICA DE CATALUNYA (Spain)

b UNIVERSITY OF ILLINOIS AT URBANA CHAMPAIGN (United States)

Author keywords

Asymmetric distributed shared memory; Data centric programming models; Heterogeneous systems

Indexed keywords

APPLICATION PERFORMANCE; APPLICATION PORTABILITY; ARCHITECTURAL SUPPORT; CPU SYSTEMS; DATA CENTRIC; DATA OBJECTS; DATA PARALLEL; DISTRIBUTED SHARED MEMORY; DISTRIBUTED SHARED MEMORY SYSTEMS; GENERAL PURPOSE CPUS; GNU/LINUX; HETEROGENEOUS COMPUTING; HETEROGENEOUS COMPUTING SYSTEM; HETEROGENEOUS PARALLEL SYSTEMS; HETEROGENEOUS SYSTEMS; LIGHT WEIGHT; MEMORY SPACE; PHYSICAL MEMORY; PROGRAMMING MODELS; SEQUENTIAL CONTROL; SOFTWARE IMPLEMENTATION;

ACCELERATION; COMPUTER OPERATING SYSTEMS; COMPUTER SOFTWARE PORTABILITY; DATA TRANSFER; LINGUISTICS; PROGRAM PROCESSORS;

COMPUTER SYSTEMS PROGRAMMING;

EID: 77952251540 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1145/1736020.1736059 Document Type: Conference Paper

Times cited : (118)

References (47)

1
- 70349100958
- The OpenCL Specification, 2009.
- (2009) The OpenCL Specification

2
- 0029180378
- The MIT alewife machine: Architecture and performance
- New York, NY, USA, ACM
- A. Agarwal, R. Bianchini, D. Chaiken, K. L. Johnson, D. Kranz, J. Kubiatowicz, B.-H. Lim, K. Mackenzie, and D. Yeung. The MIT alewife machine: architecture and performance. In ISCA '95, pages 2-13, New York, NY, USA, 1995. ACM.
- (1995) ISCA '95 , pp. 2-13
- Agarwal, A.¹ Bianchini, R.² Chaiken, D.³ Johnson, K.L.⁴ Kranz, D.⁵ Kubiatowicz, J.⁶ Lim, B.-H.⁷ Mackenzie, K.⁸ Yeung, D.⁹

3
- 0022767619
- Linda and friends
- Aug.
- S. Ahuja, N. Carriero, and D. Gelernter. Linda and friends. IEEE Trans. on Computers, 19(8):26-34, Aug. 1986.
- (1986) IEEE Trans. on Computers , vol.19 , Issue.8 , pp. 26-34
- Ahuja, S.¹ Carriero, N.² Gelernter, D.³

4
- 35248832108
- STAPL: An adaptive, generic parallel C++ library
- P. An, A. Jula, S. Rus, S. Saunders, T. Smith, G. Tanase, N. Thomas, N. Amato, and L. Rauchwerger. STAPL: An adaptive, generic parallel C++ library. LNCS, pages 193-208, 2003.
- (2003) LNCS , pp. 193-208
- An, P.¹ Jula, A.² Rus, S.³ Saunders, S.⁴ Smith, T.⁵ Tanase, G.⁶ Thomas, N.⁷ Amato, N.⁸ Rauchwerger, L.⁹

5
- 0024131247
- Distributed programming with shared data
- Oct
- H. Bal and A. Tanenbaum. Distributed programming with shared data. In ICCL '88, pages 82-91, Oct 1988.
- (1988) ICCL '88 , pp. 82-91
- Bal, H.¹ Tanenbaum, A.²

6
- 70449467862
- Entering the petaflop era: The architecture and performance of roadrunner
- Piscataway, NJ, USA, IEEE Press
- K. J. Barker, K. Davis, A. Hoisie, D. J. Kerbyson, M. Lang, S. Pakin, and J. C. Sancho. Entering the petaflop era: the architecture and performance of roadrunner. In SC'08, pages 1-11, Piscataway, NJ, USA, 2008. IEEE Press.
- (2008) SC'08 , pp. 1-11
- Barker, K.J.¹ Davis, K.² Hoisie, A.³ Kerbyson, D.J.⁴ Lang, M.⁵ Pakin, S.⁶ Sancho, J.C.⁷

7
- 34548265764
- Cellss: A programming model for the cell be architecture
- New York, NY, USA, ACM
- P. Bellens, J. M. Perez, R. M. Badia, and J. Labarta. Cellss: a programming model for the cell be architecture. In SC'06, page 86, New York, NY, USA, 2006. ACM.
- (2006) SC'06 , pp. 86
- Bellens, P.¹ Perez, J.M.² Badia, R.M.³ Labarta, J.⁴

8
- 0027307267
- The midway distributed shared memory system
- Feb
- B. Bershad, M. Zekauskas, and W. Sawdon. The midway distributed shared memory system. In Compcon Spring '93, pages 528-537, Feb 1993.
- (1993) Compcon Spring '93 , pp. 528-537
- Bershad, B.¹ Zekauskas, M.² Sawdon, W.³

9
- 0024055867
- Multilanguage parallel programming of heterogeneous machines
- Aug
- R. Bisiani and A. Forin. Multilanguage parallel programming of heterogeneous machines. IEEE Trans. on Computers, 37(8):930-945, Aug 1988.
- (1988) IEEE Trans. on Computers , vol.37 , Issue.8 , pp. 930-945
- Bisiani, R.¹ Forin, A.²

10
- 0025433314
- PLUS: A distributed shared-memory system
- R. Bisiani and M. Ravishankar. PLUS: a distributed shared-memory system. SIGARCH Comput. Archit. News, 18(3a):115-124, 1990.
- (1990) SIGARCH Comput. Archit. News , vol.18 , Issue.3 A , pp. 115-124
- Bisiani, R.¹ Ravishankar, M.²

11
- 85088003777
- GPU computing with NVIDIA CUDA
- New York, NY, USA, ACM
- I. Buck. GPU computing with NVIDIA CUDA. In SIGGRAPH '07, page 6, New York, NY, USA, 2007. ACM.
- (2007) SIGGRAPH '07 , pp. 6
- Buck, I.¹

12
- 84883300486
- Implementation and performance of munin
- New York, NY, USA, ACM
- J. B. Carter, J. K. Bennett, and W. Zwaenepoel. Implementation and performance of munin. In SOSP '91, pages 152-164, New York, NY, USA, 1991. ACM.
- (1991) SOSP '91 , pp. 152-164
- Carter, J.B.¹ Bennett, J.K.² Zwaenepoel, W.³

13
- 17144409441
- Modular interprocedural pointer analysis using access paths: Design, implementation, and evaluation
- New York, NY, USA, ACM
- B.-C. Cheng and W. W. Hwu. Modular interprocedural pointer analysis using access paths: design, implementation, and evaluation. In PLDI '00, pages 57-69, New York, NY, USA, 2000. ACM.
- (2000) PLDI '00 , pp. 57-69
- Cheng, B.-C.¹ Hwu, W.W.²

14
- 84992015947
- Parallel programming using skeleton functions
- London, UK, Springer-Verlag
- J. Darlington, A. J. Field, P. G. Harrison, P. H. J. Kelly, D. W. N. Sharp, and Q. Wu. Parallel programming using skeleton functions. In PARLE'93, pages 146-160, London, UK, 1993. Springer-Verlag.
- (1993) PARLE'93 , pp. 146-160
- Darlington, J.¹ Field, A.J.² Harrison, P.G.³ Kelly, P.H.J.⁴ Sharp, D.W.N.⁵ Wu, Q.⁶

15
- 0043207371
- The clouds distributed operating system
- Nov
- P. Dasgupta, J. LeBlanc, R.J., M. Ahamad, and U. Ramachandran. The clouds distributed operating system. IEEE Trans. on Computers, 24(11):34-44, Nov 1991.
- (1991) IEEE Trans. on Computers , vol.24 , Issue.11 , pp. 34-44
- Dasgupta, P.¹ Leblanc, J.² J, R.³ Ahamad, M.⁴ Ramachandran, U.⁵

16
- 84947663399
- An analysis of memnet - An experiment in high-speed shared-memory local networking
- New York, NY, USA, ACM
- G. Delp, A. Sethi, and D. Farber. An analysis of memnet - an experiment in high-speed shared-memory local networking. In SIGCOMM '88, pages 165-174, New York, NY, USA, 1988. ACM.
- (1988) SIGCOMM '88 , pp. 165-174
- Delp, G.¹ Sethi, A.² Farber, D.³

17
- 0024936732
- Mirage: A coherent distributed shared memory design
- New York, NY, USA, ACM
- B. Fleisch and G. Popek. Mirage: a coherent distributed shared memory design. In SOSP '89, pages 211-223, New York, NY, USA, 1989. ACM.
- (1989) SOSP '89 , pp. 211-223
- Fleisch, B.¹ Popek, G.²

18
- 0027148844
- The KSR 1: Bridging the gap between shared memory and MPPs
- Feb
- S. Frank, I. Burkhardt, H., and J. Rothnie. The KSR 1: bridging the gap between shared memory and MPPs. In Compcon Spring '93, pages 285-294, Feb 1993.
- (1993) Compcon Spring '93 , pp. 285-294
- Frank, S.¹ Burkhardt H, I.² Rothnie, J.³

19
- 57349092386
- CUBA: An architecture for efficient cpu/co-processor data communication
- New York, NY, USA, ACM
- I. Gelado, J. H. Kelm, S. Ryoo, S. S. Lumetta, N. Navarro, and W.W. Hwu. CUBA: an architecture for efficient cpu/co-processor data communication. In ICS '08, pages 299-308, New York, NY, USA, 2008. ACM.
- (2008) ICS '08 , pp. 299-308
- Gelado, I.¹ Kelm, J.H.² Ryoo, S.³ Lumetta, S.S.⁴ Navarro, N.⁵ Hwu, W.W.⁶

20
- 0026818115
- The scalable coherent interface and related standards projects
- D. B. Gustavson. The scalable coherent interface and related standards projects. IEEE Micro, 12(1):10-22, 1992.
- (1992) IEEE Micro , vol.12 , Issue.1 , pp. 10-22
- Gustavson, D.B.¹

21
- 1642364107
- The chimaera reconfigurable functional unit
- Feb.
- S. H. Hauck, T. W. Fry, M. M. Hosler, and J. P. Kao. The chimaera reconfigurable functional unit. IEEE Trans. on VLSI, 12(2):206-217, Feb. 2004.
- (2004) IEEE Trans. on VLSI , vol.12 , Issue.2 , pp. 206-217
- Hauck, S.H.¹ Fry, T.W.² Hosler, M.M.³ Kao, J.P.⁴

22
- 0031360911
- Garp: A MIPS processor with a reconfigurable coprocessor
- Apr
- J. R. Hauser and J. Wawrzynek. Garp: a MIPS processor with a reconfigurable coprocessor. In FCCM '97, pages 12-21, Apr 1997.
- (1997) FCCM '97 , pp. 12-21
- Hauser, J.R.¹ Wawrzynek, J.²

23
- 84976707130
- The performance impact of flexibility in the Stanford FLASH multiprocessor
- New York, NY, USA, ACM
- M. Heinrich, J. Kuskin, D. Ofelt, J. Heinlein, J. Baxter, J. P. Singh, R. Simoni, K. Gharachorloo, D. Nakahira, M. Horowitz, A. Gupta, M. Rosenblum, and J. Hennessy. The performance impact of flexibility in the Stanford FLASH multiprocessor. In ASPLOS '94, pages 274-285, New York, NY, USA, 1994. ACM.
- (1994) ASPLOS '94 , pp. 274-285
- Heinrich, M.¹ Kuskin, J.² Ofelt, D.³ Heinlein, J.⁴ Baxter, J.⁵ Singh, J.P.⁶ Simoni, R.⁷ Gharachorloo, K.⁸ Nakahira, D.⁹ Horowitz, M.¹⁰ Gupta, A.¹¹ Rosenblum, M.¹² Hennessy, J.¹³

24
- 85081500215
- White paper, University of Illinois
- W. W. Hwu and J. Stone. A programmers view of the new GPU computing capabilities in the Fermi architecture and cuda 3.0. White paper, University of Illinois, 2009.
- (2009) A Programmers View of the New GPU Computing Capabilities in the Fermi Architecture and Cuda 3.0
- Hwu, W.W.¹ Stone, J.²

25
- 70350602376
- IBM Staff
- IBM Staff. SPE Runtime Management Library, 2007.
- (2007) SPE Runtime Management Library

26
- 67650692011
- IMPACT Group. Parboil benchmark suite. http://impact.crhc.illinois.edu/ parboil.php.
- Parboil Benchmark Suite

27
- 54549089525
- Intel Staff
- Intel Staff. Intel 945G Express Chipset Product Brief, 2005.
- (2005) Intel 945G Express Chipset Product Brief

28
- 78649260526
- Intel Staff
- Intel Staff. Intel Xeon Processor 7400 Series Specification, 2008.
- (2008) Intel Xeon Processor 7400 Series Specification

29
- 59049085159
- Predictive runtime code scheduling for heterogeneous architectures
- Berlin, Heidelberg, Springer-Verlag
- V. Jiménez, L. Vilanova, I. Gelado, M. Gil, G. Fursin, and N. Navarro. Predictive runtime code scheduling for heterogeneous architectures. In HiPEAC '09, pages 19-33, Berlin, Heidelberg, 2009. Springer-Verlag.
- (2009) HiPEAC '09 , pp. 19-33
- Jiménez, V.¹ Vilanova, L.² Gelado, I.³ Gil, M.⁴ Fursin, G.⁵ Navarro, N.⁶

30
- 25844503119
- Introduction to the cell multiprocessor
- J. A. Kahle, M. N. Day, H. P. Hofstee, C. R. Johns, T. R. Maeurer, and D. Shippy. Introduction to the cell multiprocessor. IBM J. Res. Dev., 49(4/5):589-604, 2005.
- (2005) IBM J. Res. Dev. , vol.49 , Issue.4-5 , pp. 589-604
- Kahle, J.A.¹ Day, M.N.² Hofstee, H.P.³ Johns, C.R.⁴ Maeurer, T.R.⁵ Shippy, D.⁶

31
- 81455130002
- Treadmarks: Distributed shared memory on standard workstations and operating systems
- Berkeley, CA, USA, USENIX Association
- P. Keleher, A. L. Cox, S. Dwarkadas, and W. Zwaenepoel. Treadmarks: distributed shared memory on standard workstations and operating systems. In WTEC'94, pages 10-10, Berkeley, CA, USA, 1994. USENIX Association.
- (1994) WTEC'94 , pp. 10-10
- Keleher, P.¹ Cox, A.L.² Dwarkadas, S.³ Zwaenepoel, W.⁴

32
- 70450237431
- Rigel: An architecture and scalable programming interface for a 1000-core accelerator
- New York, NY, USA, ACM
- J. H. Kelm, D. R. Johnson, M. R. Johnson, N. C. Crago, W. Tuohy, A. Mahesri, S. S. Lumetta, M. I. Frank, and S. Patel. Rigel: an architecture and scalable programming interface for a 1000-core accelerator. In ISCA '09, pages 140-151, New York, NY, USA, 2009. ACM.
- (2009) ISCA '09 , pp. 140-151
- Kelm, J.H.¹ Johnson, D.R.² Johnson, M.R.³ Crago, N.C.⁴ Tuohy, W.⁵ Mahesri, A.⁶ Lumetta, S.S.⁷ Frank, M.I.⁸ Patel, S.⁹

33
- 0025429467
- The directory-based cache coherence protocol for the DASH multiprocessor
- New York, NY, USA, ACM
- D. Lenoski, J. Laudon, K. Gharachorloo, A. Gupta, and J. Hennessy. The directory-based cache coherence protocol for the DASH multiprocessor. In ISCA '90, pages 148-159, New York, NY, USA, 1990. ACM.
- (1990) ISCA '90 , pp. 148-159
- Lenoski, D.¹ Laudon, J.² Gharachorloo, K.³ Gupta, A.⁴ Hennessy, J.⁵

34
- 0024771302
- Memory coherence in shared virtual memory systems
- K. Li and P. Hudak. Memory coherence in shared virtual memory systems. ACM Trans. Comput. Syst., 7(4):321-359, 1989.
- (1989) ACM Trans. Comput. Syst. , vol.7 , Issue.4 , pp. 321-359
- Li, K.¹ Hudak, P.²

35
- 44849137198
- NVIDIA tesla: A unified graphics and computing architecture
- March-April
- E. Lindholm, J. Nickolls, S. Oberman, and J. Montrym. NVIDIA tesla: A unified graphics and computing architecture. IEEE Micro, 28(2):39-55, March-April 2008.
- (2008) IEEE Micro , vol.28 , Issue.2 , pp. 39-55
- Lindholm, E.¹ Nickolls, J.² Oberman, S.³ Montrym, J.⁴

36
- 0025627049
- Merlin: A superglue for multicomputer systems
- C. Maples and L. Wittie. Merlin: A superglue for multicomputer systems. In Compcon Spring '90, volume 90, pages 73-81, 1990.
- (1990) Compcon Spring '90 , vol.90 , pp. 73-81
- Maples, C.¹ Wittie, L.²

37
- 0028732614
- Global arrays: A portable "shared-memory" programming model for distributed memory computers
- New York, NY, USA, ACM
- J. Nieplocha, R. J. Harrison, and R. J. Littlefield. Global arrays: a portable "shared-memory" programming model for distributed memory computers. In SC'94, pages 340-349, New York, NY, USA, 1994. ACM.
- (1994) SC'94 , pp. 340-349
- Nieplocha, J.¹ Harrison, R.J.² Littlefield, R.J.³

38
- 35948991669
- NVIDIA Staff
- NVIDIA Staff. NVIDIA CUDA Programming Guide 2.2, 2009.
- (2009) NVIDIA CUDA Programming Guide 2.2

39
- 53749108455
- Accelerator architectures
- July-Aug.
- S. Patel and W. W. Hwu. Accelerator architectures. IEEE Micro, 28(4):4-12, July-Aug. 2008.
- (2008) IEEE Micro , vol.28 , Issue.4 , pp. 4-12
- Patel, S.¹ Hwu, W.W.²

40
- 49249086142
- Larrabee: A many-core x86 architecture for visual computing
- L. Seiler, D. Carmean, E. Sprangle, T. Forsyth, M. Abrash, P. Dubey, S. Junkins, A. Lake, J. Sugerman, R. Cavin, R. Espasa, E. Grochowski, T. Juan, and P. Hanrahan. Larrabee: a many-core x86 architecture for visual computing. ACM Trans. Graph., 27(3):1-15, 2008.
- (2008) ACM Trans. Graph. , vol.27 , Issue.3 , pp. 1-15
- Seiler, L.¹ Carmean, D.² Sprangle, E.³ Forsyth, T.⁴ Abrash, M.⁵ Dubey, P.⁶ Junkins, S.⁷ Lake, A.⁸ Sugerman, J.⁹ Cavin, R.¹⁰ Espasa, R.¹¹ Grochowski, E.¹² Juan, T.¹³ Hanrahan, P.¹⁴

41
- 0034187952
- MorphoSys: An integrated reconfigurable system for data-parallel and computation-intensive applications
- May
- H. Singh, M.-H. Lee, G. Lu, F. J. Kurdahi, N. Bagherzadeh, and E. M. C. Filho. MorphoSys: an integrated reconfigurable system for data-parallel and computation-intensive applications. IEEE Trans. on Computers, 49(5):465-481, May 2000.
- (2000) IEEE Trans. on Computers , vol.49 , Issue.5 , pp. 465-481
- Singh, H.¹ Lee, M.-H.² Lu, G.³ Kurdahi, F.J.⁴ Bagherzadeh, N.⁵ Filho, E.M.C.⁶

42
- 0036892941
- The programming model of ASSIST, an environment for parallel and distributed portable applications
- DOI 10.1016/S0167-8191(02)00188-6, PII S0167819102001886
- M. Vanneschi. The programming model of ASSIST, an environment for parallel and distributed portable applications. Parallel Comput., 28(12):1709-1732, 2002. (Pubitemid 35412373)
- (2002) Parallel Computing , vol.28 , Issue.12 , pp. 1709-1732
- Vanneschi, M.¹

43
- 8744241430
- The MOLEN polymorphic processor
- S. Vassiliadis, S. Wong, G. Gaydadjiev, K. Bertels, G. Kuzmanov, and E. M. Panainte. The MOLEN polymorphic processor. IEEE Trans. on Computers, 53(11):1363-1375, 2004.
- (2004) IEEE Trans. on Computers , vol.53 , Issue.11 , pp. 1363-1375
- Vassiliadis, S.¹ Wong, S.² Gaydadjiev, G.³ Bertels, K.⁴ Kuzmanov, G.⁵ Panainte, E.M.⁶

44
- 0009725006
- Data Diffusion Machine-a scalable shared virtual memory multiprocessor
- Springer-Verlag
- D. Warren and S. Haridi. Data Diffusion Machine-a scalable shared virtual memory multiprocessor. In Fifth Generation Computer Systems 1988, page 943. Springer-Verlag, 1988.
- (1988) Fifth Generation Computer Systems 1988 , pp. 943
- Warren, D.¹ Haridi, S.²

45
- 0027228907
- Hardware assist for distributed shared memory
- May
- J. Wilson, A.W., J. LaRowe, R.P., and M. Teller. Hardware assist for distributed shared memory. In DCS '03, pages 246-255, May 1993.
- (1993) DCS '03 , pp. 246-255
- Wilson, J.¹ W, A.² LaRowe, J.³ P, R.⁴ Teller, M.⁵

46
- 62949240224
- Xilinx Staff. Feb
- Xilinx Staff. Virtex-5 Family Overview, Feb 2009.
- (2009) Virtex-5 Family Overview

47
- 0025532322
- Extending distributed shared memory to heterogeneous environments
- May
- S. Zhou, M. Stumm, and T. McInerney. Extending distributed shared memory to heterogeneous environments. In DCS '90, pages 30-37, May 1990.
- (1990) DCS '90 , pp. 30-37
- Zhou, S.¹ Stumm, M.² McInerney, T.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.