SCOPUS 정보 검색 플랫폼

Proceedings - International Symposium on Computer Architecture

Volumn , Issue , 2014, Pages 205-216

Single-graph multiple flows: Energy efficient design alternative for GPGPUs

(2) Voitsechov, Dani a Etsion, Yoav a

a TECHNION ISRAEL INSTITUTE OF TECHNOLOGY (Israel)

Author keywords

[No Author keywords available]

Indexed keywords

DATA FLOW ANALYSIS; ENERGY EFFICIENCY; PROGRAM PROCESSORS; RECONFIGURABLE ARCHITECTURES;

COARSE-GRAIN RECONFIGURABLE; DATAFLOW; ENERGY-EFFICIENT DESIGN; MULTIPLE FLOWS; THREAD LEVEL PARALLELISM;

FLOW GRAPHS;

EID: 84905455447 PISSN: 10636897 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ISCA.2014.6853234 Document Type: Conference Paper

Times cited : (71)

References (43)

1
- 0025404493
- Executing a program on the MIT tagged-token dataflow architecture
- Mar
- Arvind and R. Nikhil, "Executing a program on the MIT tagged-token dataflow architecture"IEEE Trans. on Computers, vol. 39, no. 3, pp. 300-318, Mar 1990
- (1990) IEEE Trans. on Computers , vol.39 , Issue.3 , pp. 300-318
- Arvind¹ Nikhil, R.²

2
- 70349169075
- Analyzing CUDA workloads using a detailed GPU simulator
- A. Bakhoda, G. L. Yuan, W. W. L. Fung, H. Wong, and T. M. Aamodt, "Analyzing CUDA workloads using a detailed GPU simulator." in ISPASS. IEEE, 2009, pp. 163-174
- (2009) ISPASS IEEE , pp. 163-174
- Bakhoda, A.¹ Yuan, G.L.² Fung, W.W.L.³ Wong, H.⁴ Aamodt, T.M.⁵

3
- 0034592554
- Adapting software pipelining for reconfigurable computing
- T. J. Callahan and J. Wawrzynek, "Adapting software pipelining for reconfigurable computing"in Intl. Conf. on Compilers, Architecture, and Synthesis for Embedded Systems, 2000, pp. 57-64
- (2000) Intl. Conf. on Compilers, Architecture, and Synthesis for Embedded Systems , pp. 57-64
- Callahan, T.J.¹ Wawrzynek, J.²

4
- 70649092154
- Rodinia: A benchmark suite for heterogeneous computing
- S. Che, M. Boyer, J. Meng, D. Tarjan, J. W. Sheaffer, S.-H. Lee, and K. Skadron, "Rodinia: A benchmark suite for heterogeneous computing"in IEEE Intl. Symp. on Workload Characterization (IISWC), ser. IISWC 09, 2009, pp. 44-54
- (2009) IEEE Intl. Symp. on Workload Characterization (IISWC), Ser. IISWC 09 , pp. 44-54
- Che, S.¹ Boyer, M.² Meng, J.³ Tarjan, D.⁴ Sheaffer, J.W.⁵ Lee, S.-H.⁶ Skadron, K.⁷

5
- 0027740584
- Two fundamental limits on dataflow multiprocessing
- D. E. Culler, K. E. Schauser, and T. von Eicken, "Two fundamental limits on dataflow multiprocessing"in Intl. Conf. on Parallel Arch. and Compilation Techniques (PACT), 1993, pp. 153-164
- (1993) Intl. Conf. on Parallel Arch. and Compilation Techniques (PACT) , pp. 153-164
- Culler, D.E.¹ Schauser, K.E.² Von Eicken, T.³

6
- 0026243790
- Efficiently computing static single assignment form and the control dependence graph
- R. Cytron, J. Ferrante, B. K. Rosen, M. N. Wegman, and F. K. Zadeck, "Efficiently computing static single assignment form and the control dependence graph"ACM Trans. on Programming Languages and Systems, vol. 13, no. 4, pp. 451-490, 1991
- (1991) ACM Trans. on Programming Languages and Systems , vol.13 , Issue.4 , pp. 451-490
- Cytron, R.¹ Ferrante, J.² Rosen, B.K.³ Wegman, M.N.⁴ Zadeck, F.K.⁵

7
- 0016434955
- A preliminary architecture for a basic data flow processor
- J. B. Dennis and D. Misunas, "A preliminary architecture for a basic data flow processor"in Intl. Symp. on Computer Architecture (ISCA), 1975, pp. 126-132
- (1975) Intl. Symp. on Computer Architecture (ISCA) , pp. 126-132
- Dennis, J.B.¹ Misunas, D.²

8
- 0025750643
- Properties and performance of folded hypercubes
- Jan
- A. El-Amawy and S. Latifi, "Properties and performance of folded hypercubes"IEEE Trans. on Parallel and Distributed Systems, vol. 2, no. 1, pp. 31-42, Jan. 1991
- (1991) IEEE Trans. on Parallel and Distributed Systems , vol.2 , Issue.1 , pp. 31-42
- El-Amawy, A.¹ Latifi, S.²

9
- 0007997616
- ARB: A hardware mechanism for dynamic reordering of memory references
- M. Franklin and G. S. Sohi, "ARB: A hardware mechanism for dynamic reordering of memory references"IEEE Trans. on Computers, vol. 45, no. 5, pp. 552-571, 1996
- (1996) IEEE Trans. on Computers , vol.45 , Issue.5 , pp. 552-571
- Franklin, M.¹ Sohi, G.S.²

10
- 0034174187
- PipeRench: A reconfigurable architecture and compiler
- Apr
- S. C. Goldstein, H. Schmit, M. Budiu, S. Cadambi, M. Moe, and R. R. Taylor, "PipeRench: A reconfigurable architecture and compiler"IEEE Computer, vol. 33, no. 4, pp. 70-77, Apr. 2000
- (2000) IEEE Computer , vol.33 , Issue.4 , pp. 70-77
- Goldstein, S.C.¹ Schmit, H.² Budiu, M.³ Cadambi, S.⁴ Moe, M.⁵ Taylor, R.R.⁶

11
- 79955890625
- Dynamically specialized datapaths for energy efficient computing
- Feb
- V. Govindaraju, C.-H. Ho, and K. Sankaralingam, "Dynamically specialized datapaths for energy efficient computing"in Symp. on High-Performance Computer Architecture (HPCA), Feb 2011, pp. 503-514
- (2011) Symp. on High-Performance Computer Architecture (HPCA) , pp. 503-514
- Govindaraju, V.¹ Ho, C.-H.² Sankaralingam, K.³

12
- 84863374615
- Bundled execution of recurring traces for energy-efficient general purpose processing
- S. Gupta, S. Feng, A. Ansari, S. Mahlke, and D. August, "Bundled execution of recurring traces for energy-efficient general purpose processing"in Intl. Symp. on Microarchitecture (MICRO), 2011, pp. 12-23
- (2011) Intl. Symp. on Microarchitecture (MICRO) , pp. 12-23
- Gupta, S.¹ Feng, S.² Ansari, A.³ Mahlke, S.⁴ August, D.⁵

13
- 0021831531
- The Manchester prototype dataflow computer
- J. R. Gurd, C. C. Kirkham, and I. Watson, "The Manchester prototype dataflow computer"Comm. ACM, vol. 28, no. 1, pp. 34-52, 1985
- (1985) Comm ACM , vol.28 , Issue.1 , pp. 34-52
- Gurd, J.R.¹ Kirkham, C.C.² Watson, I.³

14
- 67650635164
- Many-core vs many-thread machines: Stay away from the valley
- Jan
- Z. Guz, E. Bolotin, I. Keidar, A. Kolodny, A. Mendelson, and U.Weiser, "Many-core vs. many-thread machines: Stay away from the valley"IEEE Computer Architecture Letters, vol. 8, no. 1, pp. 25-28, Jan 2009
- (2009) IEEE Computer Architecture Letters , vol.8 , Issue.1 , pp. 25-28
- Guz, Z.¹ Bolotin, E.² Keidar, I.³ Kolodny, A.⁴ Mendelson, A.⁵ Weiser, U.⁶

15
- 77954995378
- Understanding sources of inefficiency in general-purpose chips
- R. Hameed, W. Qadeer, M. Wachs, O. Azizi, A. Solomatnikov, B. C. Lee, S. Richardson, C. Kozyrakis, and M. Horowitz, "Understanding sources of inefficiency in general-purpose chips"in Intl. Symp. on Computer Architecture (ISCA), 2010, pp. 37-47
- (2010) Intl. Symp. on Computer Architecture (ISCA) , pp. 37-47
- Hameed, R.¹ Qadeer, W.² Wachs, M.³ Azizi, O.⁴ Solomatnikov, A.⁵ Lee, B.C.⁶ Richardson, S.⁷ Kozyrakis, C.⁸ Horowitz, M.⁹

16
- 77954994853
- An integrated GPU power and performance model
- S. Hong and H. Kim, "An integrated GPU power and performance model"in Intl. Symp. on Computer Architecture (ISCA), 2010, pp. 280-289
- (2010) Intl. Symp. on Computer Architecture (ISCA) , pp. 280-289
- Hong, S.¹ Kim, H.²

17
- 79953071805
- Sponge: Portable stream programming on graphics engines
- A. H. Hormati, M. Samadi, M. Woh, T. Mudge, and S. Mahlke, "Sponge: portable stream programming on graphics engines"in Intl. Conf. on Arch. Support for Prog. Lang. &Operating Systems (ASPLOS), 2011, pp. 381-392
- (2011) Intl. Conf. on Arch. Support for Prog. Lang. &Operating Systems (ASPLOS) , pp. 381-392
- Hormati, A.H.¹ Samadi, M.² Woh, M.³ Mudge, T.⁴ Mahlke, S.⁵

18
- 84874510087
- Y. Huang, P. Ienne, O. Temam, Y. Chen, and C. Wu, "Elastic CGRAs"in Intl. Symp. on Field Programmable Gate Arrays, 2013, pp. 171-180
- (2013) Elastic CGRAs Intl. Symp. on Field Programmable Gate Arrays , pp. 171-180
- Huang, Y.¹ Ienne, P.² Temam, O.³ Chen, Y.⁴ Wu, C.⁵

19
- 84944414165
- Runtime power monitoring in high-end processors: Methodology and empirical data
- C. Isci and M. Martonosi, "Runtime power monitoring in high-end processors: Methodology and empirical data"in Intl. Symp. on Microarchitecture (MICRO), 2003, pp. 93-104
- (2003) Intl. Symp. on Microarchitecture (MICRO) , pp. 93-104
- Isci, C.¹ Martonosi, M.²

20
- 80054875176
- GPUs and the future of parallel computing
- S. W. Keckler, W. J. Dally, B. Khailany, M. Garland, and D. Glasco, "GPUs and the future of parallel computing"IEEE Micro, vol. 31, pp. 7-17, 2011
- (2011) IEEE Micro , vol.31 , pp. 7-17
- Keckler, S.W.¹ Dally, W.J.² Khailany, B.³ Garland, M.⁴ Glasco, D.⁵

21
- 84862328133
- Life after Dennard and how i learned to love the picojoule
- (keynote)
- S. Keckler, "Life after Dennard and how I learned to love the picojoule"Intl. Symp. on Microarchitecture (MICRO), 2012, (keynote)
- (2012) Intl. Symp. on Microarchitecture (MICRO)
- Keckler, S.¹

22
- 0035271572
- Imagine: Media processing with streams
- B. Khailany, W. J. Dally, U. J. Kapasi, P. Mattson, J. Namkoong, J. D. Owens, B. Towles, A. Chang, and S. Rixner, "Imagine: Media processing with streams"IEEE Micro, vol. 21, pp. 35-46, 2001
- (2001) IEEE Micro , vol.21 , pp. 35-46
- Khailany, B.¹ Dally, W.J.² Kapasi, U.J.³ Mattson, P.⁴ Namkoong, J.⁵ Owens, J.D.⁶ Towles, B.⁷ Chang, A.⁸ Rixner, S.⁹

23
- 3042658703
- LLVM: A compilation framework for lifelong program analysis &transformation
- C. Lattner and V. Adve, "LLVM: A compilation framework for lifelong program analysis &transformation"in Intl. Symp. on Code Generation &Optimization (CGO), 2004, pp. 75
- (2004) Intl. Symp. on Code Generation &Optimization (CGO) , pp. 75
- Lattner, C.¹ Adve, V.²

24
- 0031599788
- Space-time scheduling of instruction-level parallelism on a raw machine
- W. Lee, R. Barua, M. Frank, D. Srikrishna, J. Babb, V. Sarkar, and S. Amarasinghe, "Space-time scheduling of instruction-level parallelism on a raw machine"in Intl. Conf. on Arch. Support for Prog. Lang. &Operating Systems (ASPLOS), 1998, pp. 46-57
- (1998) Intl. Conf. on Arch. Support for Prog. Lang. &Operating Systems (ASPLOS) , pp. 46-57
- Lee, W.¹ Barua, R.² Frank, M.³ Srikrishna, D.⁴ Babb, J.⁵ Sarkar, V.⁶ Amarasinghe, S.⁷

25
- 84881151222
- GPUWattch: Enabling energy optimizations in GPGPUs
- J. Leng, T. Hetherington, A. ElTantawy, S. Gilani, N. S. Kim, T. M. Aamodt, and V. J. Reddi, "GPUWattch: enabling energy optimizations in GPGPUs"in Intl. Symp. on Computer Architecture (ISCA), 2013, pp. 487-498
- (2013) Intl. Symp. on Computer Architecture (ISCA) , pp. 487-498
- Leng, J.¹ Hetherington, T.² Eltantawy, A.³ Gilani, S.⁴ Kim, N.S.⁵ Aamodt, T.M.⁶ Reddi, V.J.⁷

26
- 44849137198
- NVIDIA Tesla: A unified graphics and computing architecture
- E. Lindholm, J. Nickolls, S. Oberman, and J. Montrym, "NVIDIA Tesla: A unified graphics and computing architecture"IEEE Micro, vol. 28, no. 2, pp. 39-55, 2008
- (2008) IEEE Micro , vol.28 , Issue.2 , pp. 39-55
- Lindholm, E.¹ Nickolls, J.² Oberman, S.³ Montrym, J.⁴

27
- 34547456544
- Tartan: Evaluating spatial computation for whole program execution
- Oct
- M. Mishra, T. J. Callahan, T. Chelcea, G. Venkataramani, M. Budiu, and S. C. Goldstein, "Tartan: Evaluating spatial computation for whole program execution"in Intl. Conf. on Arch. Support for Prog. Lang. &Operating Systems (ASPLOS), Oct 2006, pp. 163-174
- (2006) Intl. Conf. on Arch. Support for Prog. Lang. &Operating Systems (ASPLOS) , pp. 163-174
- Mishra, M.¹ Callahan, T.J.² Chelcea, T.³ Venkataramani, G.⁴ Budiu, M.⁵ Goldstein, S.C.⁶

28
- 77951154340
- The GPU computing era
- J. Nickolls and W. Dally, "The GPU computing era"IEEE Micro, vol. 30, no. 2, pp. 56-69, 2010
- (2010) IEEE Micro , vol.30 , Issue.2 , pp. 56-69
- Nickolls, J.¹ Dally, W.²

29
- 78651550268
- Scalable parallel programming with CUDA
- Garland, and K. Skadron
- J. Nickolls, I. Buck, M. Garland, and K. Skadron, "Scalable parallel programming with CUDA"ACM Queue, vol. 6, no. 2, pp. 40-53, 2008
- (2008) ACM Queue , vol.6 , Issue.2 , pp. 40-53
- Nickolls, J.¹ Buck, M.I.²

30
- 84905492307
- Nvidia, Fermi Compute Architecture Whitepaper
- Nvidia, Fermi Compute Architecture Whitepaper

31
- 84905504780
- NVIDIA Tegra 4 family CPU architecture: 4-PLUS-1 quad core
- NVIDIA, "NVIDIA Tegra 4 family CPU architecture: 4-PLUS-1 quad core"2013 [Online]. Available: http://www.nvidia.com/docs/IO/116757/NVIDIA- Quad-A15-whitepaper-FINALv2.pdf
- (2013) NVIDIA

32
- 70349100958
- OpenCL Working Group, Oct 2009 ver 10
- OpenCL Working Group, "The OpenCL specification"www.khronos. org/opencl, Oct 2009, ver. 1.0
- The OpenCL Specification

33
- 84963624364
- The program dependence web: A representation supporting control-, data-, and demanddriven interpretation of imperative languages
- K. J. Ottenstein, R. A. Ballance, and A. B. MacCabe, "The program dependence web: a representation supporting control-, data-, and demanddriven interpretation of imperative languages"in Intl. Conf. on Programming Language Design and Impl. (PLDI), 1990, pp. 257-271
- (1990) Intl. Conf. on Programming Language Design and Impl. (PLDI) , pp. 257-271
- Ottenstein, K.J.¹ Ballance, R.A.² Maccabe, A.B.³

34
- 0025431466
- Monsoon: An explicit tokenstore architecture
- G. M. Papadopoulos and D. E. Culler, "Monsoon: an explicit tokenstore architecture"in Intl. Symp. on Computer Architecture (ISCA), 1990, pp. 82-91
- (1990) Intl. Symp. on Computer Architecture (ISCA) , pp. 82-91
- Papadopoulos, G.M.¹ Culler, D.E.²

35
- 0017922490
- The CRAY-1 computer system
- Jan
- R. M. Russell, "The CRAY-1 computer system"Comm. ACM, vol. 21, no. 1, pp. 63-72, Jan. 1978
- (1978) Comm ACM , vol.21 , Issue.1 , pp. 63-72
- Russell, R.M.¹

36
- 0037669851
- Exploiting ILP, TLP, and DLP with the polymorphous TRIPS architecture
- K. Sankaralingam, R. Nagarajan, H. Liu, C. Kim, J. Huh, D. Burger, S. W. Keckler, and C. R. Moore, "Exploiting ILP, TLP, and DLP with the polymorphous TRIPS architecture"in Intl. Symp. on Computer Architecture (ISCA), 2003, pp. 422-433
- (2003) Intl. Symp. on Computer Architecture (ISCA) , pp. 422-433
- Sankaralingam, K.¹ Nagarajan, R.² Liu, H.³ Kim, C.⁴ Huh, J.⁵ Burger, D.⁶ Keckler, S.W.⁷ Moore, C.R.⁸

37
- 40349095135
- Dataflow predication
- A. Smith, R. Nagarajan, K. Sankaralingam, R. McDonald, D. Burger, S. W. Keckler, and K. S. McKinley, "Dataflow predication"in Intl. Symp. on Microarchitecture (MICRO), 2006, pp. 89-102
- (2006) Intl. Symp. on Microarchitecture (MICRO) , pp. 89-102
- Smith, A.¹ Nagarajan, R.² Sankaralingam, K.³ McDonald, R.⁴ Burger, D.⁵ Keckler, S.W.⁶ McKinley, K.S.⁷

38
- 84905492308
- Threads on the cheap: Multithreaded execution in a WaveCache processor
- S. Swanson, A. Schwerin, A. Petersen, M. Oskin, and S. Eggers, "Threads on the cheap: Multithreaded execution in a WaveCache processor"in Workshop on Complexity-effective Design (WCED), 2004
- (2004) Workshop on Complexity-effective Design (WCED)
- Swanson, S.¹ Schwerin, A.² Petersen, A.³ Oskin, M.⁴ Eggers, S.⁵

39
- 84944392428
- Dec
- S. Swanson, K. Michelson, A. Schwerin, and M. Oskin, " WaveScalar"in Intl. Symp. on Microarchitecture (MICRO), Dec 2003, p. 291
- (2003) WaveScalar Intl. Symp. on Microarchitecture (MICRO) , pp. 291
- Swanson, S.¹ Michelson, K.² Schwerin, A.³ Oskin, M.⁴

40
- 0036505033
- The Raw microprocessor: A computational fabric for software circuits and general-purpose programs
- M. Taylor, J. Kim, J. Miller, D. Wentzlaff, F. Ghodrat, B. Greenwald, H. Hoffman, P. Johnson, J.-W. Lee, W. Lee, A. Ma, A. Saraf, M. Seneski, N. Shnidman, V. Strumpen, M. Frank, S. Amarasinghe, and A. Agarwal, "The Raw microprocessor: a computational fabric for software circuits and general-purpose programs"IEEE Micro, vol. 22, no. 2, pp. 25-35, 2002
- (2002) IEEE Micro , vol.22 , Issue.2 , pp. 25-35
- Taylor, M.¹ Kim, J.² Miller, J.³ Wentzlaff, D.⁴ Ghodrat, F.⁵ Greenwald, B.⁶ Hoffman, H.⁷ Johnson, P.⁸ Lee, J.-W.⁹ Lee, W.¹⁰ Ma, A.¹¹ Saraf, A.¹² Seneski, M.¹³ Shnidman, N.¹⁴ Strumpen, V.¹⁵ Frank, M.¹⁶ Amarasinghe, S.¹⁷ Agarwal, A.¹⁸

41
- 84959045524
- StreamIt: A language for streaming applications
- Apr
- W. Thies, M. Karczmarek, and S. P. Amarasinghe, "StreamIt: A language for streaming applications"in Intl. Conf. on Compiler Construction, Apr 2002, pp. 179-196
- (2002) Intl. Conf. on Compiler Construction , pp. 179-196
- Thies, W.¹ Karczmarek, M.² Amarasinghe, S.P.³

42
- 0029200683
- Simultaneous multithreading: Maximizing on-chip parallelism
- Jun
- D. Tullsen, S. Eggers, and H. Levy, "Simultaneous multithreading: Maximizing on-chip parallelism"in Intl. Symp. on Computer Architecture (ISCA), Jun 1995, pp. 392-403
- (1995) Intl. Symp. on Computer Architecture (ISCA) , pp. 392-403
- Tullsen, D.¹ Eggers, S.² Levy, H.³

43
- 0028087519
- Performance estimation of multistreamed, superscalar processors
- W. Yamamoto, M. Serrano, A. Talcott, R. Wood, and M. Nemirosky, "Performance estimation of multistreamed, superscalar processors"in Hawaii Intl. Conf. on System Sciences, vol. 1, 1994, pp. 195-204.
- (1994) Hawaii Intl. Conf. on System Sciences , vol.1 , pp. 195-204
- Yamamoto, W.¹ Serrano, M.² Talcott, A.³ Wood, R.⁴ Nemirosky, M.⁵

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.