-
1
-
-
0025545476
-
VCODE: A data-parallel intermediate language
-
BC90
-
[BC90] Blelloch, G. and S. Chatterjee. VCODE: A data-parallel intermediate language. In FOMPC3, 1990, pp. 471-480.
-
(1990)
FOMPC3
, pp. 471-480
-
-
Blelloch, G.1
Chatterjee, S.2
-
3
-
-
43949161602
-
Implementation of a portable nested data-parallel language
-
BCH+94
-
[BCH+94] Blelloch, G. E., S. Chatterjee, J. C. Hardwick, J. Sipelstein, and M. Zagha. Implementation of a portable nested data-parallel language. JPDC, 21(1), 1994, pp. 4-14.
-
(1994)
JPDC
, vol.21
, Issue.1
, pp. 4-14
-
-
Blelloch, E.G.1
Chatterjee, S.2
Hardwick, J.C.3
Sipelstein, J.4
Zagha, M.5
-
4
-
-
0030381077
-
The quickhull algorithm for convex hulls
-
BDH96
-
[BDH96] Barber, C. B., D. P. Dobkin, and H. Huhdanpaa. The quickhull algorithm for convex hulls. ACM TOMS, 22(4), 1996, pp. 469-483.
-
(1996)
ACM TOMS
, vol.22
, Issue.4
, pp. 469-483
-
-
Barber, B.C.1
Dobkin, D.P.2
Huhdanpaa, H.3
-
5
-
-
78249233242
-
Lazy tree splitting
-
[BFR+10]. ACM, September
-
[BFR+10] Bergstrom, L., M. Fluet, M. Rainey, J. Reppy, and A. Shaw. Lazy tree splitting. In ICFP '10. ACM, September 2010, pp. 93-104.
-
(2010)
ICFP '10
, pp. 93-104
-
-
Bergstrom, L.1
Fluet, M.2
Rainey, M.3
Reppy, J.4
Shaw, A.5
-
6
-
-
33846349887
-
A hierarchical O(N log N) force calculation algorithm
-
[BH86]. 324, December
-
[BH86] Barnes, J. and P. Hut. A hierarchical O(N logN) force calculation algorithm. Nature, 324, December 1986, pp. 446-449.
-
(1986)
Nature
, pp. 446-449
-
-
Barnes, J.1
Hut, P.2
-
7
-
-
0030105185
-
Programming parallel algorithms
-
[Ble96], , March
-
[Ble96] Blelloch, G. E. Programming parallel algorithms. CACM, 39(3), March 1996, pp. 85-97.
-
(1996)
CACM
, vol.39
, Issue.3
, pp. 85-97
-
-
Blelloch, E.G.1
-
8
-
-
84858427151
-
An efficient CUDA implementation of the tree-based Barnes Hut n-body algorithm
-
[BP11]. In, chapter 6, pp. 75-92. Elsevier Science Publishers, New York, NY
-
[BP11] Burtscher, M. and K. Pingali. An efficient CUDA implementation of the tree-based Barnes Hut n-body algorithm. In GPU Computing Gems Emerald Edition, chapter 6, pp. 75-92. Elsevier Science Publishers, New York, NY, 2011.
-
(2011)
GPU Computing Gems Emerald Edition
-
-
Burtscher, M.1
Pingali, K.2
-
9
-
-
85015692260
-
The pricing of options and corporate liabilities
-
[BS73]
-
[BS73] Black, F. and M. Scholes. The pricing of options and corporate liabilities. JPE, 81(3), 1973, pp. 637-654.
-
(1973)
JPE
, vol.81
, Issue.3
, pp. 637-654
-
-
Black, F.1
Scholes, M.2
-
10
-
-
0025380943
-
Compiling collection-oriented languages onto massively parallel computers
-
BS90
-
[BS90] Blelloch, G. E. and G.W. Sabot. Compiling collection-oriented languages onto massively parallel computers. JPDC, 8(2), 1990, pp. 119-134.
-
(1990)
JPDC
, vol.8
, Issue.2
, pp. 119-134
-
-
Blelloch, E.G.1
Sabot, G.W.2
-
11
-
-
84862632175
-
GPU programming in a high level language compiling X10 to CUDA
-
[CBS11]. In, San Jose, CA, May. http://x10-lang.org
-
[CBS11] Cunningham, D., R. Bordawekar, and V. Saraswat. GPU programming in a high level language compiling X10 to CUDA. In X10'11, San Jose, CA, May 2011. Available from http://x10-lang.org/.
-
(2011)
X10'11
-
-
Cunningham, D.1
Bordawekar, R.2
Saraswat, V.3
-
12
-
-
79952784184
-
Copperhead: Compiling an embedded data parallel language
-
[CGK11]. In, San Antonio, TX, February. ACM
-
[CGK11] Catanzaro, B., M. Garland, and K. Keutzer. Copperhead: compiling an embedded data parallel language. In PPoPP '11, San Antonio, TX, February 2011. ACM, pp. 47-56.
-
(2011)
PPoPP '11
, pp. 47-56
-
-
Catanzaro, B.1
Garland, M.2
Keutzer, K.3
-
13
-
-
0027632582
-
Compiling nested data-parallel programs for shared-memory multiprocessors
-
[Cha93] July
-
[Cha93] Chatterjee, S. Compiling nested data-parallel programs for shared-memory multiprocessors. ACM TOPLAS, 15(3), July 1993, pp. 400-462.
-
(1993)
ACM TOPLAS
, vol.15
, Issue.3
, pp. 400-462
-
-
Chatterjee, S.1
-
14
-
-
79952136178
-
Accelerating Haskell array codes with multicore GPUs
-
[CKL+11]. In, Austin, January. ACM
-
[CKL+11] Chakravarty, M. M., G. Keller, S. Lee, T. L. McDonell, and V. Grover. Accelerating Haskell array codes with multicore GPUs. In DAMP '11, Austin, January 2011. ACM, pp. 3-14.
-
(2011)
DAMP '11
, pp. 3-14
-
-
Chakravarty, M.M.1
Keller, G.2
Lee, S.3
McDonell, T.L.4
Grover, V.5
-
15
-
-
84937389888
-
Nepal - Nested data parallelism in Haskell
-
[CKLP01]. of LNCS. Springer-Verlag, August
-
[CKLP01] Chakravarty, M. M. T., G. Keller, R. Leshchinskiy, and W. Pfannenstiel. Nepal - nested data parallelism in Haskell. In Euro-Par '01, vol. 2150 of LNCS. Springer-Verlag, August 2001, pp. 524-534.
-
(2001)
Euro-Par '01
, vol.2150
, pp. 524-534
-
-
Chakravarty, T.M.M.1
Keller, G.2
Leshchinskiy, R.3
Pfannenstiel, W.4
-
16
-
-
79551658111
-
Partial vectorisation of Haskell programs
-
[CLPK08]. In. ACM, January, pp.. Available from
-
[CLPK08] Chakravarty, M. M. T., R. Leshchinskiy, S. Peyton Jones, and G. Keller. Partial vectorisation of Haskell programs. In DAMP '08. ACM, January 2008, pp. 2-16. Available from http://clip.dia.fi.upm.es/Conferences/DAMP08/.
-
(2008)
DAMP '08
, pp. 2-16
-
-
Chakravarty, T.M.M.1
Leshchinskiy, R.2
Peyton Jones, S.3
Keller, G.4
-
17
-
-
84872376298
-
A new method for GPU based irregular reductions and its application to k-means clustering
-
[DR11]. In, Newport Beach, California, March. ACM
-
[DR11] Dhanasekaran, B. and N. Rubin. A new method for GPU based irregular reductions and its application to k-means clustering. In GPGPU-4, Newport Beach, California, March 2011. ACM.
-
(2011)
GPGPU-4
-
-
Dhanasekaran, B.1
Rubin, N.2
-
18
-
-
12744262557
-
Threaded code variations and optimizations
-
[Ert01] . In, Schloss Dagstuhl, Germany, November. pp.. Available from
-
[Ert01] Ertl, M. A. Threaded code variations and optimizations. In EuroForth 2001, Schloss Dagstuhl, Germany, November 2001. pp. 49-55. Available from http://www.complang. tuwien.ac.at/papers/.
-
(2001)
EuroForth 2001
, pp. 49-55
-
-
Ertl, A.M.1
-
19
-
-
84867517229
-
-
Technical Report TRA1/12, [GCN+12] National University of Singapore, School of Computing, January
-
[GCN+12] Gao, M., T.-T. Cao, A. Nanjappa, T.-S. Tan, and Z. Huang. A GPU Algorithm for Convex Hull. Technical Report TRA1/12, National University of Singapore, School of Computing, January 2012.
-
(2012)
A GPU Algorithm for Convex Hull
-
-
Gao, M.1
Cao, T.-T.2
Nanjappa, A.3
Tan, T.-S.4
Huang, Z.5
-
21
-
-
33747508171
-
SAC - A Functional Array Language for Efficient Multi-threaded Execution
-
[GS06] August
-
[GS06] Grelck, C. and S.-B. Scholz. SAC - A Functional Array Language for Efficient Multi-threaded Execution. IJPP, 34(4), August 2006, pp. 383-427.
-
(2006)
IJPP
, vol.34
, Issue.4
, pp. 383-427
-
-
Grelck, C.1
Scholz, S.-B.2
-
22
-
-
79952162843
-
Breaking the GPU programming barrier with the auto-parallelising SAC compiler
-
[GTS11]. In, Austin, January. ACM
-
[GTS11] Guo, J., J. Thiyagalingam, and S.-B. Scholz. Breaking the GPU programming barrier with the auto-parallelising SAC compiler. In DAMP '11, Austin, January 2011. ACM, pp. 15-24.
-
(2011)
DAMP '11
, pp. 15-24
-
-
Guo, J.1
Thiyagalingam, J.2
Scholz, S.-B.3
-
23
-
-
84882564541
-
Thrust: A productivity-oriented library for CUDA
-
[HB11]. In W.W. Hwu (ed.), chapter 26, pp. 359-372. Morgan Kaufmann Publishers, October
-
[HB11] Hoberock, J. and N. Bell. Thrust: A productivity-oriented library for CUDA. InW.W. Hwu (ed.), GPU Computing Gems, Jade Edition, chapter 26, pp. 359-372. Morgan Kaufmann Publishers, October 2011.
-
(2011)
GPU Computing Gems, Jade Edition
-
-
Hoberock, J.1
Bell, N.2
-
25
-
-
84870456255
-
Khronos open CL working group
-
[Khr11]., November. Available from
-
[Khr11] Khronos OpenCL Working Group. OpenCL 1.2 Specification, November 2011. Available from http://www.khronos. org/registry/cl/specs/opencl-1.2.pdf.
-
(2011)
OpenCL 1.2 Specification
-
-
-
26
-
-
79952182078
-
Simple optimizations for an applicative array language for graphics processors
-
[Lar11] . In, Austin, January. ACM
-
[Lar11] Larsen, B. Simple optimizations for an applicative array language for graphics processors. In DAMP '11, Austin, January 2011. ACM, pp. 25-34.
-
(2011)
DAMP '11
, pp. 25-34
-
-
Larsen, B.1
-
27
-
-
33746637093
-
Higher order flattening
-
[LCK06]. In V. Alexandrov, D. van Albada, P. Sloot, and J. Dongarra (eds.), number 3992 in LNCS. Springer- Verlag, May
-
[LCK06] Leshchinskiy, R., M. M. T. Chakravarty, and G. Keller. Higher order flattening. In V. Alexandrov, D. van Albada, P. Sloot, and J. Dongarra (eds.), ICCS '06, number 3992 in LNCS. Springer- Verlag, May 2006, pp. 920-928.
-
(2006)
ICCS '06
, pp. 920-928
-
-
Leshchinskiy, R.1
Chakravarty, M.M.T.2
Keller, G.3
-
29
-
-
84858391043
-
Scalable GPU graph traversal
-
[MGG12]. In, New Orleans, LA, February. ACM
-
[MGG12] Merrill, D., M. Garland, and A. Grimshaw. Scalable GPU graph traversal. In PPoPP '12, New Orleans, LA, February 2012. ACM, pp. 117-128.
-
(2012)
PPoPP '12
, pp. 117-128
-
-
Merrill, D.1
Garland, M.2
Grimshaw, A.3
-
30
-
-
84858374841
-
A GPU implementation of inclusion-based points-to analysis
-
[MLBP12]. In, New Orleans, LA, February. ACM
-
[MLBP12] Mendez-Lojo, M., M. Burtscher, and K. Pingali. A GPU implementation of inclusion-based points-to analysis. In PPoPP '12, New Orleans, LA, February 2012. ACM, pp. 107-116.
-
(2012)
PPoPP '12
, pp. 107-116
-
-
Mendez-Lojo, M.1
Burtscher, M.2
Pingali, K.3
-
31
-
-
78249272964
-
Nikola: Embedding compiled GPU functions in Haskell
-
[MM10]. In, Baltimore, MD, September. ACM
-
[MM10] Mainland, G. and G. Morrisett. Nikola: Embedding compiled GPU functions in Haskell. In HASKELL '10, Baltimore, MD, September 2010. ACM, pp. 67-78.
-
(2010)
HASKELL '10
, pp. 67-78
-
-
Mainland, G.1
Morrisett, G.2
-
33
-
-
35948991669
-
-
[NVI11b], Available from
-
[NVI11b] NVIDIA. NVIDIA CUDA C Programming Guide, 2011. Available from http://developer.nvidia. com/category/zone/cuda-zone.
-
(2011)
NVIDIA. NVIDIA CUDA C Programming Guide
-
-
-
34
-
-
77956373685
-
OptiX: A general purpose ray tracing engine
-
[PBD+10], 29, July
-
[PBD+10] Parker, S. G., J. Bigler, A. Dietrich, H. Friedrich, J. Hoberock, D. Luebke, D. McAllister, M. McGuire, K. Morley, A. Robison, and M. Stich. OptiX: a general purpose ray tracing engine. ACM TOG, 29, July 2010.
-
(2010)
ACM TOG
-
-
Parker, G.S.1
Bigler, J.2
Dietrich, A.3
Friedrich, H.4
Hoberock, J.5
Luebke, D.6
McAllister, D.7
McGuire, M.8
Morley, K.9
Robison, A.10
Stich, M.11
-
35
-
-
0029196596
-
Work-efficient nested data-parallelism
-
[PPW95]. In. IEEE Computer Society
-
[PPW95] Palmer, D. W., J. F. Prins, and S. Westfold. Work-efficient nested data-parallelism. In FoMPP5. IEEE Computer Society Press, 1995, pp. 186-193.
-
(1995)
FoMPP5
, pp. 186-193
-
-
Palmer, W.D.1
Prins, J.F.2
Westfold, S.3
-
36
-
-
0029204372
-
Optimizing an ANSI C interpreter with superoperators
-
[Pro95] . In, San Francisco, January. ACM. pp
-
[Pro95] Proebsting, T. A. Optimizing an ANSI C interpreter with superoperators. In POPL '95, San Francisco, January 1995. ACM. pp. 322-332.
-
(1995)
POPL '95
, pp. 322-332
-
-
Proebsting, A.T.1
-
37
-
-
78651284120
-
Scan primitives for GPU computing
-
[SHZO07]. In, San Diego, CA, August. Eurographics Association
-
[SHZO07] Sengupta, S., M. Harris, Y. Zhang, and J. D. Owens. Scan primitives for GPU computing. In GH '07, San Diego, CA, August 2007. Eurographics Association, pp. 97-106.
-
(2007)
GH '07
, pp. 97-106
-
-
Sengupta, S.1
Harris, M.2
Zhang, Y.3
Owens, J.D.4
-
38
-
-
67650065270
-
Stackbased parallel recursion on graphics processors
-
[YHL+09]. In, Raleigh, NC, February. ACM
-
[YHL+09] Yang, K., B. He, Q. Luo, P. V. Sander, and J. Shi. Stackbased parallel recursion on graphics processors. In PPoPP '09, Raleigh, NC, February 2009. ACM, pp. 299-300.
-
(2009)
PPoPP '09
, pp. 299-300
-
-
Yang, K.1
He, B.2
Luo, Q.3
Sander, P.V.4
Shi, J.5
|