-
2
-
-
85027615245
-
Automatic decomposition of scientific programs for parallel execution
-
New York, ACM Press
-
R. Allen, D. Callahan, and K. Kennedy, Automatic decomposition of scientific programs for parallel execution. In POPL ’87: Proceedings of the 14th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages, pages 63–76, New York, 1987. ACM Press.
-
(1987)
POPL ’87: Proceedings of the 14Th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages
, pp. 63-76
-
-
Allen, R.1
Callahan, D.2
Kennedy, K.3
-
6
-
-
33745125067
-
On the architectural requirements for efficient execution of graph algorithms
-
Washington, DC, IEEE Computer Society
-
D. A. Bader, G. Cong, and J. Feo, On the architectural requirements for efficient execution of graph algorithms. In ICPP ’05: Proceedings of the 2005 International Conference on Parallel Processing (ICPP’05), pages 547–556, Washington, DC, 2005. IEEE Computer Society.
-
(2005)
ICPP ’05: Proceedings of the 2005 International Conference on Parallel Processing (ICPP’05)
, pp. 547-556
-
-
Bader, D.A.1
Cong, G.2
Feo, J.3
-
7
-
-
34547478190
-
Mesh-of-trees and alternative interconnection networks for single-chip parallel processing
-
Steamboat Springs, Colorado, Best Paper Award
-
A. Balkan, G. Qu, and U. Vishkin, Mesh-of-trees and alternative interconnection networks for single-chip parallel processing. In ASAP 2006: 17th IEEE Internation Conference on Application- Specific Systems, Architectures and Processors, pages 73–80, Steamboat Springs, Colorado, 2006. Best Paper Award.
-
(2006)
ASAP 2006: 17Th IEEE Internation Conference on Application- Specific Systems, Architectures and Processors
, pp. 73-80
-
-
Balkan, A.1
Qu, G.2
Vishkin, U.3
-
8
-
-
85056066073
-
-
XMTC compiler andXMT simulator. Technical Report UMIACS-TR 2005-45, University of Maryland Institute for Advanced Computer Studies (UMIACS), February
-
A. O. Balkan and U. Vishkin, Programmer’s manual forXMTC language, XMTC compiler andXMT simulator. Technical Report UMIACS-TR 2005-45, University of Maryland Institute for Advanced Computer Studies (UMIACS), February 2006.
-
(2006)
Programmer’s Manual Forxmtc Language
-
-
Balkan, A.O.1
Vishkin, U.2
-
9
-
-
0024646909
-
Adaptive bitonic sorting: An optimal parallel algorithm for sharedmemory machines
-
G. Bilardi and A. Nicolau, Adaptive bitonic sorting: an optimal parallel algorithm for sharedmemory machines. SIAM J. Comput., 18(2):216–228, 1989.
-
(1989)
SIAM J. Comput
, vol.18
, Issue.2
, pp. 216-228
-
-
Bilardi, G.1
Nicolau, A.2
-
10
-
-
0030105185
-
Programming parallel algorithms
-
G. E. Blelloch, Programming parallel algorithms. Commun. ACM, 39(3):85–97, 1996.
-
(1996)
Commun. ACM
, vol.39
, Issue.3
, pp. 85-97
-
-
Blelloch, G.E.1
-
11
-
-
85037420457
-
A comparison of sorting algorithms for the connection machine CM-2
-
New York, ACM Press
-
G. E. Blelloch, C. E. Leiserson, B. M. Maggs, C. G. Plaxton, S. J. Smith, and M. Zagha, A comparison of sorting algorithms for the connection machine CM-2. In SPAA ’91: Proceedings of the Third Annual ACM Symposium on Parallel Algorithms and Architectures, pages 3–16, New York, 1991. ACM Press.
-
(1991)
SPAA ’91: Proceedings of the Third Annual ACM Symposium on Parallel Algorithms and Architectures
, pp. 3-16
-
-
Blelloch, G.E.1
Leiserson, C.E.2
Maggs, B.M.3
Plaxton, C.G.4
Smith, S.J.5
Zagha, M.6
-
13
-
-
0016046965
-
The parallel evaluation of general arithmetic expressions
-
R. P. Brent, The parallel evaluation of general arithmetic expressions. J. ACM, 21(2):201–206, 1974.
-
(1974)
J. ACM
, vol.21
, Issue.2
, pp. 201-206
-
-
Brent, R.P.1
-
14
-
-
0002986475
-
-
D. Burger and T. M. Austin, The simplescalar tool set, version 2. 0. SIGARCH Comput. Archit. News, 25(3):13–25, 1997.
-
(1997)
The Simplescalar Tool Set, Version 2. 0. SIGARCH Comput. Archit. News
, vol.25
, Issue.3
, pp. 13-25
-
-
Burger, D.1
Austin, T.M.2
-
16
-
-
0002854989
-
Deterministic coin tossing and accelerating cascades: Micro and macro techniques for designing parallel algorithms
-
New York, ACM Pres
-
R. Cole and U. Vishkin, Deterministic coin tossing and accelerating cascades: micro and macro techniques for designing parallel algorithms. In STOC ’86: Proceedings of the Eighteenth Annual ACM Symposium on Theory of Computing, pages 206–219, New York, 1986. ACM Press.
-
(1986)
STOC ’86: Proceedings of the Eighteenth Annual ACM Symposium on Theory of Computing
, pp. 206-219
-
-
Cole, R.1
Vishkin, U.2
-
18
-
-
0004116989
-
-
MIT Press, Cambridge, MA
-
T. H. Cormen, C. E. Leiserson, and R. L. Rivest, Introduction to Algorithms. 1st ed., MIT Press, Cambridge, MA, 1990.
-
(1990)
Introduction to Algorithms. 1St Ed
-
-
Cormen, T.H.1
Leiserson, C.E.2
Rivest, R.L.3
-
20
-
-
84858665952
-
Experimentswith list ranking for explicitmulti-threaded (Xmt) instruction parallelism
-
London, UK, July
-
S. Dascal and U. Vishkin, Experimentswith list ranking for explicitmulti-threaded (xmt) instruction parallelism. J. Exp, Algorithmics, 5:10, 2000. Special issue for the 3rd Workshop on Algorithms Engineering (WAE’99), London, UK, July 1999.
-
(1999)
J. Exp, Algorithmics, 5:10, 2000. Special Issue for the 3Rd Workshop on Algorithms Engineering (WAE’99)
-
-
Dascal, S.1
Vishkin, U.2
-
21
-
-
84956865627
-
Performance ofMP3Don the SB-PRAM prototype (Research note)
-
London, UK, Springer-Verlag
-
R. Dementiev, M. Klein, and W. J. Paul, Performance ofMP3Don the SB-PRAM prototype (research note). In Euro-Par ’02: Proceedings of the 8th International Euro-Par Conference on Parallel Processing, pages 132–136, London, UK, 2002. Springer-Verlag.
-
(2002)
Euro-Par ’02: Proceedings of the 8Th International Euro-Par Conference on Parallel Processing
, pp. 132-136
-
-
Dementiev, R.1
Klein, M.2
Paul, W.J.3
-
22
-
-
0030216116
-
Fast parallel sorting under Logp: Experience with the CM-5
-
A. C. Dusseau, D. E. Culler, K. E. Schauser, and R. P. Martin, Fast parallel sorting under Logp: Experience with the CM-5. IEEE Trans. Parallel Distrib. Syst., 7(8):791–805, 1996.
-
(1996)
IEEE Trans. Parallel Distrib. Syst
, vol.7
, Issue.8
, pp. 791-805
-
-
Dusseau, A.C.1
Culler, D.E.2
Schauser, K.E.3
Martin, R.P.4
-
23
-
-
0000529418
-
Parallel algorithmic techniques for combinatorial computation
-
D. Eppstein and Z. Galil, Parallel algorithmic techniques for combinatorial computation. Ann. Rev. of Comput. Sci., 3:233–283, 1988.
-
(1988)
Ann. Rev. of Comput. Sci
, vol.3
, pp. 233-283
-
-
Eppstein, D.1
Galil, Z.2
-
24
-
-
33644651957
-
-
New York, ACM Press
-
J. Feo, D. Harper, S. Kahan, and P. Konecny, Eldorado. In CF ’05: Proceedings of the 2nd Conference on Computing Frontiers, pages 28–34, New York, 2005. ACM Press.
-
(2005)
Eldorado. in CF ’05: Proceedings of the 2Nd Conference on Computing Frontiers
, pp. 28-34
-
-
Feo, J.1
Harper, D.2
Kahan, S.3
Konecny, P.4
-
25
-
-
84963575852
-
Hardware prefetching in bus-based multiprocessors: Pattern characterization and cost-effective hardware
-
M. J. Garzaran, J. L. Briz, P. Ibanez, and V. Vinals, Hardware prefetching in bus-based multiprocessors: pattern characterization and cost-effective hardware. In Proceedings of the Euromicro Workshop on Parallel and Distributed Processing, pages 345–354, 2001.
-
(2001)
Proceedings of the Euromicro Workshop on Parallel and Distributed Processing
, pp. 345-354
-
-
Garzaran, M.J.1
Briz, J.L.2
Ibanez, P.3
Vinals, V.4
-
27
-
-
0008757016
-
The queue-read queue-write asynchronous PRAM model
-
P. B. Gibbons, Y. Matias, and V. Ramachandran, The queue-read queue-write asynchronous PRAM model. Theor. Comput. Sci., 196(1-2):3–29, 1998.
-
(1998)
Theor. Comput. Sci
, vol.196
, Issue.1-2
, pp. 3-29
-
-
Gibbons, P.B.1
Matias, Y.2
Ramachandran, V.3
-
28
-
-
85057191630
-
The NYU ultracomputer: Designing a mimd, shared-memory parallel machine (extended abstract)
-
Los Alamitos, CA, IEEE Computer Society Press
-
A. Gottlieb, R. Grishman, C. P. Kruskal, K. P. McAuliffe, L. Rudolph, and M. Snir, The NYU ultracomputer: designing a mimd, shared-memory parallel machine (extended abstract). In ISCA ’82: Proceedings of the 9th Annual Symposium on Computer Architecture, pages 27–42, Los Alamitos, CA, 1982. IEEE Computer Society Press.
-
(1982)
ISCA ’82: Proceedings of the 9Th Annual Symposium on Computer Architecture
, pp. 27-42
-
-
Gottlieb, A.1
Grishman, R.2
Kruskal, C.P.3
McAuliffe, K.P.4
Rudolph, L.5
Snir, M.6
-
31
-
-
78651284090
-
Simulation of cloud dynamics on graphics hardware
-
Aire-la-Ville, Switzerland, Switzerland, Eurographics Association
-
M. J. Harris, W. V. Baxter, T. Scheuermann, and A. Lastra, Simulation of cloud dynamics on graphics hardware. In HWWS ’03: Proceedings of the ACM SIGGRAPH/EUROGRAPHICS Conference on Graphics Hardware, pages 92–101, Aire-la-Ville, Switzerland, Switzerland, 2003. Eurographics Association.
-
(2003)
HWWS ’03: Proceedings of the ACM SIGGRAPH/EUROGRAPHICS Conference on Graphics Hardware
, pp. 92-101
-
-
Harris, M.J.1
Baxter, W.V.2
Scheuermann, T.3
Lastra, A.4
-
37
-
-
0003819667
-
-
Addison Wesley Longman Publishing Co., Inc., Redwood City, CA
-
J. JáJá. An Introduction to Parallel Algorithms. Addison Wesley Longman Publishing Co., Inc., Redwood City, CA, 1992.
-
(1992)
An Introduction to Parallel Algorithms
-
-
Jájá, J.1
-
40
-
-
78650965796
-
Uberflow: A GPU-based particle engine
-
New York, ACM Press
-
P. Kipfer, M. Segal, and R. Westermann, Uberflow: a GPU-based particle engine. In HWWS ’04: Proceedings of the ACM SIGGRAPH/EUROGRAPHICS Conference on Graphics Hardware, pages 115–122, New York, 2004. ACM Press.
-
(2004)
HWWS ’04: Proceedings of the ACM SIGGRAPH/EUROGRAPHICS Conference on Graphics Hardware
, pp. 115-122
-
-
Kipfer, P.1
Segal, M.2
Westermann, R.3
-
41
-
-
84976772007
-
Parallel prefix computation
-
R. E. Ladner and M. J. Fischer, Parallel prefix computation. J. ACM, 27(4):831–838, 1980.
-
(1980)
J. ACM
, vol.27
, Issue.4
, pp. 831-838
-
-
Ladner, R.E.1
Fischer, M.J.2
-
44
-
-
35048828869
-
-
Aire-la-Ville, Switzerland, Eurographics Association
-
K. Moreland and E. Angel, The FFT on a GPU. In HWWS ’03: Proceedings of the ACM SIGGRAPH/EUROGRAPHICS Conference on Graphics Hardware, pages 112–119, Aire-la-Ville, Switzerland, 2003. Eurographics Association.
-
(2003)
The FFT on a GPU. in HWWS ’03: Proceedings of the ACM SIGGRAPH/EUROGRAPHICS Conference on Graphics Hardware
, pp. 112-119
-
-
Moreland, K.1
Angel, E.2
-
45
-
-
85056034618
-
Evaluating multi-threading in the prototype XMT environment
-
D. Naishlos, J. Nuzman, C. W. Tseng, and U. Vishkin, Evaluating multi-threading in the prototype XMT environment. In Proceedings 4th Workshop on Multithreaded Execution, Architecture, and Compilation, 2000.
-
(2000)
Proceedings 4Th Workshop on Multithreaded Execution, Architecture, and Compilation
-
-
Naishlos, D.1
Nuzman, J.2
Tseng, C.W.3
Vishkin, U.4
-
46
-
-
84959055201
-
Evaluating the XMT programming model
-
D. Naishlos, J. Nuzman, C. W. Tseng, and U. Vishkin, Evaluating the XMT programming model. In Proceedings of the 6th Workshop on High-Level Parallel Programming Models and Supportive Environments, pages 95–108, 2001.
-
(2001)
Proceedings of the 6Th Workshop on High-Level Parallel Programming Models and Supportive Environments
, pp. 95-108
-
-
Naishlos, D.1
Nuzman, J.2
Tseng, C.W.3
Vishkin, U.4
-
47
-
-
0142217464
-
Towards a first vertical prototyping of an extremely fine-grained parallel programming approach
-
New York, Springer-Verlag
-
D. Naishlos, J. Nuzman, C. W. Tseng, and U. Vishkin, Towards a first vertical prototyping of an extremely fine-grained parallel programming approach. In Invited Special Issue for ACM-SPAA’01: TOCS 36, 5, pages 521–552, New York, 2003. Springer-Verlag.
-
(2003)
Invited Special Issue for ACM-SPAA’01: TOCS 36
, vol.5
, pp. 521-552
-
-
Naishlos, D.1
Nuzman, J.2
Tseng, C.W.3
Vishkin, U.4
-
48
-
-
0035182169
-
Comparing and combining read miss clustering and software prefetching
-
Washington, DC, IEEE Computer Society
-
V. S. Pai and S. V. Adve, Comparing and combining read miss clustering and software prefetching. In PACT ’01: Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques, page 292, Washington, DC, 2001. IEEE Computer Society.
-
(2001)
PACT ’01: Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques
, pp. 292
-
-
Pai, V.S.1
Adve, S.V.2
-
49
-
-
0023213138
-
A logarithmic time sort for linear size networks
-
J. H. Reif and L. G. Valiant, A logarithmic time sort for linear size networks. J. ACM, 34(1):60–76, 1987.
-
(1987)
J. ACM
, vol.34
, Issue.1
, pp. 60-76
-
-
Reif, J.H.1
Valiant, L.G.2
-
51
-
-
24344447939
-
Sparse matrices in MATLAB: Design and implementation
-
V. Shah and J. R. Gilbert, Sparse matrices in MATLAB: Design and implementation. In HiPC, pages 144–155, 2004.
-
(2004)
Hipc
, pp. 144-155
-
-
Shah, V.1
Gilbert, J.R.2
-
52
-
-
0346863310
-
An o(N2logn) parallel max-flow algorithm
-
Y. Shiloach and U. Vishkin, An o(n2logn) parallel max-flow algorithm. J. Algorithms, 3(2):128–146, 1982.
-
(1982)
J. Algorithms
, vol.3
, Issue.2
, pp. 128-146
-
-
Shiloach, Y.1
Vishkin, U.2
-
54
-
-
84877021547
-
Multi-processor performance on the Tera MTA
-
Washington, DC, IEEE Computer Society
-
A. Snavely, L. Carter, J. Boisseau, A. Majumdar, K. S. Gatlin, N. Mitchell, J. Feo, and B. Koblenz, Multi-processor performance on the Tera MTA. In Supercomputing ’98: Proceedings of the 1998 ACM/IEEE Conference on Supercomputing (CDROM), pages 1–8, Washington, DC, 1998. IEEE Computer Society.
-
(1998)
Supercomputing ’98: Proceedings of the 1998 ACM/IEEE Conference on Supercomputing (CDROM)
, pp. 1-8
-
-
Snavely, A.1
Carter, L.2
Boisseau, J.3
Majumdar, A.4
Gatlin, K.S.5
Mitchell, N.6
Feo, J.7
Koblenz, B.8
-
55
-
-
33746291130
-
Impact of compiler-based data-prefetching techniques on SPECOMPapplication performance
-
1, Washington, DC, IEEE Computer Society
-
X. Tian, R. Krishnaiyer, H. Saito, M. Girkar, and W. Li, Impact of compiler-based data-prefetching techniques on SPECOMPapplication performance. In IPDPS ’05: Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS’05)—Papers, page 53. 1, Washington, DC, 2005. IEEE Computer Society.
-
(2005)
IPDPS ’05: Proceedings of the 19Th IEEE International Parallel and Distributed Processing Symposium (IPDPS’05)—Papers
, pp. 53
-
-
Tian, X.1
Krishnaiyer, R.2
Saito, H.3
Girkar, M.4
Li, W.5
-
60
-
-
0031629796
-
Explicit multi-threading (XMT) bridging models for instruction parallelism (extended abstract)
-
New York, ACM Press
-
U. Vishkin, S. Dascal, E. Berkovich, and J. Nuzman, Explicit multi-threading (XMT) bridging models for instruction parallelism (extended abstract). In SPAA ’98: Proceedings of the Tenth Annual ACM Symposium on Parallel Algorithms and Architectures, pages 140–151, New York, 1998. ACM Press.
-
(1998)
SPAA ’98: Proceedings of the Tenth Annual ACM Symposium on Parallel Algorithms and Architectures
, pp. 140-151
-
-
Vishkin, U.1
Dascal, S.2
Berkovich, E.3
Nuzman, J.4
-
62
-
-
46449113366
-
Layout-accurate design and implementation of a high-throughput interconnection network for single-chip parallel processing
-
Stanford, CA, August 22–24, IEEE
-
Balkan, M. Horak, G. Qu, and U. Vishkin, Layout-accurate design and implementation of a high-throughput interconnection network for single-chip parallel processing. In Processings of Hot Interconnects 15, pages 21–28, Stanford, CA, August 22–24, 2007. IEEE.
-
(2007)
Processings of Hot Interconnects
, vol.15
, pp. 21-28
-
-
Balkan, M.H.1
Qu, G.2
Vishkin, U.3
|