-
1
-
-
0031988272
-
Tolerating latency in multiprocessors through compiler-inserted prefetching
-
T. Mowry, "Tolerating latency in multiprocessors through compiler-inserted prefetching," ACM Trans. Computer Systems, vol. 16, no. 1, pp. 55-92, 1998.
-
(1998)
ACM Trans. Computer Systems
, vol.16
, Issue.1
, pp. 55-92
-
-
Mowry, T.1
-
2
-
-
0004070086
-
-
Ph.D. thesis, Dept. of Comp. Sci. and Engr., University of Washington, Wash, USA
-
T.F. Chen, Data prefetching for high-performance processors, Ph.D. thesis, Dept. of Comp. Sci. and Engr., University of Washington, Wash, USA.
-
Data Prefetching for High-Performance Processors
-
-
Chen, T.F.1
-
3
-
-
0029341212
-
Sequential hardware prefetching in shared-memory multiprocessors
-
F. Dahlgren and M. Dubois, "Sequential hardware prefetching in shared-memory multiprocessors," IEEE Trans. on Parallel and Distributed Systems, vol. 6, no. 7, pp. 733-746, 1995.
-
(1995)
IEEE Trans. on Parallel and Distributed Systems
, vol.6
, Issue.7
, pp. 733-746
-
-
Dahlgren, F.1
Dubois, M.2
-
4
-
-
0030662811
-
Combining loop fusion with prefetching on shared-memory multiprocessors
-
Bloomingdale, Ill, USA, August
-
N. Manjikian, "Combining loop fusion with prefetching on shared-memory multiprocessors," in Proc. International Conference on Parallel Processing, pp. 78-82, Bloomingdale, Ill, USA, August 1997.
-
(1997)
Proc. International Conference on Parallel Processing
, pp. 78-82
-
-
Manjikian, N.1
-
5
-
-
0030661018
-
An adaptive sequential prefetching scheme in shared-memory multiprocessors
-
Bloomington, Ill, USA, August
-
M. K. Tcheun, H. Yoon, and S. R. Maeng, "An adaptive sequential prefetching scheme in shared-memory multiprocessors," in Proc. International Conference on Parallel Processing, pp. 306-313, Bloomington, Ill, USA, August 1997.
-
(1997)
Proc. International Conference on Parallel Processing
, pp. 306-313
-
-
Tcheun, M.K.1
Yoon, H.2
Maeng, S.R.3
-
6
-
-
0032297226
-
Scheduling of uniform multidimensional systems under resource constraints
-
N. Passos and E. H.-M. Sha, "Scheduling of uniform multidimensional systems under resource constraints," IEEE Trans. on VLSI Systems, vol. 6, no. 4, pp. 719-730, 1998.
-
(1998)
IEEE Trans. on VLSI Systems
, vol.6
, Issue.4
, pp. 719-730
-
-
Passos, N.1
Sha, E.H.-M.2
-
7
-
-
0000888309
-
Register requirements of pipelined processors
-
Washington, DC, USA, July
-
W. Mangione-Smith, S. G. Abraham, and E. S. Davidson, "Register requirements of pipelined processors," in Proc. International Conference on Supercomputing, pp. 260-271, Washington, DC, USA, July 1992.
-
(1992)
Proc. International Conference on Supercomputing
, pp. 260-271
-
-
Mangione-Smith, W.1
Abraham, S.G.2
Davidson, E.S.3
-
8
-
-
0028768013
-
Iterative modulo scheduling: An algorithm for software pipelining loops
-
San Jose, Calif, USA, November
-
B. R. Rau, "Iterative modulo scheduling: an algorithm for software pipelining loops," in Proc. 27th Annual International Symposium on Microarchitecture, pp. 63-74, San Jose, Calif, USA, November 1994.
-
(1994)
Proc. 27th Annual International Symposium on Microarchitecture
, pp. 63-74
-
-
Rau, B.R.1
-
9
-
-
0035280424
-
Minimizing average schedule length under memory constraints by optimal partitioning and prefetching
-
Z. Wang, T. W. O'Neil, and E. H.-M. Sha, "Minimizing average schedule length under memory constraints by optimal partitioning and prefetching," Journal of VLSI Signal Processing, vol. 27, no. 3, pp. 215-233, 2001.
-
(2001)
Journal of VLSI Signal Processing
, vol.27
, Issue.3
, pp. 215-233
-
-
Wang, Z.1
O'Neil, T.W.2
Sha, E.H.-M.3
-
10
-
-
0028591436
-
(Pen)-ultimate tiling
-
Knoxville, Tenn, USA, May
-
P. Bouilet, A. Darte, T. Risset, and Y. Robert, "(pen)-ultimate tiling," in Scalable High-Performance Computing Conference, pp. 568-576, Knoxville, Tenn, USA, May 1994.
-
(1994)
Scalable High-Performance Computing Conference
, pp. 568-576
-
-
Bouilet, P.1
Darte, A.2
Risset, T.3
Robert, Y.4
-
11
-
-
0032676178
-
A tile selection algorithm for data locality and cache interference
-
Rhodes, Greece, June
-
J. Chame and S. Moon, "A tile selection algorithm for data locality and cache interference," in Proc. 13th ACM International Conference on Supercomputing, pp. 492-499, Rhodes, Greece, June 1999.
-
(1999)
Proc. 13th ACM International Conference on Supercomputing
, pp. 492-499
-
-
Chame, J.1
Moon, S.2
-
12
-
-
0029373981
-
Automatic partitioning of parallel loops and data arrays for distributed shared-memory multiprocessors
-
A. Agarwal, D. A. Kranz, and V. Natarajan, "Automatic partitioning of parallel loops and data arrays for distributed shared-memory multiprocessors," IEEE Trans. on Parallel and Distributed Systems, vol. 6, no. 9, pp. 943-962, 1995.
-
(1995)
IEEE Trans. on Parallel and Distributed Systems
, vol.6
, Issue.9
, pp. 943-962
-
-
Agarwal, A.1
Kranz, D.A.2
Natarajan, V.3
-
13
-
-
73449135115
-
Loop scheduling and partitions for hiding memory latencies
-
San Jose, Calif, USA, November
-
F. Chen and E. H.-M. Sha, "Loop scheduling and partitions for hiding memory latencies," in Proc. IEEE 12th International Symposium on System Synthesis, pp. 64-70, San Jose, Calif, USA, November 1999.
-
(1999)
Proc. IEEE 12th International Symposium on System Synthesis
, pp. 64-70
-
-
Chen, F.1
Sha, E.H.-M.2
-
14
-
-
0024128157
-
Uniformization of linear recurrence equations: A step towards the automatic synthesis of systolic array
-
San Diego, Calif, USA, May
-
V. Van Dongen and P. Quinton, "Uniformization of linear recurrence equations: a step towards the automatic synthesis of systolic array," in International Conference on Systolic Arrays, pp. 473-482, San Diego, Calif, USA, May 1988.
-
(1988)
International Conference on Systolic Arrays
, pp. 473-482
-
-
Van Dongen, V.1
Quinton, P.2
-
15
-
-
0028583166
-
Automatic data layout using 0-1 integer programming
-
Montreal, Canada, August
-
R. Bixby, K. Kennedy, and U. Kremer, "Automatic data layout using 0-1 integer programming," in Proc. International Conference on Parallel Architectures and Compilation Techniques, Montreal, Canada, August 1994.
-
(1994)
Proc. International Conference on Parallel Architectures and Compilation Techniques
, pp. 111-122
-
-
Bixby, R.1
Kennedy, K.2
Kremer, U.3
-
16
-
-
0031631997
-
Eliminating conflict misses for high performance architectures
-
Melbourne, Australia, July
-
G. Rivera and C. W. Tseng, "Eliminating conflict misses for high performance architectures," in Proc. 1998 AACM International Conference on Supercomputing, pp. 353-360, Melbourne, Australia, July 1998.
-
(1998)
Proc. 1998 AACM International Conference on Supercomputing
, pp. 353-360
-
-
Rivera, G.1
Tseng, C.W.2
-
17
-
-
0029755361
-
Schedule-based multi-dimensional retiming on data flow graphs
-
N. L. Passos, E. H.-M. Sha, and S. C. Bass, "Schedule-based multi-dimensional retiming on data flow graphs," IEEE Trans. Signal Processing, vol. 44, no. 1, pp. 150-156, 1996.
-
(1996)
IEEE Trans. Signal Processing
, vol.44
, Issue.1
, pp. 150-156
-
-
Passos, N.L.1
Sha, E.H.-M.2
Bass, S.C.3
-
18
-
-
0026267802
-
An effective on-chip preloading scheme to reduce data access penalty
-
Albuquerque, NM, USA, November
-
J. L. Baer and T. F. Chen, "An effective on-chip preloading scheme to reduce data access penalty," in Proc. Supercomputing '91, pp. 176-186, Albuquerque, NM, USA, November 1991.
-
(1991)
Proc. Supercomputing '91
, pp. 176-186
-
-
Baer, J.L.1
Chen, T.F.2
|