-
1
-
-
2442598143
-
FPGAs vs. CPUs: Trends in peak floating-point performance
-
Monterey, CA, February
-
Underwood, K.: ' FPGAs vs. CPUs: trends in peak floating-point performance ', 12th ACM/SIGDA Int. Symp. on Field Programmable Gate Arrays, Monterey, CA, February 2004, p. 171-180
-
(2004)
ACM/SIGDA Int. Symp. on Field Programmable Gate Arrays
, pp. 171-180
-
-
Underwood, K.1
-
2
-
-
12444290912
-
Scalable and modular algorithms for floating-point matrix multiplication on FPGAs
-
April
-
Zhuo, L., and Prasanna, V.K.: ' Scalable and modular algorithms for floating-point matrix multiplication on FPGAs ', 18th Int. Parallel and Distributed Processing Symp., April 2004, p. 92-101
-
(2004)
Int. Parallel and Distributed Processing Symp.
, pp. 92-101
-
-
Zhuo, L.1
Prasanna, V.K.2
-
3
-
-
84942932890
-
Floating point unit generation and evaluation for FPGAs
-
April
-
Liang, J., Tessier, R., and Mencer, O.: ' Floating point unit generation and evaluation for FPGAs ', 11th Annual IEEE Symp. on Field-Programmable Custom Computing Machines, April 2003, p. 185-194
-
(2003)
Annual IEEE Symp. on Field-Programmable Custom Computing Machines
, pp. 185-194
-
-
Liang, J.1
Tessier, R.2
Mencer, O.3
-
4
-
-
4544332232
-
Regular mapping for coarse-grained reconfigurable architectures
-
Montréal, Canada, May
-
Hannig, F., Dutta, H., and Teich, J.: ' Regular mapping for coarse-grained reconfigurable architectures ', 2004 IEEE Int. Conf. Acoustics, Speech, and Signal Processing, Montréal, Canada, May 2004, V, p. 57-60
-
(2004)
2004 IEEE Int. Conf. Acoustics, Speech, and Signal Processing
, vol.5
, pp. 57-60
-
-
Hannig, F.1
Dutta, H.2
Teich, J.3
-
5
-
-
0012561327
-
Coming challenges in microarchitecture and architecture
-
Ronen, R., Mendelson, A., Lai, K., Lu, S.-L., Pollack, F., and Shen, J.: ' Coming challenges in microarchitecture and architecture ', Proc. IEEE, 2001, 89, (3), p. 325-340
-
(2001)
Proc. IEEE
, vol.89
, Issue.3
, pp. 325-340
-
-
Ronen, R.1
Mendelson, A.2
Lai, K.3
Lu, S.-L.4
Pollack, F.5
Shen, J.6
-
6
-
-
4644337990
-
The vector-thread architecture
-
Munich, Germany, June
-
Krashinsky, R., Batten, C., Hampton, M., Gerding, S., Pharris, B., Casper, J., and Asanovic, K.: ' The vector-thread architecture ', IEEE 31st Int. Symp. on Computer Architecture, Munich, Germany, June 2004, p. 52-63
-
(2004)
IEEE 31st Int. Symp. on Computer Architecture
, pp. 52-63
-
-
Krashinsky, R.1
Batten, C.2
Hampton, M.3
Gerding, S.4
Pharris, B.5
Casper, J.6
Asanovic, K.7
-
7
-
-
85008521306
-
Are single-chip multiprocessors in reach?
-
Bergamaschi, R., Bolsens, I., Gupta, R., Harr, R., Jerraya, A., Keutzer, K., Olukotun, K., and Vissers, K.: ' Are single-chip multiprocessors in reach? ', IEEE Des. Test Comput., 2001, 18, (1), p. 82-89
-
(2001)
IEEE Des. Test Comput.
, vol.18
, Issue.1
, pp. 82-89
-
-
Bergamaschi, R.1
Bolsens, I.2
Gupta, R.3
Harr, R.4
Jerraya, A.5
Keutzer, K.6
Olukotun, K.7
Vissers, K.8
-
8
-
-
0000227930
-
Reconfigurable computing: A survey of systems and software
-
Compton, K., and Hauck, S.: ' Reconfigurable computing: a survey of systems and software ', ACM Comput. Surv., 2002, 34, (2), p. 171-210
-
(2002)
ACM Comput. Surv.
, vol.34
, Issue.2
, pp. 171-210
-
-
Compton, K.1
Hauck, S.2
-
9
-
-
33745715748
-
Synthesizable reconfigurable array targeting distributed arithmetic for system-on-chip applications
-
Khawam, S., Arslan, T., and Westall, F.: ' Synthesizable reconfigurable array targeting distributed arithmetic for system-on-chip applications ', 12th Reconfigurable Architectures Workshop, 2004
-
(2004)
Reconfigurable Architectures Workshop
-
-
Khawam, S.1
Arslan, T.2
Westall, F.3
-
10
-
-
3543135955
-
Evolutionary algorithm for the promotion of evolvable hardware
-
Tyrrell, A.M., Krohling, R.A., and Zhou, Y.: ' Evolutionary algorithm for the promotion of evolvable hardware ', IEE Proc., Comput. Digit. Tech., 2004, 151, (4), p. 267-275
-
(2004)
IEE Proc., Comput. Digit. Tech.
, vol.151
, Issue.4
, pp. 267-275
-
-
Tyrrell, A.M.1
Krohling, R.A.2
Zhou, Y.3
-
11
-
-
0346443386
-
SIMD machines: Do they have a significant future?
-
McLean, LA, February
-
Parhami, B.: ' SIMD machines: do they have a significant future? ', Report on a Panel Discussion, 5th Symp. Frontiers Massively Parallel Computation, McLean, LA, February 1995, p. 19-22
-
(1995)
Report on A Panel Discussion, 5th Symp. Frontiers Massively Parallel Computation
, pp. 19-22
-
-
Parhami, B.1
-
12
-
-
84947231548
-
Importance of SIMD computation reconsidered
-
April
-
Meilander, W.C., Baker, J.W., and Jin, M.: ' Importance of SIMD computation reconsidered ', 17th IEEE Int. Parallel Distributed Processing Symp. (IPDPS2003), April 2003, p. 266-273
-
(2003)
IEEE Int. Parallel Distributed Processing Symp. (IPDPS2003)
, pp. 266-273
-
-
Meilander, W.C.1
Baker, J.W.2
Jin, M.3
-
13
-
-
0040674218
-
Mixed-mode system heterogeneous computing
-
Eshaghian, M.M., Heterogeneous computing, Artech House, Norwood, MA
-
Siegel, H.J., Maheswaran, M., Watson, D.W., Antonio, J.K., and Atallah, M.J.: ' Mixed-mode system heterogeneous computing ', Eshaghian, M.M., Heterogeneous computing, (Artech House, Norwood, MA 1996), p. 19-65
-
(1996)
Heterogeneous Computing
, pp. 19-65
-
-
Siegel, H.J.1
Maheswaran, M.2
Watson, D.W.3
Antonio, J.K.4
Atallah, M.J.5
-
14
-
-
0003554095
-
-
Oxford University Press, Oxford, England
-
Duff, I.S., Erisman, A.M., and Reid, J.K.: ' Direct methods for sparse matrices ', (Oxford University Press, Oxford, England 1990)
-
(1990)
Direct Methods for Sparse Matrices
-
-
Duff, I.S.1
Erisman, A.M.2
Reid, J.K.3
-
16
-
-
1842533207
-
Parallel LU factorization of sparse matrices on FPGA-based configurable computing engines
-
Wang, X., and Ziavras, S.G.: ' Parallel LU factorization of sparse matrices on FPGA-based configurable computing engines ', Concurrency Comput. Pract. Exp., 2004, 16, (4), p. 319-343
-
(2004)
Concurrency Comput. Pract. Exp.
, vol.16
, Issue.4
, pp. 319-343
-
-
Wang, X.1
Ziavras, S.G.2
-
17
-
-
84947267561
-
Parallel direct solution of linear equations on FPGA-based machines
-
Nice, France, April
-
Wang, X., and Ziavras, S.G.: ' Parallel direct solution of linear equations on FPGA-based machines ', 11th IEEE Int. Workshop on Parallel and Distributed Real-Time Systems (Proc. 17th IEEE International Parallel and Distributed Processing Symp.), Nice, France, April 22-26 2003
-
(2003)
IEEE Int. Workshop on Parallel and Distributed Real-Time Systems (Proc. 17th IEEE International Parallel and Distributed Processing Symp.)
-
-
Wang, X.1
Ziavras, S.G.2
-
18
-
-
11844292749
-
HERA: A reconfigurable and mixed-mode parallel computing engine on platform FPGAs
-
Boston, Massachusetts, November
-
Wang, X., and Ziavras, S.G.: ' HERA: a reconfigurable and mixed-mode parallel computing engine on platform FPGAs ', 16th Int. Conf. Parallel and Distributed Computing and Systems, Boston, Massachusetts, November 9-11 2004, p. 374-379
-
(2004)
Int. Conf. Parallel and Distributed Computing and Systems
, pp. 374-379
-
-
Wang, X.1
Ziavras, S.G.2
-
19
-
-
0035341885
-
Reconfigurable computing and digital signal processing: A survey
-
Tessier, R., and Burleson, W.: ' Reconfigurable computing and digital signal processing: a survey ', J. VLSI Signal Process., 2001, 28, (1-2), p. 7-27
-
(2001)
J. VLSI Signal Process.
, vol.28
, Issue.1-2
, pp. 7-27
-
-
Tessier, R.1
Burleson, W.2
-
20
-
-
84946014243
-
An FPGA based coprocessor for large matrix product implementation
-
December
-
Bensaali, F., Amira, A., and Bouridane, A.: ' An FPGA based coprocessor for large matrix product implementation ', 2003 IEEE Int. Conf. Field-Programmable Technology, December 2003, p. 292-295
-
(2003)
2003 IEEE Int. Conf. Field-Programmable Technology
, pp. 292-295
-
-
Bensaali, F.1
Amira, A.2
Bouridane, A.3
-
21
-
-
4143084151
-
Hierarchical synthesis of complex DSP functions on FPGAs
-
November
-
Yi, Y., Woods, R., and McCanny, J.V.: ' Hierarchical synthesis of complex DSP functions on FPGAs ', 37th Asilomar Conf. Signals, Systems and Computers, November 2003, 2, p. 1421-1425
-
(2003)
Asilomar Conf. Signals, Systems and Computers
, vol.2
, pp. 1421-1425
-
-
Yi, Y.1
Woods, R.2
McCanny, J.V.3
-
23
-
-
33745830023
-
A high-performance and energy-efficient architecture for floating-point based LU decomposition on FPGAs
-
April
-
Govindu, G., Choi, S., Prasanna, V.K., Daga, V., Gangadharpalli, S., and Sridhar, V.: ' A high-performance and energy-efficient architecture for floating-point based LU decomposition on FPGAs ', 12th Reconfigurable Architectures Workshop, April 2004
-
(2004)
Reconfigurable Architectures Workshop
-
-
Govindu, G.1
Choi, S.2
Prasanna, V.K.3
Daga, V.4
Gangadharpalli, S.5
Sridhar, V.6
-
24
-
-
20344376214
-
64-bit floating-point FPGA matrix multiplication
-
Monterey, CA, February
-
Dou, Y., Vassiliadis, S., Kuzmanov, G.K., and Gaydadjiev, G.N.: ' 64-bit floating-point FPGA matrix multiplication ', ACM/SIGDA Int. Symp. on Field Programmable Gate Arrays, Monterey, CA, February 2005, p. 86-95
-
(2005)
ACM/SIGDA Int. Symp. on Field Programmable Gate Arrays
, pp. 86-95
-
-
Dou, Y.1
Vassiliadis, S.2
Kuzmanov, G.K.3
Gaydadjiev, G.N.4
-
25
-
-
0034187952
-
MorphoSys: An integrated reconfigurable system for data-parallel and computation-Intensive Applications
-
Singh, H., Lee, M.-H., Lu, G., Kurdahi, F.J., Bagherzadeh, N., and Filho, E.M.C.: ' MorphoSys: an integrated reconfigurable system for data-parallel and computation-Intensive Applications ', IEEE Trans. Comput., 2000, 49, (5), p. 465-481
-
(2000)
IEEE Trans. Comput.
, vol.49
, Issue.5
, pp. 465-481
-
-
Singh, H.1
Lee, M.-H.2
Lu, G.3
Kurdahi, F.J.4
Bagherzadeh, N.5
Filho, E.M.C.6
-
26
-
-
0036505033
-
The RAW microprocessor: A computational fabric for software circuits and general purpose programs
-
Taylor, M.B., Kim, J., Miller, J., Wentzlaff, D., Ghodrat, F., Greenwald, B., Hoffmann, H., Johnson, P., Lee, J.-W., Lee, W., Ma, A., Saraf, A., Seneski, M., Shnidman, N., Strumpen, V., Frank, M., Amarasinghe, S., and Agarwal, A.: ' The RAW microprocessor: a computational fabric for software circuits and general purpose programs ', IEEE Micro, 2002, 22, (2), p. 25-35
-
(2002)
IEEE Micro
, vol.22
, Issue.2
, pp. 25-35
-
-
Taylor, M.B.1
Kim, J.2
Miller, J.3
Wentzlaff, D.4
Ghodrat, F.5
Greenwald, B.6
Hoffmann, H.7
Johnson, P.8
Lee, J.-W.9
Lee, W.10
Ma, A.11
Saraf, A.12
Seneski, M.13
Shnidman, N.14
Strumpen, V.15
Frank, M.16
Amarasinghe, S.17
Agarwal, A.18
-
27
-
-
0030394522
-
MATRIX: A reconfigurable computing architecture with configurable instruction distribution and deployable resources
-
Mirsky, E., and DeHon, A.: ' MATRIX: a reconfigurable computing architecture with configurable instruction distribution and deployable resources ', 1996 IEEE Symp. FPGAs for Custom Computing Machines, 1996, p. 157-166
-
(1996)
1996 IEEE Symp. FPGAs for Custom Computing Machines
, pp. 157-166
-
-
Mirsky, E.1
Dehon, A.2
-
28
-
-
2842571467
-
The case for a single-chip multiprocessor
-
October
-
Olukotun, K., Nayfeh, B.A., Hammond, L., Wilson, K., and Chang, K.: ' The case for a single-chip multiprocessor ', Seventh Int. Symp. Architectural Support for Programming Languages and Operating Systems, October 1996, p. 2-11
-
(1996)
Seventh Int. Symp. Architectural Support for Programming Languages and Operating Systems
, pp. 2-11
-
-
Olukotun, K.1
Nayfeh, B.A.2
Hammond, L.3
Wilson, K.4
Chang, K.5
-
29
-
-
11844284318
-
SCMP: A single-chip message-passing parallel computer
-
Las Vegas, NV, June
-
Baker, J.M., Bennett, S., Bucciero, M., Gold, B., and Mahajan, R.: ' SCMP: a single-chip message-passing parallel computer ', The 2002 Int. Conf. on Parallel and Distributed Processing Techniques and Applications, Las Vegas, NV, June 2002, p. 1485-1491
-
(2002)
The 2002 Int. Conf. on Parallel and Distributed Processing Techniques and Applications
, pp. 1485-1491
-
-
Baker, J.M.1
Bennett, S.2
Bucciero, M.3
Gold, B.4
Mahajan, R.5
-
30
-
-
85086484200
-
A multiprocessor-on-a-programmable-chip reconfigurable system for matrix operations with power-grid case studies
-
Special Issue on Parallel and Distrib. Sci. Eng. Comput., in Press
-
Wang, X., and Ziavras, S.G.: ' A multiprocessor-on-a-programmable-chip reconfigurable system for matrix operations with power-grid case studies ', Int. J. Comput. Sci. Eng., Special Issue on Parallel and Distrib. Sci. Eng. Comput., in press
-
Int. J. Comput. Sci. Eng.
-
-
Wang, X.1
Ziavras, S.G.2
-
31
-
-
33745687464
-
-
Annapolis Micro Systems, Inc., Available at http://www.annapmicro.com/
-
-
-
-
32
-
-
33745687465
-
-
Codito Technologies Pvt. Ltd. Available at: http://www.codito.com/prodtech_framework.html
-
-
-
-
33
-
-
33745715747
-
-
OpenMP. Available at: http://www.openmp.org
-
-
-
-
34
-
-
33947401902
-
Coprocessor design to support MPI primitives in configurable multiprocessors
-
in Press
-
Ziavras, S.G., Gerbessiotis, A., and Bafna, R.: ' Coprocessor design to support MPI primitives in configurable multiprocessors ', Integr. VLSI J., in press
-
Integr. VLSI J.
-
-
Ziavras, S.G.1
Gerbessiotis, A.2
Bafna, R.3
-
35
-
-
33745687448
-
A framework for dynamic resource management and scheduling on reconfigurable mixed-mode multiprocessors
-
Singapore, December
-
Wang, X., and Ziavras, S.G.: ' A framework for dynamic resource management and scheduling on reconfigurable mixed-mode multiprocessors ', IEEE Int. Conf. on Field-Programmable Technology, Singapore, December 2005, p. 51-58
-
(2005)
IEEE Int. Conf. on Field-Programmable Technology
, pp. 51-58
-
-
Wang, X.1
Ziavras, S.G.2
-
36
-
-
33745701672
-
-
Tensilica. Available at: http://tensilica.com
-
-
-
-
37
-
-
0036709503
-
Reconfigurable instruction set processors from a hardware/software perspective
-
Barat, F., Rudy, L., and Geert, D.: ' Reconfigurable instruction set processors from a hardware/software perspective ', IEEE Trans. Softw. Eng., 2002, 28, (9), p. 847-862
-
(2002)
IEEE Trans. Softw. Eng.
, vol.28
, Issue.9
, pp. 847-862
-
-
Barat, F.1
Rudy, L.2
Geert, D.3
-
38
-
-
33745701666
-
Automatic performance tuning of linear algebra kernels
-
http://bebop.cs.berkeley.edu/pubs/SciDAC_250102.pdf
-
Demmel, J., and Yelick, K.: ' Automatic performance tuning of linear algebra kernels ', TOPS-SciDAC (http://www.tops-scidac.org), 2002 January), Available at: http://bebop.cs.berkeley.edu/pubs/SciDAC_250102.pdf
-
(2002)
TOPS-SciDAC
-
-
Demmel, J.1
Yelick, K.2
-
39
-
-
33745701670
-
-
Intel Math Kernel Library (MKL) 8.0. Available at: http://www.intel.com/cd/software/products/asmo-na/eng/perflib/mkl/219823.htm
-
-
-
-
40
-
-
84858903302
-
Accelerating blocked matrix-matrix multiplication using a software-managed memory hierarchy with DMA
-
Wunderlich, R., Püschel, M., and Hoe, J.: ' Accelerating blocked matrix-matrix multiplication using a software-managed memory hierarchy with DMA ', High Performance Embedded Computing Workshop, 2005, MIT
-
(2005)
High Performance Embedded Computing Workshop
-
-
Wunderlich, R.1
Püschel, M.2
Hoe, J.3
-
41
-
-
0017636064
-
An efficient heuristic cluster algorithm for tearing large-scale networks
-
Sangiovanni-Vincentelli, A., Chen, L.K., and Chua, L.O.: ' An efficient heuristic cluster algorithm for tearing large-scale networks ', IEEE Trans. Circuits Syst., 1977, 24, (12), p. 709-717
-
(1977)
IEEE Trans. Circuits Syst.
, vol.24
, Issue.12
, pp. 709-717
-
-
Sangiovanni-Vincentelli, A.1
Chen, L.K.2
Chua, L.O.3
-
42
-
-
33745687456
-
-
Matrix Market, Available at: http://math.nist.gov/MatrixMarket/
-
-
-
-
43
-
-
70349183865
-
-
TMS320C6711/11B/11C/11D
-
TMS320C6711/11B/11C/11D Floating-Point Digital Signal Processors. Available at: http://focus.ti.com/docs/prod/folders/print/tms320c6711.html
-
Floating-Point Digital Signal Processors
-
-
|