SCOPUS 정보 검색 플랫폼

IEEE Transactions on Parallel and Distributed Systems

Volumn 23, Issue 2, 2012, Pages 202-210

Accelerating matrix operations with improved deeply pipelined vector reduction

(3) Tai, Yi Gang b Lo, Chia Tien Dan a Psarris, Kleanthis b

b University of Texas at San Antonio (United States)

Author keywords

algorithm design and analysis.; parallel algorithms; parallel and vector implementations; pipeline processors; Reconfigurable hardware

Indexed keywords

ALGORITHM DESIGN AND ANALYSIS; COMMON OPERATIONS; DATA HAZARDS; DATA SETS; ENGINEERING APPLICATIONS; INPUT DATAS; LOW LATENCY; MATRIX OPERATIONS; MULTIPLE DATA; PARALLEL AND VECTOR IMPLEMENTATIONS; PIPELINE PROCESSORS; Q R DECOMPOSITION; RE-CONFIGURABLE; REDUCTION METHOD;

ALGORITHMS; DESIGN; PARALLEL ALGORITHMS; PIPE; RECONFIGURABLE HARDWARE;

DATA REDUCTION;

EID: 84855352110 PISSN: 10459219 EISSN: None Source Type: Journal
DOI: 10.1109/TPDS.2011.141 Document Type: Article

Times cited : (13)

References (20)

1
- 35448984859
- Sept Xilinx, Inc
- Xilinx Floating-Point Operator v3.0, Xilinx, Inc., http://www.xilinx.com/ support/documentation/ip-documentation/floating-point-ds335.pdf, Sept. 2006.
- (2006) Xilinx Floating-Point Operator v3.0

2
- 74849104613
- An improved reduction algorithm with deeply pipelined operators
- Oct
- Y.-G. Tai, C.-T. D. Lo, and K. Psarris, "An Improved Reduction Algorithm with Deeply Pipelined Operators," Proc. IEEE Int'l Conf. Systems, Man and Cybernetics (SMC '09), pp. 3060-3065, Oct. 2009.
- (2009) Proc. IEEE Int'l Conf. Systems, Man and Cybernetics (SMC '09) , pp. 3060-3065
- Tai, Y.-G.¹ Lo, C.-T.D.² Psarris, K.³

3
- 79551570275
- Multiple data set reduction on FPGAs
- Dec
- Y.-G. Tai, C.-T. D. Lo, and K. Psarris, "Multiple Data Set Reduction on FPGAs," Proc. Int'l Conf. Field-Programmable Technology (FPT '10), Dec. 2010.
- (2010) Proc. Int'l Conf. Field-Programmable Technology (FPT '10)
- Tai, Y.-G.¹ Lo, C.-T.D.² Psarris, K.³

4
- 0004172748
- McGraw-Hill
- P.M. Kogge, The Architecture of Pipelined Computers. McGraw-Hill, 1981.
- (1981) The Architecture of Pipelined Computers
- Kogge, P.M.¹

5
- 0022054523
- Vector-reduction techniques for arithmetic pipelines
- May
- L.M. Ni and K. Hwang, "Vector-Reduction Techniques for Arithmetic Pipelines," IEEE Trans. Computer, vol. C-34, no. 5, pp. 404-411, May 1985.
- (1985) IEEE Trans. Computer , vol.C-34 , Issue.5 , pp. 404-411
- Ni, L.M.¹ Hwang, K.²

6
- 0026104540
- An improved vector-reduction method
- Feb
- H. Sips and H. Lin, "An Improved Vector-Reduction Method," IEEE Trans. Computer, vol. 40, no. 2, pp. 214-217, Feb. 1991.
- (1991) IEEE Trans. Computer , vol.40 , Issue.2 , pp. 214-217
- Sips, H.¹ Lin, H.²

7
- 34147131364
- A hybrid approach for mapping conjugate gradient onto an FPGA-augmented reconfigurable supercomputer
- DOI 10.1109/FCCM.2006.8, 4020890, Proceedings - 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, FCCM 2006
- G.R. Morris, V.K. Prasanna, and R.D. Anderson, "A Hybrid Approach for Mapping Conjugate Gradient onto an FPGAAugmented Reconfigurable Supercomputer," Proc. 14th Ann. IEEE Symp. Field-Programmable Custom Computing Machines (FCCM '06), pp. 3-12, 2006. (Pubitemid 47159821)
- (2006) Proceedings - 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, FCCM 2006 , pp. 3-12
- Morris, G.R.¹ Prasanna, V.K.² Anderson, R.D.³

8
- 34547415470
- An FPGA-based application-specific processor for efficient reduction of multiple variable-length floating-point data sets
- DOI 10.1109/ASAP.2006.11, 4019536, Proceedings - IEEE 17th International Conference on Application-specific Systems, Architectures and Processors, ASAP 2006
- G.R. Morris, V.K. Prasanna, and R.D. Anderson, "An FPGA-Based Application-Specific Processor for Efficient Reduction of Multiple Variable-Length Floating-Point Data Sets," Proc. 17th IEEE Int'l Conf. Application-Specific Systems, Architectures and Processors (ASAP '06), pp. 323-330, 2006. (Pubitemid 47158351)
- (2006) Proceedings of the International Conference on Application-Specific Systems, Architectures and Processors , pp. 323-330
- Morris, G.R.¹ Prasanna, V.K.² Anderson, R.D.³

9
- 33746284911
- Designing scalable FPGA-based reduction circuits using pipelined floating-point cores
- L. Zhuo, G.R. Morris, and V.K. Prasanna, "Designing Scalable FPGA-Based Reduction Circuits Using Pipelined Floating-Point Cores," Proc. 19th IEEE Int'l Parallel and Distributed Processing Symp. (IPDPS '05) p. 147a, 2005.
- (2005) Proc. 19th IEEE Int'l Parallel and Distributed Processing Symp. (IPDPS '05)
- Zhuo, L.¹ Morris, G.R.² Prasanna, V.K.³

10
- 33847218692
- High-performance and area-efficient reduction circuits on FPGAs
- Oct
- L. Zhuo and V.K. Prasanna, "High-Performance and Area-Efficient Reduction Circuits on FPGAs," Proc. 17th Int'l Symp. Computer Architecture and High Performance Computing, Oct. 2005.
- (2005) Proc. 17th Int'l Symp. Computer Architecture and High Performance Computing
- Zhuo, L.¹ Prasanna, V.K.²

11
- 33746169464
- High-performance FPGA-based general reduction methods
- Apr
- G.R. Morris, L. Zhuo, and V.K. Prasanna, "High-Performance FPGA-Based General Reduction Methods," Proc. 10th IEEE Symp. Field-Programmable Custom Computing Machines (FCCM '05), Apr. 2005.
- (2005) Proc. 10th IEEE Symp. Field-Programmable Custom Computing Machines (FCCM '05)
- Morris, G.R.¹ Zhuo, L.² Prasanna, V.K.³

12
- 34648814129
- High-performance reduction circuits using deeply pipelined operators on FPGAs
- DOI 10.1109/TPDS.2007.1068
- L. Zhuo, G.R. Morris, and V.K. Prasanna, "High-Performance Reduction Circuits Using Deeply Pipelined Operators on FPGAs," IEEE Trans. Parallel Distributed Systems, vol. 18, no. 10, pp. 1377-1392, Oct. 2007. (Pubitemid 47456003)
- (2007) IEEE Transactions on Parallel and Distributed Systems , vol.18 , Issue.10 , pp. 1377-1392
- Zhou, L.¹ Morris, G.R.² Prasanna, V.K.³

13
- 48149098623
- Applying out-of-core qr decomposition algorithms on FPGA-based systems
- Y.-G. Tai, C.-T. D. Lo, and K. Psarris, "Applying Out-of-Core QR Decomposition Algorithms on FPGA-Based Systems," Proc. 17th Int'l Conf. Field Programmable Logic and Applications (FPL '07), 2007.
- (2007) Proc. 17th Int'l Conf. Field Programmable Logic and Applications (FPL '07)
- Tai, Y.-G.¹ Lo, C.-T.D.² Psarris, K.³

14
- 84855350819
- Accelerating matrix decomposition with replications
- Y.-G. Tai, C.-T. D. Lo, and K. Psarris, "Accelerating Matrix Decomposition with Replications," Proc. 15th Reconfigurable Architectures Workshop (RAW '08), 2008.
- (2008) Proc. 15th Reconfigurable Architectures Workshop (RAW '08)
- Tai, Y.-G.¹ Lo, C.-T.D.² Psarris, K.³

15
- 17644368925
- Parallel out-of-core computation and updating of the QR factorization
- DOI 10.1145/1055531.1055534
- B.C. Gunter and R.A.V.D. Geijn, "Parallel Out-of-Core Computation and Updating of the QR Factorization," ACM Trans. Math. Software, vol. 31, no. 1, pp. 60-78, 2005. (Pubitemid 40557862)
- (2005) ACM Transactions on Mathematical Software , vol.31 , Issue.1 , pp. 60-78
- Gunter, B.C.¹ Van De Geijn, R.A.²

16
- 51049083291
- technical report, LAPack Working Notes #190
- A. Buttari, J. Langou, J. Kurzak, and J. Dongarra, "Parallel Tiled QR Factorization For Multicore Architectures," technical report, LAPack Working Notes #190, http://www.netlib.org/lapack/lawnspdf/lawn190.pdf, 2007.
- (2007) Parallel Tiled QR Factorization for Multicore Architectures
- Buttari, A.¹ Langou, J.² Kurzak, J.³ Dongarra, J.⁴

17
- 79551512963
- technical report, LAPack Working Notes #222, Innovative Computing Laboratory, Univ. of Tennessee
- B. Hadri, H. Ltaief, E. Agullo, and J. Dongarra, "Enhancing Parallelism of Tile QR Factorization for Multicore Architectures," technical report, LAPack Working Notes #222, Innovative Computing Laboratory, Univ. of Tennessee, http://www.netlib.org/lapack/lawnspdf/lawn222.pdf, 2009.
- (2009) Enhancing Parallelism of Tile QR Factorization for Multicore Architectures
- Hadri, B.¹ Ltaief, H.² Agullo, E.³ Dongarra, J.⁴

18
- 51049095781
- Xilinx, Inc
- Virtex-II Pro /Virtex-II Pro X Complete Data Sheet, Xilinx, Inc., http://direct.xilinx.com/bvdocs/publications/ds083.pdf, 2007.
- (2007) Virtex-II Pro /Virtex-II Pro X Complete Data Sheet

19
- 30344436225
- Xilinx, Inc
- Virtex-4 Family Overview, Xilinx, Inc., http://www.xilinx.com/support/ documentation/data-sheets/ds112.pdf, 2007.
- (2007) Virtex-4 Family Overview

20
- 62949240224
- Xilinx, Inc
- Virtex-5 Family Overview, Xilinx, Inc., http://www.xilinx.com/support/ documentation/data-sheets/ds100.pdf, 2009.
- (2009) Virtex-5 Family Overview

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.