-
1
-
-
10744232785
-
A comparison of empirical and model-driven optimization
-
K. Yotov, X. Li, G. Ren, M. Cibulskis, G. DeJong, M. Garzaran, D. Padua, K. Pingali, P. Stodghill, and P. Wu, "A comparison of empirical and model-driven optimization," in Proc. ACM SIGPLAN Conf. Programming Language Design and Implementation (PLDI), 2003, pp. 63-76.
-
(2003)
Proc. ACM SIGPLAN Conf. Programming Language Design and Implementation (PLDI)
, pp. 63-76
-
-
Yotov, K.1
Li, X.2
Ren, G.3
Cibulskis, M.4
Dejong, G.5
Garzaran, M.6
Padua, D.7
Pingali, K.8
Stodghill, P.9
Wu, P.10
-
2
-
-
0034512401
-
Combined selection of tile sizes and unroll factors using iterative compilation
-
T. Kisuki, P. Knijnenburg, and M. O'Boyle, "Combined selection of tile sizes and unroll factors using iterative compilation," in Proc. Parallel Architectures and Compilation Techniques (PACT), 2000, pp. 237-246.
-
(2000)
Proc. Parallel Architectures and Compilation Techniques (PACT)
, pp. 237-246
-
-
Kisuki, T.1
Knijnenburg, P.2
O'Boyle, M.3
-
3
-
-
85072516160
-
Automatic program transformations for virtual memory computers
-
W. Abu-Sufah, D. J. Kuck, and D. H. Lawrie, "Automatic program transformations for virtual memory computers," in Proc. Nat. Computer Conf., 1979, pp. 969-974.
-
(1979)
Proc. Nat. Computer Conf.
, pp. 969-974
-
-
Abu-Sufah, W.1
Kuck, D.J.2
Lawrie, D.H.3
-
4
-
-
85027612984
-
Dependence graphs and compiler optimizations
-
D. J. Kuck, R. H. Kuhn, D. A. Padua, B. Leasure, and M. Wolfe, "Dependence graphs and compiler optimizations," in Proc. 8th ACM SIGPLAN-SIGACT Symp, Principles of Programming Languages, 1981, pp. 207-218.
-
(1981)
Proc. 8th ACM SIGPLAN-SIGACT Symp, Principles of Programming Languages
, pp. 207-218
-
-
Kuck, D.J.1
Kuhn, R.H.2
Padua, D.A.3
Leasure, B.4
Wolfe, M.5
-
5
-
-
0001775038
-
A catalogue of optimizing transformations
-
R. Rustin, Ed. Englewood Cliffs, NJ: Prentice-Hall
-
F. Allen and J. Cocke, "A catalogue of optimizing transformations," in Design and Optimization of Compilers, R. Rustin, Ed. Englewood Cliffs, NJ: Prentice-Hall, 1972, pp. 1-30.
-
(1972)
Design and Optimization of Compilers
, pp. 1-30
-
-
Allen, F.1
Cocke, J.2
-
7
-
-
0030685988
-
Data-centric multi-level blocking
-
I. Kodukula, N. Ahmed, and K. Pingali, "Data-centric multi-level blocking," in Proc. ACM SIGPLAN Conf Programming Language Design and Implementation (PLDI), 1997, pp. 346-357.
-
(1997)
Proc. ACM SIGPLAN Conf Programming Language Design and Implementation (PLDI)
, pp. 346-357
-
-
Kodukula, I.1
Ahmed, N.2
Pingali, K.3
-
10
-
-
84956865893
-
On the equivalence of two systems of affine recurrence equations
-
Heidelberg, Germany: Springer-Verlag
-
D. Barthou, P. Feautrier, and X. Redon, "On the equivalence of two systems of affine recurrence equations," in Lecture Notes in Computer Science, Euro-Par 2002. Heidelberg, Germany: Springer-Verlag, 2002, vol. 2400, pp. 309-313.
-
(2002)
Lecture Notes in Computer Science, Euro-Par 2002
, vol.2400
, pp. 309-313
-
-
Barthou, D.1
Feautrier, P.2
Redon, X.3
-
11
-
-
0343462141
-
Automated empirical optimization of software and the ATLAS project
-
R. C. Whaley, A. Petitet, and J. J. Dongarra, "Automated empirical optimization of software and the ATLAS project," Parallel Comput., vol. 27, no. 1-2, pp. 3-35, 2001.
-
(2001)
Parallel Comput.
, vol.27
, Issue.1-2
, pp. 3-35
-
-
Whaley, R.C.1
Petitet, A.2
Dongarra, J.J.3
-
12
-
-
20744452904
-
Self-adapting linear algebra algorithms and software
-
Feb.
-
J. Demmel, J. Dongarra, V. Eijkhout, E. Fuentes, A. Petitet, R. Vuduc, C. Whaley, and K. Yelick, "Self-adapting linear algebra algorithms and software," Proc. IEEE, vol. 93, no. 2, pp. 293-312, Feb. 2005.
-
(2005)
Proc. IEEE
, vol.93
, Issue.2
, pp. 293-312
-
-
Demmel, J.1
Dongarra, J.2
Eijkhout, V.3
Fuentes, E.4
Petitet, A.5
Vuduc, R.6
Whaley, C.7
Yelick, K.8
-
13
-
-
0003706460
-
-
Philadelphia, PA: Society for Industrial and Applied Mathematics
-
E. Anderson, Z. Bai, C. Bischof, S. Blackford, J. Demmel, J. Dongarra, J. Du Croz, A. Greenbaum, S. Hammarling, A. McKenney, and D. Sorensen, LAPACK Users ' Guide, 3rd ed. Philadelphia, PA: Society for Industrial and Applied Mathematics, 1999.
-
(1999)
LAPACK Users ' Guide, 3rd Ed.
-
-
Anderson, E.1
Bai, Z.2
Bischof, C.3
Blackford, S.4
Demmel, J.5
Dongarra, J.6
Du Croz, J.7
Greenbaum, A.8
Hammarling, S.9
McKenney, A.10
Sorensen, D.11
-
14
-
-
20744459570
-
A comparison of empirical and model-driven optimization
-
Feb.
-
K. Yotov, X. Li, G. Ren, M. Garzaran, D. Padua, K. Pingali, and P. Stodghill, "A comparison of empirical and model-driven optimization," Proc. IEEE, vol. 93, no. 2, pp. 358-386, Feb. 2005.
-
(2005)
Proc. IEEE
, vol.93
, Issue.2
, pp. 358-386
-
-
Yotov, K.1
Li, X.2
Ren, G.3
Garzaran, M.4
Padua, D.5
Pingali, K.6
Stodghill, P.7
-
15
-
-
1542501019
-
Sparsity: Optimization framework for sparse matrix kernels
-
E.-I. Im, K. Yelick, and R. Vuduc, "Sparsity: Optimization framework for sparse matrix kernels," Int. J. High Perform. Comput. Appl., vol. 18, no. 1, pp. 135-158, 2004.
-
(2004)
Int. J. High Perform. Comput. Appl.
, vol.18
, Issue.1
, pp. 135-158
-
-
Im, E.-I.1
Yelick, K.2
Vuduc, R.3
-
16
-
-
20744453223
-
Synthesis of high-performance parallel programs for a class of ab initio quantum chemistry models
-
Feb.
-
G. Baumgartner, A. Auer, D. E. Bernholdt, A. Bibireata, V. Choppella, D. Cociorva, X. Gao, R. J. Harrison, S. Hirata, S. Krishanmoorthy, S. Krishnan, C.-C. Lam, Q. Lu, M. Nooijen, R. M. Pitzer, J. Ramanujam, P. Sadayappan, and A. Sibiryakov, "Synthesis of high-performance parallel programs for a class of ab initio quantum chemistry models," Proc. IEEE, vol. 93, no. 2, pp. 276-292, Feb. 2005.
-
(2005)
Proc. IEEE
, vol.93
, Issue.2
, pp. 276-292
-
-
Baumgartner, G.1
Auer, A.2
Bernholdt, D.E.3
Bibireata, A.4
Choppella, V.5
Cociorva, D.6
Gao, X.7
Harrison, R.J.8
Hirata, S.9
Krishanmoorthy, S.10
Krishnan, S.11
Lam, C.-C.12
Lu, Q.13
Nooijen, M.14
Pitzer, R.M.15
Ramanujam, J.16
Sadayappan, P.17
Sibiryakov, A.18
-
17
-
-
84966641594
-
A performance optimization framework for compilation of tensor contraction expressions into parallel programs
-
G. Baumgartner, D. Bernholdt, D. Cociovora, R. Harrison, M. Nooijen, J. Ramanujan, and P. Sadayappan, "A performance optimization framework for compilation of tensor contraction expressions into parallel programs," in Proc. Int. Workshop High-Level Parallel Programming Models and Supportive Environments [Held in Conjunction With IEEE Int. Parallel and Distributed Processing Symp. (IPDPS)], 2002, pp. 106-114.
-
(2002)
Proc. Int. Workshop High-level Parallel Programming Models and Supportive Environments [Held in Conjunction with IEEE Int. Parallel and Distributed Processing Symp. (IPDPS)]
, pp. 106-114
-
-
Baumgartner, G.1
Bernholdt, D.2
Cociovora, D.3
Harrison, R.4
Nooijen, M.5
Ramanujan, J.6
Sadayappan, P.7
-
18
-
-
20744449792
-
The design and implementation of FFTW3
-
Feb.
-
M. Frigo and S. G. Johnson, "The design and implementation of FFTW3," Proc. IEEE, vol. 93, no. 2, pp. 216-231, Feb. 2005.
-
(2005)
Proc. IEEE
, vol.93
, Issue.2
, pp. 216-231
-
-
Frigo, M.1
Johnson, S.G.2
-
19
-
-
0031636309
-
FFTW: An adaptive software architecture for the FFT
-
[Online]
-
_, "FFTW: An adaptive software architecture for the FFT," in Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing (ICASSP), vol. 3, 1998, pp. 1381-1384. [Online]. Available: http://www.fftw.org.
-
(1998)
Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing (ICASSP)
, vol.3
, pp. 1381-1384
-
-
-
21
-
-
84949653778
-
Automatic performance tuning in the UHFFT library
-
Heidelberg, Germany: Springer-Verlag
-
D. Mirković and S. L. Johnsson, "Automatic performance tuning in the UHFFT library," in Lecture Notes in Computer Science, Computational Science - ICCS 2001. Heidelberg, Germany: Springer-Verlag, 2001, vol. 2073, pp. 71-80.
-
(2001)
Lecture Notes in Computer Science, Computational Science - ICCS 2001
, vol.2073
, pp. 71-80
-
-
Mirković, D.1
Johnsson, S.L.2
-
22
-
-
0003569514
-
-
Ph.D. dissertation, Institut für Informatik, Univ. Karlsruhe, Karlsruhe, Germany
-
S. Egner, "Zur Algorithmischen Zerlegungstheorie Linearer Transformationen Mit Symmetrie (On the algorithmic decomposition theory of linear transforms with symmetry)," Ph.D. dissertation, Institut für Informatik, Univ. Karlsruhe, Karlsruhe, Germany, 1997.
-
(1997)
Zur Algorithmischen Zerlegungstheorie Linearer Transformationen Mit Symmetrie (On the Algorithmic Decomposition Theory of Linear Transforms with Symmetry)
-
-
Egner, S.1
-
24
-
-
20744444976
-
Architecture-cognizant divide and conquer algorithms
-
Portland, OR
-
_, "Architecture-cognizant divide and conquer algorithms," presented at the Conf. Supercomputing, Portland, OR, 1999.
-
(1999)
Conf. Supercomputing
-
-
-
25
-
-
0029322264
-
Unfavorable strides in cache memory systems
-
D. H. Bailey, "Unfavorable strides in cache memory systems," Sci. Program., vol. 4, pp. 53-58, 1995.
-
(1995)
Sci. Program.
, vol.4
, pp. 53-58
-
-
Bailey, D.H.1
-
26
-
-
0033894726
-
Dynamic data layouts for cache-conscious factorization of DFT
-
N. Park, D. Kang, K. Bondalapati, and V. K. Prasanna, "Dynamic data layouts for cache-conscious factorization of DFT," in Proc. IEEE Int. Parallel and Distributed Processing Symp. (IPDPS), 2000, pp. 693-701.
-
(2000)
Proc. IEEE Int. Parallel and Distributed Processing Symp. (IPDPS)
, pp. 693-701
-
-
Park, N.1
Kang, D.2
Bondalapati, K.3
Prasanna, V.K.4
-
27
-
-
0033694274
-
In search for the optimal Walsh-Hadamard transform
-
J. Johnson and M. Püschel, "In search for the optimal Walsh-Hadamard transform," in Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing (ICASSP), vol. 4, 2000, pp. 3347-3350.
-
(2000)
Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing (ICASSP)
, vol.4
, pp. 3347-3350
-
-
Johnson, J.1
Püschel, M.2
-
28
-
-
20744446138
-
Parallel VSIPL++: An open standard software library for high-performance parallel signal processing
-
Feb.
-
J. Lebak, J. Kepner, H. Hoffmann, and E. Rutledge, "Parallel VSIPL++: An open standard software library for high-performance parallel signal processing," Proc. IEEE, vol. 93, no. 2, pp. 313-330, Feb. 2005.
-
(2005)
Proc. IEEE
, vol.93
, Issue.2
, pp. 313-330
-
-
Lebak, J.1
Kepner, J.2
Hoffmann, H.3
Rutledge, E.4
-
29
-
-
0010947977
-
A comparative study of static and profile-based heuristics for inlining
-
M. Arnold, S. Fink, V. Sarkar, and P. F. Sweeney, "A comparative study of static and profile-based heuristics for inlining," in Proc. ACM SIGPLAN Workshop Dynamic and Adaptive Compilation and Optimization, 2000, pp. 52-64.
-
(2000)
Proc. ACM SIGPLAN Workshop Dynamic and Adaptive Compilation and Optimization
, pp. 52-64
-
-
Arnold, M.1
Fink, S.2
Sarkar, V.3
Sweeney, P.F.4
-
30
-
-
0026866013
-
Profileguided automatic inline expansion for C programs
-
P. P. Chang, S. A. Mahlke, W. Y. Chen, and W. M. W. Hwu, "Profileguided automatic inline expansion for C programs," Softw. Pract. Exper., vol. 22, no. 5, pp. 349-369, 1992.
-
(1992)
Softw. Pract. Exper.
, vol.22
, Issue.5
, pp. 349-369
-
-
Chang, P.P.1
Mahlke, S.A.2
Chen, W.Y.3
Hwu, W.M.W.4
-
32
-
-
20744451215
-
Information technology - JPEG 2000 image coding system - Part 1: Core coding system
-
ISO/IEC 15444-1
-
"Information Technology - JPEG 2000 Image Coding System - Part 1: Core Coding System," Int. Org. Standardization/Int. Electrotech. Comm., ISO/IEC 15444-1:2000.
-
(2000)
Int. Org. Standardization/Int. Electrotech. Comm.
-
-
-
33
-
-
0025600627
-
A methodology for designing, modifying, and implementing Fourier transform algorithms on various architectures
-
J. R. Johnson, R. W. Johnson, D. Rodriguez, and R. Tolimieri, "A methodology for designing, modifying, and implementing Fourier transform algorithms on various architectures," Circuits, Syst., Signal Process., vol. 9, no. 4, pp. 449-500, 1990.
-
(1990)
Circuits, Syst., Signal Process.
, vol.9
, Issue.4
, pp. 449-500
-
-
Johnson, J.R.1
Johnson, R.W.2
Rodriguez, D.3
Tolimieri, R.4
-
37
-
-
0141565306
-
Cooley-Tukey FFT like algorithms for the DCT
-
M. Püschel, "Cooley-Tukey FFT like algorithms for the DCT," in Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing (ICASSP), vol. 2, 2003, pp. 501-504.
-
(2003)
Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing (ICASSP)
, vol.2
, pp. 501-504
-
-
Püschel, M.1
-
40
-
-
18344410543
-
Factoring wavelet transforms into lifting steps
-
I. Daubechies and W. Sweldens, "Factoring wavelet transforms into lifting steps," J. Fourier Anal. Appl., vol. 4, no. 3, pp. 247-269, 1998.
-
(1998)
J. Fourier Anal. Appl.
, vol.4
, Issue.3
, pp. 247-269
-
-
Daubechies, I.1
Sweldens, W.2
-
42
-
-
0000459334
-
Rewriting
-
A. Robinson and A. Voronkov, Eds. New York: Elsevier, ch. 9
-
N. Dershowitz and D. A. Plaisted, "Rewriting," in Handbook of Automated Reasoning, A. Robinson and A. Voronkov, Eds. New York: Elsevier, 2001, vol. 1, ch. 9, pp. 535-610.
-
(2001)
Handbook of Automated Reasoning
, vol.1
, pp. 535-610
-
-
Dershowitz, N.1
Plaisted, D.A.2
-
43
-
-
84949679103
-
Fast automatic generation of DSP algorithms
-
Heidelberg, Germany: Springer - Verlag
-
M. Püschel, B. Singer, M. Veloso, and J. M. F. Moura, "Fast automatic generation of DSP algorithms," in Lecture Notes in Computer Science, Computational Science - ICCS 2001, Heidelberg, Germany: Springer - Verlag, 2001, vol. 2073, pp. 97-106.
-
(2001)
Lecture Notes in Computer Science, Computational Science - ICCS 2001
, vol.2073
, pp. 97-106
-
-
Püschel, M.1
Singer, B.2
Veloso, M.3
Moura, J.M.F.4
-
44
-
-
1542396679
-
SPIRAL: A generator for platform-adapted libraries of signal processing algorithms
-
M. Püschel, B. Singer, J. Xiong, J. M. F. Moura, J. Johnson, D. Padua, M. Veloso, and R. W. Johnson, "SPIRAL: A generator for platform-adapted libraries of signal processing algorithms," Int. J. High Perform. Comput. Appl., vol. 18, no. 1, pp. 21-45, 2004.
-
(2004)
Int. J. High Perform. Comput. Appl.
, vol.18
, Issue.1
, pp. 21-45
-
-
Püschel, M.1
Singer, B.2
Xiong, J.3
Moura, J.M.F.4
Johnson, J.5
Padua, D.6
Veloso, M.7
Johnson, R.W.8
-
45
-
-
20744433926
-
-
GAP - Groups, algorithms, and programming. GAP Team, Univ. St. Andrews, St. Andrews, U.K. [Online]
-
(1997) GAP - Groups, algorithms, and programming. GAP Team, Univ. St. Andrews, St. Andrews, U.K. [Online]. Available: http://www-gap.dcs.st-and.ac.uk/ ~gap/
-
(1997)
-
-
-
47
-
-
0034826555
-
SPL: A language and compiler for DSP algorithms
-
J. Xiong, J. Johnson, R. Johnson, and D. Padua, "SPL: A language and compiler for DSP algorithms," in Proc. ACM SIGPLAN Conf. Programming Language Design and Implementation (PLDI), 2001, pp. 298-308.
-
(2001)
Proc. ACM SIGPLAN Conf. Programming Language Design and Implementation (PLDI)
, pp. 298-308
-
-
Xiong, J.1
Johnson, J.2
Johnson, R.3
Padua, D.4
-
49
-
-
4544287691
-
Automatic generation of implementations for DSP transforms on fused multiply-add architectures
-
Y. Voronenko and M. Püschel, "Automatic generation of implementations for DSP transforms on fused multiply-add architectures," in Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing (ICASSP), vol. 5, 2004, pp. V-101-V-104.
-
(2004)
Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing (ICASSP)
, vol.5
-
-
Voronenko, Y.1
Püschel, M.2
-
50
-
-
0026916192
-
Engineering a simple, efficient code-generator generator
-
C. W. Fraser, D. R. Hanson, and T. A. Proebsting, "Engineering a simple, efficient code-generator generator," ACM Lett. Program. Lang. Syst., vol. 1, no. 3, pp. 213-226, 1992.
-
(1992)
ACM Lett. Program. Lang. Syst.
, vol.1
, Issue.3
, pp. 213-226
-
-
Fraser, C.W.1
Hanson, D.R.2
Proebsting, T.A.3
-
51
-
-
0027274408
-
Implementation of efficient FFT algorithms on fused multiply-add architectures
-
Jan.
-
E. Linzer and E. Feig, "Implementation of efficient FFT algorithms on fused multiply-add architectures," IEEE Trans. Signal Process., vol. 41, no. 1, p. 93, Jan. 1993.
-
(1993)
IEEE Trans. Signal Process.
, vol.41
, Issue.1
, pp. 93
-
-
Linzer, E.1
Feig, E.2
-
52
-
-
0026396564
-
Implementation of multiply-add FFT algorithms for complex and real data sequences
-
C. Lu, "Implementation of multiply-add FFT algorithms for complex and real data sequences," in Proc. Int. Symp. Circuits and Systems (ISCAS), vol. 1, 1991, pp. 480-483.
-
(1991)
Proc. Int. Symp. Circuits and Systems (ISCAS)
, vol.1
, pp. 480-483
-
-
Lu, C.1
-
55
-
-
19344368498
-
-
Ph.D. dissertation, Inst. Appl. Math. Numer.Anal., Vienna Univ. Technol.
-
F. Franchetti, "Performance portable short vector transforms," Ph.D. dissertation, Inst. Appl. Math. Numer.Anal., Vienna Univ. Technol., 2003.
-
(2003)
Performance Portable Short Vector Transforms
-
-
Franchetti, F.1
-
56
-
-
19344363982
-
Efficient utilization of SIMD extensions
-
Feb.
-
F. Franchetti, S. Kral, J. Lorenz, and C. Ueberhuber, "Efficient utilization of SIMD extensions," Proc. IEEE, vol. 93, no. 2, pp. 409-425, Feb. 2005.
-
(2005)
Proc. IEEE
, vol.93
, Issue.2
, pp. 409-425
-
-
Franchetti, F.1
Kral, S.2
Lorenz, J.3
Ueberhuber, C.4
-
57
-
-
85013593108
-
Experience in the automatic parallelization of four perfect benchmark programs
-
Heidelberg, Germany
-
R. E. J. Hoeflinger, Z. Li, and D. Padua, "Experience in the automatic parallelization of four perfect benchmark programs," in Lecture Notes in Computer Science, Languages and Compilers for Parallel Computing, vol. 589. Heidelberg, Germany, 1992, pp. 65-83.
-
(1992)
Lecture Notes in Computer Science, Languages and Compilers for Parallel Computing
, vol.589
, pp. 65-83
-
-
Hoeflinger, R.E.J.1
Li, Z.2
Padua, D.3
-
58
-
-
0031699606
-
On the automatic parallelization of the perfect benchmarks
-
Jan.
-
R. E. J. Hoeflinger and D. Padua, "On the automatic parallelization of the perfect benchmarks," IEEE Trans. Parallel Distrib. Syst., vol. 9, no. 1, pp. 5-23, Jan. 1998.
-
(1998)
IEEE Trans. Parallel Distrib. Syst.
, vol.9
, Issue.1
, pp. 5-23
-
-
Hoeflinger, R.E.J.1
Padua, D.2
-
62
-
-
0003710739
-
-
Cambridge, MA: MIT Press
-
W. Gropp, S. Huss-Lederman, A. Lumsdaine, E. Lusk, B. Nitzberg, W. Saphir, and M. Snir, MPI: The Complete Reference, 2nd ed. Cambridge, MA: MIT Press, 1998.
-
(1998)
MPI: The Complete Reference, 2nd Ed.
-
-
Gropp, W.1
Huss-Lederman, S.2
Lumsdaine, A.3
Lusk, E.4
Nitzberg, B.5
Saphir, W.6
Snir, M.7
-
63
-
-
20744457032
-
-
Ph.D. dissertation, Dept. Elect. Comput. Eng., Drexel Univ., Philadelphia, PA
-
P. Kumhom, "Design, optimization, and implementation of a universal FFT processor," Ph.D. dissertation, Dept. Elect. Comput. Eng., Drexel Univ., Philadelphia, PA, 2001.
-
(2001)
Design, Optimization, and Implementation of a Universal FFT Processor
-
-
Kumhom, P.1
-
64
-
-
0343394643
-
Testing multivariate linear functions: Overcoming the generator bottleneck
-
F. Ergün, "Testing multivariate linear functions: Overcoming the generator bottleneck," in Proc. ACM Symp. Theory of Computing (STOC), vol. 2, 1995, pp. 407-416.
-
(1995)
Proc. ACM Symp. Theory of Computing (STOC)
, vol.2
, pp. 407-416
-
-
Ergün, F.1
-
65
-
-
20744441642
-
Verification of linear programs
-
London, ON, Canada
-
J. Johnson, M. Püschel, and Y. Voronenko, "Verification of linear programs," presented at the Int. Symp. Symbolic and Algebraic Computation (ISSAC), London, ON, Canada, 2001.
-
(2001)
Int. Symp. Symbolic and Algebraic Computation (ISSAC)
-
-
Johnson, J.1
Püschel, M.2
Voronenko, Y.3
-
66
-
-
20744431703
-
-
ser. CBMS-NSF Regional Conf. Ser. Appl. Math. Philadelphia, PA: SIAM
-
S. Winograd, Arithmetic Complexity of Computations, ser. CBMS-NSF Regional Conf. Ser. Appl. Math. Philadelphia, PA: SIAM, 1980.
-
(1980)
Arithmetic Complexity of Computations
-
-
Winograd, S.1
-
67
-
-
1942477516
-
Automatic derivation and implementation of fast convolution algorithms
-
J. R. Johnson and A. F. Breitzman, "Automatic derivation and implementation of fast convolution algorithms," J. Symbol. Comput., vol. 37, no. 2, pp. 261-293, 2004.
-
(2004)
J. Symbol. Comput.
, vol.37
, Issue.2
, pp. 261-293
-
-
Johnson, J.R.1
Breitzman, A.F.2
-
68
-
-
0026299779
-
New scaled DCT algorithms for fused multiply/add architectures
-
E. Linzer and E. Feig, "New scaled DCT algorithms for fused multiply/add architectures," in Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing (ICASSP), vol. 3, 1991, pp. 2201-2204.
-
(1991)
Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing (ICASSP)
, vol.3
, pp. 2201-2204
-
-
Linzer, E.1
Feig, E.2
-
70
-
-
0021513104
-
Some complexity issues in digital signal processing
-
Oct.
-
P. R. Cappello and K. Steiglitz, "Some complexity issues in digital signal processing," IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-32, no. 5, pp. 1037-1041, Oct. 1984.
-
(1984)
IEEE Trans. Acoust., Speech, Signal Process.
, vol.ASSP-32
, Issue.5
, pp. 1037-1041
-
-
Cappello, P.R.1
Steiglitz, K.2
-
71
-
-
0036283460
-
Extended results for minimum-adder constant integer multipliers
-
O. Gustafsson, A. Dempster, and L. Wanhammar, "Extended results for minimum-adder constant integer multipliers," in IEEE Int. Symp. Circuits and Systems, vol. 1, 2002, pp. I-73-I-76.
-
(2002)
IEEE Int. Symp. Circuits and Systems
, vol.1
-
-
Gustafsson, O.1
Dempster, A.2
Wanhammar, L.3
-
72
-
-
4544278669
-
Automatic cost minimization for multiplierless implementations of discrete signal transforms
-
A. C. Zelinski, M. Püschel, S. Misra, and J. C. Hoe, "Automatic cost minimization for multiplierless implementations of discrete signal transforms," in Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing (ICASSP), vol. 5, 2004, pp. V-221-V-224.
-
(2004)
Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing (ICASSP)
, vol.5
-
-
Zelinski, A.C.1
Püschel, M.2
Misra, S.3
Hoe, J.C.4
-
73
-
-
16244420727
-
Custom-optimized multiplierless implementations of DSP algorithms
-
San Jose, CA
-
M. Püschel, A. Zelinski, and J. C. Hoe, "Custom-optimized multiplierless implementations of DSP algorithms," presented at the Int. Conf. Computer Aided Design (ICCAD), San Jose, CA, 2004.
-
(2004)
Int. Conf. Computer Aided Design (ICCAD)
-
-
Püschel, M.1
Zelinski, A.2
Hoe, J.C.3
-
74
-
-
20744458364
-
Information technology-coding of moving pictures and associated audio for digital storage media at up to about 1.5 Mbits/s
-
ISO/IEC 11 172
-
"Information technology-coding of moving pictures and associated audio for digital storage media at up to about 1.5 Mbits/s," Int. Org. Standardization/Int. Electrotech. Comm., ISO/IEC 11 172, 1995.
-
(1995)
Int. Org. Standardization/Int. Electrotech. Comm.
-
-
-
75
-
-
20744439951
-
-
M.S. thesis, Dept Comput. Sci., Drexel Univ., Philadelphia, PA
-
H.-J. Huang, "Performance analysis of an adaptive algorithm for the Walsh-Hadamard transform," M.S. thesis, Dept Comput. Sci., Drexel Univ., Philadelphia, PA, 2002.
-
(2002)
Performance Analysis of An Adaptive Algorithm for the Walsh-hadamard Transform
-
-
Huang, H.-J.1
-
76
-
-
20744457985
-
-
M.S. thesis, Dept. Comput. Sci., Drexel Univ., Philadelphia, PA
-
M. Furis, "Cache miss analysis of Walsh-Hadamard transform algorithms," M.S. thesis, Dept. Comput. Sci., Drexel Univ., Philadelphia, PA, 2003.
-
(2003)
Cache Miss Analysis of Walsh-hadamard Transform Algorithms
-
-
Furis, M.1
-
77
-
-
20744438614
-
Dataflow analysis of the FFT
-
Dept. Comput. Sci., Drexel Univ., Philadelphia, PA
-
A. Parekh and J. R. Johnson, "Dataflow analysis of the FFT," Dept. Comput. Sci., Drexel Univ., Philadelphia, PA, Tech. Rep. DU-CS-2004-01, 2004.
-
(2004)
Tech. Rep.
, vol.DU-CS-2004-01
-
-
Parekh, A.1
Johnson, J.R.2
-
78
-
-
20744447865
-
Distribution of a class of divide and conquer recurrences arising from the computation of the Walsh-Hadamard transform
-
Vienna, Austria
-
J. Johnson, P. Hitczenko, and H.-J. Huang, "Distribution of a class of divide and conquer recurrences arising from the computation of the Walsh-Hadamard transform," presented at the 3rd Colloq. Mathematics and Computer Science: Algorithms, Trees, Combinatorics and Probabilities, Vienna, Austria, 2004.
-
(2004)
3rd Colloq. Mathematics and Computer Science: Algorithms, Trees, Combinatorics and Probabilities
-
-
Johnson, J.1
Hitczenko, P.2
Huang, H.-J.3
-
79
-
-
20744449261
-
Distribution of a class of divide and conquer recurrences arising from the computation of the Walsh-Hadamard transform
-
submitted for publication
-
P. Hitczenko, H.-J. Huang, and J. R. Johnson, "Distribution of a class of divide and conquer recurrences arising from the computation of the Walsh-Hadamard transform," Theor. Comput. Sci., 2003, submitted for publication.
-
(2003)
Theor. Comput. Sci.
-
-
Hitczenko, P.1
Huang, H.-J.2
Johnson, J.R.3
-
82
-
-
0141696394
-
Stochastic search for signal processing algorithm optimization
-
B. Singer and M. Veloso, "Stochastic search for signal processing algorithm optimization," Proc. Supercomputing, 2001.
-
(2001)
Proc. Supercomputing
-
-
Singer, B.1
Veloso, M.2
-
83
-
-
0013103910
-
-
Ph.D. dissertation, Dept. Comput. Sci., Faculty Sci., Univ. Porto, Porto, Portugal
-
L. Torgo, "Inductive learning of tree-based regression models," Ph.D. dissertation, Dept. Comput. Sci., Faculty Sci., Univ. Porto, Porto, Portugal, 1999.
-
(1999)
Inductive Learning of Tree-based Regression Models
-
-
Torgo, L.1
-
84
-
-
0141496142
-
Learning to construct fast signal processing implementations
-
B. Singer and M. Veloso, "Learning to construct fast signal processing implementations," J. Mach. Learn. Res., vol. 3, pp. 887-919, 2002.
-
(2002)
J. Mach. Learn. Res.
, vol.3
, pp. 887-919
-
-
Singer, B.1
Veloso, M.2
-
85
-
-
20744432440
-
Learning to generate fast signal processing implementations
-
_, "Learning to generate fast signal processing implementations, " in Proc. Int. Conf. Machine Learning, 2001, pp. 529-536.
-
(2001)
Proc. Int. Conf. Machine Learning
, pp. 529-536
-
-
-
86
-
-
0036684741
-
Automating the modeling and optimization of the performance of signal transforms
-
Aug.
-
B. Singer and M. M. Veloso, "Automating the modeling and optimization of the performance of signal transforms," IEEE Trans. Signal Process., vol. 50, no. 8, pp. 2003-2014, Aug. 2002.
-
(2002)
IEEE Trans. Signal Process.
, vol.50
, Issue.8
, pp. 2003-2014
-
-
Singer, B.1
Veloso, M.M.2
-
87
-
-
0004161838
-
-
Cambridge, U.K.: Cambridge Univ. Press
-
W. H. Press, B. P. Flannery, S. A. Teukolsky, and W. T. Vetterling, Numerical Recipes in C: The Art of Scientific Computing, 2nd ed. Cambridge, U.K.: Cambridge Univ. Press, 1992.
-
(1992)
Numerical Recipes in C: The Art of Scientific Computing, 2nd Ed.
-
-
Press, W.H.1
Flannery, B.P.2
Teukolsky, S.A.3
Vetterling, W.T.4
-
88
-
-
20744435264
-
-
Ph.D. dissertation, Dept. Elect. Comput. Eng., Carnegie Mellon Univ., Pittsburgh, PA
-
Ph.D. dissertation, Dept. Elect. Comput. Eng., Carnegie Mellon Univ., Pittsburgh, PA.
-
-
-
-
91
-
-
0026918402
-
Design and evaluation of a compiler algorithm for prefetching
-
T. C. Mowry, M. S. Lam, and A. Gupta, "Design and evaluation of a compiler algorithm for prefetching," in Proc. Int. Conf. Architectural Support for Programming Languages and Operating Systems, 1992, pp. 62-73.
-
(1992)
Proc. Int. Conf. Architectural Support for Programming Languages and Operating Systems
, pp. 62-73
-
-
Mowry, T.C.1
Lam, M.S.2
Gupta, A.3
-
92
-
-
20744444518
-
Adaptive mapping of linear DSP algorithms to fixed-point arithmetic
-
Lexington, MA
-
L. J. Chang, I. Hong, Y. Voronenko, and M. Püschel, "Adaptive mapping of linear DSP algorithms to fixed-point arithmetic," presented at the Workshop High Performance Embedded Computing (HPEC), Lexington, MA, 2004.
-
(2004)
Workshop High Performance Embedded Computing (HPEC)
-
-
Chang, L.J.1
Hong, I.2
Voronenko, Y.3
Püschel, M.4
|