-
3
-
-
79952514685
-
Algorithmic parameter optimization of the DFO method with the OPAL framework
-
K. Naono, K. Teranishi, J. Cavazos, R. Suda (Eds.) Springer
-
C. Audet, D. C.-K, D. Orban, Algorithmic parameter optimization of the DFO method with the OPAL framework, in: K. Naono, K. Teranishi, J. Cavazos, R. Suda (Eds.), Software Automatic Tuning: From Concepts to State-of-the-Art Results, Springer, 2010, pp. 255-274.
-
(2010)
Software Automatic Tuning: From Concepts to State-of-the-Art Results
, pp. 255-274
-
-
Audet, C.1
Orban D.C.-K, D.2
-
4
-
-
84976845336
-
Testing unconstrained optimization software
-
doi
-
J. J. Moré, B. S. Garbow, K. E. Hillstrom, Testing unconstrained optimization software, ACM Transactions on Mathematical Software 7 (1) (1981) 17-41. doi: http://doi.acm.org/10.1145/355934.355936.
-
(1981)
ACM Transactions on Mathematical Software
, vol.7
, Issue.1
, pp. 17-41
-
-
Moré, J.J.1
Garbow, B.S.2
Hillstrom, K.E.3
-
5
-
-
2442598318
-
CUTEr and SifDec: A constrained and unconstrained testing environment
-
Revisited doi
-
N. I. M. Gould, D. Orban, P. L. Toint, CUTEr and SifDec: A constrained and unconstrained testing environment, revisited, ACM Transactions on Mathematical Software 29 (4) (2003) 373-394. doi: http://doi.acm.org/10.1145/ 962437.962439.
-
(2003)
ACM Transactions on Mathematical Software
, vol.29
, Issue.4
, pp. 373-394
-
-
Gould, N.I.M.1
Orban, D.2
Toint, P.L.3
-
6
-
-
68949099095
-
Benchmarking derivative-free optimization algorithms
-
J. J. Moré, S. M. Wild, Benchmarking derivative-free optimization algorithms, SIAM Journal on Optimization 20 (1) (2009) 172-191.
-
(2009)
SIAM Journal on Optimization
, vol.20
, Issue.1
, pp. 172-191
-
-
Moré, J.J.1
Wild, S.M.2
-
7
-
-
79958251086
-
-
Chapman & Hall CRC Press, Taylor and Francis Group
-
B. Norris, A. Hartono, W. Gropp, Annotations for Productivity and Performance Portability, Computational Science, Chapman & Hall CRC Press, Taylor and Francis Group, 2007, pp. 443-461.
-
(2007)
Annotations for Productivity and Performance Portability, Computational Science
, pp. 443-461
-
-
Norris, B.1
Hartono, A.2
Gropp, W.3
-
8
-
-
0034512401
-
Combined selection of tile sizes and unroll factors using iterative compilation
-
Washington, DC
-
T. Kisuki, P. M. W. Knijnenburg, M. F. P. O'Boyle, Combined selection of tile sizes and unroll factors using iterative compilation, in: Proc. of the 2000 International Conference on Parallel Architectures and Compilation Techniques, Washington, DC, 2000.
-
(2000)
Proc. of the 2000 International Conference on Parallel Architectures and Compilation Techniques
-
-
Kisuki, T.1
Knijnenburg, P.M.W.2
O'Boyle, M.F.P.3
-
9
-
-
33646676076
-
Automatic tuning of whole applications using direct search and a performance-based transformation system
-
A. Qasem, K. Kennedy, J. Mellor-Crummey, Automatic tuning of whole applications using direct search and a performance-based transformation system, The Journal of Supercomputing 36 (2) (2006) 183-196.
-
(2006)
The Journal of Supercomputing
, vol.36
, Issue.2
, pp. 183-196
-
-
Qasem, A.1
Kennedy, K.2
Mellor-Crummey, J.3
-
10
-
-
57949106903
-
A comparison of search heuristics for empirical code optimization
-
K. Seymour, H. You, J. Dongarra, A comparison of search heuristics for empirical code optimization, in: Proc. of the 2008 IEEE International Conference on Cluster Computing, 2008, pp. 421-429.
-
(2008)
Proc. of the 2008 IEEE International Conference on Cluster Computing
, pp. 421-429
-
-
Seymour, K.1
You, H.2
Dongarra, J.3
-
11
-
-
79958257802
-
Autotuning and specialization: Speeding up matrix multiply for small matrices with compiler technology
-
J. Shin, M. W. Hall, J. Chame, C. Chen, P. D. Hovland, Autotuning and specialization: Speeding up matrix multiply for small matrices with compiler technology, in: Proc. of the Fourth International Workshop on Automatic Performance Tuning, Japan, 2009.
-
(2009)
Proc. of the Fourth International Workshop on Automatic Performance Tuning, Japan
-
-
Shin, J.1
Hall, M.W.2
Chame, J.3
Chen, C.4
Hovland, P.D.5
-
12
-
-
70449844310
-
A scalable auto-tuning framework for compiler optimization
-
Washington, DC
-
A. Tiwari, C. Chen, C. Jacqueline, M. Hall, J. K. Hollingsworth, A scalable auto-tuning framework for compiler optimization, in: Proc. of the 2009 IEEE International Symposium on Parallel & Distributed Processing, Washington, DC, 2009, pp. 1-12.
-
(2009)
Proc. of the 2009 IEEE International Symposium on Parallel & Distributed Processing
, pp. 1-12
-
-
Tiwari, A.1
Chen, C.2
Jacqueline, C.3
Hall, M.4
Hollingsworth, J.K.5
-
14
-
-
84857186534
-
TORCH computational reference kernels: A testbed for computer science research
-
EECS Department, University of California, Berkeley December URL
-
A. Kaiser, S. Williams, K. Madduri, K. Ibrahim, D. Bailey, J. Demmel, E. Strohmaier, TORCH computational reference kernels: A testbed for computer science research, Tech. Rep. UCB/EECS-2010-144, EECS Department, University of California, Berkeley (December 2010). URL http://www.eecs.berkeley.edu/Pubs/ TechRpts/2010/EECS-2010-144.html
-
(2010)
Tech. Rep. UCB/EECS-2010-144
-
-
Kaiser, A.1
Williams, S.2
Madduri, K.3
Ibrahim, K.4
Bailey, D.5
Demmel, J.6
Strohmaier, E.7
-
18
-
-
85030969841
-
-
SPEC benchmarks, http://www.spec.org/benchmarks.html.
-
-
-
-
19
-
-
84973836157
-
The NAS parallel benchmarks
-
D. Bailey, E. Barszcz, J. Barton, D. Browning, R. Carter, L. Dagum, R. Fatoohi, P. Frederickson, T. Lasinski, R. Schreiber, H. Simon, V. Venkatakrishnan, S. Weeratunga, The NAS Parallel Benchmarks, International Journal of High Performance Computing Applications 5 (3) (1991) 63-73.
-
(1991)
International Journal of High Performance Computing Applications
, vol.5
, Issue.3
, pp. 63-73
-
-
Bailey, D.1
Barszcz, E.2
Barton, J.3
Browning, D.4
Carter, R.5
Dagum, L.6
Fatoohi, R.7
Frederickson, P.8
Lasinski, T.9
Schreiber, R.10
Simon, H.11
Venkatakrishnan, V.12
Weeratunga, S.13
-
20
-
-
63549095070
-
The PARSEC benchmark suite: Characterization and architectural implications
-
ACM URL
-
C. Bienia, S. Kumar, J. P. Singh, K. Li, The PARSEC benchmark suite: Characterization and architectural implications, in: Proceedings of the 17th international conference on Parallel architectures and compilation techniques, PACT '08, ACM, 2008, pp. 72-81. URL http://doi.acm.org/10.1145/1454115.1454128
-
(2008)
Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, PACT '08
, pp. 72-81
-
-
Bienia, C.1
Kumar, S.2
Singh, J.P.3
Li, K.4
-
21
-
-
70649092154
-
Rodinia: A benchmark suite for heterogeneous computing
-
IISWC 2009
-
S. Che, M. Boyer, J. Meng, D. Tarjan, J. W. Sheaffer, S.-H. Lee, K. Skadron, Rodinia: A benchmark suite for heterogeneous computing, in: IEEE International Symposium on Workload Characterization, 2009. IISWC 2009., 2009, pp. 44-54.
-
(2009)
IEEE International Symposium on Workload Characterization, 2009
, pp. 44-54
-
-
Che, S.1
Boyer, M.2
Meng, J.3
Tarjan, D.4
Sheaffer, J.W.5
Lee, S.-H.6
Skadron, K.7
-
22
-
-
0042674307
-
The LINPACK benchmark: Past, present and future
-
J. Dongarra, P. Luszczek, A. Petitet, The LINPACK benchmark: Past, present and future, Concurrency and Computation: Practice and Experience 15 (9) (2003) 803-820.
-
(2003)
Concurrency and Computation: Practice and Experience
, vol.15
, Issue.9
, pp. 803-820
-
-
Dongarra, J.1
Luszczek, P.2
Petitet, A.3
-
24
-
-
56449127224
-
STAMP: Stanford transactional applications for multi-processing
-
Seattle, WA
-
C. C. Minh, J. Chung, C. Kozyrakis, K. Olukotun, STAMP: Stanford Transactional Applications for Multi-Processing, in: IISWC '08: Proc. of the IEEE International Symposium on Workload Characterization, Seattle, WA, 2008, pp. 35-46.
-
(2008)
IISWC '08: Proc. of the IEEE International Symposium on Workload Characterization
, pp. 35-46
-
-
Minh, C.C.1
Chung, J.2
Kozyrakis, C.3
Olukotun, K.4
-
27
-
-
84893457109
-
Accelerating time-to-solution for computational science and engineering
-
J. Demmel, J. Dongarra, A. Fox, S. Williams, V. Volkov, K. Yelick, Accelerating time-to-solution for computational science and engineering, SciDAC Review (15).
-
SciDAC Review
, Issue.15
-
-
Demmel, J.1
Dongarra, J.2
Fox, A.3
Williams, S.4
Volkov, V.5
Yelick, K.6
-
28
-
-
77954022347
-
An auto-tuning framework for parallel multicore stencil computations
-
S. Kamil, C. Chan, L. Oliker, J. Shalf, S. Williams, An auto-tuning framework for parallel multicore stencil computations, in: 2010 IEEE International Symposium on Parallel Distributed Processing (IPDPS), 2010, pp. 1-12.
-
(2010)
2010 IEEE International Symposium on Parallel Distributed Processing (IPDPS)
, pp. 1-12
-
-
Kamil, S.1
Chan, C.2
Oliker, L.3
Shalf, J.4
Williams, S.5
-
30
-
-
84896983433
-
An experimental study of global and local search algorithms in empirical performance tuning
-
Argonne National Laboratory
-
P. Balaprakash, S. Wild, P. Hovland, An experimental study of global and local search algorithms in empirical performance tuning, Tech. Rep. ANL/MCS-P1995-0112, Argonne National Laboratory (2012).
-
(2012)
Tech. Rep. ANL/MCS-P1995-0112
-
-
Balaprakash, P.1
Wild, S.2
Hovland, P.3
|