-
1
-
-
48749141209
-
Adaptive mesh refinement for hyperbolic partial differential equations
-
M. Berger and J. Oliger, "Adaptive mesh refinement for hyperbolic partial differential equations, "Journal of Computational Physics, vol. 53, no. 1, pp. 484-512, 1984.
-
(1984)
Journal of Computational Physics
, vol.53
, Issue.1
, pp. 484-512
-
-
Berger, M.1
Oliger, J.2
-
2
-
-
0000148916
-
Salinity-driven thermocline transients in a wind- and thermohaline-forces isopycnic coordinate model of the north atlantic
-
R. Bleck, C. Rooth, D. Hu, and L. T. Smith, "Salinity-driven thermocline transients in a wind- and thermohaline-forces isopycnic coordinate model of the north atlantic, "Journal of Physical Oceanography, vol. 22, no. 12, pp. 1486-1505, 1992.
-
(1992)
Journal of Physical Oceanography
, vol.22
, Issue.12
, pp. 1486-1505
-
-
Bleck, R.1
Rooth, C.2
Hu, D.3
Smith, L.T.4
-
3
-
-
70350630432
-
A multilevel parallelization framework for high-order stencil computations
-
H. Dursun, K. ichi Nomura, L. Peng, R. Seymour, W.Wang, R. K. Kalia, A. Nakano, and P. Vashishta, "A multilevel parallelization framework for high-order stencil computations, "in Euro-Par, 2009, pp. 642-653.
-
(2009)
Euro-Par
, pp. 642-653
-
-
Dursun, H.1
Nomura, K.I.2
Peng, L.3
Seymour, R.4
Wang, W.5
Kalia, R.K.6
Nakano, A.7
Vashishta, P.8
-
4
-
-
0028714453
-
Multiresolution molecular dynamics for realistic materials modeling on parallel computers
-
A. Nakano, P. Vashishta, and R. K. Kalra, "Multiresolution molecular dynamics for realistic materials modeling on parallel computers, "Computer Physics Communications, vol. 83, no. 1, pp. 197-214, 1994.
-
(1994)
Computer Physics Communications
, vol.83
, Issue.1
, pp. 197-214
-
-
Nakano, A.1
Vashishta, P.2
Kalra, R.K.3
-
5
-
-
34548752231
-
Towards optimal multi-level tiling for stencil computations
-
L. Renganarayanan, M. Harthikote-Matha, R. Dewri, and S. V. Rajopadhye, "Towards optimal multi-level tiling for stencil computations, "in IPDPS, 2007, pp. 1-10.
-
(2007)
IPDPS
, pp. 1-10
-
-
Renganarayanan, L.1
Harthikote-Matha, M.2
Dewri, R.3
Rajopadhye, S.V.4
-
6
-
-
38849206150
-
Divide-and conquer density functional theory on hierarchical real-space grids: Parallel implementationa and applications
-
F. Shimojo, R. K. Kalia, A. Nakano, and P. Vashishta, "Divide-and conquer density functional theory on hierarchical real-space grids: parallel implementationa and applications, "Physical Review, vol. B, no. 77, pp. 1-12, 2008.
-
(2008)
Physical Review
, vol.B
, Issue.77
, pp. 1-12
-
-
Shimojo, F.1
Kalia, R.K.2
Nakano, A.3
Vashishta, P.4
-
8
-
-
49249086142
-
Larrabee: A many-core x86 architecture for visual computing
-
August
-
L. Seiler, D. Carmean, E. Sprangle, T. Forsyth, M. Abrash, P. Dubey, S. Junkins, A. Lake, J. Sugerman, R. Cavin, R. Espasa, E. Grochowski, T. Juan, and P. Hanrahan, "Larrabee: a many-core x86 architecture for visual computing, "ACM Trans. Graph., vol. 27, no. 3, pp. 1-15, August 2008.
-
(2008)
ACM Trans. Graph.
, vol.27
, Issue.3
, pp. 1-15
-
-
Seiler, L.1
Carmean, D.2
Sprangle, E.3
Forsyth, T.4
Abrash, M.5
Dubey, P.6
Junkins, S.7
Lake, A.8
Sugerman, J.9
Cavin, R.10
Espasa, R.11
Grochowski, E.12
Juan, T.13
Hanrahan, P.14
-
10
-
-
77953972043
-
-
Ph.D. dissertation, University of California, Berkeley, Dec, [Online]. Available
-
K. Datta, "Auto-tuning stencil codes for cache-based multicore platforms, "Ph.D. dissertation, EECS Department, University of California, Berkeley, Dec 2009. [Online]. Available: http://www.eecs.berkeley.edu/Pubs/ TechRpts/2009/EECS-2009-177.html.
-
(2009)
Auto-tuning Stencil Codes for Cache-based Multicore Platforms
-
-
Datta, K.1
-
11
-
-
70350771127
-
Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures
-
Piscataway, NJ, USA: IEEE Press
-
K. Datta, M. Murphy, V. Volkov, S. Williams, J. Carter, L. Oliker, D. Patterson, J. Shalf, and K. Yelick, "Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures, "in SC'08: Proceedings of the 2008 ACM/IEEE conference on Supercomputing. Piscataway, NJ, USA: IEEE Press, 2008, pp. 1-12.
-
(2008)
SC'08: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing
, pp. 1-12
-
-
Datta, K.1
Murphy, M.2
Volkov, V.3
Williams, S.4
Carter, J.5
Oliker, L.6
Patterson, D.7
Shalf, J.8
Yelick, K.9
-
12
-
-
33947307610
-
The memory behavior of cache oblivious stencil computations
-
M. Frigo and V. Strumpen, "The memory behavior of cache oblivious stencil computations, "J. Supercomput., vol. 39, no. 2, pp. 93-112, 2007.
-
(2007)
J. Supercomput.
, vol.39
, Issue.2
, pp. 93-112
-
-
Frigo, M.1
Strumpen, V.2
-
13
-
-
78650849839
-
Enabling temporal blocking a lattice boltzmann flow solver through multicore-aware wavefront parallelization
-
J. Habich, T. Zeiser, G. Hager, and G. Wellein, "Enabling temporal blocking a lattice boltzmann flow solver through multicore-aware wavefront parallelization, "21st International Conference on Parallel Computational Fluid Dynamics, pp. 178-182, 2009.
-
(2009)
21st International Conference on Parallel Computational Fluid Dynamics
, pp. 178-182
-
-
Habich, J.1
Zeiser, T.2
Hager, G.3
Wellein, G.4
-
14
-
-
34547500808
-
Implicit and explicit optimizations for stencil computations
-
New York, NY, USA: ACM
-
S. Kamil, K. Datta, S. Williams, L. Oliker, J. Shalf, and K. Yelick, "Implicit and explicit optimizations for stencil computations, "in MSPC'06: Proceedings of the 2006 workshop on Memory system performance and correctness. New York, NY, USA: ACM, 2006, pp. 51-60.
-
(2006)
MSPC'06: Proceedings of the 2006 Workshop on Memory System Performance and Correctness
, pp. 51-60
-
-
Kamil, S.1
Datta, K.2
Williams, S.3
Oliker, L.4
Shalf, J.5
Yelick, K.6
-
15
-
-
67650671606
-
3d finite difference computation on gpus using cuda
-
New York, NY, USA: ACM
-
P. Micikevicius, "3d finite difference computation on gpus using cuda, "in GPGPU-2: Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units. New York, NY, USA: ACM, 2009, pp. 79-84.
-
(2009)
GPGPU-2: Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units
, pp. 79-84
-
-
Micikevicius, P.1
-
16
-
-
78649765479
-
Tiling optimizations for 3d scientific computations
-
Washington, DC, USA: IEEE Computer Society
-
G. Rivera and C.-W. Tseng, "Tiling optimizations for 3d scientific computations, "in Supercomputing'00: Proceedings of the 2000 ACM/IEEE conference on Supercomputing (CDROM). Washington, DC, USA: IEEE Computer Society, 2000, p. 32.
-
(2000)
Supercomputing'00: Proceedings of the 2000 ACM/IEEE Conference on Supercomputing (CDROM)
, pp. 32
-
-
Rivera, G.1
Tseng, C.-W.2
-
17
-
-
70449657442
-
Efficient temporal blocking for stencil computations by multicore-aware wavefront parallelization
-
Washington, DC, USA: IEEE Computer Society
-
G. Wellein, G. Hager, T. Zeiser, M. Wittmann, and H. Fehske, "Efficient temporal blocking for stencil computations by multicore-aware wavefront parallelization, "in COMPSAC'09: Proceedings of the 2009 33rd Annual IEEE International Computer Software and Applications Conference. Washington, DC, USA: IEEE Computer Society, 2009, pp. 579-586.
-
(2009)
COMPSAC'09: Proceedings of the 2009 33rd Annual IEEE International Computer Software and Applications Conference
, pp. 579-586
-
-
Wellein, G.1
Hager, G.2
Zeiser, T.3
Wittmann, M.4
Fehske, H.5
-
18
-
-
67650998701
-
Optimization of a lattice boltzmann computation on state-of-the-art multicore platforms
-
S.Williams, J. Carter, L. Oliker, J. Shalf, and K. Yelick, "Optimization of a lattice boltzmann computation on state-of-the-art multicore platforms, "J. Parallel Distrib. Comput., vol. 69, no. 9, pp. 762-777, 2009.
-
(2009)
J. Parallel Distrib. Comput.
, vol.69
, Issue.9
, pp. 762-777
-
-
Williams, S.1
Carter, J.2
Oliker, L.3
Shalf, J.4
Yelick, K.5
-
19
-
-
59749100826
-
Optimization and performance modeling of stencil computations on modern microprocessors
-
K. Datta, S. Kamil, S. Williams, L. Oliker, J. Shalf, and K. Yelick, "Optimization and performance modeling of stencil computations on modern microprocessors, "SIAM Rev., vol. 51, no. 1, pp. 129-159, 2009.
-
(2009)
SIAM Rev.
, vol.51
, Issue.1
, pp. 129-159
-
-
Datta, K.1
Kamil, S.2
Williams, S.3
Oliker, L.4
Shalf, J.5
Yelick, K.6
-
20
-
-
1242352441
-
Optimization and profiling of the cache performance of parallel lattice boltzmann codes in 2d and 3d
-
T. Pohl, M. Kowarschik, J. Wilke, K. Iglberger, and U. Rde, "Optimization and profiling of the cache performance of parallel lattice boltzmann codes in 2d and 3d, "PARALLEL PROCESSING LETTERS, vol. 13, no. 4, pp. 549-560, 2003.
-
(2003)
Parallel Processing Letters
, vol.13
, Issue.4
, pp. 549-560
-
-
Pohl, T.1
Kowarschik, M.2
Wilke, J.3
Iglberger, K.4
Rde, U.5
-
21
-
-
34250216007
-
Scientific computing kernels on the cell processor
-
S. Williams, J. Shalf, L. Oliker, S. Kamil, P. Husb, and K. Yelick, "Scientific computing kernels on the cell processor, "International Journal of Parallel Programming, vol. 35, p. 2007, 2007.
-
(2007)
International Journal of Parallel Programming
, vol.35
, pp. 2007
-
-
Williams, S.1
Shalf, J.2
Oliker, L.3
Kamil, S.4
Husb, P.5
Yelick, K.6
-
23
-
-
79953269601
-
Efficient multicore-aware parallelization strategies for iterative stencil computations
-
Submitted to, vol. abs/1004.1741
-
J. Treibig, G. Wellein, and G. Hager, "Efficient multicore-aware parallelization strategies for iterative stencil computations, "Submitted to Computing Research Repository (CoRR), vol. abs/1004.1741, 2010.
-
(2010)
Computing Research Repository (CoRR)
-
-
Treibig, J.1
Wellein, G.2
Hager, G.3
-
24
-
-
77951435761
-
Accelerating lattice boltzmann fluid flow simulations using graphics processors
-
Vienna, Austria
-
P. Bailey, J. Myre, S. Walsh, D. Lilja, and M. Saar, "Accelerating lattice boltzmann fluid flow simulations using graphics processors, "in ICPP- 2009: 38th International Conference on Parallel Processing, Vienna, Austria, 2009.
-
(2009)
ICPP- 2009: 38th International Conference on Parallel Processing
-
-
Bailey, P.1
Myre, J.2
Walsh, S.3
Lilja, D.4
Saar, M.5
-
25
-
-
70449378728
-
Implementing the lattice boltzmann model on commodity graphics hardware
-
June
-
A. Kaufman, Z. Fan, and K. Petkov, "Implementing the lattice boltzmann model on commodity graphics hardware, " Journal of Statistical Mechanics: Theory and Experiment, vol. 2009, June 2009.
-
(2009)
Journal of Statistical Mechanics: Theory and Experiment
, vol.2009
-
-
Kaufman, A.1
Fan, Z.2
Petkov, K.3
-
26
-
-
77949484883
-
Lbm based flow simulation using gpu computing processor
-
mesoscopic Methods in Engineering and Science, International Conferences on Mesoscopic Methods in Engineering and Science. [Online]. Available
-
F. Kuznik, C. Obrecht, G. Rusaouen, and J.-J. Roux, "Lbm based flow simulation using gpu computing processor, " Computers & Mathematics with Applications, vol. 59, no. 7, pp. 2380 - 2392, 2010, mesoscopic Methods in Engineering and Science, International Conferences on Mesoscopic Methods in Engineering and Science. [Online]. Available: http://www.sciencedirect.com/ science/article/B6TYJ-4X9D5D0-3/2/9e7676667251dd6bdc7ea63fbc0232a8.
-
(2010)
Computers & Mathematics with Applications
, vol.59
, Issue.7
, pp. 2380-2392
-
-
Kuznik, F.1
Obrecht, C.2
Rusaouen, G.3
Roux, J.-J.4
-
27
-
-
51849160421
-
Parallel lattice boltzmann flow simulation on emerging multi-core platforms
-
Berlin, Heidelberg: Springer-Verlag
-
L. Peng, K.-I. Nomura, T. Oyakawa, R. K. Kalia, A. Nakano, and P. Vashishta, "Parallel lattice boltzmann flow simulation on emerging multi-core platforms, "in Euro-Par'08: Proceedings of the 14th international Euro-Par conference on Parallel Processing. Berlin, Heidelberg: Springer-Verlag, 2008, pp. 763-777.
-
(2008)
Euro-Par'08: Proceedings of the 14th International Euro-Par Conference on Parallel Processing
, pp. 763-777
-
-
Peng, L.1
Nomura, K.-I.2
Oyakawa, T.3
Kalia, R.K.4
Nakano, A.5
Vashishta, P.6
-
28
-
-
67349120241
-
Implementation of a latticeboltzmann method for numerical fluid mechanics using the nvidia cuda technology
-
E. Riegel, T. Indinger, and N. A. Adams, "Implementation of a latticeboltzmann method for numerical fluid mechanics using the nvidia cuda technology, "Computer Science - Research and Development, vol. 23, no. 3-4, pp. 241-247, 2009.
-
(2009)
Computer Science - Research and Development
, vol.23
, Issue.3-4
, pp. 241-247
-
-
Riegel, E.1
Indinger, T.2
Adams, N.A.3
-
29
-
-
72149122150
-
Implementation of a lattice boltzmann kernel using the compute unified device architecture developed by nvidia
-
J. Tolke, "Implementation of a lattice boltzmann kernel using the compute unified device architecture developed by nvidia, "Comput. Vis. Sci., vol. 13, no. 1, pp. 29-39, 2009.
-
(2009)
Comput. Vis. Sci.
, vol.13
, Issue.1
, pp. 29-39
-
-
Tolke, J.1
-
30
-
-
80052538275
-
When multicore isn't enough: Trends and the future for multi-multicore systems
-
M. Reilly, "When multicore isn't enough: Trends and the future for multi-multicore systems, "in HPEC, 2008.
-
(2008)
HPEC
-
-
Reilly, M.1
-
32
-
-
35948991669
-
-
NVIDIA, [Online]. Available
-
NVIDIA, "NVIDIA CUDA TM Programming Guide, Version 3.0, "2010. [Online]. Available: http://download.intel.com/pressroom/kits/32nm/westmere/ Intel32nmOverview.pdf.
-
(2010)
NVIDIA CUDA TM Programming Guide, Version 3.0
-
-
-
33
-
-
84976718540
-
Algorithms for scalable synchronization on shared-memory multiprocessors
-
J. M. Mellor-Crummey and M. L. Scott, "Algorithms for scalable synchronization on shared-memory multiprocessors, "ACM Trans. Comput. Syst., vol. 9, no. 1, pp. 21-65, 1991.
-
(1991)
ACM Trans. Comput. Syst.
, vol.9
, Issue.1
, pp. 21-65
-
-
Mellor-Crummey, J.M.1
Scott, M.L.2
|