|Computer Graphics and Gaming|
Smoothed Particle Hydrodynamics
I wrote a fluid simulation based on Smoothed Particle Hydrodynamics using PipelineKit to implement it in OpenCL and DirectCompute. Visualization is by Saif Ali. Here it is running on an AMD/ATI graphics card with 32,768 particles.
The Association for Computing Machinery has convened an annual symposium on video games to study their design, technology, and impact. I had the honor of co-chairing the program for the first two years.
When modern shader programs are compiled to run on Graphical Processing Units (GPUs) they may exceed the resources (registers, instruction count, texture requests) available in the native hardware. This can be resolved by virtualizing the resources over multiple passes of the algorithm. The problem of partitioning a shader optimally into separate passes is the Multi-pass Shader Partitioning Problem. The first approaches to this problem [1,2] depend on the heuristic of minimizing the number of shader passes. This heuristic was correct for PCs of the time but not appropriate to the high speed multipass architecture we studied. These approaches were non-scalable with runtimes of O(n3) and O(n4).
Subsequent approaches to this problem [3,4] treated it as an instance of the Job Shop Scheduling Problem. Riffel et al  used a List Scheduling algorithm which is scalable and produces optimal results for the case of linear objective functions. However this can produce suboptimal results in the case of non-linear objective functions such as are necessary for modeling the behavior of caches. My work  uses a Dynamic Programming algorithm which finds optimal solutions in the presence of nonlinear objective functions. Empirical results on a small but representative test set indicates that performance is scalable for typical cases. This should not be surprising since the Multipass Shader Partitioning Problem is simply a subset of the problem of instruction selection. Since Dynamic Programming is a popular solution for instruction selection it would be surprising if it failed to work for this problem.
-  Proceedings of Graphics Hardware (2002). Efficient Partitioning of Fragment Shaders for Multipass Rendering on Programmable Graphics Hardware. Eric Chan, Ren Ng, Pradeep Sen, Kekoa Proudfoot, and Pat Hanrahan.
-  Proceedings of Graphics Hardware (2004). Efficient Partitioning of Fragment Shaders for Multiple-Output Hardware. Tim Foley, Mike Houston and Pat Hanrahan.
-  Proceedings of Graphics Hardware (2004). Mio: Fast Multipass Partitioning via Priority-Based Instruction Scheduling. Andrew Riffel, Aaron E. Lefohn, Kiril Vidimce, Mark Leone, and John D. Owens.
-  Proceedings of Graphics Hardware (2005). Optimal Automatic Multi-pass Shader Partitioning by Dynamic Programming. A. Heirich.
Click here for the slides of my presentation.
The Eurographics Association holds this symposium on parallel graphics and visualization in even numbered years. It is a companion to the IEEE symposium with a similar name that is held in the USA in odd numbered years. (See for example IEEE PVG2001 symposium). This year demonstrated the relentless march of visualization clusters and was dominated by applications in science and engineering. I served as program co-chair along with Bruno Raffin and Luis Paul Santos.
I used to work in Research and Development at Sony Computer Entertainment. There I worked on the PlayStation 3 computer entertainment system which contains the revolutionary Cell processor. The Cell is a streaming architecture which in the PS3 is coupled to an NVIDIA GPU. Here is a paper about one piece of applications research that I did along with Louis Bavoil.
- Sony North American Technical Symposium (2006). Deferred Pixel Shading on the PLAYSTATION 3. A. Heirich and L. Bavoil. Awarded "Top paper".
Composited Real-time Soft Shadows
This is a technique for distributing the lighting in a scene across rendering nodes of a graphics cluster. There is no limit to the number of lights and this technique can generate complex soft shadows and other effects. Images by Michael Isard.
- Parallel Computing (2003). Distributed Rendering of Interactive Soft Shadows. M. Isard, M. Shand and A. Heirich. Parallel Computing, vol. 29, no. 3, March 2003, pp. 311-323.
Image Based Real-time Soft Shadows
This is a pair of fast image based techniques (one real time, the other near real time) to capture soft shadowing, including self shadowing, for complex geometries. Images by Ravi Ramamoorthi.
- Proceedings of ACM SIGGRAPH (2000). Efficient image-based methods for rendering soft shadows. R. Ramamoorthi, M. Agrawala, L. Moll and A. Heirich.
Parallel Ray Tracing with PC clusters
These papers addressed workload distribution and scaling issues in parallel Monte Carlo ray tracing on large and small clusters. At the time the work achieved breakthrough scalability as a result of dynamic load balancing. They were byproducts of my PhD research and are co-authored with my advisor Jim Arvo. This work was supported by the Cornell program of computer graphics and the Caltech Center for Advanced Computing Research.
- The Journal of Supercomputing. A Competitive Analysis of Load Balancing Strategies for Parallel Ray Tracing. A. Heirich and J. Arvo, vol. 12, no. 1/2, pp. 57-68 (1998).
- The International Journal for Advances in Engineering Software (1998). Parallel Radiometric Image Synthesis. A. Heirich and J. Arvo, vol. 29, no. 3-6, July 1998, pp. 283-288.
- Parallel Computing. Scalable Monte Carlo Image Synthesis. A. Heirich and J. Arvo, vol. 23 no. 7, pp. 845-859 (1997).
- Proceedings of the 6th Eurographics Workshop on Programming Paradigms in Computer Graphics, F. Arbab & Ph. Slusallek (ed.) Parallel Rendering with an Actor Model. A. Heirich and J. Arvo, pp. 115-125 (1997).
- Proceedings of Eurographics workshop: Parallel Graphics and Visualization. Scalable Photo-Realistic Rendering of Complex Scenes. A. Heirich and J. Arvo (1996).
|Parallel Computing and Scientific Visualization|
Sepia Scalable Graphics Cluster
(aka HP Scalable Visualization Array)
I architected (with some help from Bob Horst) the worlds most scalable compositor-based commodity graphics cluster. Hardware and initial software were developed by Laurent Moll and Mark Shand. Santiago Lombeyda developed a proof-of-concept scalable volume renderer that produced these images of the Rayleigh-Taylor instability. In 2002 this project was awarded $5M by the U.S. Department of Energy ASC program to develop a commercial product for HP. Subsequent developments in GPU technology have rendered this hardware-based approach uneconomical, however it provided higher performance than software based alternatives.
- IEEE Visualization 2002. Workshop on commodity-based visualization clusters (presentation October 27, 2002). Alpha/Depth Acquisition Through DVI. A. Heirich, M. Shand, E. Oertli, G. Lupton and P. Ezolt.
- IEEE Parallel and Large-Data Visualization and Graphics Symposium (2001). Scalable Interactive Volume Rendering Using Off-the-Shelf Components. S. Lombeyda, L. Moll, M. Shand, D. Breen and A. Heirich
- IEEE Parallel Visualization and Graphics Symposium (1999). Scalable Distributed Visualization Using Off-the-Shelf Components. A. Heirich and L. Moll.
- IEEE Symposium on Field Programmable Custom Computing Machines (1999). Sepia: Scalable 3D Compositing Using PCI Pamette. L. Moll, A. Heirich, and M. Shand.
|IEEE PVG2001 symposium|
Scaling to New Heights
In 2002 the US National Science Foundation held a workshop on computational science at the Pittsburgh Supercomputing Center. I spoke on the subject of Scalability in Scientific Visualization. You can download the accompanying whitepaper here. (Image by Santiago Lombeyda).
Some early publications
- The International Journal for Foundations of Computer Science (1997). “A Scalable Diffusion Algorithm for Dynamic Mapping and Load Balancing on Networks of Arbitrary Topology.” A. Heirich, vol. 8, no. 3, September 1997, pp. 329-346.
- In proceedings of the International Conference on Parallel Processing (1995). “A Parabolic Load Balancing Method.” A. Heirich and S. Taylor, vol. III, pp. 192-202. Winner of "outstanding paper of the year".
- In Connectionist Models (proceedings of the 1990 summer school), eds. Touretzky, Elman, Sejnowski and Hinton. “Neuronal Signal Strength Is Enhanced By Rhythmic Firing.” A. Heirich and C. Koch, pp. 369-378.
- Proceedings of ACM Computer Science Conference (1986). “UIG: a User Interface Generator.” A. Heirich
Second ACM Sandbox Symposium on videogames, San Diego, USA, August 2007 (program co-chair). First ACM Sandbox Symposium on videogames, Boston, USA, August 2006 (program co-chair). Eurographics Sixth Symposium on Parallel Graphics and Visualization, Braga, Portugal, May 2006 (symposium co-chair). Eurographics Fifth Symposium on Parallel Graphics and Visualization, Grenoble, France, 2004. Eurographics Fourth Workshop on Parallel Graphics and Visualization, Blaubeuren, Germany, 2002. IEEE Symposium on Parallel and Large-Data Visualization and Graphics(PVG'01), San Diego, CA, October 22-23 2001 (symposium co-chair). Eurographics Third Workshop on Parallel Graphics and Visualization, Girona, Spain, 2000. IEEE Symposium on Parallel Visualization and Graphics(PVG’99), San Francisco, California, 1999. IEEE Parallel Rendering Symposium, Phoenix, Arizona, 1997. NASA fourth national symposium on large-scale analysis and design on high-performance computers and workstations, Williamsburg, Virginia, 1997. ISATA dedicated conference on simulation, diagnosis and virtual reality applications in the automotive industry, Florence, Italy, June 1997. ISATA dedicated conference on computational fluid dynamics and supercomputing in the automotive industry, Florence, Italy, June 1996. Intel Supercomputer User Group, Albuquerque, NM, 1995.