HOME
ACTIVITIES
PERSONAL
PHD
|
 |
|
|
|
Arastra
|
|
I am excited to be
working at Arastra, a networking
startup founded by Sun co-counder
Andreas
Bechtolsheim and Stanford University professor
David Cheriton.
We are making the next generation of switches for the 10-gigabit
Ethernet market.
|
|

|
Stream Processing |
|
Stream Processors Incorporated
is a startup founded by Stanford
University professor and computer architecture pioneer
Bill Dally. Stream processing is a reaction to the growth of
the memory wall as processors have increasingly outpaced the latency and
bandwidth of memory systems. It optimizes performance by applying
large scale
stream
parallelism to data-parallel computations.
|
|

The
Association for Computing Machinery has convened an annual symposium on
video games to study their design, technology, and impact. I had
the honor of co-chairing the program for the first two years.
|
|
Multi-pass Shader Partitioning
When modern
shader programs are compiled to run on Graphical Processing Units (GPUs)
they may exceed the resources (registers, instruction count, texture
requests) available in the native hardware. This can be resolved
by virtualizing the resources over multiple passes of the algorithm.
The problem of partitioning a shader optimally into separate passes is
the Multi-pass Shader Partitioning Problem. The first
approaches to this problem [2,3] used a greedy strategy but this
suffered from attraction to local minima and tended to produce unsatisfactory
results. The results could sometimes be improved by using the
heuristic of minimizing the number of shader passes. This
heuristic was useful when compiling for PCs but unfortunately it was not
relevant to the high performance architecture that we studied. These approaches were non-scalable with runtimes of O(n3)
and O(n4). Subsequent approaches [4,1] formulated the
problem as an instance of the Job Shop Scheduling Problem.
[4] used a List Scheduling algorithm which is scalable and
produces optimal results for the case of linear objective functions.
However this can produce suboptimal results in the case of non-linear
objective functions such as are necessary for modeling the behavior of
caches. [1] uses a Dynamic Programming algorithm which finds
optimal solutions in the presence of nonlinear objective functions, and
is scalable for typical cases but has the potential for bad worst-case
complexity. It remains to be determined whether worst-case
instances will actually occur when processing DAGs produced by shading
language compilers, and there are strong reasons to believe that they
may not occur.
- Proceedings of
Graphics Hardware (2005).
Optimal Automatic Multi-pass Shader
Partitioning by Dynamic Programming. A. Heirich.
Click here
for the slides of my presentation.
- Proceedings of
Graphics Hardware
(2002).
Efficient Partitioning of Fragment Shaders for Multipass Rendering on
Programmable Graphics Hardware.
Eric Chan,
Ren Ng,
Pradeep Sen,
Kekoa Proudfoot,
and
Pat Hanrahan.
-
- Proceedings of
Graphics Hardware
(2004).
Mio: Fast Multipass Partitioning via Priority-Based Instruction
Scheduling.
Andrew Riffel,
Aaron E. Lefohn, Kiril Vidimce, Mark Leone, and John D. Owens.
|
 |
Eurographics
PGV'06 symposium
The
Eurographics Association holds this symposium on parallel graphics and
visualization in even numbered years. It is a companion to the
IEEE symposium with a similar name that is held in the USA in odd
numbered years. (See for example IEEE PVG2001
symposium). This year demonstrated the relentless march of
visualization clusters and was dominated by applications in science and
engineering. I served as program co-chair along with
Bruno
Raffin and
Luis Paul
Santos. |
 |
PlayStation
I used to work in
Research and Development at Sony Computer Entertainment. There I
worked on the PlayStation 3 computer entertainment system which contains
the revolutionary Cell processor. The Cell is a streaming
architecture which in the PS3 is coupled to an NVIDIA GPU. Here is
a paper about one piece of applications research that I did along with
Louis Bavoil. |
|
|
|
Composited
Real-time Soft Shadows
This is a technique
for distributing the lighting in a scene across rendering nodes of a
graphics cluster. There is no limit to the number of lights and this
technique can generate complex soft shadows and other effects.
Images by
Michael Isard. |
 |
-
Parallel Computing (2003). Distributed Rendering of
Interactive Soft Shadows. M. Isard, M. Shand and A. Heirich.
Parallel Computing, vol. 29, no. 3, March 2003, pp. 311-323.
|
|
Image Based
Real-time Soft Shadows
This is a pair of fast
image based
techniques (one real time, the other near real time) to capture soft
shadowing, including self shadowing, for complex geometries. Images by
Ravi
Ramamoorthi. |
 |
- Proceedings of
ACM SIGGRAPH (2000). Efficient image-based
methods for rendering soft shadows. R. Ramamoorthi, M.
Agrawala, L. Moll and A. Heirich.
|
|
|
Parallel
Ray Tracing with PC clusters
These papers
addressed workload distribution and scaling issues in parallel Monte
Carlo ray
tracing on large and small clusters. At the time the work achieved
breakthrough scalability as a result of dynamic load balancing. They were byproducts of my PhD research and
are co-authored with my advisor
Jim Arvo.
This work was supported by the
Cornell
program of computer graphics and the
Caltech Center
for Advanced Computing Research. |
 |
-
The Journal of Supercomputing. A Competitive Analysis of
Load Balancing Strategies for Parallel Ray Tracing. A.
Heirich and J. Arvo, vol. 12, no. 1/2, pp. 57-68 (1998).
-
The International Journal for Advances in Engineering Software
(1998). Parallel Radiometric Image
Synthesis. A. Heirich and J. Arvo, vol. 29, no. 3-6, July
1998, pp. 283-288.
-
Parallel Computing.
Scalable Monte Carlo Image Synthesis. A. Heirich and J.
Arvo, vol. 23 no. 7, pp. 845-859 (1997).
-
Proceedings of
the 6th
Eurographics Workshop
on Programming Paradigms in Computer Graphics, F. Arbab & Ph.
Slusallek (ed.) Parallel Rendering with an Actor Model.
A. Heirich and J. Arvo, pp. 115-125 (1997).
-
Proceedings of
Eurographics workshop:
Parallel Graphics and Visualization. Scalable Photo-Realistic
Rendering of Complex Scenes. A. Heirich and J. Arvo (1996).
|
|
Sepia Scalable Graphics
Cluster
I
architected (with some help from Bob Horst) the worlds most scalable
compositor-based commodity graphics cluster. Hardware and initial
software were developed by
Laurent Moll
and
Mark Shand.
Santiago Lombeyda developed a proof-of-concept scalable volume
renderer that produced
these images of the
Rayleigh-Taylor instability. In 2002
this project was
awarded
$5M by the
U.S.
Department of Energy ASC program to develop a commercial
product for HP. Subsequent developments in GPU technology have
rendered this hardware-based approach uneconomical, however it provided
higher performance than software based alternatives. |
- IEEE
Visualization 2002. Workshop on
commodity-based visualization clusters (presentation October 27,
2002). Alpha/Depth Acquisition Through DVI. A.
Heirich, M. Shand, E. Oertli, G. Lupton and P. Ezolt.
-
IEEE Parallel and Large-Data Visualization and Graphics Symposium
(2001). Scalable Interactive Volume Rendering Using
Off-the-Shelf Components. S. Lombeyda, L. Moll, M. Shand,
D. Breen and A. Heirich
- IEEE
Parallel Visualization and Graphics Symposium
(1999). Scalable Distributed Visualization Using
Off-the-Shelf Components. A. Heirich and L. Moll.
- IEEE
Symposium on Field Programmable Custom Computing Machines
(1999). Sepia: Scalable 3D Compositing Using PCI Pamette.
L. Moll, A. Heirich, and M. Shand.
|
 |
IEEE PVG2001 symposium
In
2001 I co-chaired (with
David Breen and Anton Koning) this bi-annual IEEE
symposium on parallel visualization and graphics. The proceedings
have 17 very good papers with a keynote by Celera Genomics. The
papers lean toward US DOE terascale visualization and represent a lively
interacting community of researchers. It includes a paper that
describes the Sepia graphics cluster applied to scalable volume
rendering of regular grids. |
Some early publications |
-
The International Journal for Foundations of
Computer Science (1997). “A
Scalable Diffusion Algorithm for Dynamic Mapping and Load Balancing on
Networks of Arbitrary Topology.” A. Heirich, vol. 8, no. 3,
September 1997, pp. 329-346.
-
In proceedings of the International Conference
on Parallel Processing (1995). “A
Parabolic Load Balancing Method.” A. Heirich and S. Taylor, vol.
III, pp. 192-202. Winner of
"outstanding paper of the year".
-
Proceedings of ACM Computer Science Conference
(1986). “UIG: a User Interface
Generator.” A. Heirich
|
Here is a current list of my patents granted by the United States patent office.
 |
Second ACM Sandbox
Symposium on videogames, San Diego, USA, August 2007 (program co-chair). |
 |
First ACM Sandbox Symposium
on videogames, Boston, USA, August 2006 (program co-chair). |
 |
Eurographics Sixth
Symposium on Parallel Graphics and Visualization, Braga, Portugal, May 2006
(symposium co-chair). |
 |
Eurographics Fifth
Symposium on Parallel Graphics and Visualization, Grenoble, France,
2004. |
 |
,
Blaubeuren, Germany, 2002.
|
 |
(PVG'01), San Diego, CA, October 22-23 2001 (symposium co-chair).
|
 |
,
Girona, Spain, 2000.
|
 |
(PVG’99), San Francisco, California, 1999.
|
 |
IEEE Parallel Rendering Symposium, Phoenix, Arizona,
1997.
|
 |
NASA fourth national symposium on large-scale
analysis and design on high-performance computers and workstations,
Williamsburg, Virginia, 1997.
|
 |
ISATA dedicated conference on simulation, diagnosis
and virtual reality applications in the automotive industry, Florence,
Italy, June 1997.
|
 |
ISATA dedicated conference on computational fluid
dynamics and supercomputing in the automotive industry, Florence, Italy,
June 1996.
|
 |
Intel Supercomputer User Group, Albuquerque, NM,
1995.
|
|