Biography

Native of Berkeley, California, raised in Ann Arbor, Michigan. Educated at California Institute of Technology (PhD, MS), University of California, San Diego (MS), and University of Michigan (BA with Honors in Computer Science). 

My Erdos number is four (via Christof Koch). I was born in Berkeley, California during the Free Speech Movement, and grew up in Ann Arbor, Michigan where I attended high school and college. I spent many years in graduate education where I had remarkable experiences with some amazing people. In between I worked in the computer industry where I have held senior positions at the largest and smallest of companies and worked as a consultant. 

I am a Graduate Fellow of the U.S. National Science Foundation and a reviewer for Mathematical Reviews. I was lucky enough to first encounter computers in 1973 at the legendary People’s Computer Company in Menlo Park, California. Although I have often been interested in other things I have never gotten away from computers. I’m credited here on one of the earliest online multiplayer games, SPACE for the time shared HP 2000 series minicomputer (1975). When I was 15 I appeared as a guest magician on the Bozo the Clown show on CKLW-TV in Windsor, Ontario, Canada. I participated in developing an early commercial scientific supercomputer (1980), a research machine with thousands of integrated processors (1993), and I architected the worlds most scalable and powerful graphics and visualization cluster which became a product for HP (2002). 

Today I work on artificial inrelligence and on exascale computers with hundreds of thousands of cores. I’ve been a Linux user since 1994.  

You may reach me by email: heirich at alumni dot caltech dot edu

 

 

 

A brief conversation with Donald Knuth

 

March 26, 2024 

Alan Heirich 2024 Touraine Lane Half Moon Bay, CA 94019 

Dear Professor Knuth, 

Thank you for your kindness in responding to my letters in the past. Some of them were a bit silly. This letter is not silly and I would appreciate your feedback. FYI my PhD is from Caltech in computer science. Professor Joel Franklin told me that I reminded him of you. I believe the reason was my spiritual focus. 

I am having a discussion with my friend Professor Alex Aiken of Stanford CS on the subject of bifurcation theory and automatic programming. My contention is that bifurcation theory (the qualitative theory of differential equations) yields, in principle, a mechanism for automatic programming based on specification of fixed points of an algorithm. It has become popular with the advent of deep learning to regard computations as gradient descent procedures in an energy landscape. This paradigm can be considered a sort of Universal Turing Machine, with each energy landscape corresponding to a different Turing Machine. The correctness of a Turning Machine depends on whether the fixed points meet the requirements of the problem. 

In the case of deep learning this is the question of whether a neural network makes a correct classification. But this is a very general idea, and we can consider the values of the fixed points to directly represent problem solutions. In algorithm development we typically use human intuition to formulate a procedure, prove that it converges to a fixed point, and then prove that the fixed point satisfies the requirements of the problem. Wouldn’t it be great if we could skip the first two parts of this process and only specify the properties of the fixed points? 

I believe that bifurcation theory in principle provides the knowledge needed for this. Bifurcation theory is the study of how small parameter changes in a system lead to topological changes in the stable points of the system. In term of UTMs and energy landscapes this is the study of how new fixed points appear in an energy landscape. With the theory of codimension-1 bifurcations of equilibria in ordinary differential equations it is possible to precisely engineer a desired energy landscape. This makes it possible, in principle, to engineer a TM that solves a desired problem by formulating a definition of the fixed points of the problem.

 Bifurcation theory has been automated by such people as Professor Herbert Keller with whom I worked at Caltech on automated bifurcation analysis of Navier-Stokes equations. This is not restricted to natural systems, bifurcation analysis can be applied to artificial systems as constructed by computer scientists. Bifurcation theory is incomplete and is not a popular research topic today. For example it is currently unknown whether a codimension-N bifurcation is equivalent to a sequence of codimention-1 bifurcations. This question is important if we want to automate the process of searching through bifurcations on a digital computer. 

Thank you for listening to these ideas. If you feel they have a kernel of merit could you please: (1) drop a note of support to Alex Aiken (2) tell me if you know anyone among your vast contacts who might be interested in advising me in research in this area? I am a computer scientist who has spent a lot of time working in applied mathematics. I am now semi-retired. 

Sincerely yours, 

Alan Heirich 


Hi Alan, 

Thanks for your flattering letter. But I'm afraid you have a hyperinflated impression of my knowledge/expertise. The fact is that I have just about zero intuition for topology, differential equations, and suchlike. Including neural networks. 

The people who could advise you might be friends of friends of mine; but I don't have any idea who those missing links might be. Thus I can't help you. 

But at least I can resonate with your comment about currently unfashionable parts of mathematics being potentially much more important than the currently trendy stuff. 

Best wishes! -- 

Don 

This was not quite what I was hoping for. But as with any communication from one of the Gods of computer science we have to look for a teaching. In this case what I learn is that certain mathematical subjects are not of interest to the Gods of computer science. 

This could mean that those topics are not relevant to computer science. But I prefer to believe that these topics are rich sources of opportunity for fundamental discovery and are under appreciated. 

This reminds me of an experience I had with Edsger Dijkstra. I completed an MS thesis at Caltech that used differential equations to solve classical optimization problems in parallel computing. The algorithm used diffusion and due to a prior paper by Dijkstra (“Termination detection of diffusing computations”) I thought he might be interested. I wrote to him and received the reply that he did not understand the necessary mathematics to understand my paper. 

So: (topology, bifurcation theory, differential equations) irrelevant to computer science? Or a bold new frontier?  I think the latter,

 

Some challenges for modern AI

 Modern AI in the form of convolutional neural networks has been very successful at learning maps between high dimensional spaces. Maps can have impressive dynamics, such as are exhibited by real time robotic controllers trained by reinforcement learning.   Note that these remarks may be somewhat dated in the era of generative AI.

Our brains and conscious experience have dynamics too but these are not captured by the current generation of machine learning models. One of the best accounts of these internal dynamics comes from the writings of William James. William James was an influential 19th century philosopher and psychologist. He wrote without the benefit of modern neuroscience but his introspective descriptions still outpace both neuroscientific understanding and AI models. 

Here then are some characteristics of our human consciousness taken from “Psychology: The Briefer Course” first published in 1892. 

 

1. Associative Memory The ability to retrieve memories based on similarity or arbitrary association is central to our flow of ideas. This is outside the scope of current AI models. Here is James’ lucid description of his own internal flow of associations (from Association: Partial Recall): “...after looking at my clock just now (1879), I found myself thinking of a recent resolution in the Senate about our legal-tender notes. The clock called up the image of the man who had repaired its gong. He suggested the jeweler’s shop where I had last seen him; that shop, some shirt-studs which I had bought there; they, the value of gold and its recent decline; the latter, the equal value of greenbacks, and this, naturally, the question of how long they were to last, and of the Bayard proposition.” 

This is something that you can observe in yourself, with practice. Allow your old thoughts to linger as your attention turns to new ones. With time you will be able to recreate your own chain of associations. You will be surprised at what you learn about yourself. 

2. Attention  (Note this was written before trasnsformers) It’s clear that living beings use an attentional mechanism to narrow the scope of their mental processes. Convolutional networks have no such concept. “It is a feeling which everyone knows, but which most people would call quite indescribable. We get it in the sensorial sphere whenever we seek to catch an impression of extreme faintness, be it of sight, hearing, taste, smell, or touch; we get it whenever we seek to discriminate a sensation merged in a mass of others that are similar; we get it whenever we resist the attractions of more potent stimuli and keep our mind occupied with some object that is naturally unimpressive. We get it in the intellectual sphere under exactly similar conditions: as when we strive to sharpen and make distinct an idea which we but vaguely seem to have; or painfully discriminate a shade of meaning from its similars; or resolutely hold fast to a thought so discordant with our impulses that, if left unaided, it would quickly yield place to images of an exciting and impassioned kind.” (from Attention: Voluntary Attention) 

3. Ego There must be some sense of “I”. Our minds are constantly evaluating, unconsciously and consciously, how our environment affects us. “Each mind, to begin with, must have a certain minimum of selfishness in the shape of instincts of bodily self-seeking in order to exist. The minimum must be there as a basis for all farther conscious acts, whether of self-negation or of a selfishness more subtle still. All minds must have come, by way of survival of the fittest, if by no directer path, to take an intense interest in the bodies to which they are yoked, altogether apart from any interest in the pure Ego which they also possess.” (From The Self: Teleological Uses of Self Interest) 

4. The concept of free will James believes our thought process is a result of cerebral activity but he identifies a special capability of choosing to apply attention (from Attention: Attention and Free Will): “No object can catch our attention except by the neural machinery. But the amount of attention which an object receives after it has caught our mental eye is another question. It often takes an effort to keep the mind upon it. We feel that we can make more or less of the effort as we choose... it will deepen and prolong the stay in consciousness of innumerable ideas which else would fade more quickly away.” Despite this mechanistic understanding James felt the question of free will was undecidable by science.

 

Isaac Newton

Data aequatione quotcunque fluentes quantitae involvente fluxiones invenire et vice versa. (Isaac Newton) 

In contemporary mathematical language this means “It is useful to solve differential equations” (Arnold & Levi [1]). Sir Isaac Newton considered this his most fundamental discovery, one so explosive that he published it only in the form of an anagram. 

([1] Geometrical Methods in the Theory of Ordinary Differential Equations (1983) by V.I. Arnold, Mark Levi) 

In the 21st century we routinely solve differential and integral equations on computers for problems in science, engineering and entertainment. Newton thought this concept was explosive (yet he didn’t mind publishing on theology and alchemy). Why? Maybe it was the potential power that comes from being able to assemble a working system from a lot of small pieces. Today we know that molecules assemble into DNA which gives rise to life, and ultimately to consciousness and intelligence. Which gives rise to culture, technology and systematized knowledge, which in turn influence DNA by affecting natural selection, leading to a creative cycle. (Or perhaps a destructive cycle, the story is still being written).  

 

 

Projects and patents 

 

smoothedparticlehydrodynamics2

Smoothed Particle Hydrodynamics Programmed in OpenCL running on an AMD graphics card with 32K particles. Visualization by Saif Ali. 

Method and System for Feasibility-based Operation of an Autonomous Agent 5 Inventors: Alan Heirich, May Mobility Sajan Patel, May Mobility Mitesh Agrawal, May Mobility Ahmed El Shaarany, May Mobility Edwin Olson, May Mobility 

The method can include: receiving a set of inputs; determining a set of policies based on the set of inputs; determining a set of scores associated with the set of environmental policies; and evaluating the set of policies. Additionally or alternatively, the method can include operating the ego agent according to a selected policy and/or any other processes. The method functions to facilitate scoring of policies based on 'feasibility' for agents in an environment. Additionally or alternatively, the method can function to facilitate autonomous operation of a vehicle (e.g. based on policy-feasibility of agents in the environment). Additionally or alternatively, the method can function to facilitate intention estimation for agents in an environment. 

Render graph compilation method and use thereof for low-latency execution. United states 9,223,551 Issued December 29, 2015 

A graph is compiled that defines a data flow from input(s) to output(s) for images. The data flow includes one or more filters to be applied to the images. 

Compiling the graph includes forming an assemblage of kernel invocations for the data flow and forming a mapping between kernel invocations in code for the one or more filters and the assemblage of kernel invocations. For multiple ones of a number of frames of images, code in the one or more filters is executed, data is passed into the assemblage to indicate which execution path in the assemblage should be chosen from among a plurality of possible execution paths for one of the filters, wherein the data is determined using at least the mapping and the executing code, and kernel invocations in the indicated execution path are executed. 

Methods, apparatus, and computer program products are disclosed. 

Fast automatic image filter tuning using machine learning algorithm United States 9,524,541 Issued December 20, 2016 

Disclosed herein is a method. A set of pre-adjusted images including source images of the pre-adjusted images is provided. The pre-adjusted images include image filter parameters. Histogram information for each of the source images is computed. A learning algorithm is applied to the set. The learning algorithm is configured to map the histogram information to the image filter parameters. A... 

Render Tree Caching United States US 8,970,613 Issued March 3, 2015 

GPU fragment programs can be used to render images in a computer system. These fragment programs are generated from render trees, which specify one or more filters or functions to be applied to an input image to render an output image. It is not uncommon for successive frames to require application of substantially the same filters. Therefore, rather than regenerate and recompile new fragment... 

Cone-culled soft shadows United States US 12/793,357 Issued July 17, 2012 

Soft shadows in computer graphics images are created by rendering the scene from the camera viewpoint and at least one light viewpoint. The positions of scene fragments and light fragments in the scene are stored. For each scene fragment, a frustum is defined between the position of the scene fragment and the light source. Light fragments are evaluated with respect to the frustum to select light... 

 

Streaming programming generator United States US 12/911,952 Issued April 26, 2012 

A device receives input that includes definitions of components of a computational pipeline, where the components include one or more buffers, one or more kernels, and one or more stages within a control graph. The device generates, based on the input, kernel signatures for a graphics processor, where the kernel signatures compile into an executable streaming program for the computational... 

Interactive debugging and monitoring of shader programs executing on a graphics processor United States US 12/467,796 Issued March 15, 2011 

A development application leverages the programmability of shader execution units in the graphics processing subsystem to make graphics processing subsystem state data accessible to applications executed outside the graphics processing subsystem. The development application modifies shaders to include state output instructions adapted to direct a shader execution unit to copy graphics processing... 

Cone-culled soft shadows United States US 11/418,415 Issued July 13, 2010 

Soft shadows in computer graphics images are created by rendering the scene from the camera viewpoint and at least one light viewpoint. The positions of scene fragments and light fragments in the scene are stored. For each scene fragment, a frustum is defined between the position of the scene fragment and the light source. Light fragments are evaluated with respect to the frustum to select light... 

 

Distributed rendering of interactive soft shadows United States US 10/418,502 Issued May 11, 2010 

The disclosed embodiments relate to a rendering cluster that renders an image of a scene object. The rendering cluster may comprise an illumination node that produces illumination output based on lighting properties of the scene object and a material node that produces material output based on material properties of the scene object. The illumination output is combined with the material output to... 

 

Technique for processing a computer program United States PCT/US2006/027361 Issued May 2, 2007 

The present invention is directed to a method for processing, in a computer system, a computer program having a plurality of operations. The method features calling a dynamic programming routine to generate a schedule for executing a subgroup of the plurality of operations by modeling operations of a computational processor associated with the computer system to minimize a computational cost of... 

Statistical rendering acceleration United States PCT/US2006/018145 Issued January 17, 2007 

Different rendering techniques are selected for portions of a scene based on statistical estimates of the portions' rendering costs. Geometry representing a scene, including static geometry (105) and dynamic geometry (110), is supplied to bounding volume hierarchy generator (120). Bounding volume hierarchy generator (120) generates a bounding volume hierarchy representing the approximate... 

 

Methods and systems for graphically navigating within a debugger program United States PCT/US2005/031701 Issued March 23, 2006 

A method for graphically navigating within a debugger program that enables examination of processing by components of a computing system is provided. In this method, a display of the components is generated that enables navigation to debugging windows of the components. The display and navigation methods may reflect a taxonomic organization of the components with hierarchical and peer... 

Parallel pipelined merge engines United States US 09/264,347 Issued June 22, 2004 

An image generator is organized into a plurality of rendering engines, each of which renders an image of a part scene and provides the part image to a merge engine associated with that rendering engine. The image is a part image in that it usually contains less than all of the objects in the image to be rendered. The merge engine merges the part image from its associated rendering engine with the... 

 

First-order difference compression for interleaved image data in a high-speed image compositor United States US 09/264,348 Issued February 4, 2003 

An encoder accepts an N byte set of values for each of a plurality of image components, with N being greater than one and, for each N byte set of values, identifies a compressed symbol length, K, wherein K is the smallest integer such that the difference between any two adjacent bytes is expressible in K bits or less, outputs an indication of K and outputs a K bit difference between the byte and... 

 

Using irradiance textures for photorealistic image generation United States US 09/264,369 Issued March 19, 2002 An image generator computes a set of light sample points, wherein each light sample point is a point on a light source from a geometric model, and an irradiance image is computed for each light sample point, wherein an irradiance image is a view-dependent image taken with the light sample point being the view point and the light source for the irradiance image. From the irradiance images, the...  

 

Published papers

 Multinode Multi-GPU Two-Electron Integrals: Code Generation Using the Regent Language K.G. Johnson, S. Mirchandaney, E. Hoag, A. Heirich, A. Aiken and T.J. Martinez J. Chem. Theory Comput. 2022 

The computation of two-electron repulsion integrals (ERIs) is often the most expensive step of integral-direct self-consistent field methods. Formally it scales as O(N4), where N is the number of Gaussian basis functions used to represent the molecular wave function. In practice, this scaling can be reduced to O(N2) or less by neglecting small integrals with screening methods. The contributions of the ERIs to the Fock matrix are of Coulomb (J) and exchange (K) type and require separate algorithms to compute matrix elements efficiently. We previously implemented highly efficient GPU-accelerated J-matrix and K-matrix algorithms in the electronic structure code TeraChem. Although these implementations supported the use of multiple GPUs on a node, they did not support the use of multiple nodes. This presents a key bottleneck to cutting-edge ab initio simulations of large systems, e.g., excited state dynamics of photoactive proteins. We present our implementation of multinode multi-GPU J- and K-matrix algorithms in TeraChem using the Regent programming language. Regent directly supports distributed computation in a task-based model and can generate code for a variety of architectures, including NVIDIA GPUs. We demonstrate multinode scaling up to 45 GPUs (3 nodes) and benchmark against hand-coded TeraChem integral code. We also outline our metaprogrammed Regent implementation, which enables flexible code generation for integrals of different angular momenta. In situ Visualization with Task-based Parallelism (In Proceedings of In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization, Denver, CO, USA, November 12–17, 2017 (ISAV’17), This short paper describes an experimental prototype of in situ visualization in a task-based parallel programming framework. A set of reusable visualization tasks were composed with an existing simulation. The visualization tasks include a local OpenGL renderer, a parallel image compositor, and a display task. These tasks were added to an existing fluid-particle-radiation simulation and weak scaling tests were run on up to 512 nodes of the Piz Daint supercomputer. Benchmarks showed that the visualization components scaled and did not reduce the simulation throughput. The compositor latency increased logarithmically with increasing node count. 

Multipass Shader Partitioning by Dynamic Programming (Graphics Hardware 2005) 

The multi pass shader partitioning problem was defined in a pair of papers by Chan et al. Unfortunately these solutions were not scalable and were intended to address large scale problems. The first successful solution to this problem was presented by Riffel et al. The current paper presents another solution based on dynamic programming and argues that it is scalable. After publication of the paper errors were found in the data presented to support this argument. Nonetheless the algorithm is semi-scalable as claimed. Dynamic Programming is commonly used for instruction selection and this problem is an instance of instruction selection. If scalability failed on this problem it would also fail on instruction selection.

[1] Proceedings of Graphics Hardware (2002). Efficient Partitioning of Fragment Shaders for Multipass Rendering on Programmable Graphics Hardware. Eric Chan, Ren Ng, Pradeep Sen, Kekoa Proudfoot, and Pat Hanrahan.

[2] Proceedings of Graphics Hardware (2004). Efficient Partitioning of Fragment Shaders for Multiple-Output Hardware. Tim Foley, Mike Houston and Pat Hanrahan.

[3] Proceedings of Graphics Hardware (2004). Mio: Fast Multipass Partitioning via Priority-Based Instruction Scheduling. Andrew Riffel, Aaron E. Lefohn, Kiril Vidimce, Mark Leone, and John D. Owens.

[4] Proceedings of Graphics Hardware (2005). 

Optimal Automatic Multi-pass Shader Partitioning by Dynamic Programming. A. Heirich.  

Competitive Analysis of Load Balancing Strategies for Parallel Ray Tracing, with James Arvo. The Journal of Supercomputing.

This paper gives a fundamental formula for predicting workload imbalance as a result of static load balancing strategies like tiling and randomization in parallel ray tracing. The results can equally be applied to any parallel algorithm for graphics or image processing. It predicts that these strategies will fail at large numbers of computers, and for NTSC resolution images this was true at 128-way parallelism and above. The solution to this failure is dynamic load balancing such as the diffusion strategy in (papers below).  

The International Journal for Foundations of Computer Science (1997). A Scalable Diffusion Algorithm for Dynamic Mapping and Load Balancing on Networks of Arbitrary Topology. A. Heirich, vol. 8, no. 3, September 1997, pp. 329-346. 

In proceedings of the International Conference on Parallel Processing (1995). A Parabolic Load Balancing Method. A. Heirich and S. Taylor, vol. III, pp. 192-202. 

Winner of "outstanding paper of the year". With special thanks to Andrew Conley. 

 Analysis of scalable algorithms for dynamic load balancing and mapping with application to photo-realistic rendering. (Dissertation)  

Interactive Volume Rendering Scalable Graphics Cluster Supercomputer Architecture 

These papers describe work at HP/Compaq to build a commodity based scalable graphics architecture. The results include world record setting performance and scalability on volume rendering and a commercial product the HP Scalable Visualization Array. A similar project was developed by Stoll et al called Lightning-2. Although it was intended as a sort-last architecture Lightning-2 was not scalable and could not support volume rendering or applications that require ordered blending.

 

 Parallel Computing (2003). Distributed Rendering of Interactive Soft Shadows. M. Isard, M. Shand and A. Heirich. Parallel Computing, vol. 29, no. 3, March 2003, pp. 311-323.  

IEEE Visualization 2002. Workshop on commodity-based visualization clusters (presentation October 27, 2002). Alpha/Depth Acquisition Through DVI. A. Heirich, M. Shand, E. Oertli, G. Lupton and P. Ezolt. IEEE Parallel and Large-Data Visualization and Graphics Symposium (2001). 

Scalable Interactive Volume Rendering Using Off-the-Shelf Components. S. Lombeyda, L. Moll, M. Shand, D. Breen and A. Heirich. IEEE Parallel Visualization and Graphics Symposium (1999). 

Scalable Distributed Visualization Using Off-the-Shelf Components. A. Heirich and L. Moll.  

IEEE Symposium on Field Programmable Custom Computing Machines (1999). Sepia: Scalable 3D Compositing Using PCI Pamette. L. Moll, A. Heirich, and M. Shand.

Scalability in large-data scientific visualization. A. Heirich. Pittsburgh Supercomputing Center (2002).

Interactive soft shadows  

Maneesh Agrawala, Ravi Ramamoorthi and Laurent Moll. Proceedings of ACM SIGGRAPH (2000). Efficient image-based methods for rendering soft shadows. M. Agrawala, R. Ramamoorthi, L. Moll and A. Heirich. 

Neuronal Simulations

In Connectionist Models: Proceedings of the 1990 Summer School (Eds. Touretzky, Elman, Sejnowski & Hinton, pp. 369-374). Neuronal Signal Strength Is Enhanced By Rhythmic Firing. A. Heirich & C. Koch.

 

Editorship

 

 

 

A fast Fielder vector method via spectral partitioning

 

Alan Heirich, May 7, 2021 

In the Perception stage of the May autonomy pipeline we need to partition an affinity graph of LIDAR points. Points with high affinity should be grouped together into individual objects. Points with low affinity should be separated into different objects. 

We do this by recursively partitioning the affinity graph and computing a minimum cut at each step of the recursion. You can read about minimum graph cuts here: wikipedia Minimum cut One method of computing minimum cuts is to compute the Fielder vector of the graph. You can read about Fiedler vectors here: wikipedia Algebraic connectivity. 

In brief, the Fielder vector of a graph G is a particular eigenvector of L(G) where L(G) is the Laplacian matrix of G. The Laplacian matrix is a simple transformation of the adjacency matrix. You can read about Laplacian matrices here: wikipedia Laplacian matrix Specifically the Fiedler vector is the eigenvector that corresponds to the smallest (in magnitude) nonzero eigenvalue of L(G). 

A standard method of computing eigenpairs is the Lanczos algorithm (pronounced Lants-sosh) which you can read about here: wikipedia Lanczos algorithm. We tried using the Lanczos algorithm implemented in the EVSL library (https://www-users.cs.umn.edu/~saad/software/EVSL/index.html). On our matrices of size 5000x5000 partitioning one step using CUDA took many milliseconds. This is much too slow for real time applications. 

Profiling showed that most of the time was spent in one CUDA kernel running for 3 microseconds per instance. We concluded that the EVSL library is intended for much larger matrices where the execution time can effectively amortize the overheads of CUDA kernel invocation. We also believe that other implementations of the Lanczos algorithm would have similar performance issues so a different library would not be likely to give us a real-time method. 

The solution we found was to implement a fast Fiedler vector method based mainly on Power Iteration. You can read about Power Iteration here: wikipedia Power iteration. Power Iteration is based on matrix-vector multiplication which executes very efficiently in CUDA. 

In addition to Power Iteration we need Shift-Invert and Matrix Deflation. You can read about Shift-Invert here: (http://www.netlib.org/utk/people/JackDongarra/etemplates/node286.html). You can read about Matrix Deflation here: (http://www.chem.helsinki.fi/~manninen/psc/materials/lectures/lecture-8.pdf). 

We start by constructing the Laplacian Matrix of the affinity graph L(G). L(G) is symmetric-positive-definite which means its eigenvalue spectrum ranges from 0 to the l2-norm of L(G). See (https://math.stackexchange.com/questions/603375/norm-of-a-symmetric-matrix-equals-spectral-radius). Call this value L2. 

We want to find the smallest (in magnitude) nonzero eigenpair of L(G). The Power Iteration finds the largest, not the smallest, eigenpairs. At first blush it seems that the Power Iteration cannot solve our problem. 

Our solution will be to construct a series of matrices L, L', and L'' which each has a different eigenvalue spectrum. We will apply the Power Iteration to L''(G) and correct the result to get the desired eigenpair of L(G). 

To construct L'(G) we shift the spectrum of L(G) to the left by L2 so that the largest eigenvalue of L(G) becomes zero and the smallest (zero) eigenvalue of L(G) becomes -L2. This is accomplished by Shift-Invert, which amounts to subtracting L2 from the diagonal of L(G) to yield a new matrix L'(G). The eigenvalue spectrum of L'(G) is from -L2 to 0. 

The Fiedler vector requires computing the smallest (in magnitude) nonzero eigenpair of L(G). The smallest eigenvalue of L(G) is zero whenever G is a connected graph. This has the corresponding eigenvector of 1^n. We need to remove the corresponding eigenpair from L'(G) in order to use a Power Iteration. In L'(G) this eigenpair has eigenvalue -L2. 

We use Matrix Deflation to eliminate this eigenpair from L'(G). Specifically we compute L''(G) = L'(G) - (lambda) qT q where lambda = -L2 and q is a vector of 1s. This yields a new matrix L''(G). We can now use Power Iteration to converge to the extremal eigenpair of L''(G) which corresponds to the smallest nonzero eigenpair of L(G). 

Since eigenvalues of L''(G) are negative we have to perform two iterations to get a positive result. We have to add L2 to the resulting eigenvalue of L''(G) to get the corresponding eigenvalue of L(G). 

We experimented with sparse matrix libraries for both the CPU (Eigen) and GPU (cuSparse). We found that matrix-vector multiplications with Eigen took 16 microseconds and with cuSparse took 3 microseconds. Since 3 microseconds is roughly the CUDA kernel invocation overhead we conclude that our problem is taking almost no compute time in cuSparse. 

The signs of the eigenvector components converged within two double-iterations. Since we only care about the signs of the eigenvector components for matrix partitioning, and not their magnitudes, we can stop the iteration at this point because the signs of the eigenvectors define a minimum graph cut. The algorithm for a single partition runs in under 20 microseconds in our tests. 

With thanks to professor Alistair Spence, Math Department, University of Bath, England and Sajan Patel for helpful discussions.

Conclusion: we defined an efficient algorithm to compute a minimum cut of a connected graph based mainly on sparse matrix-vector multiplication. We found this algorithm ran orders of magnitude faster on a GPU than a comparable Lanczos iteration.  

 

In memorium: Max Arthur Heirich

This site is a memorial to the late Max Heirich of Ann Arbor, Michigan. Max was an inspired spiritual seeker and dedicated his life to helping people. During the American Civil Rights movement he worked with Ella Baker and Dr Martin Luther King and was present at the founding of SNCC. He documented the Berkeley Free Speech Movement. Later in life he explored eastern healing methods including Polarity Therapy. In some traditions he would be considered a Tzadik or Siddha. 

Memorial Guestbook  

Student Nonviolent Coordinating Committee (1960) 

Yesterday’s Discord (1965) 

The Beginning: Berkeley 1964 

Spiral of Conflict: Berkeley 1964 

Papers relating to the Free Speech Movement 

Change of Heart: a test of some widely held theories about religious conversion (1977) 

Rethinking Health Care: Innovation and Change in America 

Obituary (2017) 

Memorial Service (2017) 

Pictures including trip to India and the Himalayas

University of Michigan Faculty History Project 

University of California Berkeley Sociology Department obituary  

 

In memorium: Herbert Bishop Keller

Herbert Keller was a mystic 

When I first started working with Caltech Applied Math professor Herbert Keller on bifurcation analysis of partial differential equations we had a remarkable experience. At one point when we were talking I perceived the space between us to be filled with a beautiful golden light. Herb got excited and after our meeting he walked over to the computer science department and told them that I was a special student. I don’t know what he perceived but he perceived something extraordinary as did I. 

Fast forward four years, I had been attending his research group meetings every week. One day he decided to lecture on solving Magnetohydrodynamics equations. At one point in the lecture he said something about matrix preconditioners that blew my mind. I can’t recall exactly what he said but it opened new dimensions of understanding. 

Suddenly I was looking down at the room from the perspective of the ceiling. It was an out-of-body experience. I saw Herb, the other students, and myself in my chair. Over my head was a slowly rotating image of the Sahasra Chakra which is known in yoga as the thousand-petaled-lotus. This is a classic yogic meditation experience. 

After a few moments of this I was back in my body and felt a strong energetic connection to Herb at the front of the room. Herb stopped speaking and stared at me. I tried to hold as still as possible to not give anything away. I am sorry that I never discussed these experiences with Herb. 

Eventually I came to learn that Herb had spent time visiting the Himalayas in India which means only one thing: visiting gurus. So Herb was mystical although he did not talk openly about that. Yoga says that there are several paths to enlightenment, and one is the path of knowledge. I think Herb’s mathematical work led him toward enlightenment and that he was in a high spiritual state. 

This is somewhat paradoxical because he was not always nice to people and was not a fount of love and acceptance. I never discussed these experiences with anyone as I had learned not to be open about my spiritual experiences and that many people do not understand them. Herb was a good friend and mentor to me, and I also have to say that his presence uplifted me spiritually. This shows that intellectual work at Caltech can lead to spiritual transcendence. 

 

In memorium: Jan L. A. van de Snepscheut

 Death of the computer science chair 

In 1994 Jan van de Snepscheut was the executive officer of the Caltech computer science department. A native of The Netherlands, Jan was a successful academic and did his graduate work under the direction of Edsger Dijkstra who was one of the fathers of computer science. Dijkstra wrote him a letter of recommendation to Caltech that predicted Jan would be one of the greatest computer scientists. As sophomore Anthony Molinaro said later “his professional life really seemed together”. 

Stephen Taylor was an assistant professor in the computer science department. A native of England, Steve was challenged in his position and had trouble fundraising and publishing papers. He received a barely passing review from the department that was delivered by Jan. Steve had a large DARPA grant that he received based on a recommendation from the previous department chair Chuck Seitz. Unfortunately after Steve turned in his first annual report DARPA wrote back that he would never receive another grant from them. 

Steve was trying to work in computational fluid dynamics but he didn’t have the necessary mathematical background. As a result he relied heavily on his collaborators and had a hard time getting his own papers published. Steve was a hard worker, spent many all-nighters on campus, and spent nearly all of his waking hours in the lab. 

He also drank prodigious amounts of coffee which caused him to have an explosive temper. He bullied everyone around him, his graduate students, research staff, administrative staff, and even other professors. He was probably the most disliked professor on campus. Steve’s motto was “there are no ethics in academia” and he was true to those words. 

To be fair to him it must be noted that he had a major project sabotaged by a rival, Eric van de Velde of Applied Math. Although Eric was ultimately forced out of Applied Math by this scandal there was no way to undo the damage that was done and this enflamed Steve’s paranoia. 

Steve tried to steal papers multiple times from students in order to publish them under his own name. He played dirty tricks on other professors. Steve wrote a confidential report to the CIA about the activities of his colleagues in the computational fluid dynamics community. He was very proud of this report. 

I was Steve’s graduate student and became alarmed when he ensnared me in a scheme to defraud graduate student Andrew Conley by stealing his paper and publishing it under our names. This would have been submitted as part of a DARPA report which would have constituted fraud on a DARPA contract. I objected to one of the senior members of the department. The news quickly got to Jan and to the Caltech provost Paul Jennings who were very concerned. 

Jan decided to confront Steve about this and when he did Steve responded “if you interfere with my student again I’ll burn your fucking house down!” 

I was used to having friendly conversations with Jan in the hallway but after this I found that Jan literally ran away from me when I approached him. I was concerned about this and talked to one of the department administrators. She told me what had happened and also that the senior faculty were meeting in secret to consider terminating Steve’s contract for reasons of mental illness. 

She told me that Steve and Jan were battling and that things were “totally out of control”. After a few weeks of this Jan sent a message asking me to meet with him. I showed up at his office and he proceeded to offer to be my PhD advisor. He knew that my relationship with Steve was on the edge of collapse. He told me that I was a good student and deserved to complete my PhD. We talked about Steve and agreed that he was a pathological liar. I opined that Steve was “a mind that is cracking under pressure.” When I said this Jan started weeping openly. 

At the time I didn’t understand what this meant but later I realized it was foreshadowing events to come. A few days later I heard that Jan had died in a house fire. I heard it from Steve who was giddy and laughing. Steve said to me “it shows it could happen to you at any time”. Since I thought Steve was a murderer this scared me and I interpreted it as a direct threat. I fled from the campus and moved my house for protection from Steve. 

I holed up for three days waiting to read the news that Steve had been arrested for murder. When this didn’t come I realized that I had been in denial and that Jan had committed suicide. My world came crashing down at that point. 

After the death it was found that Jan had a file drawer in his office that was full of information about Steve. The Pasadena police department took an interest in this and photographed every piece of paper in the drawer. They did not bring any charges. 

For the next many weeks I was depressed and traumatized. I was angry and so I contacted the author Michael Meloan and told him my story. In return Michael wrote a fictionalized version of these events called “The Cutting Edge” which he published in WIRED magazine in 1996. The fictionalized story was set at Stanford and created a backlash from the Stanford faculty. The story used some artistic license and exaggerated some things but did a masterful job of conveying the interpersonal dynamics and the ethics of the situation. 

The story appeared in the same month that Steve’s tenure committee met to start evaluating his tenure. Suffice it to say that Steve did not get tenure at Caltech, and in fact was unable to get any faculty job offer for the next year. This was largely due to the fact that his research had not made any impact on the field. Eventually Steve got a job offer from Syracuse University which he had derided as a “backwater”. He went there quickly but eventually lost all of his funding and was terminated after three years. 

While all of this was going on Steve stole my paper and published it without my knowledge. He tried to take credit for my work. But he did not succeed.

Despite his evil I have forgiven Steve Taylor.

references: https://www.latimes.com/archives/la-xpm-1994-02-24-me-26659-story.html 

https://www.wired.com/1996/03/cutting-edge/ 

https://en.wikipedia.org/wiki/Jan_L._A._van_de_Snepscheut