A scalability theorem proves that, for a purely local disturbance, the rate of convergence of algorithm 1 is constant on computer systems of arbitrarily large scale. This property also holds for disturbances that are not purely local as is shown through discussion and simulation. A model of the worst-case disturbance is presented for which algorithm 1 is ineffective. This worst-case is shown to have a vanishing probability of occurrence with increasing problem scale. In addition, it is shown that if the worst-case characteristics are not present in a problem at small scale, then they will not be present at large scales either. These two characteristics are in sharp contrast to problems that commonly arise in solving differential equations by iterative algorithms and that give rise to research in multigrid methods. In such problems the worst-case behavior has an increasing probability of occurrence with increasing problem scale, and problems that converge rapidly at small scales may converge slowly or not at all when the scale is increased.
Algorithm 4 offers a scalable solution to the dynamic mapping problem based on the computational kernel of equation (3.5). This is one of the first algorithms ever proposed for the dynamic mapping problem on general interconnection topologies. The local nature of the algorithm ensures that it will produce better solutions than random placement algorithms. The similarity of algorithms 1 and 4 serves to both unify and highlight the differences between the problems of load balancing and mapping.
The analysis of randomization strategies in section 4.4 presents a model to calculate the expected workload imbalance on any number of computers as a function of statistical properties of an image. This analysis shows that the difficulty of achieving a balanced workload increases with problem scale, and that tiling strategies commonly used for load balancing in image synthesis have a detrimental effect. It illustrates the difficulty of achieving a balanced workload by static methods and shows the necessity for dynamic load balancing in distributed image synthesis computations. This section presents predicted and simulated data for load imbalance of representative images on up to 1,024 computers. This data may be useful to future studies in scalable image synthesis.
Problems 1 and 2 define the dynamic load balancing problem in a form that is amenable to analysis. These definitions make explicit the relationship between the problems of load balancing and mapping in a form that has not appeared previously. This form makes it possible to present a unified discussion of these two problems, and to analyze them both within a common formal framework.
Section 2.4.3 ("diffusion'') presents the most complete survey to date of a class of load balancing algorithms that are based on the informal concept of diffusion. This survey demonstrates that over the past several years a consensus has emerged about the preferred local load balancing algorithms for distributed applications. The emergence of this consensus has not been widely recognized.
Algorithm 2 describes a path tracing iteration to sample the photo-realistic values of image pixels in a Monte Carlo procedure. It is unconventional to compute these values using an iteration, and this computation is typically formulated as a tail recursion. The advantage of formulating the computation as an iteration rather than as a recursion is that it becomes easy to redistribute work among computers as a result of dynamic load balancing.
Algorithm 3 shows one way to use algorithm 1 to provide dynamic load balancing for a Monte Carlo image synthesis application based on algorithm 2. The discussion in section 4.3 ("a concurrent implementation'') brings to light practical issues of implementation that may be relevant to load balancing in support of other applications.
An empirical comparison of the performance of four different load balancing strategies on systems of up to 64 computers appears in table 4.2. These results reinforce the need for dynamic load balancing for problems in image synthesis. They show that algorithm 1 was consistently as good as or superior to one of the most popular load balancing algorithms for ray-tracing on parallel computers.