|
||||
Introduction
Occassionally I will come across a question on the Internet regarding the use of matricies in computer graphics. Either they're not clear on what goes into the matricies, or don't quite understand homogeneous coordinate systems. Instead of answering those posts individually, I decided to put up this web page instead.
This is in no way an authoratative guide to the subject, but is intended only as an overview on the subject. Feel free to link to this site.
Quick plug: If you need a freelance software developer who is knowledgable about 3D computer graphics, please e-mail me at woody@alumni.caltech.edu. Also e-mail me if you have any comments or questions about this document.
Table Of Contents
Homogeneous Coordinates
Transforming Homogeneous Coordinates
Perspective
Clipping Polygons And Lines
Transforming Texture Maps
For reasons that hopefully will become clear in a moment, it's useful to represent 3D points in computer graphics using a 4-vector coordinate system, known as homogeneous coordinates.
To represent a point (x,y,z) in homogeneous coordinates, we add a 1 in the fourth column:
1. (x,y,z) -> (x,y,z,1) |
To map an arbitrary point (x,y,z,w) in homogenous coordinates back to a 3D point, we divide the first three terms by the fourth (w) term. Thus:
2. (x,y,z,w) -> (x/w, y/w, z/w) |
This sort of transformation has a few uses. For example, recall that one equation for determining points on a plane is the equation:
3. A point is on a plane if the point satifies the relationship 0 == A*x + B*y + C*z + D |
We can use this to our advantage by representing a plane L = (A,B,C,D). It is trivial to see that a point is on the plane L if
4. P dot L == 0 |
What makes this relationship interesting that if we have a "normalized" homogeneous point P and a "normalized" plane L, defined as:
A homogeneous point P = (x,y,z,w) is normalized iff w == 1. Likewise, a homogeneous plane L = (A,B,C,D) is normalized iff sqrt(A*A+B*B+C*C) == 1. |
then the dot product is the "signed" distance of the point P from the plane L. This can be a useful relationship for hit detection or collision detection, when we wish to determine where a path from P1 to P2 intersects L. In that case, we can easily calculate the intersection point P by:
a1 = P1 dot L; a2 = P2 dot L; a = a1 / (a1 - a2); P = (1-a)*P1 + a*P2 |
This is useful when we need to do clipping of lines and polygons to fit inside the screen, as well as in performing collision detection.
Transforming Homogeneous Coordinates
We can represent rotation, scaling, and translation using homogeneous coordinates by using a 4x4 matrix. Note that if we were to simply restrict ourselves to a 3x3 matrix, we could not perform translations-in that case, we would have to explicitly add. But using a full 4x4 matrix, not only can we represent a translation using a 4x4 matrix, but we can derive all sorts of interesting properties, including easily translating back from screen coordinates to world coordinates.
The standard transformation matricies used in computer graphics are:
Translation: T(x,y,z) = | 1 0 0 0 | | 0 1 0 0 | | 0 0 1 0 | | x y z 1 | |
Scaling: S(x,y,z) = | x 0 0 0 | | 0 y 0 0 | | 0 0 z 0 | | 0 0 0 1 | |
Rotation about X axis: Rx(angle)= | 1 0 0 0 | | 0 c s 0 | | 0 -s c 0 | | 0 0 0 1 | where c = cosine(angle) s = sine(angle) |
Rotation about Y axis: Ry(angle)= | c 0 -s 0 | | 0 1 0 0 | | s 0 c 0 | | 0 0 0 1 | where c = cosine(angle) s = sine(angle) |
Rotation about Z axis: Rz(angle)= | c s 0 0 | | -s c 0 0 | | 0 0 1 0 | | 0 0 0 1 | where c = cosine(angle) s = sine(angle) |
To use these matricies, you would post-multiply the homogeneous point P by the matrix. Thus, to rotate the point around all three axises, and then translate to a new location, you would do:
P' = P * Rx * Ry * Rz * T
(Note: to post-multiply, you would perform the following operation:
P * M = (x y z w) * | a b c d | = | e f g h | | i j k l | | m n p q | (x' y' z' w') where x' = x*a + y*e + z*i + w*m y' = x*b + y*f + z*j + w*n z' = x*c + y*g + z*k + w*p w' = x*d + y*h + z*l + w*q |
Normally this is implemented using a for loop, and is expanded explicitly just so it's clear in what order the rows and columns are evaluated.)
What makes this an interesting thing to do is that as the multiplication of matricies is associative, instead of performing four matrix multiplies for each point we want to transform, we can instead multiply the four matricies together into a single matrix, and perform one point/matrix multiply for each point we transform. This is especially useful when we don't know apriori the number of translations that need to be performed to a collection of points.
There are a number of perspective matricies that are floating around there, depending on the field of view desired, and the near and far clipping that we wish to perform on transformed points. (More on that later.) But the essential idea is the same: a perspective matrix moves the depth value z into the fourth column, where it will used to divide through the x and y values when the final homogeneous coordinate is translated back into a 3D point (known as screen coordinates) with x and y the pixel location, and z the depth of the point (often passed as the depth to the z-buffer).
|
Aside: Why do we want to divide by Z? Well, the best way is to illustrate what happens when we want to find the location on a screen of a point out in space. ![]() If we want to know the location on the screen of x', assuming the screen is 1 unit from the eye, it's easy to see that 5. x' = x / z |
We'll use a simple perspective matrix to illustrate some points.
Perspective: V = | 1 0 0 0 | | 0 1 0 0 | | 0 0 1 1 | | 0 0 -1 0 | |
With this perspective matrix, we have the nice relationship:
P' = P * V = (x,y,z-w,z) which in 3D is (x/z,y/z,1-w/z) |
This has the nice property that points on the screen are at screen coordinate 0, and far away points converge to 1.
One interesting thing that we may want to do is to clip the objects in a scene before we render them. That way, we can cull out objects that are not visible, and
###
In general, a texture map across a polygonal surface is a mapping from points (x,y,z) on the polygon to points (u,v), which are pixel coordinates
###