The Inner Product, July 2003
Jonathan Blow (jon@number-none.com)
Last updated 16 January 2004
 
Unified Rendering Level-of-Detail, Part 5 (The Last!)
Related articles:
Rendering 
Level-of-Detail Forecast 
(August 2002)
Unified Rendering LOD, Part 1 
(March 2003)
Unified Rendering LOD, Part 2 
(April 2003)
Unified Rendering LOD, Part 3 (May 2003)
Unified Rendering LOD, Part 4 
(June 2003)
Note: This article is currently missing Figure 2.  
We will attempt to remedy this situation in the future.
This series of articles is about LOD for large triangle soup environments. Such 
environments must be divided into chunks; the biggest issue in this kind of 
system is filling in the holes between neighboring chunks at different levels of 
detail. I fill these holes with small precomputed triangle lists called "seams", 
and earlier in this series I showed how to construct those seams.
In part 3 we saw how to split a single chunk of geometry into two, and how to 
generate the initial seam between them. Last month, in part 4, we saw how to 
split a chunk that is connected via a seam to another chunk, while maintaining 
existing manifolds. I mentioned that this is sufficient to build a chunked BSP 
tree of the input geometry; I showed some code that split the Stanford Bunny by 
arbitrary planes and played with the LOD of the various pieces.
This month I'll begin with that basic premise of building a chunked BSP tree; 
before long we'll be rendering a large environment with this system (and not 
just a bunny).
We'll focus on three major parts of the algorithm: (1) dividing the environment 
into initial chunks, (2) putting those chunks together into a detail-reduced 
hierarchy, (3) choosing which levels of the hierarchy to display at runtime.
(1) Dividing the environment
We don't want to assume anything about the shape of the environment. Maybe it's 
a big localized block like an office building, or maybe it's a loose set of 
tunnels that curve crazily through all three dimensions. An axis-aligned 
approach to segmenting the geometry, like a grid or octree, would work well for 
the office building but not for the tunnel. The chunked BSP treatment is nice 
because it doesn't much care how pieces of the environment are oriented.
I needed a method of computing BSP splitting planes. Most game-related BSP 
algorithms divide meshes at a 1-polygon granularity, so to acquire a splitting 
plane they pick a polygon from the mesh and compute its plane equation. But 
that's the kind of thing I am trying to get away from -- thinking too much about 
single triangles is a violation of the spirit of the algorithm. I want a more 
bulk-oriented approach, so I statistically compute a plane that divides the 
input into two sub-chunks of approximately equal triangle counts.
There are other things I might wish to do, besides just minimize the difference 
in triangle count between the two sub-chunks. I might wish to produce sub-chunks 
that are as round as possible, so that we don't end up with a scene full of long 
sliver-chunks. I might wish to balance the number of materials used in each 
chunk. Most probably, I'd whip up a heuristic that combines many parameters and 
makes trade-offs to achieve a happy medium.
To compute the splitting plane in a quick and dirty way, I treat the chunk as a 
bunch of point masses, one mass located at the barycenter of each triangle. I 
then compute the vector-valued mean and variance of these masses (see "My 
Friend, the Covariance Body", Game Developer magazine, September 2002). This 
gives me an ellipsoid that summarizes the mass of the chunk, as though that mass 
were normally distributed. I then split that ellipsoid in half, using a 
splitting plane that passes through its center and spans the ellipsoid's two 
minor axes. I use the minor axes because this produces halves of minimum 
eccentricity -- in other words, pieces that are "the most round". (We might 
produce pieces that are more nicely-shaped by choosing not to split the 
ellipsoid through the center). See Figure 1 for a 2D illustration of the 
splitting-plane computation.
 
(2) Combining chunks into a hierarchy
Our LOD rendering algorithm doesn't place many constraints on the way we 
organize our chunks. It just wants an n-ary tree with hierarchical bounding 
volumes that help us choose detail levels. We'll talk about computing those 
volumes in part (3); for now we'll just discuss putting the chunks together.
First, we need to decide on a policy about which chunks to combine. The easiest 
choice is just to use the hierarchy that was created by the BSP splits. We 
combine two chunks into one, detail-reducing to a given triangle budget at each 
level, until we once again reach the root of the BSP tree.
This is easy because we don't have to think much, and in fact it's the first 
thing I implemented. Clearly, though, it may not produce the best results. For 
one thing, in any tree structure like this we end up with nodes that are 
neighbors spatially, but that are far apart in the hierarchy. In terms of scene 
quality it may be best to combine these nodes early, but they won't be combined 
until the level of detail becomes very low. Secondly, we probably want the 
freedom to combine more than two chunks at once. The pre-ordained BSP hierarchy 
does not give us much freedom.
So instead, I just take the chunk geometry stored on the leaves of the BSP tree, 
throwing away the tree structure itself. Then I cluster the chunks using a 
heuristic that is independent of the original chunking process. This lets me 
match any number of chunks together, building a new n-ary tree from the leaves 
to the root. At each level of the n-ary tree, I combine the participating chunks 
and seams into one mesh, and then run a static LOD routine on that mesh. This is 
the same thing I did with the terrain hierarchy from Parts 1 and 2, but now the 
environment representation is less restricted.
You can imagine many different policies for choosing the triangle budget for 
each node in this geometric hierarchy. In the sample code, my heuristic is that 
each node should have approximately the same number of triangles (for example, 
4000). So the BSP splits occur until each leaf has about 4000 triangles; then I 
discard the BSP tree and begin combining leaves 4 at a time, LOD'ing each 
combination back down to 4000 triangles. So in the end, I have a 4000-triangle 
mesh at the root that represents the entire world, with subtrees containing 
higher-resolution data.
(3) Choosing which resolution to display at runtime
So we've got these hierarchical chunks representing world geometry; at runtime 
we need to choose which chunks to display. Typically, LOD systems estimate the 
amount of world-space error that would be incurred by using a particular level 
of detail; they project this error onto the screen to estimate how many pixels 
large it is, then use that size-in-pixels to accept or reject the approximation.
Algorithms from the Continuous LOD (CLOD) family tend to perform this operation 
for each triangle in the rendered scene. We want to operate at a larger 
granularity than that. We could do that as follows: for each chunk, find the 
longest distance by which the LOD'd surface displaces from the original, pretend 
this error happens at the point in the chunk that is nearest to the camera, and 
use the corresponding z-distance to compute the error in pixels.
It's customary in graphics research to take an approach like this, since it's 
conservative: you can achieve a quality guarantee, for example, that no part of 
the chosen approximation is rendered more than 1 pixel away from its ideal 
screen position.
But with graphics LOD, this kind of conservativism is not as fruitful as one 
might assume. When choosing LODs based on the distance by which geometry pops, 
we ignore warping of texture parameterizations, changes in tangent frame 
direction, and transitions between shaders. The popping distance can be vaguely 
related to some of these things, but in general it only controls a portion of 
the rendering artifacts caused by LOD transition. Even with a strict bound on 
popping distance, we're not achieving a guarantee about the quality of the 
rendered result.
In addition, this conservative approach can be harmful to performance. To 
provide that somewhat illusory guarantee, the algorithm must behave quite 
pessimistically (for example, assuming that the maximum popping error occurs at 
the position in the chunk closest to the camera at an angle that is maximally 
visible). This causes the algorithm to transition LODs earlier than it should 
need to.
I suggest we acknowledge the fact that, when tuning LOD in games, we just tweak 
some quality parameters until we find a sweet spot that gives us a good balance 
between running speed and visual quality. So it doesn't matter how many pixels 
of error a particular LOD parameter represents, because we don't set the 
parameter by saying "I want no more than 0.7 pixels of error", we set it by 
tweaking the pixel-error threshold until we decide that the result looks good -- 
whatever number of pixels is produced by that tweaking, that's what the answer 
is. Conservative bounds are not specially meaningful in this situation.
In this month's sample code, I approximate the spatial error for a given LOD 
without using a conservative worst case. I do this by finding the 
root-mean-square of all the spatial errors incurred during detail reduction. 
Recall that I'm using error quadric simplification, so squared spatial errors 
are easy to query for. I just take each quadric Q and zero out the dimensions 
having to do with texture coordinates, which gives me Q'. Then I find vTQ'v 
where v is the XYZ-position of a vertex in the reduced mesh and Q is its error 
quadric. That product yields the squared error; I find the mean of all those 
errors and square-root that mean.
This gives me an error distance in world-space units. I use this distance to 
compute a bounding sphere centered on the chunk, whose radius is proportional to 
the error distance. If the viewpoint is outside the sphere, we are far enough 
from the chunk that we don't need more detail, so we draw the chunk. If the 
viewpoint is inside the sphere, we need more detail, so we descend to that 
chunk's children.
The radius is proportional to the error distance because perspective involves 
dividing by view-space z. For chunks within the field of view, view-space z is 
approximately equal to the world-space distance from the camera to the chunk, 
which is why world-space bounding spheres make sense. The constant of 
proportionality is the LOD quality level, which you can adjust at will. You just 
need to make sure that the spheres are fully nested in the chunk hierarchy, 
which means potentially expanding each node's sphere to enclose its childrens' 
spheres.
These spheres can be viewed as isosurfaces of continuous-LOD-style pixel error 
projection functions. For more in the way of explanation, see my slides from 
Siggraph 2000, as well as Peter Lindstrom's paper (For More Information). Both 
references describe Continuous LOD schemes, so I don't recommend you implement 
either, but the discussion of isosurfaces can be useful. 
Trash talk time: Lindstrom disses my CLOD algorithm from 2000 in his 
paper, but his criticism is misguided. Most of the attributes that he scorns are 
actually important design decisions, chosen to create an algorithm that behaves 
reasonably under heavy loads. Not seeing this, he champions features that I 
intentionally discarded, features that inherently make for a poorly-performing 
system. (I mean come on, Peter -- given anything like real-time requirements, a 
persistent orientation-sensitive LOD metric is just a fundamentally bad idea. So 
why make your algorithm more complicated, and slower, in order to support that 
capability?)
I shouldn't single out Lindstrom, because this general problem pervades much of 
academic computer graphics. Be careful when you read those Siggraph papers; the 
authors don't have the same design priorities that you do. In general these 
authors don't do proper cost/benefit analyses of their algorithms. The "previous 
work" section is just about convincing you that the authors have done some 
homework; it says nothing about whether the authors are honestly considering the 
relative advantages and drawbacks of their system with the previous work. The 
performance numbers featured in "results" sections are a sales device; they say 
little about how an algorithm will really perform, since the authors rarely 
develop their systems for real-world data sets.
Okay, enough trash talk. Back to talking about nested spheres. To keep things 
simple, I have so far pretended that we perform an LOD transition exactly when 
the viewpoint crosses one of those spheres. But as you know from Part 2, we 
actually want to fade slowly between LODs. In the sample code this is done by 
computing two radii, one which is greater than the instant-transition radius, 
and one which is less than it. LOD fading begins when the viewpoint crosses one 
sphere, and ends when it crosses the other.
You can easily imagine more sophisticated methods of computing LOD sphere radii. 
Perhaps we want to render approximately the same number of triangles every 
frame, no matter where the player is standing in the environment. We could run 
an energy minimization system at preprocess time, expanding and contracting the 
spheres until we achieve something close to the desired result. Or instead of 
tweaking radii, we could leave radii fixed and adjust the actual number of 
triangles used in each LOD chunk.
Sample Code and Future Work:
This month's sample code preprocesses and renders Quake 3 .bsp files. You can 
see the results for one such environment in Figure 2.
The code is not yet a plug-and-play tool for general use. However, I plan to 
keep working on this system in the future. Even though this is the last article 
about this algorithm (for now?), I will post new versions of the code on the web 
as improvements occur. If you have any input on how to make this LOD system 
maximally useful, let me know!
Because we're storing triangle soup meshes, the file size for the environment 
might be very large. It takes a lot more space to store a mesh of a heightfield 
terrain than it does just to store the height samples! To alleviate this 
problem, we may wish to investigate mesh compression. Perhaps that will be a 
good topic for some future articles....
References
J. Blow, "Terrain Rendering Research for Games", slides from Siggraph 2000. 
Available at http://www.red3d.com/siggraph/2000/course39/S2000Course39SlidesBlow.pdf
P. Lindstrom and V. Pascucci, "Visualization of Large Terrains Made Easy", 
Proceedings of IEEE Visualization 2001. Available at http://www.cc.gatech.edu/gvu/people/peter.lindstrom/
S. Gottschalk, M. C. Lin and D. Manocha, "OBB-Tree: A Hierarchical Sturcture for 
Rapid Interference Detection", Siggraph 1996. Available at
http://citeseer.nj.nec.com/gottschalk96obbtree.html
 
Figure 2 (missing currently!)
2a: Quake 3 level "hal9000_b_ta.bsp", by John "Hal9000" Schuch, rendered using 
the LOD system. 2b: Level chunks have been artificially moved away from each 
other so you can see the seams (red bands). 2c: Chunks have been drawn with 
color-coding that shows their resolution level.