Complex rendering

Ever since the dawn of Programmable Pipelines the aim of the gaming industry has been to implement real-time rendering techniques for complex materials. Games have strived to make lighting, reflections and material displays as feasible as possible on real-time game engines. We can talk all day about pre-rendered cutscenes and how gorgeous they look, but the real test is what we can do with lighting in-game. The following blog investigates methods used to adapt and simplify complex skin rendering models to fit current game engines. 

Information on this blog is credited to and referenced from Tobias Kruseborn's Master's Thesis on 'Lighting and materials for real-time game engines,' at the School of Computer Science and Engineering Royal Institute of Technology in the year 2009. A link to his thesis can be found here.

The theories and applications in this blog will be quite involved and detail-oriented. It is important to keep this mind in going forth. With that said, as always, I will try to make it as fun as possible - Enjoy the blog! 


While the aim of this blog is to examine the implementation of advanced real-time rendering techniques for complex materials such as human skin, it includes determining whether spherical harmonics and Wavelets may be used to represent both diffuse and specular reflection for environmental lighting in a real-time game engine. 

The problem being addressed here is how to lighten materials with several layers in a physically correct manner as well as in real-time. To begin to understand the goals in mind behind this exploration let's take a look at NVIDIA's Doug Jones demo which attempted to realistically model the actor's head in real-time. 

The preceding demo sets a very clear standard for high quality shading in real-time. However, it is important to consider that "real-time" for a tech demo is much different from being fast enough to use in a game which is where most people would need to use this technology. Think about the fact that this demo fully taxes the processor of a high-end graphics card whereas most games run on much less powerful consoles. Commercial games must further render entire worlds which leaves a small fraction for skin shading. 

Now let's talk about theory.

Diffuse and specular reflection

Specular light on human skin is admittedly easier to represent than the diffuse reflection. This is based on the fact that specular light reflects directly and isn't absorbed into the surface. This specular light only reflects 6% of the entire light spectrum this is because the ole layer of the skin doesn't give out a mirror-like reflection due to roughness. The Blinn-Phong model leads to an incorrect estimate since it outputs more power than it receives and further fails to capture increased specularity on grazing angles. The use of a more precise physical base reflection factor model can improve this quality in exchange for a few more Shader instructions. 

Representing diffuse reflection is much more fundamentally difficult due to the fact that in order to calculate diffuse of one area on the skin, we would need to know the incoming light intensity of nearby points. So, we use a diffuse profile which provides an approximation for the manner in which light scatters underneath the surface of a highly scattering translucent material. When rendering materials by applying a diffuse profile, all incoming light converges at the surface before dispersing to create the exact shape of the profile. Luckily, the problem of finding a scattering dipole is almost solved. The dipole curve plotted for the diffuse profile can be approximated by summarizing a number of Gaussian functions. 

To implement this, we convert the diffuse illumination of geometry to a light map several times with Gaussian blur and then combine them together. We can then perform the Gaussian convolution. 

By carefully choosing a sampling pattern, we can then represent each Guassian blur with a 12 jitter sample. The first selection represent the incoming and outgoing light directly and the following six samples corresponds to middle-level scattering. The last six samples represent the high-level scattering and are mainly used for red light. The result is a variety of blurs for each channel which can be done on a single pass. 

Modified translucent shadow maps

In texture-space diffusion, some regions that are near one another in Euclidean space can be far apart in texture space. For example, ears and noses may not take into consideration transmitters from both sides causing scattering to only be observed in the part that is targeted tot he light. A translucent shadow map (TSM) makes depth, irradiance and the surface normal to store these quantities on the surface to the light at each pixel in the texture.  At run-time, each surface is in shadow so we can study the textur to find the distance to the object from the light and access the convolved version of irradiance of the light surface.

Environment lighting

The goal behind environmental lighting is to be able to represent both diffuse and specular reflection from an environment map in real-time. In representing alight with an environment map, all texels become a light source - we need to integrate light from all directions. This is not trivial to do and it is important to keep in mind that specular reflection is also view-dependent. 

The median cut algorithm effectively converts an HDR light probe image to a set of light sources by dividing a light sensor image into a number of regions followed by the process of creating a light source corresponding to direction, size, color and intensity of the total incoming light in each region. With the help of a summed area table, we find the longitude, latitude n in a picture with the highest intensity through a binary search per region in n iterations. This is the quickest and most accurate way to find areas with the same intensity. The final algorithm successfully represents complex environment lighting with a few point lights. 

Spherical Harmonics (SH) is the angular portion of the solution to the Laplace equation in spherical coordinates. The SH basis is an orthogonal function on the surface of a sphere. It is similar to the canonical basis of R3, but differs in the sense that each of the SH coefficients do not correspond to a single direction, but to values of an entire function over the whole sphere. SH basis functions are small pieces of a signal that can be united to an approximation of the original signal. To create an approximation signal using the SH basis, we must have a scalar value for each base that represents how the original function is similar to the basis function. 

There are 3 components to an SH basis. Diffuse, area specular and analytical specular. We must first separate the material into diffuse parts and low and high frequency glossy parts. We then use an SH irradiance environment map for diffuse reflection and a new area specular model for low frequency gloss. Finally, the Bidirectional Reflectance Distribution Function (BRDF) is evaluated directly with point lights for high frequency. 

Spherical harmonics are good for representing low frequency light, but not high frequency light. Wavelets can capture both low and high frequency light in a compact manner. Wavelets are a set of non-linear bases. When projecting a function in terms of wavelets, the wavelet basis functions are chosen according to the functions being approximated. 

Some waves are small and can represent just a pixel while other bigger waves can capture light from the whole environment. This way, we can represent an environment map with only a few wavelet coefficients. Some will represent just a pixel in an environment while other will represent frequencies over the whole surrounding. The Haar wavelet lies in the planar domain and leads to distortion when used for functions in other domains, but the Soho wavelet lies in the spherical domain so these are used more often. 

The following is the rendering of environment lighting with diffuse and specular relfection along with Soho wavelets. From (a) to (e) the number of lights in the scene are 8, 16, 32, 256 and 512. 

Final Algorithm

Combining all the methods and techniques previously discussed, the final algorithm used in skin rendering looks something like this:

  • Use the median cut algorithm to get our light source positions.
  • Convert the diffuse light from an environment map to spherical harmonics.
  • Do this for each light.
  • Render a shadow map and apply Gaussian blur filter to it. 
  • Render the shadows and the diffuse light to a light map. 
  • Apply Separable Subsurface Scattering (SSS) to the light map. 
  • Apply SSS for translucency.
  •  Read the diffuse light and shadow from the light map.
  • Add the rest of the mesh texture to the diffuse light.
  • Calculate the specular light from the same positions as the shadows.
  • Combine the specular and diffuse light to a final color. 

At the end of the day, here are the kind of results we're looking at with the Kelemen-Szmirnay-Kalos model on the left versus the Phong model on the right:

In conclusion

Some noted observation based on the result of the skin rendering algorithms mentioned in the this blog include the fact that diffuse reflections make the skin look more natural and the method implemented is ten times faster that the Doug Jones demo. The method can be used in real-time game engines, especially for cutscenes. In terms of specular reflection, the Kelemen-Szmirnay-Kalos model gave better results and made the specular reflection look more realistic than when using the Phong model. It was also able to catch reflections at grazing angles. Translucent shadow mapping yielded unsatisfying results when implemented with a jitter kernel and if the results were to be compare to the extra cost of computing the TSM, the conclusion would be that the modified TSM is not worth using in game engines. 

Shadows were distinctly improved when applying a Gaussian shadow map filter instead of using a uniform shadow map filter. The combination of Gaussian shadow maps and subsurface scattering provided soft shadows without artifacts. The median cut algorithm can be used for many purposes and was very accurate and helpful since it could be used as positions for shadow maps and point specular reflections. The approximation method for Soho wavelets is quite expensive, but it can be run in real-time with the light coming from a whole environment map. Finally, representing the BRDF model in spherical harmonics was seen to be an efficient and low storage technique for environment lighting. 

After having explored the rich and vast world of lighting and materials, it is plain to see that with a few clever techniques and algorithms we can quite easily represent realistic light in real-time game engines. It is a fascinating process and one not to be taken for granted. I hope all the theory in this blog has not put you off from trying it yourself. I would advise you to read Mr. Kruseborn's paper for a much more detailed explanation of the algorithms and techniques before you dive right into real-time lighting. 

We are on the precipice of creating games that look like movies, it is time to take the leap.    

