With the launch of the 20-series RTX GPUs from Nvidia, the world was introduced to the idea of dedicated hardware to accelerate real-time ray-tracing. Combined with another dedicated machine learning accelerator, the RTX GPUs have shaken up the graphics world in a big way. The 30-series GPUs offer second-generation RT cores, but how do these newer RT cores differ from those that came before?
Ray-tracing in a Nutshell
If you’re not clear on what ray-tracing is in general, let’s explain it briefly. Ray-tracing is a method of rendering a computer-generated scene that simulates light rays bouncing around objects around the scene.
Light bounces all over the objects around you before entering your eye in real life. Ray tracing simulates this but actually does it in reverse. Rays of simulated light are shot from the “camera” in the rendered scene and pick up color information with every bounce from the object's surface. The result is the most photo-realistic way of creating computer graphics.
It’s the industry standard for CG films and professional-grade CG, but it’s incredibly compute-intensive. For years ray-tracing was limited to off-line rendering, where it may take many hours to render a single frame.
Traditional real-time computer graphics techniques try to create realistic imagery without simulating how light works. After decades of refinement, we’ve gotten pretty good at it, but ray-traced scenes still knock traditional real-time rendering out of the park.
What Do RT Cores Do?
That’s where RT cores come in. These specialized processors are incredibly good at doing the primary type of calculation you need to do ray tracing. They check for intersections between virtual light rays and objects in the scene. RT cores make it possible to check for thousands and thousands of these intersections every second.
Even with that acceleration, the resulting image is still quite grainy, filled with “noise” from the relatively low level of ray-tracing fidelity. Nvidia got past this problem by employing the machine learning processors in their cards to apply a de-noising algorithm to each frame. The result is a convincing ray-traced image that runs fast enough for real-time interaction.
Those machine learning cores add even more help by letting the scene render at a lower resolution and then upscaling it with a technique known as Deep Learning Super Sampling (DLSS). This further lightens the load on the RT cores without compromising image quality.
How are Second-generation RT Cores Better?
The second-generation RT cores are better in two simple ways: speed and efficiency. Improvements to the hardware have resulted in RT cores with twice the throughput compared to the original design. The render pipeline has also been cleaned up, and ray tracing can now be done concurrently with shading, and GPU compute functions, so bottlenecks are less likely.
The third-generation tensor cores that handle machine learning have also been upgraded to have up to twice the throughput of the second-generation design. This affects the RT cores as well since de-noising and upscaling can happen more quickly and provide a more pristine final image.
Third Generation RT Cores Are Close!
Image Source iVadim
As we write this, the rumors for the 40-series GPUs are coming thick and fast. The RT cores in these upcoming cards purportedly offer almost double the performance per card tier compared to the 30-series. RT cores aren’t rumored to have such a dramatic improvement, but anything between 30-50% wouldn’t surprise us!