Last March we shared with you an important round of information about the GeForce RTX 40 that, in general, gave us almost all the keys to this new generation of graphics cards that NVIDIA could launch in the second half of this year. Now, thanks to a new round of leaks, we have new details that Delve deeper into the AD102 kernel configurationthe top-of-the-line GPU that NVIDIA will use in Ada Lovelace.
In the attached image we can see that, in theory, the AD102 GPU will come with a configuration of 12 GPCs (Graphics Processing Clusters), 6 TPCs (Texture Processing Clusters) per GPC, two SM (Streaming Multiprocessors) units per TPC, 128 shaders or FP32 units per SM, 192 FP32+INT32 units per SM and 64 warps per SM unit. In total, you would have 2,048 threads per SM unit, 192 KB of L1 cache per SM, 32 ROPs for each active GPC and 96 MB of L2 cache.
From all this data we can make very important things clear, as long as the information is confirmed. The top model of the GeForce RTX 40 series would count, if it uses a full AD102 GPU, with 70% more GPC units than the GA102 chip, the top of the range of the Ampere series for general consumption, which has recently been used in the GeForce RTX 3090 Ti. It would also have a different configuration in the FP32 and INT32 units, which would comed and would add 128 and 64 cores, respectively. This translates to a 50% increase over the GA102 core.
Other important changes compared to the GA102 core would be a 33% increase in warps, a 33% increase in threads, a 50% increase in L1 cache, a 160% increase in L2 cache and a 100% increase in ROPs (rasterization units) . The latter is very interesting, since if it is confirmed it would mean that the GeForce RTX 4090 could have a whopping 384 ROP’s. To give you an idea, the RTX 3090 Ti has 112 ROPs.
GeForce RTX 40 vs. GeForce RTX 40 vs. GeForce RTX 30: A Concise and Clear Summary
From all the information we have seen so far, we can prepare a complete and direct list that includes all changes, and improvements, that the GeForce RTX 40 will bring compared to the GeForce RTX 30. Keep in mind that to prepare this list we have started from rumors and leaks, which means that some of the data that we are going to see below may not be true.
- A huge increase in the number of active shaders (up to 18,432 in the AD102 core).
- The rasterization units per GPC are doubled, which should represent a significant performance improvement in rasterization. These units take care of “assembling” all the work of the rest of the GPU elements to create the image that will go to the frame buffer and that we will see on the monitor.
- Increased amount of L1 and L2 cache.
- 4th generation tensor cores and 3rd generation RT cores, which should improve performance in AI and DLSS, and with ray tracing.
- Jump to the improved 5nm node (N4), which in theory could help improve efficiency and working frequencies.
With all this in mind, it is quite clear to me that the generational leap that we are going to see with the GeForce RTX 40 is going to be enormous, both in rasterization and ray tracing, although something tells me that it is going to be especially great with this latest technology. In terms of consumption, a notable increase has also been rumored, especially in the most powerful models of the GeForce RTX 40 series, but I think we should take them with caution since some ridiculously high figures have been considered.