@goldgin2 True, although he does add that like in the Starfield example, the second shot is later in the animation. All in all though, I will absolutely say the tech is far from perfect. Personally my gripe with it is not so much the faces but what it's doing to the scenery. I find that much more problematic, since it literally removes atmospheric factors like the dreamy haze in Hogwarts and the overexposed look of the structures in Oblivion Remaster. Even though the lighting might be more accurate, it removes the gloomy feel it originally had.
I think that's both a problem with the tech itself not having matured enough and the fact that it's injected into existing titles. You kind of saw the same issues with path tracing being added to Cyberpunk. It would elevate one scene but then hurt another where the bouncing lights lit up a dark moody area too much so it lost it's charm.
So it seems DLSS 5 really is just some post-processing slop filter. It's all based on the actual image, so while it doesn't alter the geometry (there's no data for this) it will still make models and textures change radically once applied, which is why the changes can look not only awful, but also entirely inconsistent from one scene to the next. Hoping DF picks up on this and covers it accordingly.
It doesn't alter geometry and it also doesn't add details that aren't already in the textures. The problem is, some people don't understand how normal maps work, and no, it's not only operating on a 2D image and motion vectors. It also has depth buffers and can see the normal maps, specularity maps, etc, and raytrace samples will be separate as well. All the different render layers a game engine can output can affect this, just like all DLSS versions. While yes this is all 2D information, it's a lot of information that you can learn things about the composition of the scene, what kind of objects they are, what kind of materials they are, etc, and the model is designed to be temporally stable, meaning it has a memory of the scene even as things move.
Some things can look like new details at a glance if you don't look more closely at them, but every example I've seen can be explained by the model understanding something deeper about the kind of material that surface/texture is on, which allows new light interactions that weren't previously possible. Lips can look more full, because the self shadowing/AO around them is more pronounced. Hair can look darker because there's a lot more depth to the shadow processing between the hairs, and more dynamics of the translucency of the hair revealing shadows underneath, like in real life. These are the kind of effects you'd normally need extremely high fidelity raytracing with extremely high polygon models to achieve.
The only thing it's "generating" is light interaction, not any new actual objects or detail that isn't based on detail that already exists in the textures.
The guy is making a lot of very wrong assumptions in that video. It's not a 2D image in the same sense that we view it. There is a lot of information about depth and material provided to the model. It's also not an image generator model at all. That would require far more horsepower and RAM and would not be able to achieve this level of consistency with the geometry, or across movement. It's a completely different kind of model than what image generators do.
@AndyGilleand Do you have some insider knowledge or is it all wishful thinking? Because some of the things you say contradict even Nvidia's own comments on the subject (only color and motion vectors of the 2D frame, so really just an image and motion vectors) and DF's analysis of this. Ideally, it would work with more knowledge but the model is already so big with just these so... It is what it is as of now. And everyone is looking very closely at the frames I can assure you. I think you're very generous calling what are obvious hallucinations a new 'understanding'. One thing often forgotten about AI is that it does not understand anything. It is a probabilistic prediction based on patterns inferred from the training data.
Isnt color based on an understanding of materials as well? So how can a model predict lighting, or shading, without understanding how light interacts with a given material? Of course it needs that kind of knowledge to predict the better looking option for a given pixel.. I am still hyped about dlss 5.
I love most of what ive seen about dlss 5 so far, apart from two issues. 1. They should have preserved the correct tonemapping of the original footage and screen shots. That is a bit of a bad way to showcase respect for the originals art direction. 2. They should have lined the screen shots up properly.
It has the image so for example if it sees skin it can recognize it as such because he already has seen plenty of 'skin' (this isn't new at all) and then predict how this material will interact with light based on his training data set.
@AndyGilleand Do you have some insider knowledge or is it all wishful thinking? Because some of the things you say contradict even Nvidia's own comments on the subject (only color and motion vectors of the 2D frame, so really just an image and motion vectors) and DF's analysis of this. Ideally, it would work with more knowledge but the model is already so big with just these so... It is what it is as of now. And everyone is looking very closely at the frames I can assure you. I think you're very generous calling what are obvious hallucinations a new 'understanding'. One thing often forgotten about AI is that it does not understand anything. It is a probabilistic prediction based on patterns inferred from the training data.
This is wrong, and a common misconception. AI has nothing to do with probabilities at all. That came from a misunderstanding of the kind of output that specifically LLMs produce (DLSS is not an LLM), and that output is not actually probability based at all, but can look that way to someone unfamiliar with the way the model works. In fact what an LLM is doing is a simulation of how our brain decides the next word to speak. It forms a more complete thought about what we want to say, and then chooses the next word that brings us closer to saying that complete thought. People thought it was probability based because the model produces a vector that can then mathematically be compared with the other vectors in the dictionary, and then the software chooses one of the top 10 at random or so. That's not probability of a word being the right one. That's a brain deciding which words would be most appropriate to form the larger thought it already has about what it wants to say, and the software picking from those options.
Neural networks are a digital simulation of how neurons process signals in the brain. Yes, AI does in fact understand things at a deep level and uses that understanding to inform the decisions it makes. There are no probabilities involved, unless you want to call what our brain does based on probability.
Every version of DLSS has access to all of those other render layers I mentioned, not just color and motion vectors. DLSS5 must inherently require access to the same layers all previous versions of DLSS have access to, because it's doing the supersample and ray reconstruction alongside the new stuff.
There are no "obvious hallucinations". I have yet to see one single example that can't easily be explained by a new layer of physical light and material simulation.
Take it at 15:08 ('probabilistic computing'). I'm a simple man, he probably had a lot of advice from his team to use the right words.
That's not a brain, that's an algorithm doing compute, everything is coded and mathematical operations. It has no notion of what is appropriate to say, nor does it care, it decides on a mathematical basis. It has been designed to imitate life/thoughts/whatever, it is not the real thing.
It might be called DLSS but it is a very different tech that your SR DLSS. It is described in Nvidia blog about it. Just because it has the marketing name DLSS doesn't imply anything about what it does.
And finally if you don't see the mascara, I'm not gonna argue with that. It might be a personal bias the same way some people care about stutters more than others or high frame rates, etc.
At this point it is not far fetched to suspect the main 2 images that look AI generated the most (Make-up Grace, Salon Hair Starfield NPC) are in different time frames on purpose by NVIDIA, to fool more people not figuring out it's not just lighting (on those two specifically).
The rest of the images actually look more like just lighting, maybe a bit of displacement mapping. Even so the lighting is wrong in many cases, more like a streamers ring light when they look at the camera (us) that's why Starfield looks better, people are used to ring lights when streamers look at the camera.
Nvidia could've just described it as a real-time post-processing AI filter from the start (one that requires a dedicated 5090, no less) and be done with it. But they didn't, and neither did DF had actual knowledge about it to arrive at this conclusion, blatant as it may be due to the resulting output. They know all they have to show at this time is something completely underbaked (which, at worst, destroys the original artistic intent behind the content it's filtering over), so they decided to do some marketing double-speak and call it a day.
Forums
Topic: DF's coverage of DLSS 5
Posts 61 to 70 of 70
Please login or sign up to reply to this topic