Dirty Rectangles system is a term used for a specific rendering optimization that consists in tracking which parts of the screen changed from the previous frame and rendering only what has changed.
Implementing this sort of system is part of my last task for Google Summer of Code and it's probably the biggest and most difficult task I worked on so far.
In order to implement this kind of system inside TinyGL, a system that defers draw calls is required; once this system is implemented then dirty rectangles will be as easy as comparing draw call from current and previous frame and decide which parts to render based on this information.
As every draw call needs to be stored (along with all the information to execute it) the best way to implement this is to use polymorphism and let every subclass of DrawCall store whatever information is needed, thus saving space (because only the necessary information will be stored) at the cost of a minimal performance impact due to virtual functions.
This would be the interface of the class DrawCall:
As you can see the class exposes some basic functionalities that are required to implement dirty rects: you need to be able to compare if two draw calls are equals and you need to be able to perform the draw call (with or without a restricting rectangle).
At the moment only a few operations would be encapsulated inside this class, namely blitting (on framebuffer or zbuffer) and triangle and line rasterization.
That's all for the first part of this series of posts, the next one will describe more in depth the implementation of DrawCall subclasses and the problems that arise with encapsulating blitting and 3d rendering operations!
I wonder about a few things here:
ReplyDelete1: What _IS_ a drawCall, is it any tgl-function (or blit-api-function), or is it some subset of them?
2: Why use "compare", when C++ has perfectly good operator overloading available?
3: Is the virtual overhead really that low? This question really relates back to 1, as if you need to do virtual calls for every single tinygl-call, then you get a rather big overhead that will have to happen for every tinygl-call, regardless of whether the frame is completely clean or not.
4: What is "the relevant information"? Consider that you're working with a state-machine here, and might want to avoid having to rerun too many of those functions.
5: When do you calculate the dirty area?
1: a draw call is a function that writes pixels to the screenbuffer (be it color or zbuffer), so tglBlit, and all the fillTriangle* family of functions inside tinyGL basically (no tglVertex or tglColor etc)
ReplyDelete2: That was just a stub and a virtual function called compare looked more straight forward and understandable to me, nothing wrong with overloading the operator
3: Just read answer 1, it is going to be low because a draw call will be issued only after a few tgl* functions, that means it shouldn't really be that heavy
4: Relevant information depends on the operation type: blits just need a few information from the tgl state machine whereas 3d rendering needs more information
5: I wanted to cover this in part 2 but a function that "flips" the buffer is required somewhere in tinyGL, it should be something similar to what DirectX does with its function Present.
1: Won't catching the calls at that low level mean that you'll still have to do ALL the math for the 3D for every frame? So basically it's just the final blitting that you'll avoid?
ReplyDelete1a: How do you know at the fillTriangle-level whether anything else has changed (blend-mode, rotation-matrices, textures etc)?
4: Doesn't the 3D rendering also just consider the state of the TGL state machine?
5: Well, wouldn't it be easier to keep a running account of this, considering the draw calls done up to this point last frame to the current draw call? (This is what I do in WME, but it MIGHT not be suitable here, idk). As you do have the information available at the point when you store the call, whether or not it was the same last frame.
1: True, but profiling says that the rasterization is the most performance intensive part, so that's where we should be optimizing
ReplyDelete1a: fillTriangle becomes the core of "execute" of that DrawCall type, which means you store all the information of the state that will affect that call inside the instance of DrawCall and then compare that as well, if that makes sense.
4: Sure, but if we can only store the final result we'll be saving up lots of space
5: This would also be possible of course, though I'm not sure which approach is faster, we still need a way to tell tinyGL that a new frame is starting
1: Yes, but consider that every high-level call might result in a ton of work "under the hood".
ReplyDelete1a: How do you know what information affects that call? In theory that's more or less all of the TinyGL-state, no?
4: Well, that depends really, you have to catch stuff like texture-updates etc, which happen at a higher level of abstraction, and then dirty everything that connects to that texture-id. Consider that in practice there are very few GL-calls that actually trigger drawing, and a ton that doesn't trigger drawing, but just change state.
5: Well, in theory you can have a linear lookup when doing the check while creating the frame, since you can just move the "checkpoint" one step for every drawCall, and compare it to the previous one. (This is how the ticket-system is supposed to work in WME, checking the order of calls gets rather tricky though, as you MIGHT get the same draw call later than last frame, and thus have to dirty it, and a bunch more, since the guarantee you're providing is that the end result is supposed to be the same as without Dirty Rects). You do need some kind of "frame done"-marker to work with dirty rects, no matter how you look at it, yes.
1: That's pretty much the point of this work, if we can avoid those high-level calls then we're saving up a lot of work
ReplyDelete1a: More on this in the next post
4: The only thing that might affect this really is a texture update, other states affect what gets computed before geometry to viewport transformation for the majority of the functions so this shouldn't be an issue
5: True, but I don't really see the point if you can just compare everything at the end (you need to cycle through the draw calls anyways to render everything)