What to do when you have the dirty regions?
For those that have followed the weekly updates might know that my branch now can generate somewhat correct dirty regions for many controls. This means that most of the GUI is usable in a dirty region rendered way.
I searched quite a lot on the internet for how to handle the regions created in a good way and I found very little information beyond the simple case, so I decided to show my solution to spur some discussion and perhaps to help someone that might be faced with the same problem I was.
While we wander down the tree of controls each control marks a dirty region if they have changed, this means that when we are ready to render we might have lots of regions available which must be rendered. To render all these regions a simple solution is to take the union of them all, this allows us to render the parts that have changed. While this surely is more effective than rendering the entire screen it is still likely we render an area which has not been changed. Say you change one pixel in each screen corner, the union would envelope the entire screen which would lead to unnecessary bandwidth use. On the other hand rendering each and every region makes overhead more noticeable. Each render pass will while iterating through the tree of controls do a lot of matrix operations. Each matrix operations leads to a lot of floating point calculations which is something CPU’s normally aren’t suited for. Intuitively one would say that a better solution would be a middle ground, create unions of those regions which are close and adding rendering passes on those that are far away as seen in figure 1. The question is, how should one solve this problem?
Figure 1 – Two ways one could create the render passes
To create a greedy algorithm that solves this perticular problem we first need to define it. The problem we are faced with is that we have X regions, does there exist a solution with Y regions which has a lower cost of rendering than the X regions.
When the problem is defined one can see that the unification of the regions will solve the problem, and as a matter of fact many applications solve it this way due to its simplicity. A well known example is the Android operating system which render in one pass and as such the union of the changed regions.
The union solution is great when you have a limited amount of pixels or a well structured layout when its unlikely a change isn’t confined, which is the case on Android. XBMC on the other hand wants changes at many places throughout the screen, an example is RSS scrolling while a button is pulsating and in PM3.HD this would mean needing to render half the screen if the union solution is used.
To find a more optimal solution we first define 2 costs, one cost is per area needing to be rendered and another cost is the cost for doing one rendering pass no matter the size of the area. This reduces the problem to a cost reduction problem which can be solved rather simple. While it can be solved with trying all possible solutions its far from optimized and as such I have composed a greedy algorithm to solve it fast and hopefully often optimally.
The psuedo code would be
For every marked dirty region:
Find the cheapest unions cost against the already created rendering passes:
If the cheapest union is cheaper than adding the marked region as a new rendering pass, use the union, otherwise create a new rendering pass.
This solution can either become one unified region or several small ones depending on the costs, this makes it a very flexible solution which takes almost the same time to solve as a plain union. One thing one might note is that the order of the marked dirty region might affect the found solution. Since I am no algorithm expert I can’t mathematically show how well the solution will perform so I have tried it instead. To see how often it finds the cheapest I tried randomizing a number of regions then use the algorithm on all available orders to find the one with the lowest cost. It turns out that it depends quite much on the number of regions and there spread. If I used 10-20 regions 30% of the cases the first tried is the one with the lowest cost while 5 regions that number was 70%. While these numbers might sound small, especially with more regions, the first one found was usually 108% of the lowest region in my tests. When having only 5 regions dirty it went down to 103%. So while it might not be the smallest it is still producing a rather good approximation. Noteworthy is that the union solution produces a region about 112% when 20 regions and 150% when 5 regions.
Figure 3 – Greedy algorithm of cost reduction against cheapest solution (blue is cheapest and red is the greedy solution)