Working with dirty regions
Some of you may remember that back in 2010, Tobias (topfs2) began working on a GSOC project to improve XBMC performance on the BeagleBoard. Many optimizations came out of this project, but the most ambitious feature was dirty region rendering.
The short of it is that a major chunk of dirty-region rendering has finally been merged into XBMC’s bleeding-edge code, though it is disabled by default while we continue polishing the rough edges. The change produces impressive performance gains on low-powered hardware, and is the groundwork for many changes to come that will further reduce CPU and GPU consumption for all XBMC users.
For those interested in a more detailed explanation, read on.
As you may remember, XBMC started as a project for the original XBOX where things were very different. After the move to desktops and embedded environments, many of the legacy procedures remained. One of these procedures, and a long-time thorn in XBMC’s side has long been its rendering model in which where every frame is rendered by the GPU in its entirety, typically at 60fps. As you can imagine, this is incredibly intensive and very unfriendly to low-power platforms. This is where dirty-region rendering comes in. Thanks to the work of Tobias and Jonathan (jmarshall), XBMC now has the ability to only render what has changed.
Marking the dirty regions
So how is this accomplished? Let’s use an example.
![]()
In this screen, the user has moved from the Browse button to Add. Previously, XBMC’s renderer would’ve happily uploaded the entire screen to the GPU for each frame (remember that almost every movement in XBMC comes with an animation, so there’s rarely a single-frame change). So in this example, assuming we’re running at 1080p, we have uploaded an entire 1920×1080 scene 30 times just to change the selection. In addition, each pixel is likely rendered more than once due to the layering of dialogs and blending of translucent textures – in the above case each pixel is rendered around 4 times. That’s roughly 1GB of data sent to the GPU for a half-second animation! But not only that, even after the animation is finished, data continues flowing at the same rate – even if nothing on screen is changing at all!
The obvious solution is to send only the data that have changed, and this is exactly how dirty-regions work. With each pass of the rendering loop, we now have the ability to mark controls as dirty. In the example above, the current and next buttons are marked dirty for the length of their animation. We then create a rectangle that contains all dirty controls and send it out for display. During this animation, the data transfer drops to just 16MB. When the animation is complete, nothing is dirty so nothing is uploaded at all.
Clearly the savings here are massive.
The impact
For those of you running XBMC on a desktop, other than knowing that your GPU is working much less, you also may notice a drop in CPU usage. For those on low-power x86 machines like IONs, it is quite possible that there will be some speed-ups along with the drop in CPU usage. The most impact, however, will be seen on low-power embedded devices; hardware like the beagle/panda boards is now much more interesting. Additionally, the atv2 and iPad ports should see a nice benefit from this, though there is currently a bug that prevents correct rendering when dirty-region rendering is enabled.
Because we now know which controls are dirty, in the future there are several changes that will come in that reduce CPU usage further. We hope that we can finally reduce XBMC’s idle CPU and GPU utilization to where they should be, especially when it is minimized or has lost focus.
Try it out
Obviously you will need a bleeding-edge build to see the new functionality, and the same caveats apply that always do when running unstable builds. Currently there are a few bugs that are blocking dirty-region being enabled by default, so for now you’ll need to explicitly enable it in advancedsettings.xml. See here for the settings.
If you are interested in seeing what is going on behind the scenes, you can enable the visualizer (used in the example above) that paints a rectangle over marked regions. Use the <visualizedirtyregions> setting from the wiki link to enable.
Thank you so much for your great effort to bring this feature in. It’s going to make a much better path for the small CPU.
The wiki “true” example for algorithmdirtyregions does not look right?
Are any complete nightly bleeding-edge builds of XBMC Live planned?
Enable dirty-region processing.
0: off
1: Union
2: Cost reduction
Enable dirty-region visualization. Paints a rectangle over marked controls.
true: off
false: on
Example:
1
true
This is what I try…
I mean this one..
1
true
Wow!!!!!!!!! This is awesome news great job guys can’t wait to test out on the atv2 just awesome
weird, the cpu usage is actually less when moving in the gui then when idle:ing. Should the regions be green or red when it’s working?
Fixed. Thanks for the heads up.
@ watzen, the different colors just indicate different regions….
Wiki doesn’t seem to be fixed, this setting worked for me (to see the affected regions):
true
while the wiki reports
true
btw, great features, hope to see xbmc in car av equipment soon :)
Will this work for any skin?
Eg. Aeon MQ2?
Will it speed up performace of flicking screens or single buttons for any skin?
Or is it only for skins that Dirty knows each button?
So, is this right? because it does not seem to work for me
2
false
ok, looks like angle brackets get dropped in comments, I’ve swapped them for square ones, lets try one more time :)
[advancedsettings]
[gui]
[algorithmdirtyregions]2[/algorithmdirtyregions]
[visualizedirtyregions]false[/visualizedirtyregions/]
[/gui]
[/advancedsettings]
THANK YOU! I’m the guy in our family trying to bring as many low-end (affordable) machines to the job of playing videos from our repo in two houses. It gets harder every day! P4, 1g+ AND Nvidia? That’s no longer a cheap machine!
Thank you for working on things like this; it means a lot more than you know!
Great! I love these technical reports of what’s happening.
Brian: P4 machines are free or almost free at your nearest recycling point. Or $50 used. or $150+ new.
Great news though. Probably very useful for OEMs that are pondering a xbmc box. And we WANT those!
Interesting! Sounds like a major improvement, going to try it tonight.
Just wow – innovation continues to amaze.
Nice one!
@watzen: Red are the marked dirty regions – any control that’s dirty will be red. We then combine these in various ways (the differing algorithms set via advancedsettings.xml) which gives one or more regions that we re-draw: the green region.
Sometimes the green and red region will align exactly (eg rss on Confluence home) whereby you get a yellow-ish rectangle.
For skinners, the goal is to minimize the red regions primarily. Not all controls are working yet – you’ll notice some will always be red (textbox for instance.)
Note that if you have dirty region visualisation enabled then we re-render the entire scene every frame (i.e. it’s the same as having algorithm set to 0) – it’s primarily for debugging.
so does this only work in confluence or any skin?
Add usage of OpenMP multiprocessing API to that and XBMC should fly on most multi-core processors!
Are there any plans to implement asynchronous and synchronous multithreading support via OpenMP?
this is really awesome improvement to xbmc.
Up the XBMC’rers ! Finally we can see people looking for retrieval of what uses to be a developers Nirvana: Optimized Code !
In times where hardware is cheaper than available time, associated to the eager to provide new releases to the community it is great news you are able to focus on pieces left behind.
Proud regards from a Brazilian supporter since 2004 !
Now running XBMC on several devices all over the house :)
One note I’d like to bring is for all testing, I would like to suggest you to use algorithm 1 (union), its a very cheap algorithm and its the one which is most likely best for normal hardware. The cost reduction algorithm (2) was made mostly as a test and the costs are made for the very slow gpu of the beagleboard c4, hence it will most likely not be ideal for more modern GPUs (may in fact be worse).
We will most likely revisit and make the cost reduction at a later state be more dynamic and overall I’m not happy with the algorithm and will investigate doing an A* search for solving instead, as its sure to find optimal regions. But before that we need to have a way to find the costs, which involves benchmarks of the hardware, this is _not_ implemented yet and will take a lot of time before we are there, as such consider algorithm 2 as academic.
This is awesome, actually my mac mini is working harder to render the interface than to render the movie itself (accelerated by crystalhd). I really notice the fan speed changing to take-off speed when browsing movies/tv-series.
Great job! XBMC team is always on the edge! this is why this project is on gold steps.
Very spiffy. I wondered by XBMC was hogging my GPU even in the background. OpenGL apps/games through Launcher plugin tended to be VERY slow… I hope this speeds things up.
@jmarshall ah, that might explain why I saw that constant cpu usage. :) wow, just tried it without the vizualisation, dramatically lower cpu usage
Thanks for making a cleaner product with dirty regions ;-)
Thanks, this worked great!
Before I had between 4-8% cpu usage when idling in the main menu. Now it dropped to 0-2% (with the latest bleeding edge 2011-06-20).
I’m running xbmc on a i7, 2600k, sandy bridge (not overclocked).
@albinoman
It is completely skin independent, so will benefit all skins. Note that some skins will benefit more than others ofcourse – it all depends on how much is changing on screen (eg if there’s background animations, then we’ll be re-rendering the interface anyway).
@Harley XBMC is ofcourse already reasonably multi-threaded. Some stuff should certainly be made more asynchronous – we use a lot of background threads, but in some cases end up waiting on them to do their thing (eg directory retrieval). The goal is moving towards an event based system rather than a render loop – this allows the render to be done on a separate thread altogether, thus ensuring the UI keeps ticking over regardless of what else is going on.
Great news!
Thanks for these regular insights to XBMC progress!!
Works like a charm! Thank you very much!
Awesome work!
Thanks! Only watching dirty regions visualization is a big fun, and knowing that it actually will stop my ION from constantly idle-heating makes it even more avesome!
Now I’m, very happy!! Just FYI ;D
Unfortunately it doesn’t work for me. I have tested it in my laptop (GM965/GL960 Integrated Graphics Controller) and in the Tegra2 devboard and the CPU performance was 400% worse than previous version in both cases (measured with top and htop). If I disable the option the performance is the same.
well this looks to be great news.
how long before we see it in a standard package to use?
Does this improve the performance of the now playing/video overlay? That is the only real fault I have with my ION, navigating the XBMC GUI while a video is playing (it works but it’s choppy).
@live4ever
No difference in video. When playing video it will mark the area the video covers always, as such it will render always, so same as before.
Thanks for this great work.
Will it be included ine Eden ?
This feature is Awesome!
It makes the Gui really smooth on my 1.8Ghz T43 Thinkpad.
I’ve been using XBMC since the very beginning and upgrade frequently for all the new fun stuff.
When I run XBMC, I can literally hear my GPU fans run faster. Despite the fact that my GPU is quite fast (GTX460) and my case has soundproofing (Fractal R3). I cannot believe that a simple graphical interface uses so much resources.
Looking forward to this feature.
Any news on the Sigma Designs port or similar cheap embedded hardware that can do 1080p?
I can not find advancedsettings.xml under the xbmc\userdata folder. How do I enable the dirty region processing?
forgive me if im being daft but is this going to be released as a patch when complete or in a new xbmc release?
XBMC is amazing!
cheers in advance :)
I created an advancedsettings.xml file and entered the following from Matt’s post. I replaced all of the square brackets with angle brackets. However, when I run it with the June 26, 2011 build, nothing happens. Am I missing something?
[advancedsettings]
[gui]
[algorithmdirtyregions]2[/algorithmdirtyregions]
[visualizedirtyregions]false[/visualizedirtyregions/]
[/gui]
[/advancedsettings]
@jacky89
Did you fix the indentations in the xml file? Or does it just look like your post (all lined up on the left)?
Oh, and there shouldn’t be a slash at the end of the second “visualizedirtyregions”
It is fixed and works great. Thanks. Can’t wait until they can lower the idle cpu usage in xbmc down to windows 7 level.
Also for users like me with a low power ION box, set up a screensaver in XBMC to go to a black screen. It will drop your CPU WAY down when its not in use and in tern keeps things cooler. Before I had to shutdown my box, now it just goes to the screensaver at night and stays cool