Working with dirty regions

June 19th, 2011 theuni

Some of you may remember that back in 2010, Tobias (topfs2) began working on a GSOC project to improve XBMC performance on the BeagleBoard. Many optimizations came out of this project, but the most ambitious feature was dirty region rendering.

The short of it is that a major chunk of dirty-region rendering has finally been merged into XBMC’s bleeding-edge code, though it is disabled by default while we continue polishing the rough edges. The change produces impressive performance gains on low-powered hardware, and is the groundwork for many changes to come that will further reduce CPU and GPU consumption for all XBMC users.

For those interested in a more detailed explanation, read on.

As you may remember, XBMC started as a project for the original XBOX where things were very different. After the move to desktops and embedded environments, many of the legacy procedures remained. One of these procedures, and a long-time thorn in XBMC’s side has long been its rendering model in which where every frame is rendered by the GPU in its entirety, typically at 60fps. As you can imagine, this is incredibly intensive and very unfriendly to low-power platforms. This is where dirty-region rendering comes in. Thanks to the work of Tobias and Jonathan (jmarshall), XBMC now has the ability to only render what has changed.

Marking the dirty regions

So how is this accomplished? Let’s use an example.

In this screen, the user has moved from the Browse button to Add. Previously, XBMC’s renderer would’ve happily uploaded the entire screen to the GPU for each frame (remember that almost every movement in XBMC comes with an animation, so there’s rarely a single-frame change). So in this example, assuming we’re running at 1080p, we have uploaded an entire 1920×1080 scene 30 times just to change the selection. In addition, each pixel is likely rendered more than once due to the layering of dialogs and blending of translucent textures – in the above case each pixel is rendered around 4 times. That’s roughly 1GB of data sent to the GPU for a half-second animation! But not only that, even after the animation is finished, data continues flowing at the same rate – even if nothing on screen is changing at all!

The obvious solution is to send only the data that have changed, and this is exactly how dirty-regions work. With each pass of the rendering loop, we now have the ability to mark controls as dirty. In the example above, the current and next buttons are marked dirty for the length of their animation. We then create a rectangle that contains all dirty controls and send it out for display. During this animation, the data transfer drops to just 16MB. When the animation is complete, nothing is dirty so nothing is uploaded at all.

Clearly the savings here are massive.

The impact

For those of you running XBMC on a desktop, other than knowing that your GPU is working much less, you also may notice a drop in CPU usage. For those on low-power x86 machines like IONs, it is quite possible that there will be some speed-ups along with the drop in CPU usage. The most impact, however, will be seen on low-power embedded devices; hardware like the beagle/panda boards is now much more interesting. Additionally, the atv2 and iPad ports should see a nice benefit from this, though there is currently a bug that prevents correct rendering when dirty-region rendering is enabled.

Because we now know which controls are dirty, in the future there are several changes that will come in that reduce CPU usage further. We hope that we can finally reduce XBMC’s idle CPU and GPU utilization to where they should be, especially when it is minimized or has lost focus.

Try it out

Obviously you will need a bleeding-edge build to see the new functionality, and the same caveats apply that always do when running unstable builds. Currently there are a few bugs that are blocking dirty-region being enabled by default, so for now you’ll need to explicitly enable it in advancedsettings.xml. See here for the settings.

If you are interested in seeing what is going on behind the scenes, you can enable the visualizer (used in the example above) that paints a rectangle over marked regions. Use the <visualizedirtyregions> setting from the wiki link to enable.

  1. mintra
    June 20th, 2011 at 01:18 | #1

    Thank you so much for your great effort to bring this feature in. It’s going to make a much better path for the small CPU.

  2. Pauly
    June 20th, 2011 at 01:44 | #2

    The wiki “true” example for algorithmdirtyregions does not look right?

    Are any complete nightly bleeding-edge builds of XBMC Live planned?

  3. mintra
    June 20th, 2011 at 01:59 | #3

    Enable dirty-region processing.

    0: off
    1: Union
    2: Cost reduction

    Enable dirty-region visualization. Paints a rectangle over marked controls.

    true: off
    false: on

    Example:

    1
    true

    This is what I try…

  4. mintra
    June 20th, 2011 at 02:02 | #4

    I mean this one..

    1
    true

  5. ZombieRobot
    June 20th, 2011 at 02:31 | #5

    Wow!!!!!!!!! This is awesome news great job guys can’t wait to test out on the atv2 just awesome

  6. watzen
    June 20th, 2011 at 02:33 | #6

    weird, the cpu usage is actually less when moving in the gui then when idle:ing. Should the regions be green or red when it’s working?

  7. June 20th, 2011 at 02:47 | #7

    Pauly :
    The wiki “true” example for algorithmdirtyregions does not look right?

    Fixed. Thanks for the heads up.

  8. June 20th, 2011 at 02:51 | #8

    @ watzen, the different colors just indicate different regions….

  9. gulp
    June 20th, 2011 at 03:18 | #9

    Wiki doesn’t seem to be fixed, this setting worked for me (to see the affected regions):

    true

    while the wiki reports

    true

    btw, great features, hope to see xbmc in car av equipment soon :)

  10. Sav
    June 20th, 2011 at 03:34 | #10

    Will this work for any skin?
    Eg. Aeon MQ2?

    Will it speed up performace of flicking screens or single buttons for any skin?
    Or is it only for skins that Dirty knows each button?

  11. Mat
    June 20th, 2011 at 03:37 | #11

    So, is this right? because it does not seem to work for me

    2
    false

  12. Mat
    June 20th, 2011 at 03:39 | #12

    ok, looks like angle brackets get dropped in comments, I’ve swapped them for square ones, lets try one more time :)

    [advancedsettings]
    [gui]
    [algorithmdirtyregions]2[/algorithmdirtyregions]
    [visualizedirtyregions]false[/visualizedirtyregions/]
    [/gui]
    [/advancedsettings]

  13. June 20th, 2011 at 03:41 | #13

    THANK YOU! I’m the guy in our family trying to bring as many low-end (affordable) machines to the job of playing videos from our repo in two houses. It gets harder every day! P4, 1g+ AND Nvidia? That’s no longer a cheap machine!

    Thank you for working on things like this; it means a lot more than you know!

  14. sebak
    June 20th, 2011 at 04:14 | #14

    Great! I love these technical reports of what’s happening.

  15. Gert
    June 20th, 2011 at 04:34 | #15

    Brian: P4 machines are free or almost free at your nearest recycling point. Or $50 used. or $150+ new.

    Great news though. Probably very useful for OEMs that are pondering a xbmc box. And we WANT those!

  16. Jeroen
    June 20th, 2011 at 04:51 | #16

    Interesting! Sounds like a major improvement, going to try it tonight.

  17. Me
    June 20th, 2011 at 05:11 | #17

    Just wow – innovation continues to amaze.
    Nice one!

  18. June 20th, 2011 at 05:43 | #18

    @watzen: Red are the marked dirty regions – any control that’s dirty will be red. We then combine these in various ways (the differing algorithms set via advancedsettings.xml) which gives one or more regions that we re-draw: the green region.

    Sometimes the green and red region will align exactly (eg rss on Confluence home) whereby you get a yellow-ish rectangle.

    For skinners, the goal is to minimize the red regions primarily. Not all controls are working yet – you’ll notice some will always be red (textbox for instance.)

    Note that if you have dirty region visualisation enabled then we re-render the entire scene every frame (i.e. it’s the same as having algorithm set to 0) – it’s primarily for debugging.

  19. albinoman
    June 20th, 2011 at 05:44 | #19

    so does this only work in confluence or any skin?

  20. June 20th, 2011 at 05:50 | #20

    Add usage of OpenMP multiprocessing API to that and XBMC should fly on most multi-core processors!

    Are there any plans to implement asynchronous and synchronous multithreading support via OpenMP?

  21. MrDude
    June 20th, 2011 at 06:37 | #21

    this is really awesome improvement to xbmc.

  22. Adriano Leal
    June 20th, 2011 at 06:45 | #22

    Up the XBMC’rers ! Finally we can see people looking for retrieval of what uses to be a developers Nirvana: Optimized Code !
    In times where hardware is cheaper than available time, associated to the eager to provide new releases to the community it is great news you are able to focus on pieces left behind.
    Proud regards from a Brazilian supporter since 2004 !
    Now running XBMC on several devices all over the house :)

  23. topfs2
    June 20th, 2011 at 07:54 | #23

    One note I’d like to bring is for all testing, I would like to suggest you to use algorithm 1 (union), its a very cheap algorithm and its the one which is most likely best for normal hardware. The cost reduction algorithm (2) was made mostly as a test and the costs are made for the very slow gpu of the beagleboard c4, hence it will most likely not be ideal for more modern GPUs (may in fact be worse).

    We will most likely revisit and make the cost reduction at a later state be more dynamic and overall I’m not happy with the algorithm and will investigate doing an A* search for solving instead, as its sure to find optimal regions. But before that we need to have a way to find the costs, which involves benchmarks of the hardware, this is _not_ implemented yet and will take a lot of time before we are there, as such consider algorithm 2 as academic.

  24. Brainfrz
    June 20th, 2011 at 09:26 | #24

    This is awesome, actually my mac mini is working harder to render the interface than to render the movie itself (accelerated by crystalhd). I really notice the fan speed changing to take-off speed when browsing movies/tv-series.

  25. June 20th, 2011 at 10:04 | #25

    Great job! XBMC team is always on the edge! this is why this project is on gold steps.

  26. Stephen Baker
    June 20th, 2011 at 12:31 | #26

    Very spiffy. I wondered by XBMC was hogging my GPU even in the background. OpenGL apps/games through Launcher plugin tended to be VERY slow… I hope this speeds things up.

  27. watzen
    June 20th, 2011 at 12:42 | #27

    @jmarshall ah, that might explain why I saw that constant cpu usage. :) wow, just tried it without the vizualisation, dramatically lower cpu usage

  28. Marianodt
    June 20th, 2011 at 12:47 | #28

    Thanks for making a cleaner product with dirty regions ;-)

  29. doze
    June 20th, 2011 at 13:52 | #29

    Thanks, this worked great!

    Before I had between 4-8% cpu usage when idling in the main menu. Now it dropped to 0-2% (with the latest bleeding edge 2011-06-20).
    I’m running xbmc on a i7, 2600k, sandy bridge (not overclocked).

  30. June 20th, 2011 at 16:32 | #30

    @albinoman
    It is completely skin independent, so will benefit all skins. Note that some skins will benefit more than others ofcourse – it all depends on how much is changing on screen (eg if there’s background animations, then we’ll be re-rendering the interface anyway).

    @Harley XBMC is ofcourse already reasonably multi-threaded. Some stuff should certainly be made more asynchronous – we use a lot of background threads, but in some cases end up waiting on them to do their thing (eg directory retrieval). The goal is moving towards an event based system rather than a render loop – this allows the render to be done on a separate thread altogether, thus ensuring the UI keeps ticking over regardless of what else is going on.

  31. philneko
    June 21st, 2011 at 04:27 | #31

    Great news!

    Thanks for these regular insights to XBMC progress!!

  32. Vern
    June 21st, 2011 at 06:00 | #32

    Works like a charm! Thank you very much!

  33. Bardun
    June 21st, 2011 at 08:18 | #33

    Awesome work!

  34. Zarbis
    June 21st, 2011 at 11:45 | #34

    Thanks! Only watching dirty regions visualization is a big fun, and knowing that it actually will stop my ION from constantly idle-heating makes it even more avesome!

  35. apanloco
    June 21st, 2011 at 14:45 | #35

    Now I’m, very happy!! Just FYI ;D

  36. xbmcuser2000
    June 21st, 2011 at 15:31 | #36

    Unfortunately it doesn’t work for me. I have tested it in my laptop (GM965/GL960 Integrated Graphics Controller) and in the Tegra2 devboard and the CPU performance was 400% worse than previous version in both cases (measured with top and htop). If I disable the option the performance is the same.

  37. chris
    June 22nd, 2011 at 09:21 | #37

    well this looks to be great news.

    how long before we see it in a standard package to use?

  38. live4ever
    June 22nd, 2011 at 11:08 | #38

    Does this improve the performance of the now playing/video overlay? That is the only real fault I have with my ION, navigating the XBMC GUI while a video is playing (it works but it’s choppy).

  39. topfs2
    June 22nd, 2011 at 19:09 | #39

    @live4ever
    No difference in video. When playing video it will mark the area the video covers always, as such it will render always, so same as before.

  40. miljbee
    June 23rd, 2011 at 05:20 | #40

    Thanks for this great work.
    Will it be included ine Eden ?

  41. Tommy
    June 23rd, 2011 at 13:50 | #41

    This feature is Awesome!

    It makes the Gui really smooth on my 1.8Ghz T43 Thinkpad.

    I’ve been using XBMC since the very beginning and upgrade frequently for all the new fun stuff.

  42. voodu
    June 24th, 2011 at 07:29 | #42

    When I run XBMC, I can literally hear my GPU fans run faster. Despite the fact that my GPU is quite fast (GTX460) and my case has soundproofing (Fractal R3). I cannot believe that a simple graphical interface uses so much resources.

    Looking forward to this feature.

  43. June 25th, 2011 at 05:01 | #43

    Any news on the Sigma Designs port or similar cheap embedded hardware that can do 1080p?

  44. jacky89
    June 26th, 2011 at 13:39 | #44

    I can not find advancedsettings.xml under the xbmc\userdata folder. How do I enable the dirty region processing?

  45. Rizzo
    June 26th, 2011 at 16:00 | #45

    forgive me if im being daft but is this going to be released as a patch when complete or in a new xbmc release?
    XBMC is amazing!
    cheers in advance :)

  46. jacky89
    June 26th, 2011 at 16:31 | #46

    I created an advancedsettings.xml file and entered the following from Matt’s post. I replaced all of the square brackets with angle brackets. However, when I run it with the June 26, 2011 build, nothing happens. Am I missing something?

    [advancedsettings]
    [gui]
    [algorithmdirtyregions]2[/algorithmdirtyregions]
    [visualizedirtyregions]false[/visualizedirtyregions/]
    [/gui]
    [/advancedsettings]

  47. Anonymous
    June 30th, 2011 at 05:37 | #47

    @jacky89

    Did you fix the indentations in the xml file? Or does it just look like your post (all lined up on the left)?

  48. Anonymous
    June 30th, 2011 at 05:40 | #48

    jacky89 :
    I created an advancedsettings.xml file and entered the following from Matt’s post. I replaced all of the square brackets with angle brackets. However, when I run it with the June 26, 2011 build, nothing happens. Am I missing something?
    [advancedsettings]
    [gui]
    [algorithmdirtyregions]2[/algorithmdirtyregions]
    [visualizedirtyregions]false[/visualizedirtyregions/]
    [/gui]
    [/advancedsettings]

    Oh, and there shouldn’t be a slash at the end of the second “visualizedirtyregions”

  49. jacky89
    July 2nd, 2011 at 15:12 | #49

    It is fixed and works great. Thanks. Can’t wait until they can lower the idle cpu usage in xbmc down to windows 7 level.

  50. July 2nd, 2011 at 22:36 | #50

    Also for users like me with a low power ION box, set up a screensaver in XBMC to go to a black screen. It will drop your CPU WAY down when its not in use and in tern keeps things cooler. Before I had to shutdown my box, now it just goes to the screensaver at night and stays cool

Comment pages
1 2 4617
Comments are closed.