PDA

View Full Version : How do I troubleshoot a "severe" lockup?


xbmcJJ
2009-03-10, 21:03
First a disclaimer: I'm running this on Jaunty because that was the easiest way to get hdmi audio out on my intel x4500hd video. - I know it's not an officially supported platform.

Kernel: 2.6.28
XBMC: pre-9.04 r18324 (compiled Mar 7 2009) - installed via the svn ppas
Intel 2.6.1 display driver

The gnome desktop seems to work fine, firefox, etc launch and run indefinitely. glxinfo reports direct rendering: yes and glxgears gives ~1200 fps

I can access my movie collection via XBMC and play .iso .wmv and .avi test files with no problems.

I've been running with XBMC windowed for setup & testing purposes.

On certain plugins, for example Apple Movie Trailers, I'll drill down, the list will populate with trailers and then the system locks up.

I can still move the cursor, but clicks don't work. The screen doesn't update (gnome clock stops, cpu graph on gnome desktop stops, etc.).

Switching to virtual consoles doesn't work.

The system does respond to pings, though and I can ssh in. Commands like ls, du, df, man, lspci, ps, top all seem to work fine - and top reports 0-1% cpu util on both cores.

Here's where it gets weird: "ps a" hangs the ssh session completely. ctrl+c does nothing. I can open as many ssh sessions as I want, but as soon as I issue the ps a command, the session hangs and I end up having to close the terminal.

If I run xbmc via gdb per this post http://xbmc.org/forum/showthread.php?t=30230 , I can SSH in to the locked machine, ps a works, and I can kill the gdb and xbmc.bin processes and regain control of the desktop. But I never get the chance to get a backtrace from gdb.

Neither the X log, system log nor xbmc logs seem to show anything glaringly wrong - I wonder whether there's even a chance for the process to write to the log before the lockup.

As a test, I also installed boxee. There, if I drill down to Friends activity, and pick a recommended video, the system locks up in an identical way.

Here's the tail of my xbmc log:

[/xbmc@mediator:~$ tail -n 50 /var/tmp/xbmc-xbmc.log
11:47:13 T:2713713552 M:2536112128 INFO: Creating thumb from: http://images.apple.com/moviesxml/s/independent/posters/alientrespass_xl200902251509.jpg as: special://masterprofile/Thumbnails/Video/1/17b047cc.tbn
11:47:13 T:2729786256 M:2536112128 INFO: Creating thumb from: http://images.apple.com/moviesxml/s/independent/posters/americanviolet_xl200902251441.jpg as: special://masterprofile/Thumbnails/Video/e/ec0df7ac.tbn
11:47:13 T:2696928144 M:2536112128 INFO: Creating thumb from: http://images.apple.com/moviesxml/s/sony_pictures/posters/angelsdemons_xl200811061144.jpg as: special://masterprofile/Thumbnails/Video/b/b324a854.tbn
11:47:13 T:2811222928 M:2535088128 DEBUG: FileCurl::Open(0xa78fba9c) http://images.apple.com/moviesxml/s/sony/posters/12_xl200811041428.jpg
11:47:13 T:2705320848 M:2535088128 DEBUG: FileCurl::Open(0xa13fca9c) http://images.apple.com/moviesxml/s/sony_pictures/posters/adoration_xl200812171620.jpg
11:47:13 T:2713713552 M:2535088128 DEBUG: FileCurl::Open(0xa1bfda9c) http://images.apple.com/moviesxml/s/independent/posters/alientrespass_xl200902251509.jpg
11:47:13 T:2811222928 M:2535088128 INFO: easy_aquire - Created session to http://images.apple.com
11:47:13 T:2729786256 M:2533924864 DEBUG: FileCurl::Open(0xa2b51a9c) http://images.apple.com/moviesxml/s/independent/posters/americanviolet_xl200902251441.jpg
11:47:13 T:2705320848 M:2533924864 INFO: easy_aquire - Created session to http://images.apple.com
11:47:13 T:2696928144 M:2533720064 DEBUG: FileCurl::Open(0xa0bfba9c) http://images.apple.com/moviesxml/s/sony_pictures/posters/angelsdemons_xl200811061144.jpg
11:47:13 T:2713713552 M:2533412864 INFO: easy_aquire - Created session to http://images.apple.com
11:47:13 T:2729786256 M:2531467264 INFO: easy_aquire - Created session to http://images.apple.com
11:47:13 T:2696928144 M:2531057664 INFO: easy_aquire - Created session to http://images.apple.com
11:47:13 T:3065562992 M:2528321536 DEBUG: Load special://masterprofile/Thumbnails/Video/2/25fc2aba.tbn: 106.3ms
11:47:13 T:3065562992 M:2529038336 DEBUG: Load DefaultVideoBig.png: 1.3ms (bundled)
11:47:13 T:3065562992 M:2524561408 DEBUG: Load special://masterprofile/Thumbnails/Video/e/ee4eac93.tbn: 49.2ms
11:47:13 T:3065562992 M:2524049408 DEBUG: Load special://masterprofile/Thumbnails/Video/2/2aa59422.tbn: 17.9ms
11:47:13 T:3065562992 M:2521387008 DEBUG: Load special://masterprofile/Thumbnails/Video/8/8a613d6c.tbn: 63.6ms
11:47:13 T:3065562992 M:2521387008 DEBUG: Load list-focus.png: 0.0ms (bundled)
11:47:13 T:3065562992 M:2521284608 DEBUG: Load defaultFolderBackBig.png: 0.9ms (bundled)
11:47:14 T:2705320848 M:2523762688 DEBUG: Curl::Debug About to connect() to images.apple.com port 80 (#0)
11:47:14 T:2705320848 M:2523762688 DEBUG: Curl::Debug Trying 65.32.34.82...
11:47:14 T:2729786256 M:2523762688 DEBUG: Curl::Debug About to connect() to images.apple.com port 80 (#0)
11:47:14 T:2729786256 M:2523762688 DEBUG: Curl::Debug Trying 65.32.34.56...
11:47:14 T:2794945424 M:2523762688 INFO: Python script stopped
11:47:14 T:2794945424 M:2523762688 DEBUG: staticThread, deleting thread graphic context
11:47:14 T:2794945424 M:2523762688 DEBUG: Thread 2794945424 terminating
11:47:14 T:2705320848 M:2523762688 DEBUG: Curl::Debug Connected to images.apple.com (65.32.34.82) port 80 (#0)
11:47:14 T:2705320848 M:2523762688 DEBUG: Curl::Debug GET /moviesxml/s/sony_pictures/posters/adoration_xl200812171620.jpg HTTP/1.1
11:47:14 T:2705320848 M:2523762688 DEBUG: Curl::Debug User-Agent: XBMC/pre-9.04 r18324 (Linux; Ubuntu jaunty (development branch); Linux 2.6.28-8-generic; http://www.xbmc.org)
11:47:14 T:2705320848 M:2523762688 DEBUG: Curl::Debug Host: images.apple.com
11:47:14 T:2705320848 M:2523762688 DEBUG: Curl::Debug Accept: */*
11:47:14 T:2705320848 M:2523762688 DEBUG: Curl::Debug Connection: keep-alive
11:47:14 T:2705320848 M:2523762688 DEBUG: Curl::Debug Expire cleared
11:47:14 T:3065562992 M:2523820032 DEBUG: python thread 1 destructed
11:47:14 T:3065562992 M:2523820032 INFO: Python, unloading python24.dll because no scripts are running anymore
11:47:14 T:3065562992 M:2523615232 DEBUG: UnloadExtensionLibs, clearing python extension libraries
11:47:14 T:3065562992 M:2523615232 DEBUG: Unloading: time.so
11:47:14 T:3065562992 M:2523615232 DEBUG: Unloading: strop.so
11:47:14 T:3065562992 M:2523615232 DEBUG: Unloading: _socket.so
11:47:14 T:3065562992 M:2523615232 DEBUG: Unloading: _ssl.so
11:47:14 T:3065562992 M:2523615232 DEBUG: Unloading: datetime.so
11:47:14 T:3065562992 M:2523615232 DEBUG: Unloading: python24-i486-linux.so
11:47:14 T:2729786256 M:2523717632 DEBUG: Curl::Debug Connected to images.apple.com (65.32.34.56) port 80 (#0)
11:47:14 T:2729786256 M:2523717632 DEBUG: Curl::Debug GET /moviesxml/s/independent/posters/americanviolet_xl200902251441.jpg HTTP/1.1
11:47:14 T:2729786256 M:2523717632 DEBUG: Curl::Debug User-Agent: XBMC/pre-9.04 r18324 (Linux; Ubuntu jaunty (development branch); Linux 2.6.28-8-generic; http://www.xbmc.org)
11:47:14 T:2729786256 M:2523717632 DEBUG: Curl::Debug Host: images.apple.com
11:47:14 T:2729786256 M:2523717632 DEBUG: Curl::Debug Accept: */*
11:47:14 T:2729786256 M:2523717632 DEBUG: Curl::Debug Connection: keep-alive
11:47:14 T:2729786256 M:2523717632 DEBUG: Curl::Debug Expire cleared

So my question is, where do I go from here? How can I troubleshoot this issue?

Maxim
2009-03-11, 19:48
Wow. I'm going to bump this for the sake of curiosity. You did a great job on the description too. I understand your situation and can't think of anything that you can do. You even went as far as trying to get a backtrace for the devs, kudos to you. The devs are going to have to step up to bat for this one. I can only assume that it's something with CPU scheduling, where something happens and everything in the OS gets stuck cause XBMC is holding authority of the CPU cycles (Just a guess, i'm no programmer.). I've see this type of stuff happen on different OSes with different applications over the years.

The only thing I would ask of you is to post the full debug log, and not just a tail. Pastebin is a common location to post up these types of things.

Also, you're not the only jaunty user due to Intel drivers. I'm using alpha3 and not experiencing anything like this.

althekiller
2009-03-11, 20:21
You can kill xbmc with "killall -SEGV xbmc.bin" to generate a core file then examine it for deadlock (obviously the problem here) with gdb. You may need to launch XBMC from the console for the core to be generated properly.

CapnBry
2009-03-11, 21:08
You can kill xbmc with "killall -SEGV xbmc.bin" to generate a core file then examine it for deadlock (obviously the problem here) with gdb. You may need to launch XBMC from the console for the core to be generated properly.You will also probably have to turn on cores before launching xbmc from that terminalulimit -c unlimited
xbmc
Run it till it locks then do the killall above. You should see "Segmentation fault (core dumped)" in the terminal window and there should be a core.XXXX file in that directory.

xbmcJJ
2009-03-12, 01:32
Thanks everybody for your suggestsions. I may tackle this again in the future, but for now, I give up.

I launched XBMC like this:

ulimit -c unlimited
xbmc

Browsed plugins until it locked up, then ssh'd in and tried:

killall -SEGV xbmc.bin

Machine stayed locked. X unusable. and "ps a" from an ssh session hangs indefinitely.

Here's the last log:

http://pastebin.com/m626daa2b

The good news is that I had a spare partition on the disk and installing Intrepid + intel xserver from xorg-edgers ppa + 2.6.28 kernel from jaunty + libpulse-dev + compiling XBMC from the SVN yields a working, stable configuration:grin:.

gl performance is crappy, compared to jaunty install, but good enough for xbmc (i won't be playing games on it) and watching movies & internet streams.

Redth
2009-10-09, 16:04
I'm having this exact issue now too...

For me, it seems like the xbmc.log always shows the last thing that happened was a call to FileCurl to download some type of file. It happens mostly when scanning content (which makes sense since there's a lot of FileCurl calls in that process), and it does not hang at the same point each time. For example the system will hang while scanning a tv show, when I restart and launch xbmc, and start scanning again, it picks up where it left off, scanning the tv show that made it hang, and carrying on, then hanging a bit later again...

I have an ASUS P5E-VM, and am running Jaunty with all the latest updates. I'm also running Revision 23535 (I configured with --enable-external-libraries).

The next thing I'm going to try tonight is installing some newer Intel drivers from this post: http://ubuntuforums.org/showthread.php?t=1130582

If xbmcJJ was able to get intrepid working with the xorg stuff from xorg-edgers ppa, maybe this will be of some help to me.

Any other ideas? I wouldn't think this is a graphics issue, but who knows. I've used xbmc to play videos back fine for hours without issues, it seems as if it only happens when I'm scanning for content, or I suppose, whenever xbmc is fetching a file from the net via FileCurl.

Help is much appreciated!

Here is my xbmc.log: http://pastebin.com/f42462c2a

jverdeyen
2009-10-09, 17:10
I'm having this exact issue now too...

For me, it seems like the xbmc.log always shows the last thing that happened was a call to FileCurl to download some type of file. It happens mostly when scanning content (which makes sense since there's a lot of FileCurl calls in that process), and it does not hang at the same point each time. For example the system will hang while scanning a tv show, when I restart and launch xbmc, and start scanning again, it picks up where it left off, scanning the tv show that made it hang, and carrying on, then hanging a bit later again...

I have an ASUS P5E-VM, and am running Jaunty with all the latest updates. I'm also running Revision 23535 (I configured with --enable-external-libraries).

The next thing I'm going to try tonight is installing some newer Intel drivers from this post: http://ubuntuforums.org/showthread.php?t=1130582

If xbmcJJ was able to get intrepid working with the xorg stuff from xorg-edgers ppa, maybe this will be of some help to me.

...


Same hardware..
Same Problem..

Solved it by Hardy, now everything runs smooth!

Redth
2009-10-09, 17:16
Same hardware..
Same Problem..

Solved it by Hardy, now everything runs smooth!

So you downgraded all the way to Hardy?

That seems to be the trend. If this is a bit more widespread than anyone thought hopefully we can get it working right in Jaunty. obviously something changed since Intrepid to Jaunty that's causing this. Doesn't seem like a great solution to downgrade to older software :s

jverdeyen
2009-10-09, 17:19
I know, but I had everything working, remote, lcd, .. the only thing was the freezing on scan issue... so I got back to Hardy! No problems, no issues (except soms vsync problems while playing videos, which I should invest later on..).

I did .xbmc dir backup, and clean install. Fixed in less than one hour.

CrashX
2009-10-09, 18:22
I have the same issue .. Sometimes it hangs on database scan and it basically looks up my machine .. I can't get back to Ubuntu desktop .. Ctrl+Alt+F1 doesn't even work after it hangs ...

I did upgrade from Hardy to Jaunty instead of clean install ? Are you guys doing the same ?

I never had this problem on Hardy ..

Redth
2009-10-09, 18:25
I did a clean install of Jaunty, so I don't think that upgrading is the issue...

When I get home I'm going to try it again.. I don't want to lock up my machine from here since I can't physically reboot it, and I need it for ssh :)

But I did upgrade to the X-Updates PPA for Jaunty. I'm hoping that might help fix things... Doesn't seem like it should be a Graphics driver issue though...

jverdeyen
2009-10-09, 18:26
I did a clean install with Hardy.

I migrated from MediaPortal (windows) to XBMC (linux) last week, it runs 100x faster, smoother and it's just better, but that was my only point to get back to MediaPortal.

I didn't want to give up so I tried a clean Hardy install, which resulted in a success.

It seems like some people also having these freezes, just try to get back to Hardy... and wait until this problem is fixed?

Redth
2009-10-09, 18:29
It doesn't seem like anyone's working on the problem, so I hesitate to go back to Hardy or Intrepid and avoid the problem.

I was a Mediaportal user too, but mainly for the dvb aspect, which I'm not using anymore... I've always liked XBMC better, and it's leagues ahead these days... Plus, Windows requires a lot more overhead than linux, and is less configurable... I also have an old XBOX running xbmc in my bedroom that streams from my main HTPC (the system having the issues with freezing).

I'm willing to try and help figure this out, but I'm not sure what else to do!

jverdeyen
2009-10-09, 18:30
I've looking for all kind of issues:
- network
- samba
- nfs
- display driver
- ..

there are so many types of freezes in ubuntu reported on the internet... dead end :)

CrashX
2009-10-09, 18:31
The only reason I upgraded to Jaunty was due to this issue ? http://xbmc.org/forum/showthread.php?t=56693 ..

Has this been resolved in Hardy ?

Anyone able to get a crash dump ?

althekiller
2009-10-09, 18:51
The log is absolutely useless in these cases. We need a stacktrace for there to be any hope in fixing this.

Redth
2009-10-09, 18:58
Ok, the Getting a Stacktrace section of the wiki is not very informative... sorry i'm a newb to this...

How exactly do I go about getting a stack trace? I'm now using the SVN PPA repo and have installed xbmc-common-dbg... I have version 9.04.3+svn23539-jaunty1 installed right now..

I'd be more than happy to submit a stack trace, but I'm not sure where I would find it...

CrashX
2009-10-09, 19:01
The log is absolutely useless in these cases. We need a stacktrace for there to be any hope in fixing this.

Currently xbmc is locking the system .. Killing in the shell doesn't work ..

Anyway to get the stack trace in this state ?

CrashX
2009-10-09, 19:06
You will also probably have to turn on cores before launching xbmc from that terminalulimit -c unlimited
xbmc
Run it till it locks then do the killall above. You should see "Segmentation fault (core dumped)" in the terminal window and there should be a core.XXXX file in that directory.

You can kill xbmc with "killall -SEGV xbmc.bin" to generate a core file then examine it for deadlock (obviously the problem here) with gdb. You may need to launch XBMC from the console for the core to be generated properly.

Ok, the Getting a Stacktrace section of the wiki is not very informative... sorry i'm a newb to this...

How exactly do I go about getting a stack trace? I'm now using the SVN PPA repo and have installed xbmc-common-dbg... I have version 9.04.3+svn23539-jaunty1 installed right now..

I'd be more than happy to submit a stack trace, but I'm not sure where I would find it...

See quotes on top ..

Redth
2009-10-09, 19:10
Well, as you said, the system gets locked up... doing the killall -SEGV from a ssh session does NOT work....

So, not sure how we're going to diagnose this..

CrashX
2009-10-09, 19:11
Well, as you said, the system gets locked up... doing the killall -SEGV from a ssh session does NOT work....

So, not sure how we're going to diagnose this..

Can you try it as well since mine is an upgrade ... I am going to try again when I get home as well with latest svn release ..

Maxim
2009-10-09, 19:31
doing the killall -SEGV from a ssh session does NOT work....Try pkill -SIGSEGV xbmc.bin instead.

Redth
2009-10-09, 19:43
Will give that a try tonight...

CrashX
2009-10-10, 01:56
Try pkill -SIGSEGV xbmc.bin instead.

Tried both commands and it is still doesn't kill it .. It is hanging on executing the command ..

Redth
2009-10-10, 21:14
Ok, so I have things working stable now on Jaunty.

Here's what I did:
1) Installed the XBMC SVN PPA for Jaunty so that I have a pretty recent SVN build, but without the hassle of checking out the source and compiling it myself :)

2) Followed this forum post: http://ubuntuforums.org/showthread.php?t=1130582 and followed the steps for the 'Optimal' configuration. This basically involved getting kernel 2.6.30 as well as getting the "latest stable Xorg drivers (via the X-Updates PPA), enable UXA acceleration and create a workaround for the MTRR bug.".

I now have an up to date Jaunty configuration that is working fine, haven't had it hang yet, and it's successfully scanned in all my TV Shows, Movies, and Music just fine!

Hopefully this helps someone else, and maybe gets you guys to upgrade from Hardy or Intrepid to the otherwise awesome Jaunty :) Thanks for all the suggestions, I have no idea what 'changed' to fix it (either it was the kernel or xorg drivers or both), but it's working

CrashX
2009-10-10, 23:06
That solved my problem as well .. Thanks for the investigating it ... I have an intel videocard ( atom 330 )

jverdeyen
2009-10-12, 15:21
any reason why I should go back from Hardy to Jaunty?

Redth
2009-10-12, 19:13
any reason why I should go back from Hardy to Jaunty?

Keeping up with the times I guess... I like to run the latest stable always, eventually Hardy will no longer be supported...

Redth
2009-10-12, 23:07
I wanted to add that I have been doing more testing and it turns out Kernel 2.6.30 is NOT required to get things stable... Simply installing the newer xorg driver from the x-edgers ppa should do the trick.

This will be helpful since if you install 2.6.30, you'd need to mess around with the lirc kernel modules which is a bit of a pain, otherwise you'd have broken remote support...

Found this out the hard way, decided before i went and mucked around with lirc i would try booting back to 2.6.28-15, and voila, it works fine...

So, the xorg driver does the trick!

CrashX
2009-10-12, 23:15
I wonder if the next version of Ubuntu fixes this issue ? It is scheduled for October 29, 2009.

dc2447
2009-10-20, 20:50
I'm on Karmic, on kernel 2.6.31 with xserver-xorg-video-intel 2:2.9 and I am gettin g lockups all the time

My machine is still accessible over ssh and killing xbmc gets me back working

Redth
2009-10-20, 20:58
I've seen a few lockups here and there... I updated xbmc from the svn ppa as well as the xserver-xorg-video, but i'm still on Jaunty, and things for the most part are working well...

dc2447
2009-10-20, 21:01
I ended up on Karmic because no mattr what ppa's I used intel graphics were much much worse than hardy

Jaunty was a mistake