PDA

View Full Version : Tv scrapers not returning top result


snappz
2007-10-12, 09:49
I've been playing with the linux port for a few weeks now and have enjoyed following the progress.
I finally nailed my video driver setup today and sat down to scan the network shares I have setup.
If I try to scan a tv show series eg Chuck with tv.com I get 9 results beginning with Chuck Finn. I dont get a plain chuck however.

The same search via the xbox version of XBMC (T3CH 12 oct) gives 10 results starting with Chuck.

This is happening with every tv show i am attempting. The movie searches appear ok. I quickly tried with another scraper and I think it was doing the same but I had to leave to go to work.

I realize this isn't really important but thought there may be an easy solution or work around

Cheers on some really fine work thus far with the port. it is sensational!

snappz

snappz
2007-10-12, 11:44
oops. meant the 7 oct T3CH release

also using rev 10513 of linux port

skunkm0nkee
2007-10-12, 11:59
try using the tvdb scraper instead?

snappz
2007-10-12, 12:02
I did quickly try before I had to leave for work and I THINK it was doing the same thing. I cant be sure because I only searched one show and it didn't work.

If no one else can see this I will try a fresh build.

cheers

prichards
2007-10-12, 18:26
I can confirm this.

It happened to me yesterday, I made a clean build today (rev 10514) and it still happening.

I try a lot of shows with the tv.com and tvdb scraper, and it seems that is not returning the first show.

I was trying to get "Stargate Atlantis" and it returns "Stargate SG-1" and "Stargate: Infinity", however if i manually search for "Stargate", it returns "Stargate Atlantis" and "Stargate Infinity", the same search on the web page returns "Stargate SG-1", "Stargate Atlantis" and "Stargate Infinity", in that exact order.

snappz
2007-10-12, 20:43
yep, just built rev 10515 and still happening.
It seems that all the tv scrapers are doing a similar thing (tv.com, tvrage, tvdb)
I just tried a couple of movies with the IMDB scraper and I got some weird results like
when querying for the movie "In Her Shoes" the only result returned is "The King of Queens: Mild Bunch (#9.9)"(2007)

Freddo
2007-10-13, 16:25
I've had this, the most glaring example being "lost" instead of chosing lost it choses lost moments of NFL history or something.

maybe this thread should be in the bugs forums?

Rand Al Thor
2007-10-15, 09:46
I am experiencing the same issues with the linux build rev 10530

Freddo
2007-10-15, 20:28
this thread should really be in bug discussion since it affects xbox AND linux builds.

sho
2007-10-15, 20:44
Someone experiencing the problem on an Xbox, please open it there.
Spiff (who else would fix it?) is unlikely to notice this here.

Rand Al Thor
2007-10-17, 18:36
Made a post in the bug discussion forums. Doesn't seem to be getting any replies.

C-Quel
2007-10-18, 00:25
Must admit not experienced it on XBOX but Linux yes i have...

Also try thetvdb not tv.com :)

Rand Al Thor
2007-10-22, 05:48
Hmm, im still experiencing this issue with build 10581. Both the tv and movie scrapers are refusing to display the top result. I am also no longer getting the high quality posters from IMDB scraper. I am aware that not all movies have high quality posters, but right now none of them are showing. Any thoughts?

Rand Al Thor
2007-10-25, 01:07
I tried the new imdb scraper from the SVN, it now shows impa thumbs again. However it still has issues with some movies. The tv scrapers (tv.com and thetvdb) are both still skipping the top result. I might try to revert builds until i can discover where the issue might have crept up from. Anyone find a cure for this issue?

jmarshall
2007-10-25, 01:13
I believe d4rk fixed the top result issue in SVN in the last 12 hours.

Rand Al Thor
2007-10-25, 01:21
Yeah, just saw that. Thanks to dark for the fix. Will give it a go when i get home. Cheers to all XBMC contributors :)

snappz
2007-10-25, 03:53
looks like its working for me. Thank you d4rk and all concerned......

bripeace
2007-11-29, 23:54
This seems similar to a problem i'm having right now where the first EPISODE result returned from the TVDB scanner is always ignored.

I.E if you scan in season 1 of say Venture Brothers - it ignores the special A Very Venture Christmas, but if you scan in Season 1 of say Firefly (which has no specials) it will ignore episode 1.

Scanning in something like Season 2 of anything will be fine since it ignores whatever the first entry is and tvdb's GetEpisodes returns all epsiodes of a series at once.

TV.com and the other scrapers scans no episodes in so maybe due to interface differences it's a similar problem.

This is of course really annoying as all season1's of shows which do not have specials get off by an episode and those with specials do not scan in the first.

The result of all this miss is you get info for epsiode 2 when you are looking at one (it says Season 1 episode 1 but with info from Season 1 Episode 2, you then play the episode and your watching S1E1) Also the last episode of the affected seasons cannot be scanned in.

I'm trying to figure out how to fix this - seems like some sort of loop/recursion error - but I don't' quite understand how the scrape system works and for someone that does seems this would be an easy fix.

Jezz_X
2007-11-30, 00:15
bripeace I had the same issue with heroes no matter what I did it always got it wrong I then deleted my database file dor videos tried again and it was perfect. I think deleting stuff from the database dosn't actually fully get deleted and its still uses the same info from when the original bug was there

bripeace
2007-11-30, 00:27
Thanks for the tip jezz. i destroyed the old database and went at it fresh. No dice, the 2 series I tested that were having the problem still are.

BTW, I'm using rev 10871 the most recent one as best I can tell.

bripeace
2007-11-30, 00:58
Okay found the problem. it was in IMDB.cpp much like the first result on titles issue, which was a little more apparent.

Same sort of fix (from while -> do/while ) this time in.

Posted the patch to the patch tracker:
http://sourceforge.net/tracker/index.php?func=detail&aid=1841302&group_id=87054&atid=581840

Rand Al Thor
2007-11-30, 03:52
Nice, I'll give it a shot and let you know how it goes. Cheers.

jmarshall
2007-11-30, 06:18
The above doesn't affect trunk which uses for(movie; movie; movie = movie->NextSiblingElement()) type logic.

A (possibly better) flow would be:

while (movie)
{
....
movie = movie->NextSiblingElement();
}

Cheers,
Jonathan

bripeace
2007-11-30, 07:52
The above doesn't affect trunk which uses for(movie; movie; movie = movie->NextSiblingElement()) type logic.

A (possibly better) flow would be:

while (movie)
{
....
movie = movie->NextSiblingElement();
}

Cheers,
Jonathan

I just went for a quick fix that was similar to the fix accepted in for the same problem that was occurring as described earlier in the thread.. just trying to be helpful.

However it's fixed in the is cool by me

jmarshall
2007-11-30, 08:00
Sure - and we are very grateful - it saves us trying to track it down :)

I'm simply offering my opinion on the code for whoever commits it to SVN so that we can make it as clear as possible (and hopefully so we can have the same code in trunk and the branch).

Cheers,
Jonathan

bripeace
2007-11-30, 08:16
Sure - and we are very grateful - it saves us trying to track it down :)

I'm simply offering my opinion on the code for whoever commits it to SVN so that we can make it as clear as possible (and hopefully so we can have the same code in trunk and the branch).

Cheers,
Jonathan

Awesome, I hope my next contribution is as helpful. Thanks.

bripeace
2007-11-30, 21:44
I took a much deeper look at the code today and uncovered a few more things.

I see what your saying about the code in TRUNK and the Linux Branch being different. They definitely should be using the same code.

I also found another bug. CIMDBUrl::Parse was skipping the first <url> element in episodeguides due to the same faulty loop logic. This causes scrapers which use the <url> format to not scrape the first url. For instance, the TV.COM scraper will not scan season 1 of shows.

This brings up another readability issue; there is code to handle two different cases for episode guides returned by scrape interfaces. Multiple URLS and 1 Url. TVCOM returns Bleach like so:
<episodeguide><url>Season1url</url><url>Season2url</url>etc..</episodeguide>

TVDB returns bleach like so:
<episodeguide>allseasonurl</episodeguide>

This means CIMDBUrl::Parse has code for the 2 different cases. Requiring scrapers to use <episodeguide><url></url></episodeguide> regardless of how many URLS they return would simplify the code and make more pragmatic sense.

In any case I can submit a patch that cleans up both Trunk and the Linux branch to the more clear While(x) {x = nextx} format bringing them to use the same code and fixing the remaining Linux TV Scrape bugs. Also if theres agreement on the standardization of <episodeguide> and the use of <url> I can sumbit a patch which fixes that and all the scrapers.

Thoughts?

C-Quel
2007-11-30, 21:54
Post patches anyway ... once tested and checked if devs are content with patch then it will go into trunk. Benefits the whole commuity, plus credit where credits due ;)

Thank you for your time and effort.

jmarshall
2007-11-30, 23:09
Indeed - thanks very much.

I'll discuss with C-Quel (our leading scraper man) and we'll get it into trunk and linuxport sometime this weekend.

Cheers,
Jonathan

Rand Al Thor
2007-12-06, 01:11
Any updates on getting this patch worked into the linuxport? I just put Tin man on my box and episode 1 - Into the storm comes back as episode 1 - Search for the Emerald. Just a little frustrating that season 1 of each show seems to be off by 1 episode. Cheers.

Rand Al Thor
2007-12-11, 03:48
just compiled a new build, looks like it is working flawlessly. Thanks to all who contributed to this great project.

jmarshall
2007-12-11, 04:08
There's actually still a bug in the linuxport version at the moment, that you'll hit if you have years in the name of the file (or the returned results). It's fixed in trunk, and I'll do a merge shortly if vulkanr doesn't beat me too it.

Cheers,
Jonathan

bripeace
2007-12-11, 08:43
Theres one problem i just tracked down with my patch.

Due to me cleaning up how <episodeguide> works with tvdb, tvdb-fr, and movie-xml. Database entries for shows that use those scrapers need their episodeguide field to be updated. Think the only real way to do that is to refresh the series (say no to "Refresh info for all episodes?"). That will update the field to include the proper formatting.

So really if you having trouble with it finding new episodes that are added just do a quick refresh. Sorry for the small problem.

bripeace
2007-12-11, 09:04
Or more simply if you know how to edit the database you could run this sql:

update tvshow set c10 = '<episodeguide><url>' || substr(c10,15,length(c10)-29) || '</url></episodeguide>' where substr(c10,15,1) <> '<'

majorheadache
2008-01-26, 23:07
Hi,
Just got my first linux XBMC setup working last night. Woot! I'll post some specs and stats in a bit. In Video, I added my usual source of an SMB shared folder with my movies. I set the content to Movies and chose IMDB. The scan went super quick and instead of the usual DVD cover type images, it just gave me screenshots. I tried the movie poster DB too, but again, quick scan with no results.

Any suggestions?

PS. Big ups to those who help in the IRC channels!