View Full Version : Something about scraping
Hi!
Can anybody explain the scraping behaviour?
I've strated using Media Companion and/or Movie Info Plus to scrape some contents. nfo-files an tbns have been created. When telling XBMC to refresh the library it reads these files (in my case only after I removed the movie/tv show and rescan for new content, but thats just a sidenote). So far so good.
But other files not scraped by MIP or MC yet, don't get scraped by the XBMC internal Scraper. Is that correct?
I wished the internal scraper kicks in when there are no nfo-files.. Bug or feature? :confused2:
I'm using Jester's 17560 (and before other Versions >17500).
imdb scraper has been disabled so you'll have to select a different one.
imdb scraper has been disabled so you'll have to select a different one.
Ultra lame. For every step forward XMBC makes, it takes three steps back.
Then don't use XBMC. Simple solution right?
xanadu1979
2009-02-04, 20:31
Ultra lame. For every step forward XBMC makes, it takes three steps back.
IMDB doesn't give permission for applications to scrape their site. Frankly, every application that does it should stop unless IMDB explicitly says that it's ok.
XBMC is totally doing the right thing with this move.
I know that imdb-scraping doesn't work anymore. Thats not my point. I'm having this trouble(?) with every scraper including thetvdb.com (sorry, I haven't mentioned that before..). So its just me? Do I have a bug or is this behaviour normal?
WhatMonkey
2009-02-05, 01:14
I don't think it is just you. The last couple of days I have not been able to get new TV shows into my library either. I was still trying to figure out if it was something I screwed up or not. I did revert back to an older build (17422 I think?) that I knew worked previously, but even when I did that it still did not import. A clean install with a empty library only imported the show names no episodes. I was thinking it was just something on my side, but if you are having the same problem, maybe not.
Personally I took all the guess work out and went to Movie Info Plus. I figured I could scrape then upload so everything is in place and I know it works. Movie Info Plus scrapes from the regular places anyways.
I know its one more step and one more app, but it works for me at least.
Something is definitely wrong with scraping. Installed Jester's latest build last night, set it to import my movies at about midnight last night. Got tonight after going out, and still said it was importing. Checked the progress, and no movies had been imported. It gives the name of each movie, and the title of each movie series, but no episodes (for tv series), and no coverart or details for movies. All there is is movie titles and series titles.
And this is after using media companion
Now I tried a new version (Jester 17586). Sometimes some tv shows are scraped correctly, some others aren't.
I tried scraping "Lost" by choosing "Scan for new content" - it worked. Tried scraping new Episodes of "Fringe" or "Nip Tuck", it didn't work.
Additionally, there was a time when that was done automatically on startup. That doesn't seem to work either - it also doesn't save the setting "Run automated scan". Everytime I choose "Set content" the checkbox "Run automated scan" isn't checked.
Here's my debug.log:
http://pastebin.com/m27eb4bd
Does anybod have an idea?
Thanks in advance!!
Ive just scraped several hundred shows using the latest jester and it worked perfectly fine. This was a clean install with no existing userdata.
As for IMDB whatever it takes it should be brought back. XBMC library is crippled without it, no other source comes close. If it takes a funding drive to buy 100 pro accounts then lets do it :)
mitul103
2009-02-06, 01:10
Ive just scraped several hundred shows using the latest jester and it worked perfectly fine. This was a clean install with no existing userdata.
As for IMDB whatever it takes it should be brought back. XBMC library is crippled without it, no other source comes close. If it takes a funding drive to buy 100 pro accounts then lets do it :)
No it shouldn't be brought back. IMDB doesn't want people scraping their website and that should be respected. That said, I don't think there is anything stopping you from grabbing the scrapers from an older build and using them.
The situation with IMDB is way more complicated than a simple "yes they allow" or "no they dont". Im sure they would be most upset if every XBMC user started downloading a complete copy of their database. That is allowed. Or how about using shell scripts, thats been allowed to for 18 years.
Were going OT here. TV scraping using tvdb as a source works as expected. I retested again a few hours ago.
No it shouldn't be brought back. IMDB doesn't want people scraping their website and that should be respected. That said, I don't think there is anything stopping you from grabbing the scrapers from an older build and using them.
IMDB does allow people or projects to scrape their site WITH PERMISSION.
As as long as the project has permission, which is worked on AFAIK, there should be no reason not to bring it back.
Hey guys, lets try pulling the thread back to its origin. :nod:
Perhaps there's some problem in the combination of nfo-files and internal scraped information?
But nonetheless, scraping works not that reliable on my system. Other Idea: some kind of timeout on tvdb? Browsing the website is really slow, too!?
WhatMonkey
2009-02-06, 22:12
I did a fresh install of REV17592 with an empty library. I pointed to my TV Shows folder and ran a scan. It imported the series name and information but no episodes. My shows are named as Lost.s01.e01.iso.
Here is the debug log http://pastebin.com/ma1cec60
natethomas
2009-02-06, 22:23
For those of you using nfo files, did you make sure to 'set content' on your folders to their appropriate content types and then refresh? I actually haven't been having this problem with the very most recent install, but it was a problem a few revisions ago for me.
WhatMonkey
2009-02-07, 00:05
Well I am happily confused, because it seems to be mostly working again for no apparent reason. Most of the shows that I have put on the last few days when the scraper was not working for me are now in the library. A few did not get in from "update library" but I was able to get those buy using "scan for new content" I don't believe I made any changes, I just tried it again and it worked? I would think if it was something with thetvdb.com a lot more people would be having problems? I will just be happy it worked.
I did a brand new install, even with a new profile and library.
Now the scraping seems to work, but saving the setting "run automated scans" still doesn't. So I don't just have to update library, but to select "scan for new content".
I would be glad, if someone else reports that behaviour/bug.
I have this trouble after updating to the newest Jester build too. Movies/music works great.
TV Shows:
Update Library = Scans folder, recognises new shows but doesn't scan episodes
Scan for new content = Success
So to get a new show with eps to add you've got to do both. It's no biggie though - it's still less work than loading an external program to scan. Anyone work this out?
I've had all the above problems at various points too. Like several people here, I'm in the habit of downloading a new SVN build every couple of days. I'm wondering if changes to the workings of the userdata folder are responsible for the database files not being written to correctly? The auto scan progress bar will flicker for me as if it's finding new episodes, but then nothing is added.
#11 has got me thinking I need a clean install and to stop worrying that I'm missing kewl new features and stick with a build that works ;)