PDA

View Full Version : AdultDVDEmpire Scraper


Bleckshire
2008-12-04, 00:59
After helping artik with his Excalibur Films scraper, I learned a lot more about regexp coding and scrapers in general so I was able to finish my AdultDVDEmpire scraper. This scraper will retrieve the following info:

- Film Title along with box cover.
- Production year and film studio.
- Rating (which should always be XXX, but I set it to pull anyway).
- Film Director
- Film Genres/Categories (All categories are pulled if a film fits into multiple ones).
- Film Actors/Actress along with thumbnails for each star if available.
- Film runtime.
- Film plot/tagline. *

* About the plot and tagline. Some films have just plots, some have just taglines, and some have both. Since the taglines come first in the code, I've set it to try and pull just a tagline first then it tries to pull just a plot, then tries to pull a plot if it has a tagline before it in the code. This should work most of the time but their are a few films which it fails on. Another thing is it will fail to pull the complete plot if the plot itself has a '<' bracket in it. For instance: "This here is a plot about an <b>AWESOME</b> movie!" There are a few plots like that and in that case it will pull everything up to 'an'. I tried to figure out a way around this but couldn't and finally settled with just having it pull as much as it can if it has a case like that. If it fails otherwise, it's most likely due to that particular movie having weird coding (which I've also run into during testing).

This script pulled a good majority of my collection on first try. I do have a few low budget films that it couldn't find but it was hard to even find those using google, so I'm happy either way. Here's some shots taken off my xbox with the MediaStream skin and the script:

http://www.bleckshire.com/screenshot007.jpg http://www.bleckshire.com/screenshot008.jpg
http://www.bleckshire.com/screenshot010.jpg http://www.bleckshire.com/screenshot011.jpg
http://www.bleckshire.com/screenshot012.jpg http://www.bleckshire.com/screenshot013.jpg



<scraper name="Adult DVD Empire" content="movies" thumb="adultdvdempire.jpg">
<NfoUrl dest="3">
<RegExp input="$$1" output="&lt;url&gt;http://www.adultdvdempire.com/itempage.aspx?item_id=\1&lt;/url&gt;" dest="3">
<expression noclean="1">adultdvdempire.com/itempage.aspx?item_id=([0-9]*)</expression>
</RegExp>
</NfoUrl>

<CreateSearchUrl dest="3">
<RegExp input="$$1" output="&lt;url&gt;http://www.adultdvdempire.com/SearchTitlesPage.aspx?SearchString=\1&lt;/url&gt;" dest="3">
<expression noclean="1"></expression>
</RegExp>

</CreateSearchUrl>

<GetSearchResults dest="6">
<RegExp input="$$5" output="&lt;?xml version=&quot;1.0&quot; encoding=&quot;iso-8859-1&quot; standalone=&quot;yes&quot;?&gt;&lt;results&gt;\1&lt;/results&gt;" dest="6">
<RegExp input="$$1" output="\1" dest="4">
<expression>&lt;a href=&quot;itempage.aspx?item_id=([0-9]*)[^&gt;]&gt;</expression>
</RegExp>
<RegExp input="$$1" output="&lt;entity&gt;&lt;title&gt;\2&lt;/title&gt;&lt;url&gt;http://www.adultdvdempire.com/itempage.aspx?item_id=\1&lt;/url&gt;&lt;/entity&gt;" dest="5">
<expression repeat="yes">ListItem_ItemTitle&quot;&gt;&lt;a href=[^=]*=([0-9]*)[^&gt;]*&gt;([^&lt;]*)</expression>
</RegExp>
<expression noclean="1"></expression>
</RegExp>
</GetSearchResults>

<GetDetails dest="3">
<RegExp input="$$5" output="&lt;details&gt;\1&lt;/details&gt;" dest="3">
<RegExp input="$$1" output="&lt;thumb&gt;http://images2.dvdempire.com/res/movies/\1h.jpg&lt;/thumb&gt;" dest="5">
<expression>BoxCover_Container&quot;&gt;[^&gt;]*&gt;&lt;img src=&quot;http://images2.dvdempire.com/res/movies/([^m]*)</expression>
</RegExp>

<RegExp input="$$1" output="&lt;title&gt;\1&lt;/title&gt;" dest="5+">
<expression>Item_Title&quot;&gt;([^&lt;]*)</expression>
</RegExp>

<RegExp input="$$1" output="&lt;studio&gt;\1&lt;/studio&gt;" dest="5+">
<expression>StudioProductionRating&quot;&gt;([^&lt;]*)</expression>
</RegExp>

<RegExp input="$$1" output="&lt;year&gt;\1&lt;/year&gt;" dest="5+">

<expression>Year: ([0-9]*)</expression>
</RegExp>

<RegExp input="$$1" output="&lt;tagline&gt;\1&lt;/tagline&gt;" dest="5+">
<expression>InfoTagLine&quot;&gt;([^&lt;]*)</expression>
</RegExp>

<RegExp input="$$1" output="&lt;plot&gt;\1&lt;/plot&gt;" dest="7">
<expression clear="yes">Item_InfoContainer&quot;&gt;[^ ]*([^&lt;]*)&lt;</expression>
</RegExp>

<RegExp input="$$1" output="&lt;plot&gt;\1&lt;/plot&gt;" dest="5+">
<expression>Item_InfoContainer&quot;&gt;[^&gt;]*&gt;[^&lt;]*&lt;/span&gt;[^ ]*([^&lt;]*)&lt;</expression>
</RegExp>

<RegExp input="$$1" output="&lt;actor&gt;&lt;name&gt;\2&lt;/name&gt;&lt;thumb&gt;http://images.dvdempire.com/pornstar/actors/\1.jpg&lt;/thumb&gt;&lt;/actor&gt;" dest="5+">
<expression repeat="yes">cast_id=([0-9]*)[^t]*type=1&quot;[^&gt;]*&gt;([^&lt;]*)</expression>
</RegExp>


<RegExp input="$$1" output="&lt;genre&gt;\1&lt;/genre&gt;" dest="5+">
<expression repeat="yes">media_id=[^i]*item_id=[^&gt;]*&gt;([^&lt;]*)</expression>
</RegExp>


<RegExp input="$$1" output="&lt;runtime&gt;\1&lt;/runtime&gt;" dest="5+">
<expression>&gt;Length: ([^&lt;]*)&lt;</expression>
</RegExp>

<RegExp input="$$1" output="&lt;mpaa&gt;\1&lt;/mpaa&gt;" dest="5+">
<expression>&gt;Rating: ([^&lt;]*)</expression>
</RegExp>

<RegExp input="$$1" output="&lt;director&gt;\1&lt;/director&gt;" dest="5+">
<expression repeat="yes">type=4&quot;&gt;([^&lt;]*)</expression>
</RegExp>
<expression noclean="1"></expression>
</RegExp>
</GetDetails>
</scraper>

To spiff, if you read this: I submitted a ticket for this already.

lovedaddy
2008-12-06, 03:09
Nice!

NotShorty
2009-01-24, 18:55
Thanks man! JadedVideo scraper lacked plots the last time I used it. And yes, plots/summaries can be a very useful feature of an adult library (though not as cool as cross-referencing by actress). :D

NS

Anacotic
2009-02-09, 12:31
I donīt get any Infos from this scraper, No Images, No Plot, No Actors, nothing...

gongloo
2009-03-09, 20:39
I donīt get any Infos from this scraper, No Images, No Plot, No Actors, nothing...

I found the same as well. I've fixed the issue on my end and submitted a patch[1]. Hopefully this will make it to SVN soon!

[1] http://xbmc.org/trac/ticket/6047

vdrfan
2009-03-09, 20:54
I found the same as well. I've fixed the issue on my end and submitted a patch[1]. Hopefully this will make it to SVN soon!

[1] http://xbmc.org/trac/ticket/6047

It's in r18354. Cheers!

Bleckshire
2009-03-30, 21:29
I found the same as well. I've fixed the issue on my end and submitted a patch[1]. Hopefully this will make it to SVN soon!

[1] http://xbmc.org/trac/ticket/6047

ADE must have altered their layout a bit and I didn't notice (not to mention I haven't been on the forums for a while). Thanks for the fix, gongloo.

Bleckshire
2009-03-31, 14:10
I noticed after gongloo fixed the script for those changes ADE made, it screwed up the plot detection. Been messing around with it all night and changed up a few things:
- The script still pulls the same info as before. Title, front cover, now pulls the back cover as well, production year, studio, director, all actresses and actors as well as all genres and categories (great for sorting by star or style of porn), runtime, and plot + tagline.

As before, not every film has both a plot and tagline but the way ADE coded it was a little difficult to scrape. I've now got it to pull both tagline and plot if a film has both or pull just the plot if the flim doesn't have a tagline and only a plot. I haven't run across any films that have ONLY a tagline. I've just seen both, just plot, or nothing. If you run across one that has only a tagline, nothing will be pulled. Seems rare though if not non-existent. Also, just like before, plots will be scraped completely unless the plot uses HTML tags. It will then be scraped up until the first HTML tag it hits. Taglines should be complete all the time.

http://xbmc.org/trac/ticket/6215

http://www.bleckshire.com/skrn001.jpg http://www.bleckshire.com/skrn002.jpg

nc88keyz
2009-08-05, 16:54
broken - box covers

looks like url might have changed to:

http://images2.dvdempire.com/res/movies/1/

checked all r/w permissions.

might want to check it out.

r21936 / all skins.

If you rescrape it loses cover as well.

spiff
2009-08-05, 17:02
see the sticky in this very forum

nc88keyz
2009-08-05, 17:12
i just saw it , but im don't know how to fix it. can you update the xml for us :)

i did try however.just not that good at all this.

is there anyway i can erase my name from this post lol.

vdrfan
2009-08-05, 18:25
Fixed in SVN r22012

nc88keyz
2009-09-19, 16:34
broke again for covers.

Any idea?

Back on windows platform. Confirm same on ATV>

Everything else scrapes.

vdrfan
2009-09-19, 20:22
Please create a new bug report at xbmc.org/trac (with debug log attached). Thanks.

vdrfan
2009-09-20, 00:47
Fixed in SVN r22999.

NandoBR
2009-09-20, 09:25
Still having cover problems (r22999)

vdrfan
2009-09-20, 11:54
Works for me (tm). Debug log please.

NandoBR
2009-09-20, 17:34
Here is my debug log

http://pastebin.com/m478b2a44

vdrfan
2009-09-20, 17:37
This is not a debug log. Please enable debug logging in system settings.

Freddo
2009-09-20, 17:57
plots will be scraped completely unless the plot uses HTML tags. It will then be scraped up until the first HTML tag it hits.


Hi, I don't know if this helps at all but this rang a bell with me so I dug out an old TRAC ticket I filed about the imdb scraper having a similar problem:

http://xbmc.org/trac/ticket/5013

Maybe you can get some clues from there?

NandoBR
2009-09-20, 17:58
I'm sorry.
Here is Debug Log (I hope... lol)

http://pastebin.com/m48fa3acd

vdrfan
2009-09-20, 23:21
There's no scan plus you're running SVN r22943. Please upgrade to a version >= r22999, run a scan or refresh given movies and report back

@Freddo: You're quoting an outdated post from march ;)

NandoBR
2009-09-21, 00:26
Where can I find XBMCSetup SVN22999 (Windows version)?
I use XBMC Update utility and the last version there is 22943.

vdrfan
2009-09-21, 00:29
Search for xbmc nightly builds or alternatively just replace your current scraper .xml with http://xbmc.org/trac/export/22999/branches/linuxport/XBMC/system/scrapers/video/adultdvdempire.xml

NandoBR
2009-09-21, 05:17
vdrfan,

I downloaded and installed nightly build (r22968). Didn't work.
Then I downloaded and installed ikons build (r22975-gl). Didn't work too.

Both of then I used new adultdvdempire.xml (r22999).

NandoBR
2009-09-21, 05:38
I don't know what I did but now it works.
Thanks for your help.

vdrfan
2009-09-21, 09:36
Glad to hear.