XBMC Community Forum  

Go Back   XBMC Community Forum > Development > Scraper Development

Scraper Development Developers forum for meta data scrapers. Scraper developers only!
Not for posting feature requests, bugs, or end-user support requests!

Reply
 
Thread Tools Search this Thread Display Modes
Old 2009-01-02, 23:12   #1
jhhbe
Junior Member
 
Join Date: Jul 2007
Posts: 9
jhhbe is on a distinguished road
Default Variable in <expression> section?

hi,

My challenge is that I have no way to end up on a single movie page. If I'm searching for movie A it will have that one and all its sequels - which would be ok for the GetSearchResults. However those URLs would all link to one and the same detail page for the initial movie and all sequels (I hope this still makes sense)

So I can populate the GetSearchResults and that gives a link to the 'detail page' for every movie found but when I want to use that URL I potentially would get more than one movie.

I think this would be easier to fix if I could use a variable in my <expression>Regex comes here</expression> bit which would then look like <expression>Regex with Movietitle comes here</expression>

Problem is that I have no clue on how to sneak in the Movietitle - if it is possible at all?

Jan
jhhbe is offline   Reply With Quote
Old 2009-01-02, 23:32   #2
spiff
Grumpy Bastard Developer
 
spiff's Avatar
 
Join Date: Nov 2003
Posts: 7,715
spiff is on a distinguished road
Default

that's easy. just stick the movie title in a buffer then adresse it with $$<#buffer>. the e.g. $$1 works everywhere, in output, input, expressions
__________________
Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
spiff is online now   Reply With Quote
Old 2009-01-03, 11:48   #3
jhhbe
Junior Member
 
Join Date: Jul 2007
Posts: 9
jhhbe is on a distinguished road
Default

Quote:
Originally Posted by spiff View Post
that's easy. just stick the movie title in a buffer then adresse it with $$<#buffer>. the e.g. $$1 works everywhere, in output, input, expressions
Right - that makes me feel a bit silly but I still can't figure it out. The CreateSearchUrl has the movie title in $$1 and sends the webpage to $$3 - for testing I'm simply using the exported xml movie database from XBMC (so I have more movies on one page).

Then the GetSearchResults has all the titles in it so I would need the title to restrict the overview list.

I threw in a couple of lines to send the movie title to buffer 2 (because I have it when creating the search url) but I'm not able to find it back lower down when reading the results.

If I replace the underlined .*? with $$2 in the GetSearchResults section it does not work although when I hardcode a movie title in there I get that title in the overview rather than the full list so that looks promising. So where is my $$2 gone?

Code:
	<CreateSearchUrl dest="3">
		<RegExp input="$$1" output="http://smart-pvr/movie.xml" dest="3">
			<RegExp input="$$1" output="\1" dest="2">
				<expression noclean="1"/>
			</RegExp>
			<expression noclean="1"/>
		</RegExp>
	</CreateSearchUrl>
	<GetSearchResults dest="8">
		<RegExp input="$$5" output="&lt;?xml version=&quot;1.0&quot; encoding=&quot;iso-8859-1&quot; standalone=&quot;yes&quot;?&gt;&lt;results&gt;\1&lt;/results&gt;" dest="8">
			<RegExp input="$$1" output="&lt;entity&gt;&lt;title&gt;&apos;\2&apos; van \4 (\3)&lt;/title&gt;&lt;url&gt;http://smart-pvr/movie.xml&lt;/url&gt;&lt;/entity&gt;" dest="5">
				<expression repeat="yes">&lt;(movie)&gt;.*?&lt;title&gt;(.*?)&lt;/title&gt;.*?&lt;year&gt;(.*?)&lt;/year&gt;.*?&lt;director&gt;(.*?)&lt;/director&gt;</expression>
			</RegExp>
			<expression noclean="1"/>
		</RegExp>
	</GetSearchResults>
jhhbe is offline   Reply With Quote
Old 2009-01-03, 15:29   #4
spiff
Grumpy Bastard Developer
 
spiff's Avatar
 
Join Date: Nov 2003
Posts: 7,715
spiff is on a distinguished road
Default

so you want the movie title in getsearchresults if i understand you correctly?

by default the scraper parser clears buffers between function calls. use the clearbuffers="no" parameter to override this behaviour. be warned though; getsearchresults puts the url in buffer 2 so you will have to use another one
__________________
Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.

Last edited by spiff; 2009-01-03 at 15:44.
spiff is online now   Reply With Quote
Old 2009-01-03, 23:53   #5
jhhbe
Junior Member
 
Join Date: Jul 2007
Posts: 9
jhhbe is on a distinguished road
Default

Great - that did the trick!

One final issue though - when the title is '101 Dalmatians' it is NOT passing on '101 Dalmatians' but '101%20dalmatians' instead. I tried the noclean option but that is not working.

When I have a movie in my webpage with the '101%20dalmatians' then it all works so I'm quite happy with that so far.

Any pointers as to where I can find the untouched title? When the script fails I get a popup with the exact title so the system has to know somehow?

btw - thanks a lot for your help - I needed to be put on the right track or I would never have made it so far.

Jan

Last edited by jhhbe; 2009-01-03 at 23:53. Reason: 'passing on' or 'NOT passing on' makes a difference
jhhbe is offline   Reply With Quote
Old 2009-01-04, 00:34   #6
spiff
Grumpy Bastard Developer
 
spiff's Avatar
 
Join Date: Nov 2003
Posts: 7,715
spiff is on a distinguished road
Default

not much to do about that i'm afraid. since the search is usually done using url's we have to url encode it prior to passing it. if this really is a problem we can add an additional input that holds the non-encoded search title, just say jump
__________________
Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.

Last edited by spiff; 2009-01-04 at 02:41.
spiff is online now   Reply With Quote
Old 2009-01-05, 21:05   #7
jhhbe
Junior Member
 
Join Date: Jul 2007
Posts: 9
jhhbe is on a distinguished road
Default

hi,

Another cry for help - ok the scraper has the wrong name - originally I intended to integrate the SageTV library into XBMC but it seems I can't get the plot from the SageTV webserver.

Next idea was simply to throw the exported videodb.xml from XBMC on the webserver so searching for a title is not really possible as the webserver always simply returns the full .xml with all movies. That xml is quite easy to produce and why would we not consider the xbmc format of the xml as the reference format?

So when the CreateSearchUrl performs the call the webserver gives the full .xml - the regex then filters out that one title because before the <title> tag I added a <cleantitle> tag and that works nicely (we would not have to do this if we got the exact title but let's not go there yet).

What I absolutely don't understand is why I lose the content of $$6 in the GetDetails regardless whether I use <GetDetails clearbuffers="no" dest="3"> or <GetDetails dest="3"> (I thought the clearbuffers affected the bit that came after the module) At that point $$6 is empty so the regex matches the first title in the .xml - just when I thought I was there

The output bit for the getdetails is really easy because we simply copy the content of the .xml we get from the webserver.

Any hints? Many thanks!

p.s.: I could achieve this by simply importing the videodb but I've given up on that as I could not get the thumbnails in and this is good practice when I start looking into tv-show integration with SageTV.

Code:
<scraper name="SageTV" content="movies" thumb="SageTV.gif" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
	<NfoUrl dest="3">
		<RegExp input="$$1" output="http://smart-pvr:8080/thumbs/movie.xml" dest="3">
			<expression noclean="1"/>
		</RegExp>
	</NfoUrl>
	<CreateSearchUrl clearbuffers="no" dest="3">
		<RegExp input="$$1" output="http://smart-pvr:8080/thumbs/movie.xml" dest="3">
			<RegExp input="$$1" output="\1" dest="6">
				<expression/>
			</RegExp>
			<expression noclean="1"/>
		</RegExp>
	</CreateSearchUrl>
	<GetSearchResults clearbuffers="no" dest="8">
		<RegExp input="$$5" output="&lt;?xml version=&quot;1.0&quot; encoding=&quot;iso-8859-1&quot; standalone=&quot;yes&quot;?&gt;&lt;results&gt;\1&lt;/results&gt;" dest="8">
			<RegExp input="$$1" output="&lt;entity&gt;&lt;title&gt;&apos;\2&apos; van \4 (\3)&lt;/title&gt;&lt;url&gt;http://smart-pvr:8080/thumbs/movie.xml&lt;/url&gt;&lt;/entity&gt;" dest="5">
				<expression repeat="yes">&lt;(movie)&gt;.*?&lt;cleantitle&gt;($$6)&lt;/cleantitle&gt;.*?&lt;year&gt;(.*?)&lt;/year&gt;.*?&lt;director&gt;(.*?)&lt;/director&gt;</expression>
			</RegExp>
			<expression noclean="1"/>
		</RegExp>
	</GetSearchResults>
	<GetDetails dest="3">
		<RegExp input="$$5" output="&lt;details&gt;\1&lt;/details&gt;" dest="3">
			<RegExp input="$$1" output="\1" dest="5">
				<expression trim="1" noclean="1">$$6&lt;/cleantitle&gt;(.*?)&lt;/movie&gt;</expression>
			</RegExp>
			<expression noclean="1"/>
		</RegExp>
	</GetDetails>
</scraper>
jhhbe is offline   Reply With Quote
Old 2009-01-06, 17:28   #8
spiff
Grumpy Bastard Developer
 
spiff's Avatar
 
Join Date: Nov 2003
Posts: 7,715
spiff is on a distinguished road
Default

i do understand. r16922
__________________
Always read the XBMC online-manual, FAQ and search the forum before posting.
Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules.
For troubleshooting and bug reporting please make sure you read this first.
spiff is online now   Reply With Quote
Old 2009-01-15, 15:13   #9
jhhbe
Junior Member
 
Join Date: Jul 2007
Posts: 9
jhhbe is on a distinguished road
Default

yep - that works nicely! (he said after downloading and installing linux, compiling xbmc and some more suffering with a drive that refused to read half of the cds I burned with the ubuntu image).

Thanks!
jhhbe is offline   Reply With Quote
Reply

Bookmarks


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 14:17.


Protected by Akismet, We recommend WordPress blogs
Copyright © 2008, XBMC Project