![]() |
|
|||||||
| Scraper Development Developers forum for meta data scrapers. Scraper developers only! Not for posting feature requests, bugs, or end-user support requests! |
![]() |
|
|
Thread Tools | Search this Thread | Display Modes |
|
|
#1 |
|
Member
Join Date: Feb 2004
Posts: 62
![]() |
I'm working on a TV Show scraper for allocine.fr.
I'm down to the episode list, but I have a little problem: I use the scrap.exe tool to test it, and when the tool get the links for the episode list, there is a "&" sign that gets lost, let me show you: Code:
</status><premiered> 7 Aošt 2005</premiered><episodeguide><url>http://www.allocine.fr/series/episodes_gen_csaison=1511&cserie=513.html</url> <url>http://www.allocine.fr/series/episodes_gen_csaison=2450&cserie=513.html</url></episodeguide></details> Episodelist URL 1:http://www.allocine.fr/series/episodes_gen_csaison=1511cserie=513.html Episodelist URL 2:http://www.allocine.fr/series/episodes_gen_csaison=2450cserie=513.html GetEpisodeListInternal 2 returned : GetEpisodeList returned : Error: Unable to parse episodelist.xml You can see that in the <details> tag the URL are OK : Code:
<url>http://www.allocine.fr/series/episodes_gen_csaison=1511&cserie=513.html</url> Code:
Episodelist URL 1:http://www.allocine.fr/series/episodes_gen_csaison=1511cserie=513.html Code:
<RegExp input="$$8" output="<episodeguide>\1</episodeguide>" dest="5+"> <RegExp input="$$2" output="<url>http://www.allocine.fr/series/episodes_gen_csaison=\1&cserie=$$4.html</url>" dest="8"> <expression repeat="yes">"/series/casting_gen_csaison=([0-9]*)&cserie=$$4.html" class="link1">[0-9]</a></expression> </RegExp> <expression noclean="1"></expression> </RegExp> any help would be appreciated. The_Dogg |
|
|
|
|
|
#2 |
|
Member
Join Date: Feb 2004
Posts: 62
![]() |
After a little more research I found the way to have the missing & show
![]() I had to put Code:
&amp; Code:
<RegExp input="$$8" output="<episodeguide>\1</episodeguide>" dest="5+"> <RegExp input="$$2" output="<url>http://www.allocine.fr/series/episodes_gen_csaison=\1&amp;cserie=$$4.html</url>" dest="8"> <expression repeat="yes">"/series/casting_gen_csaison=([0-9]*)&cserie=$$4.html" class="link1">[0-9]</a></expression> </RegExp> <expression noclean="1"></expression> </RegExp>
|
|
|
|
|
|
#3 |
|
Grumpy Bastard Developer
Join Date: Nov 2003
Posts: 7,715
![]() |
reason for this is: you are in an xml document. and you return xml.... each time xml is parsed, you need & or it will be stripped due to being a nonvalid xml char....
__________________
Always read the XBMC online-manual, FAQ and search the forum before posting. Do not e-mail XBMC-Team members directly asking for support. Read/follow the forum rules. For troubleshooting and bug reporting please make sure you read this first. |
|
|
|
![]() |
| Bookmarks |
| Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
| Thread Tools | Search this Thread |
| Display Modes | |
|
|