View Full Version : MovieMeter.nl (Dutch Movies) Scraper development...
I would like to try and create a moviemeter.nl scraper.
I just doen't have as much information a imdb, only:
Name
Year
Country
Genre(s)
Runtime
Director name
Actor names
Plot
Image
Is it possible to retrieve other things like rating from imdb?
snaaps50
2008-01-24, 02:06
i want this also
I have checkt diffent pages and all pages has the same content:
<img class="poster" src="http://www.moviemeter.nl/images/covers/30000/30611.jpg" style="width: 200px;" alt="Verlengd Weekend (2005)" /><div id="film_stars"><div class="star_full"></div><div class="star_full"></div><div class="star_full"></div><div class="star_empty"></div><div class="star_empty"></div></div><div id="film_votes"><b>133</b> stemmen<span class="divider"> | </span>gemiddelde <b>3,11</b></div><div id="film_poster_stats_topnotation" style="padding-bottom: 20px;"></div><div id="film_myvote"></div><div id="film_poster_stats_bottom" style="background: red;margin-top: -46px;"></div></div><div id="film_info">België<br />Komedie<br />93 minuten<br /><br />geregisseerd door <a href="http://www.moviemeter.nl/director/4743">Hans Herbots</a><br />met Jan Decleir, Koen De Bouw en Wouter Hendrickx<br /><br />Deze sociale komedie vertelt het verhaal van twee ontslagen werknemers van een Kempisch familiebedrijfje (Jan Decleir en Wouter Hendrickx) die hun ex-werkgever (Koen De Bouw) gijzelen in zijn eigen huis. Het duo heeft het plan opgevat voor zichzelf en hun gedupeerde werkmakkers een schadevergoeding af te dwingen. Maar niets loopt zoals gepland.</div>
</div>
mabey there is somebody to can make a plugin based on the source code from videometer.nl
yes, you can chain as you want in the scrapers - just have a look at the imdb one
I'm also trying to make a scraper for moviemeter.nl. Problem is i cant get the results of the search.
the search page:
http://www.moviemeter.nl/film.search/moviename
redirects to
http://www.moviemeter.nl/film/searchresults#results
What i get after scraping this page is a blank search page.
Anyone have any tips how to get the search results?
Been looking at other scrapers for about a day now but haven't been able to find the solution.
you need to post the form at http://www.moviemeter.nl/film/search/
Forgive me if this is an newbie question. I'm just starting with this scraper stuff.
Here's what i got
After some searching i found that you can query moviemeter.nl like this:
http://www.moviemeter.nl/calls/quicksearch.php?hash=[HASHCODE]&type=films&search=[MOVIENAME]
To do this i need to get the hashcode from the http://www.moviemeter.nl main page. This means i first have to request the main page. Read the hashcode from the results and then create my actual search request.
I can't seem to figure out how to do this in the scraper. Does anyone have an idea/solution?
i just added chaining in getsearchresults so you can do this
cheers
spiff
Bigfoot87
2008-07-11, 11:16
What's the status of this project?
I would love to see my movieinformation in Dutch... :)
And a lot of Dutch community-members with me I guess...
I can't speak for the topic starter but i'm still working on it. As i'm just getting into this scraper development i'm at a trial and error stage so it might take me a while.
you got the search thing working i assume?
you got the search thing working i assume?
I've got it as far as downloading the main page and recovering the hash value needed for the search. I've been on vacation so haven't worked on it much.
What it does now is request the main page and retrieve the hashcode. Next i loose the moviename/search string as $$1 gets overwritten. I need both to build the complete search string. Next problem is to get the scraper to request the actual search url. Which is where chaining (is what you mentioned) comes in i guess?
I've very little time to figure everyting out. Mostly i skip my lunch breaks at work to work on this, so please excuse me if it takes me a long time. Hope to have it figured out and complete in a few weeks if i can find time.
pseudo;
<CreateSearchUrl clearbuffers="no" dest="3">
store inputstring in,say, $$9
return <url>mainpage</url>
<GetSearchResults dest="3">
fetch hash
return <url function="realgetresults">someurl</url> based on hash and $$9
<RealGetResults>
return the parsed results
important points being;
1) the clearbuffers=no will make sure you don't clear the buffers between subsequent calls. this means you can pass the searchstring further on.
2) the url with a function parameter is what i'm referring to as chaining
Thanks spiff that clears some things up.
I made/found some time yesterday to work on the scraper and think i got it at a stage that it can retrieve the basic details from moviemeter. I'm testing it using scrap.exe but it seems to keep crashing. I also tried the imdb.xml with scrap.exe which also fails. So i'm guessing that scrap.exe differs from xbmc in functionality and that i should test the scraper with xbmc. Hopefully i will find time this weekend to make a stable "basic" version of the scraper after which i will be looking at adding impawards, movieposterdb support for posters and imdb support for additional movie details.
scrap is outdated and we lost the source code for parts of it :/
Thanks for all the help you are providing spiff.
I think it might be usefull to update the wiki with the fact that scrap.exe is outdated because the current wiki page (http://xbmc.org/wiki/?title=Scrap) doesnt notify you of this. Or am i missing something?
Did this scraper got any progress :) I'd still love my movie information in dutch.
Sadly i can't report much progress. As i said in my latest post in this thread, i think i have the basic functionality done but need to test it in xbmc. I just haven't found the time this last weeks to do this. I'm still determined to finish the scaper, and add support for impawards, movieposterdb, fanart and additional imdb info. But sadly i can't give you a timeframe for this all. I'm working on it in the (very) little spare time that i have.
Sadly i can't report much progress. As i said in my latest post in this thread, i think i have the basic functionality done but need to test it in xbmc. I just haven't found the time this last weeks to do this. I'm still determined to finish the scaper, and add support for impawards, movieposterdb, fanart and additional imdb info. But sadly i can't give you a timeframe for this all. I'm working on it in the (very) little spare time that i have.
Sounds very ambitious :) I have plenty of time, just nice to hear your planning on finishing it in the future, hopefully someone pops up to offer you help to speed everything up a bit, there are plenty of dutch/flemish xbmc users out there.
Bigfoot87
2008-09-18, 00:10
Sounds very ambitious :) I have plenty of time, just nice to hear your planning on finishing it in the future, hopefully someone pops up to offer you help to speed everything up a bit, there are plenty of dutch/flemish xbmc users out there.
True!
Maybe you can ask for help in the Dutch community topic (http://gathering.tweakers.net/forum/list_messages/1306695/0). :)
Finally had some time to fiddle with the scraper. Seems moviemeter.nl changed something on the main page so i had to change the regexp for retrieving the hashcode. At least that part is working again.
I have a question about chaining for the GetSearchResults function. Does scrap.exe support this or can i only test this using xbmc? My guess it's the latter. Am i guessing right?
yeah, scrap is totally deprecated as we lost the source code :/
Wow, that was a fast reply. Thanks much appreciated.
Time for me to setup xbmc for windows to correctly test the scraper and expand it's functionalities.
Whe can i find the moviemeter scraper?
Bigfoot87
2008-10-07, 15:29
It's still under construction. ;)
Hi, i made a php script that hopefully someone can translate into a working scraper file...
If you need more info please reply
<?php
/*
if ($('quicksearch')) {
new Searcher.Ajax.Json('quicksearch', 'http://www.moviemeter.nl/calls/search.php?hash=29918b11647fdd3755d59e6ac45d4977&qs=1', {
'postVar': 'search',
'quicksearch': true,
'maxChoices': 12,
'overflow':true,
'basic':true
});
}
*/
$term = 'jurassic park';
$url = 'http://www.moviemeter.nl/calls/search.php?hash=29918b11647fdd3755d59e6ac45d4977&qs=1&search='.$term;
$str = file_get_contents($url);
//json response example for search "jurassic park"
//$str = '["header_films_0_3",{"i":"365","ty":"f","t":"Jurassic Park","a":"","y":"1993","img":"%3Cimg src%3D%22http%3A%2F%2Fwww.moviemeter.nl%2Fimages%2 Fcovers%2Fthumbs%2F0%2F365.jpg%22 class%3D%22thumbnail%22 alt%3D%22Jurassic Park %281993%29%22 %2F%3E","px":75,"h":"%3Cp class%3D%22subtext%22%3EAvontuur %2F Science-Fiction%2C 127 minuten%3Cbr %2F%3Egeregisseerd door Steven Spielberg%3Cbr %2F%3Emet Sam Neill%2C Jeff Goldblum en Laura Dern%3Cbr %2F%3E%3C%2Fp%3E"},{"i":"341","ty":"f","t":"Jurassic Park III","a":"Jurassic Park 3","y":"2001","img":"%3Cimg src%3D%22http%3A%2F%2Fwww.moviemeter.nl%2Fimages%2 Fcovers%2Fthumbs%2F0%2F341.jpg%22 class%3D%22thumbnail%22 alt%3D%22Jurassic Park III %282001%29%22 %2F%3E","px":75,"h":"%3Cp class%3D%22subtext%22%3EScience-Fiction %2F Actie%2C 92 minuten%3Cbr %2F%3Egeregisseerd door Joe Johnston%3Cbr %2F%3Emet Sam Neill%2C William H. Macy en T%E9a Leoni%3Cbr %2F%3E%3C%2Fp%3E"},{"i":"364","ty":"f","t":"Lost World%3A Jurassic Park%2C The","a":"Jurassic Park 2","y":"1997","img":"%3Cimg src%3D%22http%3A%2F%2Fwww.moviemeter.nl%2Fimages%2 Fcovers%2Fthumbs%2F0%2F364.jpg%22 class%3D%22thumbnail%22 alt%3D%22Lost World%3A Jurassic Park%2C The %281997%29%22 %2F%3E","px":75,"h":"%3Cp class%3D%22subtext%22%3EScience-Fiction %2F Avontuur%2C 129 minuten%3Cbr %2F%3Egeregisseerd door Steven Spielberg%3Cbr %2F%3Emet Jeff Goldblum%2C Julianne Moore en Vince Vaughn%3Cbr %2F%3E%3C%2Fp%3E"},"header_directors_0_0","header_topics_0_2",{"i":"1424","t":"Jurassic Park 4 %28Film %3E Nieuws%29","ty":"t"},{"i":"5679","t":"Favoriete dino uit de Jurassic Park reeks %28Film %3E Toplijsten en favorieten%29","ty":"t"},"header_users_0_2",{"i":"37185","t":"JurassicPark","ty":"u","img":"%3Cimg src%3D%22http%3A%2F%2Fwww.moviemeter.nl%2Fimages%2 Fuser_unknown.jpg%22 class%3D%22avatar%22 %2F%3E","px":54,"h":"%3Cp class%3D%22subtext%22%3Eingeschreven sinds 15 augustus 2006%3Cbr %2F%3E632 stemmen%2C 509 berichten%3C%2Fp%3E"},{"i":"1720","t":"Jurassic Smurf","ty":"u"}]';
//echo $str;
//echo '<hr />';
//i = id, t = movie title, y = movie year
preg_match_all('|"i":"(.*)".*"t":"(.*)".*"y":"(.*)".*|iUm', $str, $ids);
$detail_urls = array();
if (!empty($ids[1])) {
foreach($ids[1] as $id) {
array_push($detail_urls, 'http://www.moviemeter.nl/film/'.$id);
}
}
//echo 'matches<pre>';
//print_r($detail_urls);
//echo '</pre>';
//parse a detail url
$contents = file_get_contents('http://www.moviemeter.nl/film/364');
$contents = str_replace("\r\n", '', $contents);
$contents = str_replace("\r", '', $contents);
$contents = str_replace("\n", '', $contents);
preg_match_all('|.*<div id="film_info">(.*)<br />(.*)<br />(.*)<br />(.*)<br />(.*)<br />(.*)<br />(.*)<br />(.*)<br />.*</div>.*|iUm', $contents, $movie_info);
echo 'country:'.$movie_info[1][0];
echo '<br />';
echo 'genre(s):'.$movie_info[2][0];
echo '<br />';
echo 'movie length:'.$movie_info[3][0];
echo '<br />';
echo 'director:'.$movie_info[5][0];
echo '<br />';
echo 'actors:'.$movie_info[6][0];
echo '<br />';
echo 'movie info:'.$movie_info[8][0];
echo '<br />';
?>
Currently i have something working to get the movie description from moviemeter, but i am having a problem with the following expression:
<expression><div id="film_info">(.*)[^<div]</expression>
source string = 'fdqskfdq<div id="film_info">MOVIE CONTENT THAT I NEED<div>fdqsfqds</div></div>fsdjlk';
i am trying to get all the data betwee div id="film_info">xxx</div>, but i get alot more (also the comments), see for example http://www.moviemeter.nl/film/365
does anyone know how to set this to the right regex?
i have some working code for moviemeter:
<scraper name="Moviemeter" content="movies" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<!-- By fjskmdl 2 jan 2009 -->
<CreateSearchUrl dest="3">
<RegExp input="$$1" output="http://www.moviemeter.nl/calls/search.php?hash=3d669ba0d93914426945f6985e135be6&q s=1&search=\1" dest="3">
<expression noclean="1"/>
</RegExp>
</CreateSearchUrl>
<GetSearchResults dest="8">
<RegExp input="$$5" output="<?xml version="1.0" encoding="iso-8859-1" standalone="yes"?><results>\1</results>" dest="8">
<RegExp input="$$1" output="<entity><title>\3</title><url>http://www.moviemeter.nl/film/\2</url></entity>" dest="5">
<expression repeat="yes">({"i":"([0-9]+)","ty":"[a-z]*","t":"(.[^"]*).[^}])</expression>
</RegExp>
<expression noclean="1"/>
</RegExp>
</GetSearchResults>
<GetDetails dest="3">
<RegExp input="$$8" output="<details>\1</details>" dest="3">
<!-- title,year -->
<RegExp input="$$1" output="<title>\1</title><year>\2</year>" dest="8">
<expression trim="1" noclean="1"><h1>([^\(]*)\(([^\(]*)</expression>
</RegExp>
<!--Director-->
<RegExp input="$$1" output="<director>\2</director>" dest="8+">
<expression repeat="yes">geregisseerd door ([^>]*)>([^<]*)</expression>
</RegExp>
<!--Actors -->
<RegExp input="$$1" output="<actor><name>\1</name><role></role></actor>" dest="8+">
<expression>met ([^<]*)</expression>
</RegExp>
<!-- Runtime !-->
<RegExp input="$$1" output="<runtime>\1 minuten</runtime>" dest="8+">
<expression repeat="yes">([0-9]+) minuten</expression>
</RegExp>
<!-- Thumbnail !-->
<RegExp input="$$1" output="<thumb><url spoof="http://www.moviemeter.nl">http://www.moviemeter.nl/images/covers/\1/\2.jpg</url></thumb>" dest="8+">
<expression>http://www.moviemeter.nl/images/covers/([0-9]+)/([0-9]+)\.jpg</expression>
</RegExp>
<!--rating -->
<RegExp input="$$1" output="<rating>\1</rating>" dest="8+">
<expression>gemiddelde <b>([0-9,]+)([^<]*)</b></expression>
</RegExp>
<!-- nr votes -->
<RegExp input="$$1" output="<votes>\1</votes>" dest="8+">
<expression><b>([0-9]+)</b> stemmen</expression>
</RegExp>
<!-- genre -->
<RegExp input="$$1" output="<genre>\2</genre>" dest="8+">
<expression>film_info">([^<]*)<br />([^<]*)</expression>
</RegExp>
<!-- Plot -->
<RegExp input="$$1" output="<plot>\7</plot>" dest="8+">
<expression repeat="yes"><div id="film_info">([^<]*)<br />([^<]*)<br />([^<]*)<br /><br />geregisseerd door <a href="http://www\.moviemeter\.nl/director/([0-9]+)"([^<]*)</a><br />([^<]*)<br /><br />([^<]*)</expression>
</RegExp>
<expression noclean="1"/>
</RegExp>
</GetDetails>
</scraper>
bugs: when rating is 2,98 it shows 2.00
cast --> all persons are shown on 1 line
yeyh! i see you figured out the scraper syntax :)
the problem with the comma separated number is that they are simply not valid floating point numbers (parsed as %f in a sscanf like function if that tells you anything). you need to translate them to use a dot.
cast is just the expression, there is no repeat on it (but i assume you knew that)
Hi,
This scraper is going to stop working soon because of some changes in the HTML of the site I'm going to make. However, I'm creating an XML-RPC API (web service) for accessing the MovieMeter.nl film information. Would it be possible for you to change your scripts so this API is used instead of scraping the HTML? If someone wants to test using this API, please contact me at info@moviemeter.nl
yeah, that is very much possible and very nice of you to do so :)
Is there any news yet on this scraper? Unfortunately i'm a total noob into programming in xbmc. Is there any way we can help the development of this scraper?
i tried to look into it, but my mind dazzling. This is to much for me to comprehend. Instructions for the new API are here:
http://wiki.moviemeter.nl/index.php/API
hopefully someone more experienced will pick this up and make a scraper for moviemeter
Arvinine
2009-04-06, 16:11
Any update on this scraper?
Problem with Moviemeter is that, they keep changing there hashfile.
See for more info. (dutch)
http://gathering.tweakers.net/forum/list_messages/1306695?data%5Bfilter_keywords%5D=moviemeter&data%5Bboolean%5D=AND
Arvinine
2009-04-06, 23:59
http://www.moviemeter.nl/forum/1/9053/
Arvinine
2009-04-07, 12:26
Moviemeter has created its own API, since past January. I guess this a good start for a scraper for Moviemeter? http://wiki.moviemeter.nl/index.php/API
Is there anyone who is developing this scraper? Not that I am impatient but I realy want this :laugh::grin:
Is there any change in the status of the moviemeter scraper? :;):
Not possible at this point as there's no XML-PRC support in XBMC (web) scrapers.
OK thanks for the information.
Gamester17
2009-06-11, 20:08
Not possible at this point as there's no XML-PRC support in XBMC (web) scrapers.I thought that PCRE (http://www.pcre.org) was supported, is that not the same as XML-PRC (http://www.xmlrpc.com)? ???
Nope, xml-rpc has nothing to do with regular expressions.
Not possible at this point as there's no XML-PRC support in XBMC (web) scrapers.
so can we put a request at the xbmc developers? And is some able to code it when xbmc supports it?
stpedejo
2009-07-22, 22:37
I am able to make a little website than "under the hood" implements the XML-PRC of moviemeter.nl in order to return a fixed layout (going anything from a plain text display, html, or even xml).
I think it would be very "easy" then to make a scraper - which I want to ask somebody here to make it.
In order to know what the layout would be - easiest to make the scraper - pls contact me here in the forum to let me know how to format the output of the search result and the detail view.
http://xbmc.org/wiki/?title=Scrapers
there you can see the wanted output formats
is anyone still working on this?
dionyssoss
2009-09-11, 15:00
No one can make a scraper from moviemeter for XBMC?
MediaPortal is working with that.. but i want to use XBMC.
Please no one?
you can make it if you want it
you can make it if you want it
if i only had the programming skills.... :( I know some basic html language, but i'm completely dark on scrapers.....did some research with xbmc and wiki. But just don't know how and where to start. Maybe someone else with more programming skills is able to do it
I als spend quite some time searching a scraper for Moviemeter, and treid a few like YAMJ, but i think i've got a sollution:
Download MediaPortal and use the MP Config to scrape your database from Moviemeter.nl.
Really easy to use, although i advise you to name your films including the release year. for example Wanted (2008).avi, otherwise the scraper might get wrong results. I spend quite some time doing my entire database, but when you're finished it's easy to use it for your new movies.
Good Luck
tsJarlie
2009-09-29, 19:59
I'm using ant movie catalog, I adapted a script for moviemeter.nl and wrote
an export to xbmc script. It makes nfo files which are recognized bij XBMC...
I'm using ant movie catalog, I adapted a script for moviemeter.nl and wrote
an export to xbmc script. It makes nfo files which are recognized bij XBMC...
could you please share this script with us? Until anyone develops the scraper for xbmc, this could be a good backup. Maybe a developer can even use some of your script for xbmc.
Hi,
I would also like to use XBMC MovieMeter. Is there someone who can make a scraper. I understand that the current API from MovieMeter is not good for XBMC. Maybe I can make a php page that returns the correct data. Can we not develop it together with some people?
I have the test code online from the Moviemeter wiki page.
http://www.cyberpoint.nl/moviemeter/find.php?tt=the bourne
Amelandbor
2009-11-03, 11:38
Yanfoe has added moviemeter support.
http://code.google.com/p/yanfoe/
Anyone any luck? Moviemeter.nl has now an API:laugh: