View Full Version : KinoPoisk2 (Russian Movies) Scraper
Hi,
Let me present another KinoPoisk.ru (http://xbmc.org/forum/showthread.php?t=45404) scraper. It's a completely re-worked scraper Kinopoisk.ru with following features:
Optimized regexps
Low-res cover if no poster present (really helpful on some old movies)
Artists' roles
Can fetch movie stills fanart, wallpapers fanart, or both
Fixed incorrect parsing of outline/plot
Download version 1.0 of KinoPoisk2 from here:
http://files.me.com/andrey_babak/gtxbcl
P.S. I'd like to thank spiff for his help!
awesome for you russians :)
one question though; what does that ServerEncoding tag do?
you are the man! spasibo balshoye. I was waiting for this.
Zemlyak, ya tozhe s Kieva teper v NY.
awesome for you russians :)
one question though; what does that ServerEncoding tag do?
I didn't check the source of the parser but as far as I can tell looking at the original scraper, it defines how the external URLs are parsed. Maybe it just does nothing though ;-) (or works in Plex only)
i know that it must be a plex thing as i wrote the scraper parser and most of the surrounding code :)
By the way, does the parser handle server encoding returned in headers? It would be great to make scraper completely UTF-8
scraper code does honor the encoding you set on the returned xml.
i guess the ServerContentEncoding is used to convert the html pages to utf-8 prior to passing them to the scrapers. i will dig in the plex git
edit: dug a bit. it's nonsense from the plex devs. the servercontentencoding is just a dupe of the encoding set on the returned xml
When xbmc load info from the site, Kinopoisk.ru ban me about 30 minutes. Because of what? At the Plex this does not happen.
TigerHeart
2009-05-14, 18:24
I try to get info about the movie Butterfly effect (I type movie name in russian - "Эффект бабочки"). But the scraper returns me next list of movies:
==============
Интервью с вампиром
Сделка с дьяволом
Мадагаскар 2
Ирония судьбы. Продолжение.
Загадочная история Бенджамина Баттона
Суини Тодд,демон+парикмахер с Флит-стрит
==============
And I see the same list every time when I try to get info about any movie. Whats wrong?
Thanks.
PS. I made the screenshots, but I can't understand how to attach them here. But I can send them to anyone by e-mail.
GooglieS
2009-05-14, 20:42
This script does not load any information/art from kinopoisk! Something is broken?
//не работает! фильм из списка находит, но никакую инфу с кинопоиска не подгружает :( Что делать?
Попытки исправить пока, что нулевые. Вот ждем гуру создателей хбмс. Исправлено только для Plex (http://forums.plexapp.com/index.php?showtopic=3861&st=20&start=20) - ссылка на форум. И очень интересная заметка - бан на самом кинопоиске по ип. И точно так же, как и у TigerHeart.
GooglieS
2009-05-14, 23:25
Как банит? Меня хттп не банит!
TigerHeart
2009-05-15, 09:38
Please, return the old version!!! We don't need your version 2!!! Nobody need it. It doesn't work at all!!! Version 1 is the best!!!
TigerHeart
2009-05-15, 11:11
Eng: Does anybody know where I can download the old wersion of kinopoisk.xml?
Rus: Кто-нибудь знает откуда можно скачать старую версию файла kinopoisk.xml?
Eng: Does anybody know where I can download the old wersion of kinopoisk.xml?
Rus: Кто-нибудь знает откуда можно скачать старую версию файла kinopoisk.xml?
kinopoisk.xm work fine. But ScraperParser.cpp not work.
Дело не в кинопоиске, а в скрипте, обрабатывающего этот скрапер. Именно в ScraperParser.cpp
Вот его история - http://xbmc.org/trac/log/branches/linuxport/XBMC/xbmc/utils/ScraperParser.cpp?rev=10815
Вот попробуйте этот - Если работает, то пишите сюда. 86
TigerHeart
2009-05-15, 15:31
Вот попробуйте этот - Если работает, то пишите сюда. 86
Ух-ты! Работает! Спасибо!!!
Упс! Рано обрадовался. Теперь название фильма находит правильно, но когда открываешь "Информацию о фильме", то никакой информацмм по этому фильму нету. А вместо названия фильма только одна строка "Кинопоиск.ru - Все фильмы планеты". Может меня просто забанили на самом КиноПоиске?
althekiller
2009-05-15, 19:49
Please keep discussion in the XBMC forums in English, thanks.
GooglieS
2009-05-15, 20:30
Does not working... XBMS hangs for a while, when fetching film info.
Hi all.
Sorry for my bad English ))))
Eng: Can a scrapper send a useragent message?
e.g. request.UserAgent = "Mozilla/5.0 (Windows; U; MSIE 7.0; Windows NT 6.0; en-US)";
request.Accept = "text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
When in MediaPortal there was a similar problem, it has dared transfer kinopoisk UserAgent.
Rus:Кто нибудь знает, можно ли в скрапере передавать UserAgent. Когда в MediaPortal'e была подобная проблема, он решилась обманом кинопоиска, передаче ему юзерагента.
hi all.
Sorry for my bad english ))))
eng: Can a scrapper send a useragent message?
E.g. request.useragent = "mozilla/5.0 (windows; u; msie 7.0; windows nt 6.0; en-us)";
request.accept = "text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
when in mediaportal there was a similar problem, it has dared transfer kinopoisk useragent.
Rus:Кто нибудь знает, можно ли в скрапере передавать useragent. Когда в mediaportal'e была подобная проблема, он решилась обманом кинопоиска, передаче ему юзерагента.
Я бы мог попробовать, но незнаю куда этот код вставлять...
uhm, what's that about user agent? i stopped reading this thread pages back. if you want our attention
1) stick to the forum rules - english only
2) stay out of the dev forum unless you have something to contribute
uhm, what's that about user agent? i stopped reading this thread pages back. if you want our attention
1) stick to the forum rules - english only
2) stay out of the dev forum unless you have something to contribute
Help...)
Can a scrapper send a useragent message?
request.useragent = "mozilla/5.0 (windows; u; msie 7.0; windows nt 6.0; en-us)";
request.accept = "text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
hmm, currently not. but it seems like something i would consider adding. ticket please
Sorry, no time for ticket. Help.
What are the differences version 8.10 and version 9 of the suorce Scraper?
http://xbmc.org/trac/browser/branches/8.10_Atlantis-linux-osx-win32/XBMC/xbmc/utils/ScraperParser.cpp
http://xbmc.org/trac/browser/branches/9.04_Babylon-linux-osx-win32/XBMC/xbmc/utils/ScraperParser.cpp
http://xbmc.org/trac/browser/branches/8.10_Atlantis-linux-osx-win32/XBMC/xbmc/utils/ScraperUrl.cpp
http://xbmc.org/trac/browser/branches/9.04_Babylon-linux-osx-win32/XBMC/xbmc/utils/ScraperUrl.cpp
Work/ Wait for test. Uploaded later.
//Заработало. Скоро выложу. Оказалось - ошибка было в коде скрапера.
Work/ Wait for test. Uploaded later.
//Заработало. Скоро выложу. Оказалось - ошибка было в коде скрапера.
Wow! That's cool! I look forward to.
//Ждем с нетерпением
goodwill
2009-07-05, 12:09
Hamp? Post a fix you did for this somewhere?
// Polozhite kudanibut' patchik ili sam file??
Komandor
2009-07-28, 22:26
Hello. Thank you for the scrapper. It works good, but there is one little problem: genres can not be scrapped. I don't see this field at the information screen. I hope, that this problem will be solved. Good luck.
vlavrinenko
2009-08-02, 17:20
when I use this scraper in xbmc, it gets information in 1251 encoding, and it displays wrong. Seems it has to convert it to UTF-8. Is it possible?