Python Scrapy Not Crawling All Urls In Scraped List
I am trying to scrape information from the pages listed on this page. https://pardo.ch/pardo/program/archive/2017/catalog-films.html the xpath selector: film_page_urls_startpage =
Solution 1:
I checked this
len(film_page_urls_startpage)
and I get only 11, not 23.
If I use xpath('//article/a/@href')
then I get 23 urls.
There is no need to add @class
. There is no other article
.
EDIT:
If I do
for item in sel.xpath('//article/@class').extract():
print('class:', item)
then I get
class: strip-list_link_all strip-list strip--color row row--5 evenclass: strip-list_link_all strip-list strip--color row row--5class: strip-list_link_all strip-list strip--color row row--5 evenclass: strip-list_link_all strip-list strip--color row row--5class: strip-list_link_all strip-list strip--color row row--5 evenclass: strip-list_link_all strip-list strip--color row row--5class: strip-list_link_all strip-list strip--color row row--5 evenclass: strip-list_link_all strip-list strip--color row row--5class: strip-list_link_all strip-list strip--color row row--5 evenclass: strip-list_link_all strip-list strip--color row row--5class: strip-list_link_all strip-list strip--color row row--5 evenclass: strip-list_link_all strip-list strip--color row row--5class: strip-list_link_all strip-list strip--color row row--5 evenclass: strip-list_link_all strip-list strip--color row row--5class: strip-list_link_all strip-list strip--color row row--5 evenclass: strip-list_link_all strip-list strip--color row row--5class: strip-list_link_all strip-list strip--color row row--5 evenclass: strip-list_link_all strip-list strip--color row row--5class: strip-list_link_all strip-list strip--color row row--5 evenclass: strip-list_link_all strip-list strip--color row row--5class: strip-list_link_all strip-list strip--color row row--5 evenclass: strip-list_link_all strip-list strip--color row row--5class: strip-list_link_all strip-list strip--color row row--5 even
So some items have even
in class string and this was your problem.
Post a Comment for "Python Scrapy Not Crawling All Urls In Scraped List"