python - How to tell scrapy crawler to STOP following more links dynamically? -
basically have regex rule following pages
each page has 50 links
when hit link old (based on pre-defined date-time)
i want tell scrapy stop following more pages, not stop entirely, must continue scrape links has decided scrape -> (complete request
objects created). must not follow more links. program grind stop (when it's done scraping links)
is there way can inside spider?
once hit "too old" page, throw closespider
exception. in case, scrapy finish processing links being scheduled , shut down.
Comments
Post a Comment