Skip to content Skip to sidebar Skip to footer

One Spider With 2 Different Url And 2 Parse Using Scrapy

Hi I have 2 different domain with 2 different approach running in one spider I have tried this code but nothing works any idea please? class SalesitemSpiderSpider(scrapy.Spider):

Solution 1:

You have two different spiders in the same class. For the sake of maintainability, I recommend you to keep them in different files.

If you really want to keep them together, it would be easier split the urls into two lists:

type1_urls = ['https://www.forever21.com/eu/shop/Catalog/GetProducts', ]
type2_urls = ['https://www2.hm.com/en_us/sale/shopbyproductladies/view-all.html?sort=stock&image-size=small&image=stillLife&offset=0&page-size=20', ]

defstart_requests(self):
    for url in self.type1_urls:
        payload = self.payload.copy()
        yield Request(
            # ...
            callback=self.parse_1
       )

    for url in self.type2_urls:
        yield scrapy.Request(url, callback=self.parse_2)

Solution 2:

You should use self.url in for cycle and then work with i variable inside your loop for comparison, request yielding, etc.:

def start_requests(self):
    foriinself.url:
        if (i == 'https://www.forever21.com/eu/shop/Catalog/GetProducts'):
            payload = self.payload.copy()
            payload['page']['pageNo'] = 1yield scrapy.Request(
                i, method='POST', body=json.dumps(payload),
                headers={'X-Requested-With': 'XMLHttpRequest',
                     'Content-Type': 'application/json; charset=UTF-8'},
                callback=self.parse_2, meta={'pageNo': 1})

        if (i == 'https://www2.hm.com/en_us/sale/shopbyproductladies/view-all.html?sort=stock&image-size=small&image=stillLife&offset=0&page-size=20'):yield scrapy.Request(i, callback=self.parse_1)

Post a Comment for "One Spider With 2 Different Url And 2 Parse Using Scrapy"