如何在scrapy中发送带有标头和有效负载的Post请求

问题描述 投票:0回答:1

我正在尝试向 Graph API 发送发布请求,并且成功了,但我想在 scrapy 中发送相同的请求,但我不知道如何在 scrapy 中发送带有标头和有效负载的发布请求。

这是我的代码

import requests

url = 'https://www.kickstarter.com/graph'
headers =  {'authority':'www.kickstarter.com',
'method':'POST',
'path':'/graph',
'scheme':'https',
'accept':'*/*',
'accept-encoding':'gzip, deflate, br',
'accept-language':'en-US,en;q=0.9',
'content-length':'606',
'content-type':'application/json',
'cookie':"vis=f5761fb0e1994852-b38b5b3d46161036-c3a4a56c5add1076v1; lang=en; woe_id=YzFrZ1NUV1lRTUhMT2tsc1ZURHVsQT09LS12L0pidVVCeDBHZU16dk81MmVpeTNBPT0%3D--468e7c1e5daf8c17cdd902b0a1cb1ef4e2856543; optimizely_current_variations=%7B%7D; _pxhd=75f70796791b6f8a5930b19c70bcd30d268fe4a4f1644460c7c7bbe65d5e8196:837ba981-9d56-11eb-841e-e7065f1f0101; _pxvid=837ba981-9d56-11eb-841e-e7065f1f0101; ajs_anonymous_id=%22f5761fb0e1994852-b38b5b3d46161036-c3a4a56c5add1076v1%22; _ga=GA1.2.17378398.1618428050; _gid=GA1.2.1258279558.1618428050; __ssid=3d59a55ffedce2904d3464e3a555309; em_cdn_uid=t%3D1618428051657%26u%3D8d620439ed7740b89c98770bbaee8b05; __stripe_mid=e4e89c20-83c7-4ba0-907b-7b83f8b24051e87f22; em_p_uid=l:1618428053354|t:1618428053353|u:c814f9e5a157438b910a57075a7fe320; __stripe_sid=eaa7f9e2-2ba2-45db-8213-c79be847d1100aa907; ajs_anonymous_id=%22f5761fb0e1994852-b38b5b3d46161036-c3a4a56c5add1076v1%22; last_page=https%3A%2F%2Fwww.kickstarter.com%2Fprojects%2F1202256831%2Flumicube-an-led-cube-kit-for-the-raspberry-pi%3Fref%3D404-ksr10; local_offset=-2528; _gat_creatorAnalytics=1; _gat=1; _px2=eyJ1IjoiNmMwYTZiODAtOWRkMS0xMWViLTkyNzItOWRkZDk3Y2VlODdkIiwidiI6IjgzN2JhOTgxLTlkNTYtMTFlYi04NDFlLWU3MDY1ZjFmMDEwMSIsInQiOjE2MTg0ODExMzU4NzksImgiOiJhZWM4ZDc0MjgwM2IzZGFlY2JiZWNkZjYxNjc0Yjg4MWY5YWRhNTVkOTRiNDk5NjhmNzdmZWZjMzUzMmZkMDRiIn0=; _ksr_session=NEVzc0R3N0tIZHNsVlBoVzNQQ3haUXBCeC9jaWY4MExzbjNnNzZ0V3ZTTE1BcE1hcC94eFZVSTVUdXc4anJLRVJ3Zk81MVByNDVhdEhyaW9lZHNGa1l1OGdDTjhZN0FvUjd3Z1ZZRW8vb2x2ZGhsTm1Bb2N5TnV6TklEOFV5YzFBYzg5VHUzS3VPakpDT3pVQlgvY21RPT0tLXIzcFlXVFFsbG9Gc3JJRS9IU3VEdlE9PQ%3D%3D--1d66e41aef503bec8ea9d964160d776cee928583; request_time=Thu%2C+15+Apr+2021+10%3A00%3A53+-0000",
'origin':'https://www.kickstarter.com',
'referer':'https://www.kickstarter.com/projects/818583073/dies-irae-day-of-wrath-rated-r/description',
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36',
'x-csrf-token':'KFhfbaWae3u6BzTKoYZDw65CrYUk1NMQnI4zVruvfKspDvFRlIjlFY/HESrLol2iGX/+W1Yqww40nFqfgBdL7Q=='
}
urrl = '818583073/dies-irae-day-of-wrath-rated-r'
payload = {"operationName":"Campaign","variables":{"slug":urrl},"query":"query Campaign($slug: String!) {\n  project(slug: $slug) {\n    id\n    isSharingProjectBudget\n    risks\n    story(assetWidth: 680)\n    currency\n    spreadsheet {\n      displayMode\n      public\n      url\n      data {\n        name\n        value\n        phase\n        rowNum\n        __typename\n      }\n      dataLastUpdatedAt\n      __typename\n    }\n    environmentalCommitments {\n      id\n      commitmentCategory\n      description\n      __typename\n    }\n    __typename\n  }\n}\n"}

r = requests.post(url, headers=headers, data=json.dumps(payload))

请任何人都可以指导我如何在 scrapy 中发送相同的请求

python web-scraping scrapy data-mining data-extraction
1个回答
0
投票

在Scrapy中,我们有两种发出POST请求的方法:scrapy.Request和scrapy.FormRequest。当网页有表单时使用后者。欲了解更多信息,请参阅FormRequest

import scrapy

class Kickstarter(scrapy.Spider):
    name = 'kickstarter'

    def start_requests(self):
        url = 'https://www.kickstarter.com/graph'
        headers =  {'authority':'www.kickstarter.com',
                    'method':'POST',
                    'path':'/graph',
                    'scheme':'https',
                    'accept':'*/*',
                    'accept-encoding':'gzip, deflate, br',
                    'accept-language':'en-US,en;q=0.9',
                    'content-length':'606',
                    'content-type':'application/json',
                    'cookie':"vis=f5761fb0e1994852-b38b5b3d46161036-c3a4a56c5add1076v1; lang=en; woe_id=YzFrZ1NUV1lRTUhMT2tsc1ZURHVsQT09LS12L0pidVVCeDBHZU16dk81MmVpeTNBPT0%3D--468e7c1e5daf8c17cdd902b0a1cb1ef4e2856543; optimizely_current_variations=%7B%7D; _pxhd=75f70796791b6f8a5930b19c70bcd30d268fe4a4f1644460c7c7bbe65d5e8196:837ba981-9d56-11eb-841e-e7065f1f0101; _pxvid=837ba981-9d56-11eb-841e-e7065f1f0101; ajs_anonymous_id=%22f5761fb0e1994852-b38b5b3d46161036-c3a4a56c5add1076v1%22; _ga=GA1.2.17378398.1618428050; _gid=GA1.2.1258279558.1618428050; __ssid=3d59a55ffedce2904d3464e3a555309; em_cdn_uid=t%3D1618428051657%26u%3D8d620439ed7740b89c98770bbaee8b05; __stripe_mid=e4e89c20-83c7-4ba0-907b-7b83f8b24051e87f22; em_p_uid=l:1618428053354|t:1618428053353|u:c814f9e5a157438b910a57075a7fe320; __stripe_sid=eaa7f9e2-2ba2-45db-8213-c79be847d1100aa907; ajs_anonymous_id=%22f5761fb0e1994852-b38b5b3d46161036-c3a4a56c5add1076v1%22; last_page=https%3A%2F%2Fwww.kickstarter.com%2Fprojects%2F1202256831%2Flumicube-an-led-cube-kit-for-the-raspberry-pi%3Fref%3D404-ksr10; local_offset=-2528; _gat_creatorAnalytics=1; _gat=1; _px2=eyJ1IjoiNmMwYTZiODAtOWRkMS0xMWViLTkyNzItOWRkZDk3Y2VlODdkIiwidiI6IjgzN2JhOTgxLTlkNTYtMTFlYi04NDFlLWU3MDY1ZjFmMDEwMSIsInQiOjE2MTg0ODExMzU4NzksImgiOiJhZWM4ZDc0MjgwM2IzZGFlY2JiZWNkZjYxNjc0Yjg4MWY5YWRhNTVkOTRiNDk5NjhmNzdmZWZjMzUzMmZkMDRiIn0=; _ksr_session=NEVzc0R3N0tIZHNsVlBoVzNQQ3haUXBCeC9jaWY4MExzbjNnNzZ0V3ZTTE1BcE1hcC94eFZVSTVUdXc4anJLRVJ3Zk81MVByNDVhdEhyaW9lZHNGa1l1OGdDTjhZN0FvUjd3Z1ZZRW8vb2x2ZGhsTm1Bb2N5TnV6TklEOFV5YzFBYzg5VHUzS3VPakpDT3pVQlgvY21RPT0tLXIzcFlXVFFsbG9Gc3JJRS9IU3VEdlE9PQ%3D%3D--1d66e41aef503bec8ea9d964160d776cee928583; request_time=Thu%2C+15+Apr+2021+10%3A00%3A53+-0000",
                    'origin':'https://www.kickstarter.com',
                    'referer':'https://www.kickstarter.com/projects/818583073/dies-irae-day-of-wrath-rated-r/description',
                    'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36',
                    }
        urrl = '818583073/dies-irae-day-of-wrath-rated-r'
        payload = {"operationName":"Campaign","variables":{"slug":urrl},"query":"query Campaign($slug: String!) {\n  project(slug: $slug) {\n    id\n    isSharingProjectBudget\n    risks\n    story(assetWidth: 680)\n    currency\n    spreadsheet {\n      displayMode\n      public\n      url\n      data {\n        name\n        value\n        phase\n        rowNum\n        __typename\n      }\n      dataLastUpdatedAt\n      __typename\n    }\n    environmentalCommitments {\n      id\n      commitmentCategory\n      description\n      __typename\n    }\n    __typename\n  }\n}\n"}
        yield scrapy.FormRequest(url, formdata=payload, headers=headers, method='POST')


    def parse(self, response):
        print('successfully...')
© www.soinside.com 2019 - 2024. All rights reserved.