编写Twisted Client以向多个API调用发送循环GET请求并记录响应

问题描述 投票:1回答:1

我有一段时间没有完成扭曲编程,所以我试图回到新项目中。我正在尝试设置一个可以将服务器列表作为参数的扭曲客户端,并且对于每个服务器,它发送API GET调用并将返回消息写入文件。此API GET调用应每60秒重复一次。

我使用Twisted的代理类使用单个服务器成功完成了它:

from StringIO import StringIO

from twisted.internet import reactor
from twisted.internet.protocol import Protocol
from twisted.web.client import Agent
from twisted.web.http_headers import Headers
from twisted.internet.defer import Deferred

import datetime
from datetime import timedelta
import time

count = 1
filename = "test.csv"

class server_response(Protocol):
    def __init__(self, finished):
        print "init server response"
        self.finished = finished
        self.remaining = 1024 * 10

    def dataReceived(self, bytes):
        if self.remaining:
            display = bytes[:self.remaining]
            print 'Some data received:'
            print display
            with open(filename, "a") as myfile:
                myfile.write(display)

            self.remaining -= len(display)


    def connectionLost(self, reason):
        print 'Finished receiving body:', reason.getErrorMessage()

        self.finished.callback(None)

def capture_response(response): 
    print "Capturing response"
    finished = Deferred()
    response.deliverBody(server_response(finished))
    print "Done capturing:", finished

    return finished

def responseFail(err):
    print "error" + err
    reactor.stop()


def cl(ignored):
    print "sending req"
    agent = Agent(reactor)
    headers = {
    'authorization': [<snipped>],
    'cache-control': [<snipped>],
    'postman-token': [<snipped>]
    }

    URL = <snipped>
    print URL

    a = agent.request(
        'GET',
        URL,
        Headers(headers),
        None)

    a.addCallback(capture_response)
    reactor.callLater(60, cl, None)
    #a.addBoth(cbShutdown, count)


def cbShutdown(ignored, count):
    print "reactor stop"
    reactor.stop()

def parse_args():
    usage = """usage: %prog [options] [hostname]:port ...
    Run it like this:
      python test.py hostname1:instanceName1 hostname2:instancename2 ...
    """

    parser = optparse.OptionParser(usage)

    _, addresses = parser.parse_args()

    if not addresses:
        print parser.format_help()
        parser.exit()

    def parse_address(addr):
        if ':' not in addr:
            hostName = '127.0.0.1'
            instanceName = addr
        else:
            hostName, instanceName = addr.split(':', 1)

        return hostName, instanceName

    return map(parse_address, addresses)

if __name__ == '__main__':
    d = Deferred()
    d.addCallbacks(cl, responseFail)
    reactor.callWhenRunning(d.callback, None)

    reactor.run()

但是,我很难确定如何让多个座席发送呼叫。有了这个,我依靠cl()--- reactor.callLater(60,cl,None)中的写入结束来创建调用循环。那么,一旦我的反应堆启动,我如何创建多个呼叫代理协议(server_response(协议))并继续为每个协议循环执行GET?

python api client twisted
1个回答
1
投票

看看猫拖进来了!

那么我该如何创建多个呼叫代理

使用treq。你很少想与Agent课程纠缠在一起。

此API GET调用应每60秒重复一次

使用LoopingCalls而不是callLater,在这种情况下它更容易,你以后会遇到更少的问题。

import treq
from twisted.internet import task, reactor

filename = 'test.csv'

def writeToFile(content):
    with open(filename, 'ab') as f:
        f.write(content)

def everyMinute(*urls):
    for url in urls:
        d = treq.get(url)
        d.addCallback(treq.content)
        d.addCallback(writeToFile)

#----- Main -----#            
sites = [
    'https://www.google.com',
    'https://www.amazon.com',
    'https://www.facebook.com']

repeating = task.LoopingCall(everyMinute, *sites)
repeating.start(60)

reactor.run()

它从everyMinute()函数开始,每60秒运行一次。在该函数中,每个端点都被查询,一旦响应的内容变得可用,treq.content函数将获取响应并返回内容。最后,内容被写入文件。

PS

您是在抓取还是试图从这些网站中提取某些内容?如果你是scrapy可能是一个很好的选择。

© www.soinside.com 2019 - 2024. All rights reserved.