我正在尝试创建 python 智能代理服务器,它应该能够将大型请求正文内容从客户端流式传输到某些内部存储(可能是 amazon s3、swift、ftp 或类似的东西)。在流媒体服务器应该请求一些内部 API 服务器来确定上传到内部存储的参数之前。主要限制是它应该使用 PUT 方法在一次 HTTP 操作中完成。此外,它应该异步工作,因为会有大量文件上传。
什么解决方案允许我从上传内容中读取块,并在用户上传整个文件之前开始将这些块传输到内部存储?我知道的所有 Python Web 应用程序都会在对 wsgi 应用程序/Python Web 服务器进行管理之前收到等待完整内容的信息。
我找到的解决方案之一是tornado fork https://github.com/nephics/tornado。但它是非官方的,龙卷风开发人员并不急于将其纳入主分支。 那么您可能知道我的问题的一些现有解决方案吗?龙卷风?扭曲?事件?
这是一个使用 Twisted 编写的进行流式上传处理的服务器示例:
from twisted.internet import reactor
from twisted.internet.endpoints import serverFromString
from twisted.web.server import Request, Site
from twisted.web.resource import Resource
from twisted.application.service import Application
from twisted.application.internet import StreamServerEndpointService
# Define a Resource class that doesn't really care what requests are made of it.
# This simplifies things since it lets us mostly ignore Twisted Web's resource
# traversal features.
class StubResource(Resource):
isLeaf = True
def render(self, request):
return b""
class StreamingRequestHandler(Request):
def handleContentChunk(self, chunk):
# `chunk` is part of the request body.
# This method is called as the chunks are received.
Request.handleContentChunk(self, chunk)
# Unfortunately you have to use a private attribute to learn where
# the content is being sent.
path = self.channel._path
print "Server received %d more bytes for %s" % (len(chunk), path)
class StreamingSite(Site):
requestFactory = StreamingRequestHandler
application = Application("Streaming Upload Server")
factory = StreamingSite(StubResource())
endpoint = serverFromString(reactor, b"tcp:8080")
StreamServerEndpointService(endpoint, factory).setServiceParent(application)
这是一个 tac 文件(将其放入
streamingserver.tac
并运行 twistd -ny streamingserver.tac
)。
由于需要使用
self.channel._path
,这不是完全受支持的方法。 API 整体上也相当笨重,所以这更多的是一个“可能”的例子,而不是它的“好”。长期以来,我们一直致力于让此类事情变得更容易(http://tm.tl/288),但可能还需要很长一段时间才能实现。
,一个专门为流媒体设计的 HTTP 服务器框架 。
#!/usr/bin/env python3
from tremolo import Tremolo
app = Tremolo()
@app.route('/upload')
async def upload(**server):
request = server['request']
with open('/save/to/image_uploaded.png', 'wb') as f:
# read body chunk by chunk
async for data in request.read():
# write to file on each chunk
f.write(data)
return 'Done.'
if __name__ == '__main__':
app.run('0.0.0.0', 8000, debug=True, reload=True)
from gevent.monkey import patch_all
patch_all()
from gevent.pywsgi import WSGIServer
def stream_to_internal_storage(data):
pass
def simple_app(environ, start_response):
bytes_to_read = 1024
while True:
readbuffer = environ["wsgi.input"].read(bytes_to_read)
if not len(readbuffer) > 0:
break
stream_to_internal_storage(readbuffer)
start_response("200 OK", [("Content-type", "text/html")])
return ["hello world"]
def run():
config = {'host': '127.0.0.1', 'port': 45000}
server = WSGIServer((config['host'], config['port']), application=simple_app)
server.serve_forever()
if __name__ == '__main__':
run()
当我尝试上传大文件时效果很好:
curl -i -X PUT --progress-bar --verbose --data-binary @/path/to/huge/file "http://127.0.0.1:45000"