使用google-api-client将长文件从客户端上传到GDrive,使用可恢复的上载到烧瓶中/存储在内存中的python

问题描述 投票:2回答:1

我正在尝试使用google-api-client将大文件上传到gdrive。我正在使用断点续传。这里的问题是我不想在我的文件系统中保存/写入文件对象。我希望它以大块读取它,因此使用可恢复的上载将相同的块上载到Gdrive。无论如何有什么可以实现的地方,我可以通过谷歌API Python客户端发送块。

这是我的示例代码,该代码有效,但是它从客户端获取了整个文件对象。

@app.route('/upload', methods = ["GET", "POST"])
def upload_buffer():
    drive = cred()
    if request.method == "POST":
        mime_type = request.headers['Content-Type']
         body = {
        'name': "op.pdf",
        'mimeType': mime_type,
         }

         chunk = BytesIO(request.stream.read()) # as you can see here the entire file stream is obtained 

         #I want to read in chunks and simultaneously send that chunk to GDrive

         #chunk = BytesIO(request.stream.read(1024))     

         #if I send like the above only some part of the file is uploaded in Gdrive

         media_body = MediaIoBaseUpload(chunk, chunksize = 1024, mimetype=mime_type,
                                   resumable=True)

         return drive.files().create(body=body,
                     media_body=media_body,
                     fields='id,name,mimeType,createdTime,modifiedTime').execute()

    return render_template("upload_image.html")

这就是我使用Google Rest API的方式]

@app.route('/upload3', methods=["GET", "POST"])
def upload_buff():

if request.method == "POST":
    Content_Length = request.headers['Content-Length']

    access_token = '####'

    headers = {"Authorization": "Bearer " + access_token, "Content-Type": "application/json", "Content-Length": Content_Length}
    params = {
        "name": "file_name.pdf",
        "mimeType": "application/pdf"
    }
    r = requests.post("https://www.googleapis.com/upload/drive/v3/files?uploadType=resumable", headers=headers, data=json.dumps(params))

    location = r.headers['Location']
    print("----------------------------------------------------")
    print("GDrive Upload url : ", location)
    print("----------------------------------------------------")

    start = str(0)
    while True:
        chunk = request.stream.read(1024 * 1024)
        chunk_size = len(chunk)
        print("----------------------------------------------------")
        print("Size of Received Chunk From Client:  ", chunk_size)
        print("----------------------------------------------------")
        if chunk_size == 0:
            break
        end = str(int(start)+chunk_size-1)
        headers = {'Content-Range': 'bytes '+start+'-' + end + '/' +str(Content_Length), 'Content-Length': str(chunk_size)}
        start = str(int(end)+1)
        print("The headers set for the chunk upload : ", headers)
        r = requests.put(location, headers=headers, data=chunk)
        print("----------------------------------------------------")
        print("Response content : ", r.content)
        print("Response headers : ", r.headers)
        print("Response status : ", r.status_code)
        print("----------------------------------------------------")
    return r.content

return render_template("upload_image.html")
python flask file-upload google-drive-api google-api-client
1个回答
1
投票

阅读您的问题和代码,我假设您将流保存在名为chunk的变量中,并希望将其分成1024个字节的块以使用resumable upload。如果我对这个问题的理解是正确的,则可以通过类似以下方式在bytes object chunk中使用切片:

chunk = b"\x04\x09\x09\x01\x01\x01\x00\x03" # Example values
chunk[:3] # Equals to b"\x04\x09\x09"
chunk[-3:] # Equals to b"\x01\x00\x03"
chunk[4:2] # Equals to b"\x01\x01"

您可以使用这种方法将chunk切片为1024个字节。请问我任何问题是否需要更多帮助。


对于您对问题的理解有误,我深表歉意。我现在了解到,您有一个按字节划分的字节对象,并想使用可恢复的上传将其上传到云端硬盘。如果我的实际假设是正确的,则可以使用我为该场景编写的代码。使用此代码,无需在硬盘驱动器上写任何东西。

#!/usr/bin/python
# -*- coding: utf-8 -*-
import json
import locale
import requests
import sys
from io import BytesIO

accessToken = \
    '{YOUR ACCESS TOKEN HERE}'
fileData = \
    BytesIO(requests.get('https://upload.wikimedia.org/wikipedia/commons/c/cf/Alhambra_evening_panorama_Mirador_San_Nicolas_sRGB-1.jpg'
            ).content).getvalue()
fileSize = sys.getsizeof(fileData) - 129

# Step I - Chop data into chunks
wholeSize = fileSize
chunkSize = 4980736  # Almost 5 MB
chunkTally = 0
chunkData = []
while wholeSize > 0:
    if (chunkTally + 1) * chunkSize > fileSize:
        chunkData.append(fileData[chunkTally * chunkSize:fileSize])
    else:
        chunkData.append(fileData[chunkTally * chunkSize:(chunkTally
                         + 1) * chunkSize])
    wholeSize -= chunkSize
    chunkTally += 1

# Step II - Initiate resumable upload
headers = {'Authorization': 'Bearer ' + accessToken,
           'Content-Type': 'application/json'}
parameters = {'name': 'alhambra.jpg',
          'description': 'Evening panorama of Alhambra from Mirador de San Nicol\xc3\xa1s, Granada, Spain.'}
r = \
    requests.post('https://www.googleapis.com/upload/drive/v3/files?uploadType=resumable'
                  , headers=headers, data=json.dumps(parameters))
location = r.headers['location']

# Step III - File upload
chunkTally = 0
for chunk in chunkData:
    if (chunkTally + 1) * chunkSize - 1 > fileSize - 1:
        finalByte = fileSize - 1
        chunkLength = fileSize - chunkTally * chunkSize
    else:
        finalByte = (chunkTally + 1) * chunkSize - 1
        chunkLength = chunkSize
    headers = {'Content-Length': str(chunkLength),
               'Content-Range': 'bytes ' + str(chunkTally * chunkSize) \
               + '-' + str(finalByte) + '/' + str(fileSize)}
    r = requests.put(location, headers=headers, data=chunk)
    print(r.text)  # Response
    chunkTally += 1

作为示例,此脚本将使用Wikimedia Commons的photo;您可以改用文件流。获取数据后,代码将根据变量使用的内存空间来计算文件大小(因为它未写入硬盘驱动器)。

下一步是将文件切成小于5 MB的块。我确保使用docs上详细说明的1024 * 256的倍数。数据将被迭代直到被分成[[几乎 5 MB块(最后一个除外)。

此操作之后,代码将使用documented进行身份验证,将可恢复的上传初始化为OAuth 2.0。在这一步中,我为文件使用了一些示例元数据,但是您可以在Files properties上阅读有关其他元数据的示例。最后,该脚本会将变量的保存位置保存在以后的变量中。

在最后一步中,将逐块迭代并上载块。首先,基于specifications构建头。之后,我们已经准备好标题,块和上载位置,因此我们可以在请求中继续上载形式。在上载每个块之后,将打印响应以记录错误,并在最后一个块之后显示响应,以显示上载文件的元数据。这标志着完成操作的结束。最后一点,我想提到的是我在Python3中编写并测试了该脚本。如果您有任何疑问,请随时向我澄清。

© www.soinside.com 2019 - 2024. All rights reserved.