我正在尝试使用google-api-client将大文件上传到gdrive。我正在使用断点续传。这里的问题是我不想在我的文件系统中保存/写入文件对象。我希望它以大块读取它,因此使用可恢复的上载将相同的块上载到Gdrive。无论如何有什么可以实现的地方,我可以通过谷歌API Python客户端发送块。
这是我的示例代码,该代码有效,但是它从客户端获取了整个文件对象。
@app.route('/upload', methods = ["GET", "POST"])
def upload_buffer():
drive = cred()
if request.method == "POST":
mime_type = request.headers['Content-Type']
body = {
'name': "op.pdf",
'mimeType': mime_type,
}
chunk = BytesIO(request.stream.read()) # as you can see here the entire file stream is obtained
#I want to read in chunks and simultaneously send that chunk to GDrive
#chunk = BytesIO(request.stream.read(1024))
#if I send like the above only some part of the file is uploaded in Gdrive
media_body = MediaIoBaseUpload(chunk, chunksize = 1024, mimetype=mime_type,
resumable=True)
return drive.files().create(body=body,
media_body=media_body,
fields='id,name,mimeType,createdTime,modifiedTime').execute()
return render_template("upload_image.html")
这就是我使用Google Rest API的方式]
@app.route('/upload3', methods=["GET", "POST"])
def upload_buff():
if request.method == "POST":
Content_Length = request.headers['Content-Length']
access_token = '####'
headers = {"Authorization": "Bearer " + access_token, "Content-Type": "application/json", "Content-Length": Content_Length}
params = {
"name": "file_name.pdf",
"mimeType": "application/pdf"
}
r = requests.post("https://www.googleapis.com/upload/drive/v3/files?uploadType=resumable", headers=headers, data=json.dumps(params))
location = r.headers['Location']
print("----------------------------------------------------")
print("GDrive Upload url : ", location)
print("----------------------------------------------------")
start = str(0)
while True:
chunk = request.stream.read(1024 * 1024)
chunk_size = len(chunk)
print("----------------------------------------------------")
print("Size of Received Chunk From Client: ", chunk_size)
print("----------------------------------------------------")
if chunk_size == 0:
break
end = str(int(start)+chunk_size-1)
headers = {'Content-Range': 'bytes '+start+'-' + end + '/' +str(Content_Length), 'Content-Length': str(chunk_size)}
start = str(int(end)+1)
print("The headers set for the chunk upload : ", headers)
r = requests.put(location, headers=headers, data=chunk)
print("----------------------------------------------------")
print("Response content : ", r.content)
print("Response headers : ", r.headers)
print("Response status : ", r.status_code)
print("----------------------------------------------------")
return r.content
return render_template("upload_image.html")
阅读您的问题和代码,我假设您将流保存在名为chunk
的变量中,并希望将其分成1024个字节的块以使用resumable upload。如果我对这个问题的理解是正确的,则可以通过类似以下方式在bytes object chunk
中使用切片:
chunk = b"\x04\x09\x09\x01\x01\x01\x00\x03" # Example values
chunk[:3] # Equals to b"\x04\x09\x09"
chunk[-3:] # Equals to b"\x01\x00\x03"
chunk[4:2] # Equals to b"\x01\x01"
您可以使用这种方法将chunk
切片为1024个字节。请问我任何问题是否需要更多帮助。
对于您对问题的理解有误,我深表歉意。我现在了解到,您有一个按字节划分的字节对象,并想使用可恢复的上传将其上传到云端硬盘。如果我的实际假设是正确的,则可以使用我为该场景编写的代码。使用此代码,无需在硬盘驱动器上写任何东西。
#!/usr/bin/python
# -*- coding: utf-8 -*-
import json
import locale
import requests
import sys
from io import BytesIO
accessToken = \
'{YOUR ACCESS TOKEN HERE}'
fileData = \
BytesIO(requests.get('https://upload.wikimedia.org/wikipedia/commons/c/cf/Alhambra_evening_panorama_Mirador_San_Nicolas_sRGB-1.jpg'
).content).getvalue()
fileSize = sys.getsizeof(fileData) - 129
# Step I - Chop data into chunks
wholeSize = fileSize
chunkSize = 4980736 # Almost 5 MB
chunkTally = 0
chunkData = []
while wholeSize > 0:
if (chunkTally + 1) * chunkSize > fileSize:
chunkData.append(fileData[chunkTally * chunkSize:fileSize])
else:
chunkData.append(fileData[chunkTally * chunkSize:(chunkTally
+ 1) * chunkSize])
wholeSize -= chunkSize
chunkTally += 1
# Step II - Initiate resumable upload
headers = {'Authorization': 'Bearer ' + accessToken,
'Content-Type': 'application/json'}
parameters = {'name': 'alhambra.jpg',
'description': 'Evening panorama of Alhambra from Mirador de San Nicol\xc3\xa1s, Granada, Spain.'}
r = \
requests.post('https://www.googleapis.com/upload/drive/v3/files?uploadType=resumable'
, headers=headers, data=json.dumps(parameters))
location = r.headers['location']
# Step III - File upload
chunkTally = 0
for chunk in chunkData:
if (chunkTally + 1) * chunkSize - 1 > fileSize - 1:
finalByte = fileSize - 1
chunkLength = fileSize - chunkTally * chunkSize
else:
finalByte = (chunkTally + 1) * chunkSize - 1
chunkLength = chunkSize
headers = {'Content-Length': str(chunkLength),
'Content-Range': 'bytes ' + str(chunkTally * chunkSize) \
+ '-' + str(finalByte) + '/' + str(fileSize)}
r = requests.put(location, headers=headers, data=chunk)
print(r.text) # Response
chunkTally += 1
作为示例,此脚本将使用Wikimedia Commons的photo;您可以改用文件流。获取数据后,代码将根据变量使用的内存空间来计算文件大小(因为它未写入硬盘驱动器)。
下一步是将文件切成小于5 MB的块。我确保使用docs上详细说明的1024 * 256的倍数。数据将被迭代直到被分成[[几乎 5 MB块(最后一个除外)。
此操作之后,代码将使用documented进行身份验证,将可恢复的上传初始化为OAuth 2.0。在这一步中,我为文件使用了一些示例元数据,但是您可以在Files properties上阅读有关其他元数据的示例。最后,该脚本会将变量的保存位置保存在以后的变量中。在最后一步中,将逐块迭代并上载块。首先,基于specifications构建头。之后,我们已经准备好标题,块和上载位置,因此我们可以在请求中继续上载形式。在上载每个块之后,将打印响应以记录错误,并在最后一个块之后显示响应,以显示上载文件的元数据。这标志着完成操作的结束。最后一点,我想提到的是我在Python3中编写并测试了该脚本。如果您有任何疑问,请随时向我澄清。