带有 Office365 REST api 的 Python 使用 create_upload_session 将大文件上传到 SharePoint

问题描述 投票:0回答:2

Python 与 Office365 REST api

我正在使用 REST api 上传每个数百 MB 的文件。我的代码有效,但我被迫为每次上传创建一个临时文件。我已经在内存中保存了该文件,并且很想发送它。但是,该 API 似乎只支持文件句柄。我错过了什么还是有更好的方法?

                file_b_data = io.BytesIO(
                    sp_obj.download_file(ab_zip_file.name, folder.name)
                )
                logger.warning(f"Finished downloading {folder.name}/{ab_zip_file.name}")
                zip_file = ZipFile(file_b_data)

                for z_file in zip_file.filelist:
                    logger.warning(
                        f"Extracting: {folder.name}_{ab_zip_file.name}_{z_file.filename}, Size: {z_file.file_size}"
                    )
                    content = zip_file.read(z_file)
                    upload_fname = z_file.filename

                    if "/" in upload_fname:
                        upload_fname = f'{folder.name}_{upload_fname.partition("/")[2]}'

                    tmp_file = tmp_path.joinpath(upload_fname)
                    logger.debug(f"Uploading File: {tmp_file}")
                    with open(tmp_file, "wb") as f:
                        f.write(content)
                        f.flush()
                        f.close()

                    time.sleep(2)
                    ab_folder.files.create_upload_session(
                        tmp_file,
                        chunk_size=3072000,
                        chunk_uploaded=print_upload_progress,
                    ).execute_query_retry()

flush() 和 sleep(2) 的目的是代码在多个线程中运行,如果没有给第一个文件几秒钟的时间来关闭商店,第一个文件将无法读取。这是我想使用内存缓冲区而不是临时文件的众多原因之一。

谢谢...

我尝试使用 IOBytes 缓冲区,但不支持。

python rest sharepoint
2个回答
1
投票

经过大量研究,我找到了答案。事实证明 io.BytesIO 受支持,但您必须将 file_name 和 file_size 作为 kwargs 传递。当发生这种情况时,它不会调用 fileno()。

def upload_file(ab_files: FileCollection, content: io.BytesIO | Path, **kwargs):
if isinstance(content, Path):
    upload_name = content.stem + content.suffix
else:
    upload_name = kwargs.get("upload_name")
logger.debug(f"Uploading File: {upload_name}")
fc_session: FileCollection = ab_files.create_upload_session(
    content,
    chunk_size=2048000,
    chunk_uploaded=print_upload_progress,
    print_f_name=upload_name,
    file_name=upload_name,
    file_size=kwargs.get("file_size"),
)
fc_session.execute_query_retry(max_retry=10)

0
投票

在遵循 GLayton 建议的建议后,我仍然收到错误 io.BytesIO is not support。下面是代码。请帮我让它工作。

def print_upload_progress(offset):
file_size = len(downloaded_bytes)  
print("Uploaded '{0}' bytes from '{1}'...[{2}%]".format(offset, 
file_size, round(offset / file_size * 100, 2)))

def upload_file(content: io.BytesIO, **kwargs):
    chunk_size=2048000
    upload_name = kwargs.get("upload_name")
    target_url = "Shared Documents/UploadFileTest"
    target_folder = ctx.web.get_folder_by_server_relative_url(target_url)
    uploaded_file=target_folder.files.create_upload_session
    (content,chunk_size=100000,chunk_uploaded=print_upload_progress)


    service_client = 
    authenticate.initialize_storage_account_ad(store_name_value)
    file_system_client = service_client.get_file_system_client('XXX')
    file_path = '/Test/Test.csv'
    file_client = file_system_client.get_file_client(file_path)
    downloaded_file = file_client.download_file()
    downloaded_bytes = downloaded_file.readall()
    file_content = BytesIO(downloaded_bytes)
    upload_file(file_content,upload_name="test.csv",
    file_size=len(downloaded_bytes))
© www.soinside.com 2019 - 2024. All rights reserved.