带有 Office365 REST api 的 Python 使用 create_upload_session 将大文件上传到 SharePoint

Question

Python 与 Office365 REST api

我正在使用 REST api 上传每个数百 MB 的文件。我的代码有效，但我被迫为每次上传创建一个临时文件。我已经在内存中保存了该文件，并且很想发送它。但是，该 API 似乎只支持文件句柄。我错过了什么还是有更好的方法？

                file_b_data = io.BytesIO(
                    sp_obj.download_file(ab_zip_file.name, folder.name)
                )
                logger.warning(f"Finished downloading {folder.name}/{ab_zip_file.name}")
                zip_file = ZipFile(file_b_data)

                for z_file in zip_file.filelist:
                    logger.warning(
                        f"Extracting: {folder.name}_{ab_zip_file.name}_{z_file.filename}, Size: {z_file.file_size}"
                    )
                    content = zip_file.read(z_file)
                    upload_fname = z_file.filename

                    if "/" in upload_fname:
                        upload_fname = f'{folder.name}_{upload_fname.partition("/")[2]}'

                    tmp_file = tmp_path.joinpath(upload_fname)
                    logger.debug(f"Uploading File: {tmp_file}")
                    with open(tmp_file, "wb") as f:
                        f.write(content)
                        f.flush()
                        f.close()

                    time.sleep(2)
                    ab_folder.files.create_upload_session(
                        tmp_file,
                        chunk_size=3072000,
                        chunk_uploaded=print_upload_progress,
                    ).execute_query_retry()

flush() 和 sleep(2) 的目的是代码在多个线程中运行，如果没有给第一个文件几秒钟的时间来关闭商店，第一个文件将无法读取。这是我想使用内存缓冲区而不是临时文件的众多原因之一。

谢谢...

我尝试使用 IOBytes 缓冲区，但不支持。

Answer 1

经过大量研究，我找到了答案。事实证明 io.BytesIO 受支持，但您必须将 file_name 和 file_size 作为 kwargs 传递。当发生这种情况时，它不会调用 fileno()。

def upload_file(ab_files: FileCollection, content: io.BytesIO | Path, **kwargs):
if isinstance(content, Path):
    upload_name = content.stem + content.suffix
else:
    upload_name = kwargs.get("upload_name")
logger.debug(f"Uploading File: {upload_name}")
fc_session: FileCollection = ab_files.create_upload_session(
    content,
    chunk_size=2048000,
    chunk_uploaded=print_upload_progress,
    print_f_name=upload_name,
    file_name=upload_name,
    file_size=kwargs.get("file_size"),
)
fc_session.execute_query_retry(max_retry=10)

Answer 2

在遵循 GLayton 建议的建议后，我仍然收到错误 io.BytesIO is not support。下面是代码。请帮我让它工作。

def print_upload_progress(offset):
file_size = len(downloaded_bytes)  
print("Uploaded '{0}' bytes from '{1}'...[{2}%]".format(offset, 
file_size, round(offset / file_size * 100, 2)))

def upload_file(content: io.BytesIO, **kwargs):
    chunk_size=2048000
    upload_name = kwargs.get("upload_name")
    target_url = "Shared Documents/UploadFileTest"
    target_folder = ctx.web.get_folder_by_server_relative_url(target_url)
    uploaded_file=target_folder.files.create_upload_session
    (content,chunk_size=100000,chunk_uploaded=print_upload_progress)


    service_client = 
    authenticate.initialize_storage_account_ad(store_name_value)
    file_system_client = service_client.get_file_system_client('XXX')
    file_path = '/Test/Test.csv'
    file_client = file_system_client.get_file_client(file_path)
    downloaded_file = file_client.download_file()
    downloaded_bytes = downloaded_file.readall()
    file_content = BytesIO(downloaded_bytes)
    upload_file(file_content,upload_name="test.csv",
    file_size=len(downloaded_bytes))

带有 Office365 REST api 的 Python 使用 create_upload_session 将大文件上传到 SharePoint

问题描述投票：0回答：2

2个回答

最新问题

带有 Office365 REST api 的 Python 使用 create_upload_session 将大文件上传到 SharePoint

问题描述 投票：0回答：2

2个回答

最新问题

问题描述投票：0回答：2