如何使用python将文件夹从本地上传到GCP存储桶

问题描述 投票:0回答:5

我正在关注此链接并收到一些错误:

如何使用Python API上传Google Cloud Storage上的文件夹

我已将模型保存在容器环境中,我想从那里复制到 GCP 存储桶。

这是我的代码:

storage_client = storage.Client(project='*****')
def upload_local_directory_to_gcs(local_path, bucket, gcs_path):

   bucket = storage_client.bucket(bucket)

    assert os.path.isdir(local_path)
    for local_file in glob.glob(local_path + '/**'):
        
        print(local_file)


        
        print("this is bucket",bucket)
        blob = bucket.blob(gcs_path)
        print("here")
        blob.upload_from_filename(local_file)
        print("done")

path="/pythonPackage/trainer/model_mlm_demo" #this is local absolute path where my folder is. Folder name is **model_mlm_demo**
buc="py*****" #this is my GCP bucket address
gcs="model_mlm_demo2/" #this is the new folder that I want to store files in GCP

upload_local_directory_to_gcs(local_path=path, bucket=buc, gcs_path=gcs)

/pythonPackage/trainer/model_mlm_demo
里面有3个文件config,model.bin和arguments.bin`

错误

代码没有抛出任何错误,但GCP存储桶中没有上传文件。它只是创建空文件夹。

python-3.x google-cloud-platform
5个回答
1
投票

我看到的错误是,您不需要将

gs://
作为存储桶参数传递。实际上,这是您可能需要查看的示例,

https://cloud.google.com/storage/docs/uploading-objects#storage-upload-object-python

def upload_blob(bucket_name, source_file_name, destination_blob_name):
    """Uploads a file to the bucket."""
    # The ID of your GCS bucket
    # bucket_name = "your-bucket-name"
    # The path to your file to upload
    # source_file_name = "local/path/to/file"
    # The ID of your GCS object
    # destination_blob_name = "storage-object-name"

    storage_client = storage.Client()
    bucket = storage_client.bucket(bucket_name)
    blob = bucket.blob(destination_blob_name)

    blob.upload_from_filename(source_file_name)

    print(
        "File {} uploaded to {}.".format(
            source_file_name, destination_blob_name
        )
    )

1
投票

我已经重现了您的问题,下面的代码片段工作正常。我已根据您在问题中提到的文件夹和名称更新了代码。如果您有任何问题,请告诉我。

import os
import glob
from google.cloud import storage
storage_client = storage.Client(project='')

def upload_local_directory_to_gcs(local_path, bucket, gcs_path):

    bucket = storage_client.bucket(bucket)

    assert os.path.isdir(local_path)
    for local_file in glob.glob(local_path + '/**'):

        print(local_file)

        print("this is bucket", bucket)
        filename=local_file.split('/')[-1]
        blob = bucket.blob(gcs_path+filename)
        print("here")
        blob.upload_from_filename(local_file)
        print("done")


# this is local absolute path where my folder is. Folder name is **model_mlm_demo**
path = "/pythonPackage/trainer/model_mlm_demo"
buc = "py*****"  # this is my GCP bucket address
gcs = "model_mlm_demo2/"  # this is the new folder that I want to store files in GCP

upload_local_directory_to_gcs(local_path=path, bucket=buc, gcs_path=gcs)

1
投票

我刚刚遇到了 gcsfs 库,它似乎也是关于更好的接口

您可以将整个目录复制到 gcs 位置,如下所示:


def upload_to_gcs(src_dir: str, gcs_dst: str):
    fs = gcsfs.GCSFileSystem()
    fs.put(src_dir, gcs_dst, recursive=True)

0
投票

我找到了一种使用子进程在 GCP 存储桶中上传模型工件的方法。

import subprocess

subprocess.call('gsutil cp -r source_folder_in_local gs://*****/folder_name', shell=True, stdout=subprocess.PIPE)

如果未安装 gsutil。您可以使用此链接安装:

https://cloud.google.com/storage/docs/gsutil_install


-1
投票

我第一次来这里,我找到了完美的文章。感谢您分享这篇有趣且内容丰富的帖子。 https://www.cgkoot.com

© www.soinside.com 2019 - 2024. All rights reserved.