我无法弄清楚如何自动进行 OAuth 身份验证以访问 Google 云端硬盘。 (Python)

问题描述 投票:0回答:2

我们要解决什么问题

  1. 通过OAuth身份验证使用用户帐户访问共享驱动器
  2. 检索电子表格 -> 转换为 parquet 类型 3. 保存到 GCS
  3. 保存到GCS

这些过程写在下面的main()函数中,我想使用CloudFunction和CloudScheduler将它们应用到每天的定期处理中。

然而,事实上,下面的代码要求用户通过浏览器手动登录他/她的Google帐户。 我想重写代码,以便可以自动完成此登录,但我无法理解它...... 如果有人能帮助我,我将不胜感激......

www.DeepL.com/Translator 翻译(免费版本)

 ### ※※Authentication is required by browser※※
creds = flow.run_local_server(port=0)
### Result
Please visit this URL to authorize this application: 
https://accounts.google.com/o/oauth2/auth?response_type=code&client_id=132987612861-
4j24afrouontpeiv5ryy7sn64inhr.apps.googleusercontent.com&redirect_uri=
http%yyy%2Flocalhost%3yy6%2F&scope=httpsyyF%2Fwww.googleapis.com%2Fauth%2Fdrive.
readonly&state=XXXXXXXXXXXXXXXXXXXXXXXXXXX&access_type=offline

readonly&state=XXXXXXXXXXXXXXXXXXXXX 部分随着每次执行而改变。

执行上述代码部分时切换的浏览器屏幕

完整相关源代码

from __future__ import print_function
import io
import os
import key
import json
import os.path
import pandas as pd
import pyarrow as pa
import pyarrow.parquet as pq
from pprint import pprint
from webbrowser import Konqueror
from google.cloud import storage as gcs
from google.oauth2 import service_account
from google.auth.transport.requests import Request
from google.oauth2.credentials import Credentials
from google_auth_oauthlib.flow import InstalledAppFlow
from googleapiclient.http import MediaIoBaseDownload, MediaIoBaseUpload, MediaFileUpload
from googleapiclient.discovery import build
from googleapiclient.errors import HttpError

SCOPES = ['https://www.googleapis.com/auth/drive.readonly']

def main(event, context):
    """Drive v3 API
    Function to access shared Drive→get Spreadsheet→convert to parquet→upload to GCS    """
    creds = None
    file_id = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxx' #Unedited data in shared drive
    mime_type = 'text/csv'

    # OAuth authentication to access shared drives
    if os.path.exists('token.json'):
        creds = Credentials.from_authorized_user_file('token.json', SCOPES)
     # Allow users to log in if there are no (valid) credentials available    if not creds or not creds.valid:
        if creds and creds.expired and creds.refresh_token:
            creds.refresh(Request())
        else:
            flow = InstalledAppFlow.from_client_secrets_file(
                'credentials.json', SCOPES)
            ### ※※Browser authentication required※※
            creds = flow.run_local_server(port=0)##Currently, we need a manual login here!
        with open('token.json', 'w') as token:
            token.write(creds.to_json())
    try:
        # Retrieve spreadsheets from shared drives
        service = build('drive', 'v3', credentials=creds)
        request = service.files().export_media(fileId=file_id, mimeType=mime_type)
        fh = io.BytesIO()
        downloader = MediaIoBaseDownload(fh, request)
        done = False
        print(io.StringIO(fh.getvalue().decode()))

        while done is False:
            status, done = downloader.next_chunk()
        # Read "Shared Drive/SpreadSheet" -> convert to parquet
        df = pd.read_csv(io.StringIO(fh.getvalue().decode()))
        table = pa.Table.from_pandas(df)
        buf = pa.BufferOutputStream()
        pq.write_table(table, buf,compression=None)

        # service_account for save to GCS
        key_path = 'service_account_file.json'
        service_account_info = json.load(open(key_path))
        credentials = service_account.Credentials.from_service_account_info(service_account_info)
        client = gcs.Client(
            credentials=credentials,
            project=credentials.project_id,
        )

        # GCS information to be saved 
        bucket_name = 'bucket-name'
        blob_name = 'sample-folder/daily-data.parquet'#save_path
        bucket = client.get_bucket(bucket_name)
        blob = bucket.blob(blob_name)

        # parquet save to GCS
        blob.upload_from_string(data=buf.getvalue().to_pybytes())
        # ↓If a print appears, the data has been saved.
        print("Blob '{}' created to '{}'!".format(blob_name, bucket_name))

    except HttpError as error:
        # TODO(developer) - Handle errors from drive API.
        print(f'An error occurred: {error}')

我自己尝试过的

我尝试使用selenium来运行浏览器,但无法很好地实现,因为浏览器登录URL每次都不同。 ←我也许能找到办法。

python google-cloud-platform oauth-2.0 oauth google-drive-api
2个回答
0
投票

尝试这个方法。为我工作!

解决方案包括创建一个服务帐户并与 SA 电子邮件共享您的数据文件夹。

驱动API 服务帐号


0
投票

我和你有同样的问题。我想自动化此步骤“creds = flow.run_local_server(port=0)” 我编写了自动获取 url 并登录的方法,然后返回令牌,但我找不到必须将哪些 url 作为参数发送给该方法 我不知道将什么名称作为参数发送给函数? 我使用这个网址作为参数 auth_url, _ = flow.authorization_url(prompt='同意') 但这个网址没有 redirecr_uri 那么我的代码给出了一个错误

© www.soinside.com 2019 - 2024. All rights reserved.