从bigquery加载大数据到python

问题描述 投票:1回答:1
from google.cloud import bigquery as bq
import google_auth_oauthlib.flow

query = '''select ... from ...'''

bigquery_client = bq.Client()
table = bq.query.QueryResults(query=query,client=bigquery_client)
table.use_legacy_sql = False
table.run()

# transfer bigquery data to pandas dataframe
columns=[field.name for field in table.schema]
rows = table.fetch_data()
data = []
for row in rows:
    data.append(row)

df = pd.DataFrame(data=data[0],columns=columns)

我想在python中加载超过1000万行,几周前它运行良好,但现在它只返回100,000行。有人知道这样做的可靠方法吗?

python google-bigquery google-cloud-platform google-python-api
1个回答
2
投票

我刚刚在这里测试了这段代码,可以带来300万行没有应用上限:

import os
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'path/to/key.json'

from google.cloud.bigquery import Client

bc = Client()
query = 'your query'

job  = bc.run_sync_query(query)
job.use_legacy_sql = False
job.run()

data = list(job.fetch_data())

对你起作用吗?

© www.soinside.com 2019 - 2024. All rights reserved.