ETL从谷歌云存储加载到bigquery

问题描述 投票:0回答:1

我想从Google云存储上的数百个CSV文件加载数据,并使用云数据流(最好使用python SDK)将它们每天附加到Bigquery上的单个表中。你能告诉我怎样才能做到这一点吗?

谢谢

python google-cloud-platform google-bigquery google-cloud-storage dataflow
1个回答
0
投票

我们也可以通过Python来实现。请找到以下代码段。

def format_output_json(element):
    """
    :param element: is the row data in the csv
    :return: a dictionary with key as column name and value as real data in a row of the csv.

    :row_indices: I have hard-coded here, but can get it at the run time.
    """
    row_indices = ['time_stamp', 'product_name', 'units_sold', 'retail_price']
    row_data = element.split(',')
    dict1 = dict()
    for i in range(len(row_data)):
        dict1[row_indices[i]] = row_data[i]

    return [dict1]
© www.soinside.com 2019 - 2024. All rights reserved.