将数据从 pyspark 加载到 bigquery 表中,但出现架构不兼容错误

问题描述 投票:0回答:0

我正在尝试将数据从 pyspark dataframea 加载到 bigquery 表中,但遇到以下错误:

    1) [Guice/ErrorInCustomProvider]: IllegalArgumentException: BigQueryConnectorException$InvalidSchemaException: Destination table's schema is not compatible with dataframe's schema
E                     at BigQueryDataSourceWriterModule.provideDirectDataSourceWriterContext(BigQueryDataSourceWriterModule.java:60)
E                     while locating BigQueryDirectDataSourceWriterContext
E                   
E                   Learn more:
E                     https://github.com/google/guice/wiki/ERROR_IN_CUSTOM_PROVIDER
E                   
E                   1 error

我试过使模式匹配,如图所示:

Pyspark 数据框架构

root
 |-- key_column: string (nullable = false)
 |-- column_a: string (nullable = false)
 |-- column_b: string (nullable = true)
 |-- column_c: string (nullable = false)

BigQuery 表架构

{"fields":[{"metadata":{},"name":"key_column","nullable":false,"type":"string"},{"metadata":{},"name":"column_a","nullable":false,"type":"string"},{"metadata":{},"name":"column_b","nullable":true,"type":"string"},{"metadata":{},"name":"column_c","nullable":false,"type":"string"}],"type":"struct"}

我需要修改/更正什么才能使此负载正常工作?

pyspark google-bigquery
© www.soinside.com 2019 - 2024. All rights reserved.