我正在 databricks 中运行以下代码以使用 Sparklyr 保存表格
library(sparklyr)
library(dplyr)
sc <- sparklyr::spark_connect(method = "databricks")
dat <- sparklyr::spark_read_table(sc, "products.output")
dat <- dat %>% dplyr::mutate(x = as.character(x), y = as.character(y))
%sql
drop table products.output
sparklyr::spark_write_table(x = dat , name = "products.output")
org.apache.spark.sql.AnalysisException:
The schema of your Delta table has changed in an incompatible way since your DataFrame or
DeltaTable object was created. Please redefine your DataFrame or DeltaTable object
我可以覆盖架构吗?
与这个问题的答案相同的方法。按照
sparklyr::spark_write_table
的文档,添加另一个参数作为 options=list(overwriteSchema="true")
。此 Databricks 文档可能会有所帮助:https://docs.databricks.com/en/delta/update-schema.html#explicitly-update-schema-to-change-column-type-or-name
sparklyr::spark_write_table(x = dat, name = "products.output",
mode = "overwrite",
options = list(overwriteSchema = "true"))