Spark可以直接在源数据库中的表上执行SQL更新查询,而不必将表转换为数据框并将数据框重新写回数据库吗?
感谢您的输入:)
当然可以。尝试这样的事情:
import java.util.Properties
val properties = new Properties()
val readDF = sqlContext.read.format("jdbc").
options(Map("url" -> sys.env("SQL_CONNECTION"), "dbtable" -> "MyTableName")).load()
// convert, map, add/remove columns ... so that readDF
// becomes finalProduct
finalProduct
.write
.mode(org.apache.spark.sql.SaveMode.Overwrite)
.jdbc(sys.env("SQL_CONNECTION"), "MyTableName", properties)