将spark sql表加载到数据库表中

问题描述 投票:0回答:1

是否有任何方法可以像在sql中那样按原样在数据库表中加载spark sql表

insert into database_table select * from sparksql_table.

pg_hook = PostgresHook(postgres_conn_id="ingestion_db", schema="ingestiondb")

connection = pg_hook.get_conn()

cursor = connection.cursor()

spark = SparkSession \
    .builder \
    .appName("Spark csv schema inference") \
    .config("spark.sql.warehouse.dir", warehouse_location) \
    .enableHiveSupport() \
    .getOrCreate()\

我能够运行此:

spark.sql(“从MetadataTable中选择*”。show()

但不是这个:

cursor.execute(“从元数据表中选择*”]

python apache-spark pyspark apache-spark-sql
1个回答
0
投票

用此包装打开火花壳spark-shell --packages org.postgresql:postgresql:42.1.1

val url = "jdbc:postgresql://localhost:5432/dbname"

 def getProperties: Properties ={
 val prop = new Properties
 prop.setProperty("user", "dbuser")
 prop.setProperty("password", "dbpassword")
 prop.setProperty("driver", "org.postgresql.Driver")
 prop
 }

val df = spark.sql("""select * from table  """)

df.write.mode("append").option("driver", "org.postgresql.Driver").jdbc(url, 
"tablename", getProperties)

之后,您可以检查postgres数据库中的表。另外,请参考提供各种服务的不同模式选项,然后为您选择合适的选项。

© www.soinside.com 2019 - 2024. All rights reserved.