sql_list = ['(select * from table1 where rownum <= 100) alias1','(select * from table2 where rownum <= 100) alias2']
for sql_statement in sql_list: df = spark.read.format("jdbc").option("driver", jdbc_driver_name).option("url", db_url).option("dbtable", sql_statement).option("user", db_username).option("password", db_password).option("fetchSize", 100000).load()
df.write.format("parquet").mode("overwrite").save("s3://s3-location/" + sql_statement)
源是Oracle数据库
我能够运行查询数组并将其存储在S3中的拼花地板中,但是使用的命名与sql_list上列出的命名相同,我想将数据分别存储为S3,并分别命名为alias1和alias2。 >
sql_list = ['(select from from table1 where rownum <= 100)alias1','((select * from table2 from where rownum <= 100)alias2'] for sql_statement in sql_list:df = spark.read.format(“ jdbc “).option(” driver“ ...
考虑使用字典而不是列表,因为它既整洁又灵活。