pyspark-如何并行提交Spark SQL?

问题描述 投票:0回答:0

您好,我有超过 1200 多个 SQL 查询,我想并行提交多个 SQL 查询并将每个查询存储到 CSV 文件中, 由于python有GIL限制,如何并行提交, 我看过其他的demo,都是基于scala的spark app

# return about 61K records
SQL = """ 
SELECT * FROM TEMP_VIEW WHERE index>=1 and index<=10;
"""
# return about 60K records
SQL2 = """ 
SELECT * FROM TEMP_VIEW WHERE index>=11 and index<=20;
"""
....
# this will use for loop to submit

任何建议都会非常有帮助!提前致谢!

apache-spark pyspark apache-spark-sql spark-streaming
© www.soinside.com 2019 - 2024. All rights reserved.