从 Spark Emr 写入 s3 失败,出现 UnsupportedStagingDirectoryOperationException

问题描述 投票:0回答:0

我正在尝试通过这样做将数据帧保存到 s3 中。

(fl
    .write
    .partitionBy("XXX")
    .option('path', 's3://some/location')
    .bucketBy(40, "YY", "ZZ")
    .saveAsTable(f"DB_NAME.TABLE_NAME")

)

我看到很多较小的多部分并决定通过以下方式禁用多部分上传:

spark.conf.set("fs.s3n.multipart.uploads.enabled", "false")

现在当我运行代码时,我得到了

Caused by: org.apache.hadoop.fs.staging.UnsupportedStagingDirectoryOperationException: Multipart uploads (fs.s3n.multipart.uploads.enabled) must be enabled in order to create files under a staging directory

我以为多部分是可选的,这是怎么回事?

apache-spark pyspark apache-spark-sql aws-glue amazon-emr
© www.soinside.com 2019 - 2024. All rights reserved.