使用的资源
日志分析正在使用导出规则将数据写入容器中的 blob 存储帐户。 Databricks 安装了相同的容器并运行管道,该管道每隔一小时读取一次数据并进行转换。 Databricks 管道有时运行正常,有时会出现以下错误。我可以理解在向 blob 存储读取和写入数据时存在竞争条件。日志分析数据导出规则没有固定的时间阈值将数据发送到存储。有什么想法可以处理这种竞争条件吗?
Caused by: java.io.IOException: Operation failed: "The condition specified using HTTP conditional header(s) is not met.", 412, GET, https://xxx.dfs.core.windows.net/xx-xxx/WorkspaceResourceId%3D/subscriptions/xxx.json?timeout=90, ConditionNotMet, "The condition specified using HTTP conditional header(s) is not met. RequestId:xxx-xxxx-xxxx-xxxx-xxx Time:xxx-04-03T20:11:21.xxxx"
at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.services.AbfsInputStream.readRemote(AbfsInputStream.java:673)
at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.services.AbfsInputStream.readInternal(AbfsInputStream.java:619)
at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.services.AbfsInputStream.readOneBlock(AbfsInputStream.java:409)
at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.services.AbfsInputStream.read(AbfsInputStream.java:346)
at java.io.DataInputStream.read(DataInputStream.java:149)
at com.databricks.common.filesystem.LokiAbfsInputStream.$anonfun$read$3(LokiABFS.scala:204)
at scala.runtime.java8.JFunction0$mcI$sp.apply(JFunction0$mcI$sp.java:23)
at com.databricks.common.filesystem.LokiAbfsInputStream.withExceptionRewrites(LokiABFS.scala:194)
"The condition specified using HTTP conditional header(s) is not met.", 412, GET, https://xxx.dfs.core.windows.net/xx-xxx/WorkspaceResourceId%3D/subscriptions/xxx.json?timeout=90, ConditionNotMet, "The condition specified using HTTP conditional header(s) is not met."
按照这个,
当对 Blob 执行
操作时,Blob 的write
会被重置,比如说ETag
。并且,在触发之前(ETag 值为0x8CDA1BF0593B660
),该 Blob 由另一个服务更新,并且其0x8CDA1BF0593B660
更改为ETag
。0x8CDA1BF0593B661
这可能是从 Databricks 中的存储帐户读取 JSON 文件时出现上述错误的原因。可以根据 hadoop-azure 库 文档配置并发行为。它是用于访问ADLS(abfss)的库。
欲了解更多信息,您可以参考以下链接: