Spark作业抛出错误(spark节点心跳错误)

问题描述 投票:0回答:1

我使用 Spark 2.0 和 2 个节点集群(Windows 机器和禁用的防火墙)。我正在运行套接字程序,收到以下错误:

16/10/12 12:10:41 WARN NettyRpcEndpointRef: Error sending message [message = Heartbeat(2,[Lscala.Tuple2;@1ee84de,BlockManagerId(2, IP1, 2726))] in 1 attempts
org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [30 seconds]. This timeout is controlled by spark.executor.heartbeatInterval
    at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:48)
    at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:63)
    at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
    at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167)
    at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83)
    at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:102)
    at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$reportHeartBeat(Executor.scala:518)
    at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply$mcV$sp(Executor.scala:547)
    at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:547)
    at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:547)
    at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1857)
    at org.apache.spark.executor.Executor$$anon$1.run(Executor.scala:547)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [30 seconds]
    at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
    at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
    at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)
    at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
    at scala.concurrent.Await$.result(package.scala:190)
    at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:81)

我已将值

spark.executor.heartbeatInterval
设置为 30。但我仍然收到错误。

我的环境:

服务器1: 我将通过

spark-class org.apache.spark.deploy.master.Master
开始掌握大师 我将在
spark-class org.apache.spark.deploy.worker.Worker spark://server1:7077

之前启动工人

服务器2(ip2): 我将在

spark-class org.apache.spark.deploy.worker.Worker spark://172.16.2.95:7077

之前启动工人

驱动程序位于服务器 1 中:

spark-submit --master spark://server1:7077  --num-executors 2  --executor-cores 1  C:\sparkpoc\streamexample.py

节点:

有人可以帮助我吗?

apache-spark pyspark
1个回答
0
投票

该错误消息表明执行器未能在预期的时间范围内向驱动器发送心跳消息。超时由

spark.executor.heartbeatInterval
配置参数控制。

要解决此错误,您可以尝试将

spark.executor.heartbeatInterval
的值增加到比当前值更高的值。您可以在 Spark 配置文件中设置此配置参数,也可以在启动 Spark 应用程序时将其作为命令行参数传递。 spark.executor.heartbeatInterval 60s

© www.soinside.com 2019 - 2024. All rights reserved.