在 Ignite 失败时停止本地节点 - V2.15.0

问题描述 投票:0回答:1

我们在最新版本的 Ignite (V2.15.0) 中遇到问题。我们使用三个节点,由于以下问题,每天都有一个节点出现故障。

  1. 与数据流相关的系统工作线程被阻止。这似乎是由于从
    clear()
    removeAll()
    的更改,但因此我们的代码中没有任何对数据流处理程序的引用。以下是异常日志。
Stopping local node on Ignite failure: [failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker [name=data-streamer-stripe-8, igniteInstanceName=CasperIgniteCluster, finished=false, heartbeatTs=1693369119278]]]
2023-08-31 00:18:39.865 ERROR 8 --- [rIgniteCluster%] ROOT : Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeFailureHandler [super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker [name=data-streamer-stripe-19, igniteInstanceName=CasperIgniteCluster, finished=false, heartbeatTs=1693455490076]]] org.apache.ignite.IgniteException: GridWorker [name=data-streamer-stripe-19, igniteInstanceName=CasperIgniteCluster, finished=false, heartbeatTs=1693455490076]
   at sun.nio.ch.Net.poll(Native Method) ~[na:1.8.0_322]
   at sun.nio.ch.SocketChannelImpl.poll(SocketChannelImpl.java:953 undefined) ~[na:1.8.0_322]
   at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:121 undefined) ~[na:1.8.0_322]
   at org.apache.ignite.spi.communication.tcp.internal.GridNioServerWrapper.createNioSession(GridNioServerWrapper.java:462 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.spi.communication.tcp.internal.GridNioServerWrapper.createTcpClient(GridNioServerWrapper.java:693 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:1181 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$Lambda$1772/1684778436.apply(Unknown Source) ~[na:na]
   at org.apache.ignite.spi.communication.tcp.internal.GridNioServerWrapper.createTcpClient(GridNioServerWrapper.java:691 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.spi.communication.tcp.internal.ConnectionClientPool.createCommunicationClient(ConnectionClientPool.java:442 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.spi.communication.tcp.internal.ConnectionClientPool.reserveClient(ConnectionClientPool.java:231 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:1105 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:1052 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:2102 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:2195 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.processors.cache.GridCacheIoManager.send(GridCacheIoManager.java:1279 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.processors.cache.GridCacheIoManager.send(GridCacheIoManager.java:1318 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicAbstractUpdateFuture.sendDhtRequests(GridDhtAtomicAbstractUpdateFuture.java:476 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicAbstractUpdateFuture.map(GridDhtAtomicAbstractUpdateFuture.java:433 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1920 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1688 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture.sendSingleRequest(GridNearAtomicAbstractUpdateFuture.java:300 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicUpdateFuture.map(GridNearAtomicUpdateFuture.java:812 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicUpdateFuture.mapOnTopology(GridNearAtomicUpdateFuture.java:664 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture.map(GridNearAtomicAbstractUpdateFuture.java:249 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.removeAllAsync0(GridDhtAtomicCache.java:1356 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.removeAll0(GridDhtAtomicCache.java:703 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.processors.cache.GridCacheAdapter.removeAll(GridCacheAdapter.java:3186 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.processors.cache.distributed.near.GridNearAtomicCache.removeAll(GridNearAtomicCache.java:549 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.processors.cache.IgniteCacheProxyImpl.removeAll(IgniteCacheProxyImpl.java:1585 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.removeAll(GatewayProtectedCacheProxy.java:1106 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.processors.datastreamer.DataStreamerCacheUpdaters.updateAll(DataStreamerCacheUpdaters.java:94 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.processors.datastreamer.DataStreamerCacheUpdaters$Batched.receive(DataStreamerCacheUpdaters.java:163 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.processors.datastreamer.DataStreamerUpdateJob.call(DataStreamerUpdateJob.java:144 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:7431 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.processors.closure.GridClosureProcessor$2.body(GridClosureProcessor.java:789 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.processors.security.thread.SecurityAwareRunnable.run(SecurityAwareRunnable.java:51 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:637 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at java.lang.Thread.run(Thread.java:750 undefined) ~[na:1.8.0_322]
  1. 第二个问题似乎是系统条带池工作人员被阻止。这种情况似乎每天都会发生一次,在这种情况下节点会再次关闭。日志如下。
2023-08-29 03:52:05.423 ERROR 8 --- [rIgniteCluster%] ROOT : Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeFailureHandler [super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker [name=sys-stripe-24, igniteInstanceName=CasperIgniteCluster, finished=false, heartbeatTs=1693295501788]]] org.apache.ignite.IgniteException: GridWorker [name=sys-stripe-24, igniteInstanceName=CasperIgniteCluster, finished=false, heartbeatTs=1693295501788]
   at sun.nio.ch.Net.poll(Native Method) ~[na:1.8.0_322]
   at sun.nio.ch.SocketChannelImpl.poll(SocketChannelImpl.java:953 undefined) ~[na:1.8.0_322]
   at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:121 undefined) ~[na:1.8.0_322]
   at org.apache.ignite.spi.communication.tcp.internal.GridNioServerWrapper.createNioSession(GridNioServerWrapper.java:462 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.spi.communication.tcp.internal.GridNioServerWrapper.createTcpClient(GridNioServerWrapper.java:693 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:1181 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$Lambda$1772/2003383653.apply(Unknown Source) ~[na:na]
   at org.apache.ignite.spi.communication.tcp.internal.GridNioServerWrapper.createTcpClient(GridNioServerWrapper.java:691 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.spi.communication.tcp.internal.ConnectionClientPool.createCommunicationClient(ConnectionClientPool.java:442 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.spi.communication.tcp.internal.ConnectionClientPool.reserveClient(ConnectionClientPool.java:231 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:1105 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:1052 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:2102 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:2195 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.processors.cache.GridCacheIoManager.send(GridCacheIoManager.java:1279 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.processors.cache.GridCacheIoManager.send(GridCacheIoManager.java:1318 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicAbstractUpdateFuture.sendDhtRequests(GridDhtAtomicAbstractUpdateFuture.java:476 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicAbstractUpdateFuture.map(GridDhtAtomicAbstractUpdateFuture.java:433 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1920 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1688 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processNearAtomicUpdateRequest(GridDhtAtomicCache.java:3179 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$200(GridDhtAtomicCache.java:147 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$3.apply(GridDhtAtomicCache.java:270 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$3.apply(GridDhtAtomicCache.java:265 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1164 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:605 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:406 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:324 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:112 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:314 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1907 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1528 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.managers.communication.GridIoManager.access$5300(GridIoManager.java:243 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.managers.communication.GridIoManager$9.execute(GridIoManager.java:1421 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.managers.communication.TraceRunnable.run(TraceRunnable.java:55 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:637 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
   at java.lang.Thread.run(Thread.java:750 undefined) ~[na:1.8.0_322]

Ignite 从 v2.11.0 升级到 V2.15.0 也将

clear()
替换为
removeAll()
,因为clear 抛出了错误。

ignite
1个回答
0
投票

据我所知,您有意将节点配置为这样。根据日志片段,您似乎已经配置了自定义故障处理程序。

[hnd=StopNodeFailureHandler [super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]]

我敢打赌,您已经用以下内容覆盖了默认处理程序

StopNodeFailureHandler
ignoredFailureTypes 设置为空集合。

一般来说,您应该有非常扎实的想法和理由来修改默认的忽略异常集。该列表是故意引入的,包含可能只是表明潜在和/或持续的稳定性问题的例外情况。这并不一定意味着应该立即终止节点。

如果这部分不是故意引入的,您需要检查您的配置并修复它。请参阅文档。这是此配置的片段。

<bean class="org.apache.ignite.configuration.IgniteConfiguration">
    <property name="failureHandler">
        <bean class="org.apache.ignite.failure.StopNodeFailureHandler"/>
    </property>
</bean>
© www.soinside.com 2019 - 2024. All rights reserved.