我们在最新版本的 Ignite (V2.15.0) 中遇到问题。我们使用三个节点,由于以下问题,每天都有一个节点出现故障。
clear()
到 removeAll()
的更改,但因此我们的代码中没有任何对数据流处理程序的引用。以下是异常日志。Stopping local node on Ignite failure: [failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker [name=data-streamer-stripe-8, igniteInstanceName=CasperIgniteCluster, finished=false, heartbeatTs=1693369119278]]]
2023-08-31 00:18:39.865 ERROR 8 --- [rIgniteCluster%] ROOT : Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeFailureHandler [super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker [name=data-streamer-stripe-19, igniteInstanceName=CasperIgniteCluster, finished=false, heartbeatTs=1693455490076]]] org.apache.ignite.IgniteException: GridWorker [name=data-streamer-stripe-19, igniteInstanceName=CasperIgniteCluster, finished=false, heartbeatTs=1693455490076]
at sun.nio.ch.Net.poll(Native Method) ~[na:1.8.0_322]
at sun.nio.ch.SocketChannelImpl.poll(SocketChannelImpl.java:953 undefined) ~[na:1.8.0_322]
at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:121 undefined) ~[na:1.8.0_322]
at org.apache.ignite.spi.communication.tcp.internal.GridNioServerWrapper.createNioSession(GridNioServerWrapper.java:462 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.spi.communication.tcp.internal.GridNioServerWrapper.createTcpClient(GridNioServerWrapper.java:693 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:1181 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$Lambda$1772/1684778436.apply(Unknown Source) ~[na:na]
at org.apache.ignite.spi.communication.tcp.internal.GridNioServerWrapper.createTcpClient(GridNioServerWrapper.java:691 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.spi.communication.tcp.internal.ConnectionClientPool.createCommunicationClient(ConnectionClientPool.java:442 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.spi.communication.tcp.internal.ConnectionClientPool.reserveClient(ConnectionClientPool.java:231 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:1105 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:1052 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:2102 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:2195 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.processors.cache.GridCacheIoManager.send(GridCacheIoManager.java:1279 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.processors.cache.GridCacheIoManager.send(GridCacheIoManager.java:1318 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicAbstractUpdateFuture.sendDhtRequests(GridDhtAtomicAbstractUpdateFuture.java:476 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicAbstractUpdateFuture.map(GridDhtAtomicAbstractUpdateFuture.java:433 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1920 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1688 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture.sendSingleRequest(GridNearAtomicAbstractUpdateFuture.java:300 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicUpdateFuture.map(GridNearAtomicUpdateFuture.java:812 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicUpdateFuture.mapOnTopology(GridNearAtomicUpdateFuture.java:664 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture.map(GridNearAtomicAbstractUpdateFuture.java:249 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.removeAllAsync0(GridDhtAtomicCache.java:1356 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.removeAll0(GridDhtAtomicCache.java:703 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.processors.cache.GridCacheAdapter.removeAll(GridCacheAdapter.java:3186 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.processors.cache.distributed.near.GridNearAtomicCache.removeAll(GridNearAtomicCache.java:549 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.processors.cache.IgniteCacheProxyImpl.removeAll(IgniteCacheProxyImpl.java:1585 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.removeAll(GatewayProtectedCacheProxy.java:1106 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.processors.datastreamer.DataStreamerCacheUpdaters.updateAll(DataStreamerCacheUpdaters.java:94 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.processors.datastreamer.DataStreamerCacheUpdaters$Batched.receive(DataStreamerCacheUpdaters.java:163 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.processors.datastreamer.DataStreamerUpdateJob.call(DataStreamerUpdateJob.java:144 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:7431 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.processors.closure.GridClosureProcessor$2.body(GridClosureProcessor.java:789 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.processors.security.thread.SecurityAwareRunnable.run(SecurityAwareRunnable.java:51 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:637 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at java.lang.Thread.run(Thread.java:750 undefined) ~[na:1.8.0_322]
2023-08-29 03:52:05.423 ERROR 8 --- [rIgniteCluster%] ROOT : Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeFailureHandler [super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker [name=sys-stripe-24, igniteInstanceName=CasperIgniteCluster, finished=false, heartbeatTs=1693295501788]]] org.apache.ignite.IgniteException: GridWorker [name=sys-stripe-24, igniteInstanceName=CasperIgniteCluster, finished=false, heartbeatTs=1693295501788]
at sun.nio.ch.Net.poll(Native Method) ~[na:1.8.0_322]
at sun.nio.ch.SocketChannelImpl.poll(SocketChannelImpl.java:953 undefined) ~[na:1.8.0_322]
at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:121 undefined) ~[na:1.8.0_322]
at org.apache.ignite.spi.communication.tcp.internal.GridNioServerWrapper.createNioSession(GridNioServerWrapper.java:462 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.spi.communication.tcp.internal.GridNioServerWrapper.createTcpClient(GridNioServerWrapper.java:693 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:1181 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$Lambda$1772/2003383653.apply(Unknown Source) ~[na:na]
at org.apache.ignite.spi.communication.tcp.internal.GridNioServerWrapper.createTcpClient(GridNioServerWrapper.java:691 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.spi.communication.tcp.internal.ConnectionClientPool.createCommunicationClient(ConnectionClientPool.java:442 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.spi.communication.tcp.internal.ConnectionClientPool.reserveClient(ConnectionClientPool.java:231 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:1105 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:1052 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:2102 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:2195 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.processors.cache.GridCacheIoManager.send(GridCacheIoManager.java:1279 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.processors.cache.GridCacheIoManager.send(GridCacheIoManager.java:1318 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicAbstractUpdateFuture.sendDhtRequests(GridDhtAtomicAbstractUpdateFuture.java:476 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicAbstractUpdateFuture.map(GridDhtAtomicAbstractUpdateFuture.java:433 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1920 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1688 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processNearAtomicUpdateRequest(GridDhtAtomicCache.java:3179 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$200(GridDhtAtomicCache.java:147 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$3.apply(GridDhtAtomicCache.java:270 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$3.apply(GridDhtAtomicCache.java:265 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1164 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:605 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:406 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:324 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:112 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:314 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1907 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1528 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.managers.communication.GridIoManager.access$5300(GridIoManager.java:243 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.managers.communication.GridIoManager$9.execute(GridIoManager.java:1421 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.managers.communication.TraceRunnable.run(TraceRunnable.java:55 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:637 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125 undefined) ~[ignite-core-2.15.0.jar!/:2.15.0]
at java.lang.Thread.run(Thread.java:750 undefined) ~[na:1.8.0_322]
Ignite 从 v2.11.0 升级到 V2.15.0 也将
clear()
替换为 removeAll()
,因为clear 抛出了错误。
据我所知,您有意将节点配置为这样。根据日志片段,您似乎已经配置了自定义故障处理程序。
[hnd=StopNodeFailureHandler [super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]]
我敢打赌,您已经用以下内容覆盖了默认处理程序
StopNodeFailureHandler
和 ignoredFailureTypes 设置为空集合。
一般来说,您应该有非常扎实的想法和理由来修改默认的忽略异常集。该列表是故意引入的,包含可能只是表明潜在和/或持续的稳定性问题的例外情况。这并不一定意味着应该立即终止节点。
如果这部分不是故意引入的,您需要检查您的配置并修复它。请参阅文档。这是此配置的片段。
<bean class="org.apache.ignite.configuration.IgniteConfiguration">
<property name="failureHandler">
<bean class="org.apache.ignite.failure.StopNodeFailureHandler"/>
</property>
</bean>