我面临着这样的情况:由于与
InetAddress.getByName
相关的“死锁”情况,我的应用程序在启动时挂起,但我不清楚修复它的方法是什么。
为了提供一些背景信息,所涉及的 2 个线程并不直接在我的控制范围内:
第1个线程的相关代码是:
new InetSocketAddress("0.0.0.0", somePort)
第二个:
static final InetAddress INET6_ANY = InetAddress.getByName(i)
static final InetAddress INET_ANY = InetAddress.getByName("0.0.0.0")
我读过使用
InetAddress
可能会涉及一些阻塞等..但为什么它会永远挂起?特别是当我们指的是本地地址0.0.0.0
而不是某些远程地址时。
这个应用程序正在 Kubernetes 的容器中运行,如果这可以解释一些事情的话。
$ cat /etc/hosts
# Kubernetes-managed hosts file.
127.0.0.1 localhost
::1 localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
fe00::0 ip6-mcastprefix
fe00::1 ip6-allnodes
fe00::2 ip6-allrouters
10.42.126.184 my-app-756f44d67-tgr5b
请注意,这种情况并不总是可以重现,但我们最近已经看到了几次这种情况。
这可能是一个“错误”,因为某种方式误用了库吗?或者我可能缺少一些必须定义的东西才能使此类代码在 Kubernetes 上下文中工作?
为了完整起见,这里是这 2 个线程的线程转储。
被封锁的人:
"ZScheduler-Worker-6" #30 daemon prio=5 os_prio=0 cpu=276.23ms elapsed=7541.20s tid=0x00007f7bd54d4880 nid=0x55 waiting for monitor entry [0x00007f7b663f4000]
java.lang.Thread.State: BLOCKED (on object monitor)
at jdk.internal.loader.NativeLibraries.loadLibrary([email protected]/Unknown Source)
- waiting to lock <0x00000000a02e9a90> (a java.util.HashSet)
at jdk.internal.loader.NativeLibraries.loadLibrary([email protected]/Unknown Source)
at jdk.internal.loader.NativeLibraries.findFromPaths([email protected]/Unknown Source)
at jdk.internal.loader.NativeLibraries.loadLibrary([email protected]/Unknown Source)
at jdk.internal.loader.NativeLibraries.loadLibrary([email protected]/Unknown Source)
at jdk.internal.loader.BootLoader.loadLibrary([email protected]/Unknown Source)
at java.net.InetAddress.<clinit>([email protected]/Unknown Source)
at java.net.InetSocketAddress.<init>([email protected]/Unknown Source)
at io.prometheus.metrics.exporter.httpserver.HTTPServer$Builder.makeInetSocketAddress(HTTPServer.java:209)
at io.prometheus.metrics.exporter.httpserver.HTTPServer$Builder.buildAndStart(HTTPServer.java:197)
at io.opentelemetry.exporter.prometheus.PrometheusHttpServer.<init>(PrometheusHttpServer.java:71)
at io.opentelemetry.exporter.prometheus.PrometheusHttpServerBuilder.build(PrometheusHttpServerBuilder.java:68)
at com.myapp.metrics.sdk.PrometheusMetricReader$.$anonfun$startReader$2(PrometheusMetricReader.scala:21)
at com.myapp.metrics.sdk.PrometheusMetricReader$$$Lambda$1109/0x00007f7b7840e078.apply(Unknown Source)
at zio.ZIOCompanionVersionSpecific.$anonfun$attempt$1(ZIOCompanionVersionSpecific.scala:100)
at zio.ZIOCompanionVersionSpecific$$Lambda$430/0x00007f7b782ba000.apply(Unknown Source)
at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:904)
at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:890)
at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:890)
at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:890)
at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:1024)
at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:890)
at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:1024)
at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:967)
at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:1024)
at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:890)
at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:1024)
at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:967)
at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:890)
at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:890)
at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:1024)
at zio.internal.FiberRuntime.evaluateEffect(FiberRuntime.scala:381)
at zio.internal.FiberRuntime.evaluateMessageWhileSuspended(FiberRuntime.scala:504)
at zio.internal.FiberRuntime.drainQueueOnCurrentThread(FiberRuntime.scala:220)
at zio.internal.FiberRuntime.run(FiberRuntime.scala:139)
at zio.internal.ZScheduler$$anon$4.run(ZScheduler.scala:478)
那个“锁定”:
"ZScheduler-Worker-20" #44 daemon prio=5 os_prio=0 cpu=191.27ms elapsed=7541.20s tid=0x00007f7bd54e2e70 nid=0x63 in Object.wait() [0x00007f7b655e3000]
java.lang.Thread.State: RUNNABLE
at io.netty.channel.epoll.LinuxSocket.unsafeInetAddrByName(LinuxSocket.java:364)
- waiting on the Class initialization monitor for java.net.InetAddress
at io.netty.channel.epoll.LinuxSocket.<clinit>(LinuxSocket.java:42)
at jdk.internal.loader.NativeLibraries.load([email protected]/Native Method)
at jdk.internal.loader.NativeLibraries$NativeLibraryImpl.open([email protected]/Unknown Source)
at jdk.internal.loader.NativeLibraries.loadLibrary([email protected]/Unknown Source)
- locked <0x00000000a02e9a90> (a java.util.HashSet)
at jdk.internal.loader.NativeLibraries.loadLibrary([email protected]/Unknown Source)
at java.lang.ClassLoader.loadLibrary([email protected]/Unknown Source)
at java.lang.Runtime.load0([email protected]/Unknown Source)
at java.lang.System.load([email protected]/Unknown Source)
at io.netty.util.internal.NativeLibraryUtil.loadLibrary(NativeLibraryUtil.java:36)
at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0([email protected]/Native Method)
at jdk.internal.reflect.NativeMethodAccessorImpl.invoke([email protected]/Unknown Source)
at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke([email protected]/Unknown Source)
at java.lang.reflect.Method.invoke([email protected]/Unknown Source)
at io.netty.util.internal.NativeLibraryLoader$1.run(NativeLibraryLoader.java:430)
at java.security.AccessController.executePrivileged([email protected]/Unknown Source)
at java.security.AccessController.doPrivileged([email protected]/Unknown Source)
at io.netty.util.internal.NativeLibraryLoader.loadLibraryByHelper(NativeLibraryLoader.java:422)
at io.netty.util.internal.NativeLibraryLoader.loadLibrary(NativeLibraryLoader.java:388)
at io.netty.util.internal.NativeLibraryLoader.load(NativeLibraryLoader.java:218)
at io.netty.channel.epoll.Native.loadNativeLibrary(Native.java:323)
at io.netty.channel.epoll.Native.<clinit>(Native.java:85)
at io.netty.channel.epoll.Epoll.<clinit>(Epoll.java:40)
at zio.http.netty.ChannelFactories$Client$.$anonfun$fromConfig$4(ChannelFactories.scala:83)
at zio.http.netty.ChannelFactories$Client$$$Lambda$966/0x00007f7b783d3e60.apply(Unknown Source)
at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:890)
at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:890)
at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:890)
at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:1024)
at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:967)
at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:890)
at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:890)
at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:1024)
at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:890)
at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:1024)
at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:967)
at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:890)
at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:890)
at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:1024)
at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:1024)
at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:967)
at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:967)
at zio.internal.FiberRuntime.evaluateEffect(FiberRuntime.scala:381)
at zio.internal.FiberRuntime.evaluateMessageWhileSuspended(FiberRuntime.scala:504)
at zio.internal.FiberRuntime.drainQueueOnCurrentThread(FiberRuntime.scala:220)
at zio.internal.FiberRuntime.run(FiberRuntime.scala:139)
at zio.internal.ZScheduler$$anon$4.run(ZScheduler.scala:478)
从你的线程转储中,我可以看到以下内容:
InetAddress
的类初始值设定项等待加载本机库,但由于另一个线程正在加载本机库而无法加载。InetAddress
作为加载本机库的一部分。具体来说,Netty 的 Native
类 的类初始值设定项会加载一个本机库 (netty_transport_native_epoll
),而该库又对 LinuxSocket
进行向上调用(或至少对其进行初始化),这需要加载 InetAddress
。 所以问题是 Netty 在加载本机库时使用
InetAddress
,这可能在初始化期间发生。
在给 Netty 执行任何操作的机会之前,您可以确保
InetAddress
已完全初始化。您可以通过在 main 的开头运行 InetAddress.getLocalHost();
来做到这一点。这应该是在 Netty 在任何地方使用之前,它应该初始化 InetAddress
您可以向 Netty 团队提交错误报告。他们可以实现的一种解决方案是在加载本机库(依赖于它被加载/可加载)之前初始化
InetAddress
本身。