线程在 InetAddress.getByName 0.0.0.0 上被阻止

问题描述 投票:0回答:1

我面临着这样的情况:由于与

InetAddress.getByName
相关的“死锁”情况,我的应用程序在启动时挂起,但我不清楚修复它的方法是什么。

为了提供一些背景信息,所涉及的 2 个线程并不直接在我的控制范围内:

  • 1 个线程(已阻塞)正在启动 Prometheus HTTP 服务器
  • 1 个线程(RUNNABLE)与 ZIO Http 客户端库相关,调用一些 Netty 的东西作为客户端(而不是服务器)

第1个线程的相关代码是:

new InetSocketAddress("0.0.0.0", somePort)

第二个:

static final InetAddress INET6_ANY = InetAddress.getByName(i)
static final InetAddress INET_ANY = InetAddress.getByName("0.0.0.0")

我读过使用

InetAddress
可能会涉及一些阻塞等..但为什么它会永远挂起?特别是当我们指的是本地地址
0.0.0.0
而不是某些远程地址时。

这个应用程序正在 Kubernetes 的容器中运行,如果这可以解释一些事情的话。

$ cat /etc/hosts
# Kubernetes-managed hosts file.
127.0.0.1       localhost
::1     localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
fe00::0 ip6-mcastprefix
fe00::1 ip6-allnodes
fe00::2 ip6-allrouters
10.42.126.184   my-app-756f44d67-tgr5b

请注意,这种情况并不总是可以重现,但我们最近已经看到了几次这种情况。

这可能是一个“错误”,因为某种方式误用了库吗?或者我可能缺少一些必须定义的东西才能使此类代码在 Kubernetes 上下文中工作?


为了完整起见,这里是这 2 个线程的线程转储。

被封锁的人:

"ZScheduler-Worker-6" #30 daemon prio=5 os_prio=0 cpu=276.23ms elapsed=7541.20s tid=0x00007f7bd54d4880 nid=0x55 waiting for monitor entry  [0x00007f7b663f4000]
   java.lang.Thread.State: BLOCKED (on object monitor)
    at jdk.internal.loader.NativeLibraries.loadLibrary([email protected]/Unknown Source)
    - waiting to lock <0x00000000a02e9a90> (a java.util.HashSet)
    at jdk.internal.loader.NativeLibraries.loadLibrary([email protected]/Unknown Source)
    at jdk.internal.loader.NativeLibraries.findFromPaths([email protected]/Unknown Source)
    at jdk.internal.loader.NativeLibraries.loadLibrary([email protected]/Unknown Source)
    at jdk.internal.loader.NativeLibraries.loadLibrary([email protected]/Unknown Source)
    at jdk.internal.loader.BootLoader.loadLibrary([email protected]/Unknown Source)
    at java.net.InetAddress.<clinit>([email protected]/Unknown Source)
    at java.net.InetSocketAddress.<init>([email protected]/Unknown Source)
    at io.prometheus.metrics.exporter.httpserver.HTTPServer$Builder.makeInetSocketAddress(HTTPServer.java:209)
    at io.prometheus.metrics.exporter.httpserver.HTTPServer$Builder.buildAndStart(HTTPServer.java:197)
    at io.opentelemetry.exporter.prometheus.PrometheusHttpServer.<init>(PrometheusHttpServer.java:71)
    at io.opentelemetry.exporter.prometheus.PrometheusHttpServerBuilder.build(PrometheusHttpServerBuilder.java:68)
    at com.myapp.metrics.sdk.PrometheusMetricReader$.$anonfun$startReader$2(PrometheusMetricReader.scala:21)
    at com.myapp.metrics.sdk.PrometheusMetricReader$$$Lambda$1109/0x00007f7b7840e078.apply(Unknown Source)
    at zio.ZIOCompanionVersionSpecific.$anonfun$attempt$1(ZIOCompanionVersionSpecific.scala:100)
    at zio.ZIOCompanionVersionSpecific$$Lambda$430/0x00007f7b782ba000.apply(Unknown Source)
    at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:904)
    at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:890)
    at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:890)
    at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:890)
    at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:1024)
    at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:890)
    at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:1024)
    at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:967)
    at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:1024)
    at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:890)
    at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:1024)
    at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:967)
    at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:890)
    at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:890)
    at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:1024)
    at zio.internal.FiberRuntime.evaluateEffect(FiberRuntime.scala:381)
    at zio.internal.FiberRuntime.evaluateMessageWhileSuspended(FiberRuntime.scala:504)
    at zio.internal.FiberRuntime.drainQueueOnCurrentThread(FiberRuntime.scala:220)
    at zio.internal.FiberRuntime.run(FiberRuntime.scala:139)
    at zio.internal.ZScheduler$$anon$4.run(ZScheduler.scala:478)

那个“锁定”:

"ZScheduler-Worker-20" #44 daemon prio=5 os_prio=0 cpu=191.27ms elapsed=7541.20s tid=0x00007f7bd54e2e70 nid=0x63 in Object.wait()  [0x00007f7b655e3000]
   java.lang.Thread.State: RUNNABLE
    at io.netty.channel.epoll.LinuxSocket.unsafeInetAddrByName(LinuxSocket.java:364)
    - waiting on the Class initialization monitor for java.net.InetAddress
    at io.netty.channel.epoll.LinuxSocket.<clinit>(LinuxSocket.java:42)
    at jdk.internal.loader.NativeLibraries.load([email protected]/Native Method)
    at jdk.internal.loader.NativeLibraries$NativeLibraryImpl.open([email protected]/Unknown Source)
    at jdk.internal.loader.NativeLibraries.loadLibrary([email protected]/Unknown Source)
    - locked <0x00000000a02e9a90> (a java.util.HashSet)
    at jdk.internal.loader.NativeLibraries.loadLibrary([email protected]/Unknown Source)
    at java.lang.ClassLoader.loadLibrary([email protected]/Unknown Source)
    at java.lang.Runtime.load0([email protected]/Unknown Source)
    at java.lang.System.load([email protected]/Unknown Source)
    at io.netty.util.internal.NativeLibraryUtil.loadLibrary(NativeLibraryUtil.java:36)
    at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0([email protected]/Native Method)
    at jdk.internal.reflect.NativeMethodAccessorImpl.invoke([email protected]/Unknown Source)
    at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke([email protected]/Unknown Source)
    at java.lang.reflect.Method.invoke([email protected]/Unknown Source)
    at io.netty.util.internal.NativeLibraryLoader$1.run(NativeLibraryLoader.java:430)
    at java.security.AccessController.executePrivileged([email protected]/Unknown Source)
    at java.security.AccessController.doPrivileged([email protected]/Unknown Source)
    at io.netty.util.internal.NativeLibraryLoader.loadLibraryByHelper(NativeLibraryLoader.java:422)
    at io.netty.util.internal.NativeLibraryLoader.loadLibrary(NativeLibraryLoader.java:388)
    at io.netty.util.internal.NativeLibraryLoader.load(NativeLibraryLoader.java:218)
    at io.netty.channel.epoll.Native.loadNativeLibrary(Native.java:323)
    at io.netty.channel.epoll.Native.<clinit>(Native.java:85)
    at io.netty.channel.epoll.Epoll.<clinit>(Epoll.java:40)
    at zio.http.netty.ChannelFactories$Client$.$anonfun$fromConfig$4(ChannelFactories.scala:83)
    at zio.http.netty.ChannelFactories$Client$$$Lambda$966/0x00007f7b783d3e60.apply(Unknown Source)
    at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:890)
    at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:890)
    at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:890)
    at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:1024)
    at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:967)
    at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:890)
    at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:890)
    at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:1024)
    at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:890)
    at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:1024)
    at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:967)
    at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:890)
    at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:890)
    at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:1024)
    at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:1024)
    at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:967)
    at zio.internal.FiberRuntime.runLoop(FiberRuntime.scala:967)
    at zio.internal.FiberRuntime.evaluateEffect(FiberRuntime.scala:381)
    at zio.internal.FiberRuntime.evaluateMessageWhileSuspended(FiberRuntime.scala:504)
    at zio.internal.FiberRuntime.drainQueueOnCurrentThread(FiberRuntime.scala:220)
    at zio.internal.FiberRuntime.run(FiberRuntime.scala:139)
    at zio.internal.ZScheduler$$anon$4.run(ZScheduler.scala:478)
java kubernetes prometheus netty zio-http
1个回答
0
投票

发生了什么

从你的线程转储中,我可以看到以下内容:

  • 在阻塞线程中,
    InetAddress
    的类初始值设定项等待加载本机库,但由于另一个线程正在加载本机库而无法加载。
  • 另一个线程尝试使用
    InetAddress
    作为加载本机库的一部分。具体来说,Netty 的
    Native
    的类初始值设定项会加载一个本机库 (
    netty_transport_native_epoll
    ),而该库又对
    LinuxSocket
    进行向上调用(或至少对其进行初始化),这需要加载
    InetAddress

所以问题是 Netty 在加载本机库时使用

InetAddress
,这可能在初始化期间发生。

解决方法

在给 Netty 执行任何操作的机会之前,您可以确保

InetAddress
已完全初始化。您可以通过在 main 的开头运行
InetAddress.getLocalHost();
来做到这一点。这应该是在 Netty 在任何地方使用之前,它应该初始化
InetAddress

切实解决问题

您可以向 Netty 团队提交错误报告。他们可以实现的一种解决方案是在加载本机库(依赖于它被加载/可加载)之前初始化

InetAddress
本身。

© www.soinside.com 2019 - 2024. All rights reserved.