虽然连接到Windows机器作为奴隶,我得到跟随错误我认为它的一些网络相关的问题,但需要一些帮助从哪里开始寻找或什么是可能的解决方案。
INFO: Terminated
Aug 01, 2017 10:15:54 PM hudson.remoting.JarCacheSupport$1 run
WARNING: Failed to resolve a jar 06bcb4519543f5ec83cf9d6da9f6cfbe
java.io.IOException: Failed to write to C:\Users\Administrator\.jenkins\cache\jars\06\BCB4519543F5EC83CF9D6DA9F6CFBE.jar
at hudson.remoting.FileSystemJarCache.retrieve(FileSystemJarCache.java:133)
at hudson.remoting.JarCacheSupport$1.run(JarCacheSupport.java:64)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:483)
at java.util.concurrent.FutureTask.run(FutureTask.java:274)
at hudson.remoting.AtmostOneThreadExecutor$Worker.run(AtmostOneThreadExecutor.java:110)
at java.lang.Thread.run(Thread.java:809)
Caused by: java.io.IOException: Backing channel 'JNLP4-connect connection to dr2r4m1p21/172.20.238.41:9001' is disconnected.
at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:192)
at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:257)
at com.sun.proxy.$Proxy4.writeJarTo(Unknown Source)
at hudson.remoting.FileSystemJarCache.retrieve(FileSystemJarCache.java:98)
... 5 more
Caused by: java.nio.channels.ClosedChannelException
at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer.onReadClosed(ChannelApplicationLayer.java:208)
at org.jenkinsci.remoting.protocol.ApplicationLayer.onRecvClosed(ApplicationLayer.java:222)
at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:832)
at org.jenkinsci.remoting.protocol.FilterLayer.onRecvClosed(FilterLayer.java:287)
at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecvClosed(SSLEngineFilterLayer.java:181)
at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.switchToNoSecure(SSLEngineFilterLayer.java:283)
at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processWrite(SSLEngineFilterLayer.java:503)
at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processQueuedWrites(SSLEngineFilterLayer.java:248)
at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doSend(SSLEngineFilterLayer.java:200)
at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecvClosed(SSLEngineFilterLayer.java:166)
at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:832)
at org.jenkinsci.remoting.protocol.NetworkLayer.onRecvClosed(NetworkLayer.java:154)
at org.jenkinsci.remoting.protocol.impl.BIONetworkLayer.access$1500(BIONetworkLayer.java:48)
at org.jenkinsci.remoting.protocol.impl.BIONetworkLayer$Reader.run(BIONetworkLayer.java:247)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1157)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:627)
at hudson.remoting.Engine$1$1.run(Engine.java:94)
... 1 more
上面提到的堆栈跟踪来自salve(Windows)机器,我的Jenkins / Master在RHEL上运行,我能够在那里看到以下堆栈跟踪。
INFO: Accepted JNLP4-connect connection #113 from /172.20.238.31:60363
Aug 01, 2017 12:45:55 PM jenkins.slaves.DefaultJnlpSlaveReceiver channelClosed
WARNING: Computer.threadPoolForRemoting [#42] for Build_Agent terminated
java.nio.channels.ClosedChannelException
at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer.onReadClosed(ChannelApplicationLayer.java:208)
at org.jenkinsci.remoting.protocol.ApplicationLayer.onRecvClosed(ApplicationLayer.java:222)
at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecvClosed(ProtocolStack.java:832)
at org.jenkinsci.remoting.protocol.FilterLayer.onRecvClosed(FilterLayer.java:287)
at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecvClosed(SSLEngineFilterLayer.java:181)
at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.switchToNoSecure(SSLEngineFilterLayer.java:283)
at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processWrite(SSLEngineFilterLayer.java:503)
at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processQueuedWrites(SSLEngineFilterLayer.java:248)
at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doSend(SSLEngineFilterLayer.java:200)
at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.doCloseSend(SSLEngineFilterLayer.java:213)
at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.doCloseSend(ProtocolStack.java:800)
at org.jenkinsci.remoting.protocol.ApplicationLayer.doCloseWrite(ApplicationLayer.java:173)
at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer$ByteBufferCommandTransport.closeWrite(ChannelApplicationLayer.java:311)
at hudson.remoting.Channel.close(Channel.java:1295)
at hudson.remoting.Channel.close(Channel.java:1263)
at jenkins.slaves.DefaultJnlpSlaveReceiver.afterChannel(DefaultJnlpSlaveReceiver.java:173)
at org.jenkinsci.remoting.engine.JnlpConnectionState$4.invoke(JnlpConnectionState.java:421)
at org.jenkinsci.remoting.engine.JnlpConnectionState.fire(JnlpConnectionState.java:312)
at org.jenkinsci.remoting.engine.JnlpConnectionState.fireAfterChannel(JnlpConnectionState.java:418)
at org.jenkinsci.remoting.engine.JnlpProtocol4Handler$Handler$1.run(JnlpProtocol4Handler.java:334)
at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
我遇到了与OP相似的错误,其中与我的奴隶的连接正在下降。问题的根本原因不是由于Jenkins从站和主控主机之间的Java版本不匹配。
解决方案如果您在Elastic Load Balancer(ELB)后面的AWS上的EC2实例中运行Jenkins,请将“attributes”部分下的“idle timeout”值从默认的60秒增加。我将新值设置为600,不再出现错误。
看来,如果构建过程中的单个命令花费的时间超过60秒而没有日志输出,则ELB将因空闲活动而终止会话。
我遇到了同样的问题。我发现如果您的作业没有针对GUI运行,Windows slave会切换到“睡眠”模式。
然后成功解决它。在Windows7奴隶上,这是我做的:
这个程序后应该没问题
除了帖子中的错误日志,我还得到了奴隶中jenkins目录下的错误日志(对我来说是C:\ jenkins \ jenkins-slave.err.log):
JNLP文件http://jenkins.domain.com/computer/my_slave_name/slave-agent.jnlp?encrypt=true具有无效参数:[######################################,my_slave_name,-workDir ,c:\ jenkins,-internalDir,remoting,-url,http://jenkins.domain.com/,-headless,-jar-cache,C:\ Users \ Administrator.jenkins \ cache \ jars]很可能主“-workDir”中的配置错误是不是一个有效的选择
我的解决方案
1)windows slave level:关闭所有用户的GUI中的服务控制台 - 这是必须的。由于某种原因,微软正在锁定Windows服务的安装/删除
2)windows slave level:kill所有java和jenkins-slave进程(如果存在)
3)windows slave level:从cmd删除jenkins slave服务(如果存在):sc delete jenkinsslave-c__jenkins /force
(在我的例子中)
4)windows slave level:验证你已经安装了java 8:我正在使用jdk1.8.0_151
。卸载所有旧的java版本
5)jenkins master ui level:在slave configure下改变Jenkins连接到slave的方式 - > Launch方法:Let Jenkins control this Windows slave as a Windows service
(而不是Launch agent via Java Web Start
)
6)aws级别:将aws elb空闲超时增加到600
(来自60
) - 就像@njtman建议的那样
7)jenkins掌握ui级别:重新启动jenkins中的代理并等待几分钟。
我的环境:
jenkins:2.89.2,os:windows 2012 R2,java:jdk1.8.0_151
嗯......对我来说,它有以下解决方案:
将节点标记为“临时脱机”并再次将其“在线”恢复
重新连接
好的,我在这里解决了我的特殊情况:
我有一些VM的libvirt / quemu作为奴隶运行。因为libvirt-plugin对我来说不可靠,所以我自己开始使用这些VM。我问自己:“为什么这个libvirt-plugin有一个强制延迟时间......不耐烦......
因此,如果libvirt-client(奴隶)向jenkins打招呼,你应该等一段时间让这个可怜的家伙喘不过气来。启动后。
奴隶是一个win7主机ubuntu 18.04
在Windows上,我认识到我需要将“-noCertificateCheck”属性添加到workdir中jenkins-slave.xml的参数中。我们在主服务器上使用来自内部PKI的证书,这是解决它的最简单方法(在内部网络中拥有所有内容)。
<arguments>-Xrs -jar "%BASE%\slave.jar" -jnlpUrl https://jenkins.ourdomain.com/computer/Windows%20build%20server%20-%20Bare%20metal/slave-agent.jnlp -secret abc -noCertificateCheck</arguments>
我通过从命令提示符手动运行代理来识别这一点:
java -jar agent.jar -jnlpUrl https://jenkins.ourdomain.com/computer/Windows%20build%20server%20-%20Bare%20metal/slave-agent.jnlp -secret abc -workDir "D:\agentroot" -noCertificateCheck
user2015131的建议激发了我找到解决这个问题的方法。
我解释一下我的情况,它可能适用于某些人:
因此,存储在从属服务器上的Jenkins服务代码已过时。
按照每台从机上的后续步骤操作:
希望能帮助到你!
我正面临同样的问题,使用以下步骤进行了纠正