如何重试 bash 命令,直到其状态正常或达到超时?
我最好的镜头(我正在寻找更简单的东西):
NEXT_WAIT_TIME=0
COMMAND_STATUS=1
until [ $COMMAND_STATUS -eq 0 || $NEXT_WAIT_TIME -eq 4 ]; do
command
COMMAND_STATUS=$?
sleep $NEXT_WAIT_TIME
let NEXT_WAIT_TIME=NEXT_WAIT_TIME+1
done
您可以通过将
command
放在测试中并以不同的方式进行增量来稍微简化事情。否则脚本看起来不错:
NEXT_WAIT_TIME=0
until (( NEXT_WAIT_TIME == 5 )) || cmd; do
sleep "$(( NEXT_WAIT_TIME++ ))"
done
(( NEXT_WAIT_TIME < 5 ))
一行最短,也许是最好的方法:
timeout 12h bash -c 'until ssh root@mynewvm; do sleep 10; done'
提供
此脚本可以从 retrying-commands-in-shell-scripts
下载#!/bin/bash
# Retries a command on failure.
# $1 - the max number of attempts
# $2... - the command to run
retry() {
local -r -i max_attempts="$1"; shift
local -r cmd="$@"
local -i attempt_num=1
until $cmd
do
if (( attempt_num == max_attempts ))
then
echo "Attempt $attempt_num failed and there are no more attempts left!"
return 1
else
echo "Attempt $attempt_num failed! Trying again in $attempt_num seconds..."
sleep $(( attempt_num++ ))
fi
done
}
用法示例:
retry 5 ls -ltr foo
如果您想重试脚本中的某个函数,请按如下所示使用它:
# example usage:
foo()
{
#whatever you want do.
}
declare -fxr foo
retry 3 timeout 60 bash -ce 'foo'
组装一些工具。
重试:https://github.com/kadwanev/retry
超时:http://manpages.courier-mta.org/htmlman1/timeout.1.html
那就看看神奇吧
retry timeout 3 ping google.com
PING google.com (173.194.123.97): 56 data bytes
64 bytes from 173.194.123.97: icmp_seq=0 ttl=55 time=13.982 ms
64 bytes from 173.194.123.97: icmp_seq=1 ttl=55 time=44.857 ms
64 bytes from 173.194.123.97: icmp_seq=2 ttl=55 time=64.187 ms
Before retry #1: sleeping 0.3 seconds
PING google.com (173.194.123.103): 56 data bytes
64 bytes from 173.194.123.103: icmp_seq=0 ttl=55 time=56.549 ms
64 bytes from 173.194.123.103: icmp_seq=1 ttl=55 time=60.220 ms
64 bytes from 173.194.123.103: icmp_seq=2 ttl=55 time=8.872 ms
Before retry #2: sleeping 0.6 seconds
PING google.com (173.194.123.103): 56 data bytes
64 bytes from 173.194.123.103: icmp_seq=0 ttl=55 time=25.819 ms
64 bytes from 173.194.123.103: icmp_seq=1 ttl=55 time=16.382 ms
64 bytes from 173.194.123.103: icmp_seq=2 ttl=55 time=3.224 ms
Before retry #3: sleeping 1.2 seconds
PING google.com (173.194.123.103): 56 data bytes
64 bytes from 173.194.123.103: icmp_seq=0 ttl=55 time=58.438 ms
64 bytes from 173.194.123.103: icmp_seq=1 ttl=55 time=94.828 ms
64 bytes from 173.194.123.103: icmp_seq=2 ttl=55 time=61.075 ms
Before retry #4: sleeping 2.4 seconds
PING google.com (173.194.123.103): 56 data bytes
64 bytes from 173.194.123.103: icmp_seq=0 ttl=55 time=43.361 ms
64 bytes from 173.194.123.103: icmp_seq=1 ttl=55 time=32.171 ms
...
检查最终通过/失败的退出状态。
对于任何想要真正等待一段时间的人来说,考虑你的命令的时间可能很重要:
TIMEOUT_SEC=180
start_time="$(date -u +%s)"
while [ condition_or_just_true ]; do
current_time="$(date -u +%s)"
elapsed_seconds=$(($current_time-$start_time))
if [ $elapsed_seconds -gt $TIMEOUT_SEC ]; then
echo "timeout of $TIMEOUT_SEC sec"
exit 1
fi
echo "another attempt (elapsed $elapsed_seconds sec)"
some_command_and_maybe_sleep
done
我发现这可以满足我的需求:
function wait_for_success() {
local timeout start_time end_time
timeout=${TIMEOUT:-60}
interval=${INTERVAL:-2}
start_time=$(date +%s)
end_time=$((start_time + timeout))
while [ $(date +%s) -lt $end_time ]; do
if $@; then
return 0
fi
sleep $interval
done
>&2 echo "Timeout exceeded."
return 1
}
我对 this 答案做了一些调整,让您可以打开是否达到超时,或者命令是否成功。另外,在此版本中每秒重试一次:
ELAPSED=0
started=$(mktemp)
echo "False" > $started
until the_command_here && echo "True" > $started || [ $ELAPSED -eq 30 ]
do
sleep 1
(( ELAPSED++ ))
done
if [[ $(cat $started) == "True" ]]
then
echo "the command completed after $ELAPSED seconds"
else
echo "timed out after $ELAPSED seconds"
exit 111
fi