我正在尝试过滤掉比最新日志早7天的日志,例如,如果日志的最后一部分是2024-02-13,那么2024-02-05上的所有日志都将被删除。
示例日志文件:
* Server Name: png9iwb4a
* Date and Time: 2024-02-05 23:00:01
* Total Number of Child (PID): 5
* PID: 117703, Worker Threads: 46
* PID: 117704, Worker Threads: 30
* PID: 117705, Worker Threads: 30
* PID: 117809, Worker Threads: 30
* Server Name: png9iwb4a
* Date and Time: 2024-02-05 23:30:01
* Total Number of Child (PID): 5
* PID: 117703, Worker Threads: 46
* PID: 117704, Worker Threads: 30
* PID: 117705, Worker Threads: 30
* PID: 117809, Worker Threads: 30
* Server Name: png9iwb4a
* Date and Time: 2024-02-06 00:00:01
* Total Number of Child (PID): 5
* PID: 117703, Worker Threads: 46
* PID: 117704, Worker Threads: 30
* PID: 117705, Worker Threads: 30
* PID: 117809, Worker Threads: 30
.
.
.
* Server Name: png9iwb4a
* Date and Time: 2024-02-13 23:00:01
* Total Number of Child (PID): 5
* PID: 117703, Worker Threads: 46
* PID: 117704, Worker Threads: 30
* PID: 117705, Worker Threads: 30
* PID: 117809, Worker Threads: 30
* Server Name: png9iwb4a
* Date and Time: 2024-02-13 23:30:01
* Total Number of Child (PID): 5
* PID: 117703, Worker Threads: 46
* PID: 117704, Worker Threads: 30
* PID: 117705, Worker Threads: 30
* PID: 117809, Worker Threads: 30
预期输出:
* Server Name: png9iwb4a
* Date and Time: 2024-02-06 00:00:01
* Total Number of Child (PID): 5
* PID: 117703, Worker Threads: 46
* PID: 117704, Worker Threads: 30
* PID: 117705, Worker Threads: 30
* PID: 117809, Worker Threads: 30
.
.
.
* Server Name: png9iwb4a
* Date and Time: 2024-02-13 23:00:01
* Total Number of Child (PID): 5
* PID: 117703, Worker Threads: 46
* PID: 117704, Worker Threads: 30
* PID: 117705, Worker Threads: 30
* PID: 117809, Worker Threads: 30
* Server Name: png9iwb4a
* Date and Time: 2024-02-13 23:30:01
* Total Number of Child (PID): 5
* PID: 117703, Worker Threads: 46
* PID: 117704, Worker Threads: 30
* PID: 117705, Worker Threads: 30
* PID: 117809, Worker Threads: 30
我确实尝试使用 awk 命令,但它没有达到我想要的效果,它只是清除了整个日志文件。
我的代码如下:
#!/bin/bash
LOG_FILE="logs/worker_count.log"
TMP_FILE="$LOG_FILE.tmp"
# Extract the newest and oldest dates from the log file
newest_date=$(grep -oP 'Date and Time: \K[^\n]+' "$LOG_FILE" | tail -n 1)
oldest_date=$(grep -oP 'Date and Time: \K[^\n]+' "$LOG_FILE" | head -n 1)
# Check if either date is empty
if [ -z "$newest_date" ] || [ -z "$oldest_date" ]; then
echo "Error: Unable to extract dates from the log file."
exit 1
fi
# Convert dates to timestamps for comparison
newest_timestamp=$(date -d "$newest_date" +"%s")
oldest_timestamp=$(date -d "$oldest_date" +"%s")
# Calculate the difference in seconds
time_difference=$((newest_timestamp - oldest_timestamp))
# If the difference is greater than or equal to 7 days (604800 seconds)
if [ "$time_difference" -ge 604800 ]; then
# Calculate the cutoff date based on the newest date minus 7 days
cutoff_date=$(date -d "@$((newest_timestamp - 604800))" +"%Y-%m-%d %T")
# Extract entries within the specified date range and remove old entries
awk -v cutoff="$cutoff_date" '/^(\* Server Name:|\* Date and Time:)/ {
server_name = $NF
getline datetime
if (datetime >= cutoff) {
print "* Server Name: " server_name
print datetime
for (i = 1; i <= 5; i++) {
getline line
print line
}
}
}
' "$LOG_FILE" > "$TMP_FILE"
# Replace the original log file with the filtered entries
#mv "$TMP_FILE" "$LOG_FILE"
else
echo "No need to remove old entries. Time difference is less than 7 days."
fi
有人可以帮我吗?
如果您有
tac
,向后读取文件可能会更高效:
tac "$LOG_FILE" |
awk -v RS= '
/Date and Time/ {
if (!cutoff)
"date --date \""$(NF-4)" "$(NF-5)" - 7 days\" +%F%T" | getline cutoff
else
if ($(NF-5)$(NF-4) < cutoff) exit
}
{ print ORS $0 }
' |
tac