我正在运行 Lambda 函数来查询 ALB 的访问日志并将输出发送到 S3 存储桶。正在执行两个查询:
Daily Logs
Monthly Logs
我创建了一个bucket和两个文件夹,分别是DailyLogs和MonthlyLogs,这样当Lambda函数执行时,日志就会存储在各自的文件夹中。
我还添加了一个功能,可以删除日志中以前的 CSV 文件,并在生成时用新日志替换它。但是,当执行 Lambda 函数时,整个文件夹 DailyLogs 和 MonthlyLogs 将被删除。
我只想删除文件夹内的内容并用新日志替换。
你能帮我吗?
附件是Lambda代码。
import boto3
import json
import time
from datetime import datetime
# Query string to execute
daily_query = "SELECT * FROM \"DATABASE\".\"TABLE\" WHERE user_agent LIKE '%test%' AND date_parse(time, '%Y-%m-%dT%H:%i:%s.%fZ') >= date_parse(date_format(date_add('day', -1, current_date), '%Y-%m-%d'), '%Y-%m-%d') AND date_parse(time, '%Y-%m-%dT%H:%i:%s.%fZ') < date_parse(date_format(current_date, '%Y-%m-%d'), '%Y-%m-%d') ORDER BY time ASC"
monthly_query = "SELECT * FROM \"DATABASE\".\"TABLE\" WHERE parse_datetime(time,'yyyy-MM-dd''T''HH:mm:ss.SSSSSS''Z') BETWEEN parse_datetime(CAST(date_trunc('month', current_date) AS varchar), 'yyyy-MM-dd') AND parse_datetime(CAST(current_date AS varchar), 'yyyy-MM-dd') AND user_agent LIKE '%test%' ORDER BY time ASC"
# Database to execute the query against
DATABASE = 'DATABASE'
# Output bucket
bucket_name = 'BUCKET_NAME'
# Initialize Boto3 clients
s3_client = boto3.client('s3')
athena_client = boto3.client('athena')
def lambda_handler(event, context):
try:
# Get current date
current_date = datetime.now()
# Create folder names for daily and monthly logs
daily_folder = f"DailyLogs/{current_date.strftime('%Y-%m-%d')}/"
monthly_folder = f"MonthlyLogs/{current_date.strftime('%Y-%m')}/"
# Delete existing files in the S3 bucket
delete_daily_files(daily_folder)
delete_monthly_files(monthly_folder)
# Start the query executions
response = athena_client.start_query_execution(
QueryString=daily_query,
QueryExecutionContext={'Database': DATABASE},
ResultConfiguration={'OutputLocation': f's3://{bucket_name}/{daily_folder}'}
)
response = athena_client.start_query_execution(
QueryString=monthly_query,
QueryExecutionContext={'Database': DATABASE},
ResultConfiguration={'OutputLocation': f's3://{bucket_name}/{monthly_folder}'}
)
return response
except Exception as e:
print(f"An error occurred: {str(e)}")
return {'statusCode': 500, 'body': json.dumps({'error': str(e)})}
def delete_daily_files(folder):
response = s3_client.list_objects_v2(Bucket=bucket_name, Prefix=folder)
if 'Contents' in response:
keys_to_delete = [{'Key': obj['Key']} for obj in response['Contents']]
if keys_to_delete:
s3_client.delete_objects(Bucket=bucket_name, Delete={'Objects': keys_to_delete})
def delete_monthly_files(folder):
response = s3_client.list_objects_v2(Bucket=bucket_name, Prefix=folder)
if 'Contents' in response:
keys_to_delete = [{'Key': obj['Key']} for obj in response['Contents']]
if keys_to_delete:
s3_client.delete_objects(Bucket=bucket_name, Delete={'Objects': keys_to_delete})
找到随附的代码。不是删除文件,而是删除整个文件夹。
您可以通过查找斜杠来检查该对象是否是文件夹。
if not obj['Key'].endswith('/')
这是一个可以解决问题的函数;
def delete_files_in_folder(folder):
response = s3_client.list_objects_v2(Bucket=bucket_name, Prefix=folder)
if "Contents" in response:
keys_to_delete = [
{"Key": obj["Key"]}
for obj in response["Contents"]
if not obj["Key"].endswith("/")
]
if keys_to_delete:
s3_client.delete_objects(
Bucket=bucket_name, Delete={"Objects": keys_to_delete}
)