使用 AWS CLI 从 S3 获取最后修改的对象

Question

我有一个用例，我以编程方式启动一个 EC2 实例，从 S3 复制一个可执行文件，运行它并关闭该实例（在用户数据中完成）。我只需要从 S3 获取最后添加的文件。

有没有办法使用 AWS CLI 工具从 S3 存储桶获取最后修改的文件/对象？

Answer 1

您可以使用

aws s3 ls $BUCKET --recursive

列出存储桶中的所有对象：

$ aws s3 ls $BUCKET --recursive
2015-05-05 15:36:17          4 an_object.txt
2015-06-08 14:14:44   16322599 some/other/object
2015-04-29 12:09:29      32768 yet-another-object.sh

它们按字母顺序按键排序，但第一列是最后修改时间。快速

sort

将按日期重新排序：

$ aws s3 ls $BUCKET --recursive | sort
2015-04-29 12:09:29      32768 yet-another-object.sh
2015-05-05 15:36:17          4 an_object.txt
2015-06-08 14:14:44   16322599 some/other/object

tail -n 1

选择最后一行，

awk '{print $4}'

提取第四列（对象的名称）。

$ aws s3 ls $BUCKET --recursive | sort | tail -n 1 | awk '{print $4}'
some/other/object

最后但并非最不重要的一点是，将其放入

aws s3 cp

来下载对象：

$ KEY=`aws s3 ls $BUCKET --recursive | sort | tail -n 1 | awk '{print $4}'`
$ aws s3 cp s3://$BUCKET/$KEY ./latest-object

Answer 2

更新答案

过了一会儿，有一个小更新，如何做得更优雅：

aws s3api list-objects-v2 --bucket "my-awesome-bucket" --query 'sort_by(Contents, &LastModified)[-1].Key' --output=text

我们可以通过

reverse

 从列表中获取最后一个条目，而不是额外的

[-1]

函数

旧答案

这个命令只是完成工作，没有任何外部依赖：

aws s3api list-objects-v2 --bucket "my-awesome-bucket" --query 'reverse(sort_by(Contents, &LastModified))[:1].Key' --output=text

Answer 3

aws s3api list-objects-v2 --bucket "bucket-name" |jq  -c ".[] | max_by(.LastModified)|.Key"

Answer 4

如果这是新上传的文件，您可以使用Lambda在新的S3对象上执行一段代码。

如果您确实需要获取最新的文件，您可以先用日期命名文件，按名称排序，然后获取第一个对象。

Answer 5

这是我的拍摄，它是一个 CLI 助手，您只需要使用 python 运行它，但您可以用任何其他语言实现相同的逻辑，它只是自动执行手动逐个存储桶检查的工作

import subprocess

# Define your bucket names here
buckets = ['development', 'human-resources', 'marketing']
last_backups = {}

# Print a message to ignore errors during the process
print("Ignore any errors during the process.")

# Prompt the user to choose between backup sizes and last backup dates
option = input("- Backup sizes - 1\n- Last backup dates - 2\n->> ")
command = ''

# Determine the appropriate command based on user input
if option == '1':
    command = 'tail -2'
elif option == '2':
    command = 'head -1'

# Retrieve backup information for each bucket
for bucket in buckets:
    # Execute AWS command to get backup information
    process = subprocess.Popen(
        f"aws s3 ls s3://{bucket}/ --recursive --human-readable --summarize | {command}",
        shell=True,
        stdout=subprocess.PIPE,
        stderr=subprocess.PIPE
    )
    # Read the output of the command
    output, _ = process.communicate()
    # Store the output in the dictionary
    last_backups[bucket] = output.decode()

# Print the backup information for each bucket
for bucket, backup_info in last_backups.items():
    # Print the last backup date
    if option == '2':
        print(f" - {bucket}: {backup_info[:20]}\n")
    # Print the backup sizes and object counts
    elif option == '1':
        print(f" - {bucket}: {backup_info}\n")

Answer 6

以下是 bash 脚本，用于从 S3 存储桶下载最新文件。我改用 AWS S3 Synch 命令，这样它就不会从 S3 下载文件（如果已经存在）。

--exclude，排除所有文件

--include，包含所有与该模式匹配的文件

#!/usr/bin/env bash

    BUCKET="s3://my-s3-bucket-eu-west-1/list/"
    FILE_NAME=`aws s3 ls $BUCKET  | sort | tail -n 1 | awk '{print $4}'`
    TARGET_FILE_PATH=target/datdump/
    TARGET_FILE=${TARGET_FILE_PATH}localData.json.gz

    echo $FILE_NAME
    echo $TARGET_FILE

    aws s3 sync $BUCKET $TARGET_FILE_PATH --exclude "*" --include "*$FILE_NAME*"

    cp target/datdump/$FILE_NAME $TARGET_FILE

附注谢谢@大卫穆雷

使用 AWS CLI 从 S3 获取最后修改的对象

问题描述投票：0回答：6

6个回答

更新答案

旧答案

最新问题

使用 AWS CLI 从 S3 获取最后修改的对象

问题描述 投票：0回答：6

6个回答

更新答案

旧答案

最新问题

问题描述投票：0回答：6