我有这段代码可以将 S3 对象回滚到特定版本,但我正在使用的方法中没有“key”选项,只有“prefix”。这是一个问题,因为在这个示例中,我最终将删除名为“问题副本”的对象的所有版本。所以如你所见,我必须在 python 中进行过滤,这似乎效率较低。
import boto3
import logging
from operator import attrgetter
logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)
logger.addHandler(logging.StreamHandler())
def rollback_object(bucket, object_key, version_id):
"""
Rolls back an object to an earlier version by deleting all versions that
occurred after the specified rollback version.
Usage is shown in the usage_demo_single_object function at the end of this module.
:param bucket: The bucket that holds the object to roll back.
:param object_key: The object to roll back.
:param version_id: The version ID to roll back to.
"""
# Versions must be sorted by last_modified date because delete markers are
# at the end of the list even when they are interspersed in time. (This is because
# when we use the builtin sorted method, it ensures this to be true. [if we sort ascending, then
# the delete markers would always be on top])
versions = sorted(
bucket.object_versions.filter(Prefix=object_key), #note "prefix" means will delete "questions copy" if newer than version we are rolling back to.
key=attrgetter("last_modified"), #note this code will also delete the delete marker if it is newer than the version we are rolling back to.
reverse=True,
)
filtered_versions = [v for v in versions if v.key == object_key]
logger.debug(
"Got versions:\n%s",
"\n".join(
[
f"\t{version.version_id}, last modified {version.last_modified}"
for version in filtered_versions
]
),
)
if version_id in [ver.version_id for ver in filtered_versions]:
print(f"Rolling back to version {version_id}")
for version in filtered_versions:
if version.version_id != version_id:
version.delete()
print(f"Deleted version {version.version_id}")
else:
break
print(f"Active version is now {bucket.Object(object_key).version_id}")
else:
raise KeyError(
f"{version_id} was not found in the list of versions for " f"{object_key}."
)
if __name__ == '__main__':
mybucket = boto3.resource('s3').Bucket('scottedwards2000')
result = rollback_object(mybucket, 'questions', 'RQY0ebFXtUnm.A48N2I62CEmdu2QZGEO')
print(result)
那么,情况似乎是:
.object_versions
(或用于客户端调用的 list_object_versions()
),它返回存储桶中所有对象的版本,但 可以过滤Prefix
你问的是效率。由于您的代码编写为仅回滚单个对象,因此实际上不可能减少对 AWS 的 API 调用数量。如果存储桶中的ALL版本被检索once,然后您可以根据返回的数据确定版本,那么它可能会更“高效”。同样,使用
delete_objects()
在一次 API 调用中删除多个对象版本可能比“每个要删除的对象版本一次 API 调用”更有效。
至于列表操作的效率,它运行得很快,因此改变它并没有真正的好处。
我注意到您对
filtered_versions
的检查避免了诸如 foo/lunch
的键也匹配 foo/lunchtime
之类的情况。很高兴您意识到了潜在的问题。
可以被认为“更有效”的“替代方法”是“将所需的先前版本复制到同一个密钥”,这将导致该数据成为对象的“当前”版本。因此,您不是“删除该版本以来的内容”,而是“将该版本复制为当前版本”。这样,您就永远不会丢失任何版本,甚至可以稍后“回滚”到新版本!