以下代码通过将 S3 拉入内存来创建 zip 文件,并将最终产品写入磁盘上的文件。然而,观察者发现它在创建 zip 时损坏了几个文件(数千个)。我已经检查过,在此过程中损坏的文件没有任何问题,因为相同的文件通过其他方式正确压缩。有什么建议可以微调代码吗?
代码:
public static async Task S3ToZip(List<string> pdfBatch, string zipPath, IAmazonS3 s3Client)
{
FileStream fileStream = new FileStream(zipPath, FileMode.OpenOrCreate, FileAccess.ReadWrite, FileShare.ReadWrite);
using (ZipArchive archive = new ZipArchive(fileStream, ZipArchiveMode.Update, true))
{
foreach (var file in pdfBatch)
{
GetObjectRequest request = new GetObjectRequest
{
BucketName = "sample-bucket",
Key = file
};
using GetObjectResponse response = await s3Client.GetObjectAsync(request);
using Stream responseStream = response.ResponseStream;
ZipArchiveEntry zipFileEntry = archive.CreateEntry(file.Split('/')[^1]);
using Stream zipEntryStream = zipFileEntry.Open();
await responseStream.CopyToAsync(zipEntryStream);
zipEntryStream.Seek(0, SeekOrigin.Begin);
zipEntryStream.CopyTo(fileStream);
}
archive.Dispose();
fileStream.Close();
}
}
不要明确调用
Dispose()
或 Close()
,让 using
完成所有工作。而且您不需要向 fileStream
写入任何内容,只需向 ZipArchiveEntry
stream 写入即可。您还需要使用 FileMode.Create
来保证您的文件在写入之前始终被截断。此外,由于您只创建存档而不更新它,因此您应该使用 ZipArchiveMode.Create
来启用内存高效流(感谢 @canton7 对 zip 存档格式的详细信息进行深入研究)。
public static async Task S3ToZip(List<string> pdfBatch, string zipPath, IAmazonS3 s3Client)
{
using FileStream fileStream = new FileStream(zipPath, FileMode.Create, FileAccess.ReadWrite, FileShare.ReadWrite);
using ZipArchive archive = new ZipArchive(fileStream, ZipArchiveMode.Create, true);
foreach (var file in pdfBatch)
{
GetObjectRequest request = new GetObjectRequest
{
BucketName = "sample-bucket",
Key = file
};
using GetObjectResponse response = await s3Client.GetObjectAsync(request);
using Stream responseStream = response.ResponseStream;
ZipArchiveEntry zipFileEntry = archive.CreateEntry(file.Split('/')[^1]);
using Stream zipEntryStream = zipFileEntry.Open();
await responseStream.CopyToAsync(zipEntryStream);
}
}