在 Azure ADLS 中提取受密码保护的 zip 文件

问题描述 投票:0回答:1

我有一个 zip 文件,在 azure ADLS 容器中受密码保护。 想要解压它并将文件移动到不同的容器

我尝试过复制活动。它给出了错误。

"errorCode": "2200",
"message": "ErrorCode=UserErrorUnzipInvalidFile,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=The file 'Myriad_DataConnect_AON_04-16-2024.zip' is not a valid Zip file with Deflate compression method.,Source=Microsoft.DataTransfer.ClientLibrary,''Type=System.IO.InvalidDataException,Message=The archive entry was compressed using an unsupported compression method.,Source=System.IO.Compression,'",
"failureType": "UserError",
"target": "cpy-SFTP-ADLS-myriad",
"details": []

请给我解决方案。

azure azure-data-factory
1个回答
0
投票

根据这个

目前数据工厂不支持读取和生成受密码保护的文件。

另一种解决方法是使用 Azure 函数解压缩受密码保护的 zip 文件。

使用以下代码创建 HTTP 触发器 Azure 函数:

import azure.functions as func
import uuid
import os
import shutil
from azure.storage.blob import ContainerClient
from zipfile import ZipFile

storageAccountConnstr = '<storage account conn str>'
container = '<container name>'

# Define local temp path; on Azure, the path is recommended under /home 
tempPathRoot = 'd:/temp/'
unZipTempPathRoot = 'd:/unZipTemp/'


def main(req=func.HttpRequest) -> func.HttpResponse:
    reqBody = req.get_json()
    fileName = reqBody['fileName']
    zipPass = reqBody['password']

    container_client = ContainerClient.from_connection_string(storageAccountConnstr, container)

    # Download zip file 
    zipFilePath = tempPathRoot + fileName
    with open(zipFilePath, "wb") as my_blob:
        download_stream = container_client.get_blob_client(fileName).download_blob()
        my_blob.write(download_stream.readall())

    # Unzip to temp folder
    unZipTempPath = unZipTempPathRoot + str(uuid.uuid4())
    with ZipFile(zipFilePath) as zf:
        zf.extractall(path=unZipTempPath, pwd=bytes(zipPass, 'utf8'))

    # Upload all files in temp folder
    for root, dirs, files in os.walk(unZipTempPath):
        for file in files: 
            filePath = os.path.join(root, file)
            destBlobClient = container_client.get_blob_client(fileName + filePath.replace(unZipTempPath, ''))
            with open(filePath, "rb") as data:
                destBlobClient.upload_blob(data, overwrite=True)
    
    # Remove all temp files 
    shutil.rmtree(unZipTempPath)
    os.remove(zipFilePath)

    return func.HttpResponse("done")

Azure 数据工厂运行它。欲了解更多信息,您可以参考这个SO答案

© www.soinside.com 2019 - 2024. All rights reserved.