[我正在尝试在python中开发脚本,以从称为“源”的blob存储容器中读取.xlsx中的文件,将其转换为.csv并将其存储在新容器中(我正在本地测试脚本,如果可以,我应该将其包含在ADF管道中)。 Sofar我设法访问了Blob存储,但是在读取文件内容时遇到了问题。
from azure.storage.blob import BlobServiceClient, ContainerClient, BlobClient
import pandas as pd
conn_str = "DefaultEndpointsProtocol=https;AccountName=XXXXXX;AccountKey=XXXXXX;EndpointSuffix=core.windows.net"
container = "source"
blob_name = "prova.xlsx"
container_client = ContainerClient.from_connection_string(
conn_str=conn_str,
container_name=container
)
# Download blob as StorageStreamDownloader object (stored in memory)
downloaded_blob = container_client.download_blob(blob_name)
df = pd.read_excel(downloaded_blob)
print(df)
我收到以下错误:
ValueError: Invalid file path or buffer object type: <class 'azure.storage.blob._download.StorageStreamDownloader'>
我尝试使用.csv文件作为输入并按如下方式编写解析代码
df = pd.read_csv(StringIO(downloaded_blob.content_as_text()) )
并且有效。
关于如何修改代码以使Excel文件可读的任何建议?