使用colab笔记本从公共谷歌驱动器中提取Zip文件

问题描述 投票:0回答:1

我想从公共谷歌驱动器文件夹下载数据集(另存为 zip)。

 url = https://drive.google.com/drive/folders/1TzwfNA5JRFTPO-kHMU___kILmOEodoBo

由于我希望其他人可以复制它,因此我不想将其复制到我的驱动器(最好也不要将我的驱动器安装在笔记本中)。

如何才能做到?

到目前为止我尝试过:

import requests
import io
import zipfile

zip_url = 'https://drive.google.com/file/d/1fdFu5NGXe4rTLYKD5wOqk9dl-eJOefXo'

response = requests.get(zip_url)
file_contents = io.BytesIO(response.content)
print(file_contents)
with zipfile.ZipFile(file_contents, 'r') as zip_ref:
    zip_ref.extractall('/content/')  # Replace with your desired extraction path

但出现此错误(并之前打印“file_contents”):

<_io.BytesIO object at 0x7ad7efbf27f0>
---------------------------------------------------------------------------
BadZipFile                                Traceback (most recent call last)
<ipython-input-18-56d2c8f2bfe8> in <cell line: 14>()
     12 print(file_contents)
     13 # Extract the zip file (if needed)
---> 14 with zipfile.ZipFile(file_contents, 'r') as zip_ref:
     15     zip_ref.extractall('/content/')  # Replace with your desired extraction path

1 frames
/usr/lib/python3.10/zipfile.py in _RealGetContents(self)
   1334             raise BadZipFile("File is not a zip file")
   1335         if not endrec:
-> 1336             raise BadZipFile("File is not a zip file")
   1337         if self.debug > 1:
   1338             print(endrec)

BadZipFile: File is not a zip file

如果我尝试以下方法,我会得到一个空的 zip 文件:

file_id = '1fdFu5NGXe4rTLYKD5wOqk9dl-eJOefXo'
download_url = f'https://drive.google.com/uc?export=download&id={file_id}'
!wget --no-check-certificate -O '/content/file.zip' 'https://drive.google.com/uc?export=download&id=1fdFu5NGXe4rTLYKD5wOqk9dl-eJOefXo'

如有任何帮助,我们将不胜感激。

python jupyter-notebook google-drive-api dataset google-colaboratory
1个回答
0
投票

返回的内容类型是 html,而不是 zip 文件。

import requests
import io

file_id = '1fdFu5NGXe4rTLYKD5wOqk9dl-eJOefXo'
download_url = f'https://drive.google.com/uc?export=download&id={file_id}'
response = requests.get(download_url)
print(response.headers.get("Content-Type"))

这应该告诉您服务器返回的内容。在本例中,它的 text/html 不是 zip 文件。

检查 url 是否指向实际的 zip 文件。

© www.soinside.com 2019 - 2024. All rights reserved.