Udacity:无法在Ud120项目中下载数据集“enron_mail_20150507.tar.gz”

问题描述 投票:0回答:1

我无法通过“python startup.py”下载“enron_mail_20150507.tar.gz”。我有以下错误,不知道如何解决。

    downloading the Enron dataset (this may take a while)
    to check on progress, you can cd up one level, then execute <ls -lthr>
    Enron dataset should be last item on the list, along with its current 
    size
    download will complete at about 423 MB
    Traceback (most recent call last):
    File "startup.py", line 36, in
    urllib.urlretrieve(url, filename="../enron_mail_20150507.tar.gz")
    File "C:\Python27\lib\urllib.py", line 98, in urlretrieve
    return opener.retrieve(url, filename, reporthook, data)
    File "C:\Python27\lib\urllib.py", line 245, in retrieve
    fp = self.open(url, data)
    File "C:\Python27\lib\urllib.py", line 213, in open
    return getattr(self, name)(url)
    File "C:\Python27\lib\urllib.py", line 350, in open_http
    h.endheaders(data)
    File "C:\Python27\lib\httplib.py", line 1049, in endheaders
    self._send_output(message_body)
    File "C:\Python27\lib\httplib.py", line 893, in _send_output
    self.send(msg)
    File "C:\Python27\lib\httplib.py", line 855, in send
    self.connect()
    File "C:\Python27\lib\httplib.py", line 832, in connect
    self.timeout, self.source_address)
    File "C:\Python27\lib\socket.py", line 557, in create_connection
    for res in getaddrinfo(host, port, 0, SOCK_STREAM):
    IOError: [Errno socket error] [Errno 11001] getaddrinfo failed

我尝试将“startup.py”中的URL更改为“http://www.cs.cmu.edu/~enron/enron_mail_20150507.tar.gz”,但它也不起作用。如果有人在WINDOW上使用python下载它,请告诉我如何。我真的很感激。

无论如何,我尝试手动下载它,但即使在下载了1.1 GB的文件后文件也会继续下载。所以,我害怕并阻止它......哈哈XD。 “enron_mail_20150507.tar.gz”文件有多大?下载后我把文件放在哪里?在ud120项目中?

请帮我。我卡住了。

python machine-learning artificial-intelligence naivebayes
1个回答
0
投票

问题是解决的。我通过starup.py中的链接手动下载,文件大小为1.69 G(压缩)和2.23 G(解压缩)。

© www.soinside.com 2019 - 2024. All rights reserved.