使用 Snowpark 框架使用 NLTK 包时 Snowflake 出现错误

问题描述 投票:0回答:1

我正在尝试使用Python在Snowflake上进行文本挖掘,这需要使用NLTK包。但它给了我这样的错误:

Traceback (most recent call last):
  File "nltk/corpus/util.py", line 84, in __load
    root = nltk.data.find(f"{self.subdir}/{zip_name}")
  File "nltk/data.py", line 583, in find
    raise LookupError(resource_not_found)
LookupError: 
**********************************************************************
  Resource [93mwordnet[0m not found.
  Please use the NLTK Downloader to obtain the resource:

  [31m>>> import nltk
  >>> nltk.download('wordnet')
  [0m
  For more information see: https://www.nltk.org/data.html

  Attempted to load [93mcorpora/wordnet.zip/wordnet/[0m

  Searched in:
    - '/home/udf/nltk_data'
    - '/usr/lib/python_udf/60bdac8bd3ab49160e68d1c230dbee4e9fe700de7ef0f3c065830b86356ce449/nltk_data'
    - '/usr/lib/python_udf/60bdac8bd3ab49160e68d1c230dbee4e9fe700de7ef0f3c065830b86356ce449/share/nltk_data'
    - '/usr/lib/python_udf/60bdac8bd3ab49160e68d1c230dbee4e9fe700de7ef0f3c065830b86356ce449/lib/nltk_data'
    - '/usr/share/nltk_data'
    - '/usr/local/share/nltk_data'
    - '/usr/lib/nltk_data'
    - '/usr/local/lib/nltk_data'
**********************************************************************


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  Worksheet, line 19, in main
  Worksheet, line 68, in tokenize
  Worksheet, line 69, in <listcomp>
  File "nltk/stem/wordnet.py", line 45, in lemmatize
    lemmas = wn._morphy(word, pos)
  File "nltk/corpus/util.py", line 121, in __getattr__
    self.__load()
  File "nltk/corpus/util.py", line 86, in __load
    raise e
  File "nltk/corpus/util.py", line 81, in __load
    root = nltk.data.find(f"{self.subdir}/{self.__name}")
  File "nltk/data.py", line 583, in find
    raise LookupError(resource_not_found)
LookupError: 
**********************************************************************

但是,我已经有了

import nltk
nltk.download('stopwords')
from nltk.corpus import stopwords

在我的代码中。那么有人可以帮助我哪里做错了吗?

snowflake-cloud-data-platform nltk
1个回答
0
投票

您没有做错什么,而是不受支持(截至 2023 年 8 月 7 日)。 Snowpark 功能在沙箱中执行,无需通过公共互联网访问外部资源。请继续关注...

我自己还没有尝试过特定场景,但我建议尝试:

  1. 按照其doc中的说明手动下载 NLTK 包。
  2. 然后检查这个KB,它解释了如何使用
    /tmp
    文件夹上传您自己的Python UDF包。

希望有帮助。

© www.soinside.com 2019 - 2024. All rights reserved.