UnicodeDecodeError:“utf-8”编解码器无法解码位置 122 中的字节 0xe9:无效的连续字节

问题描述 投票:0回答:1

当我在文本框中输入字母时,avater 会移动,而当我写文本时,它不会移动。

def _decode_stdoutdata(stdoutdata):
    """ Convert data read from stdout/stderr to unicode """
    if not isinstance(stdoutdata, bytes):
        return stdoutdata

    encoding = getattr(sys.__stdout__, "encoding", locale.getpreferredencoding())
    if encoding is None:
        return stdoutdata.decode()
    return stdoutdata.decode(encoding)

在终端附件中

文字是你好

Traceback (most recent call last):
  File "C:\Users\lenovo\Desktop\folder\text_to_isl-main\venv\lib\site-packages\flask\app.py", line 2070, in wsgi_app
    response = self.full_dispatch_request()
  File "C:\Users\lenovo\Desktop\folder\text_to_isl-main\venv\lib\site-packages\flask\app.py", line 1515, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "C:\Users\lenovo\Desktop\folder\text_to_isl-main\venv\lib\site-packages\flask\app.py", line 1513, in full_dispatch_request
    rv = self.dispatch_request()
  File "C:\Users\lenovo\Desktop\folder\text_to_isl-main\venv\lib\site-packages\flask\app.py", line 1499, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args)
  File "C:\Users\lenovo\Desktop\folder\text_to_isl-main\main.py", line 375, in flask_test
    take_input(text)
  File "C:\Users\lenovo\Desktop\folder\text_to_isl-main\main.py", line 321, in take_input
    convert(some_text);
  File "C:\Users\lenovo\Desktop\folder\text_to_isl-main\main.py", line 330, in convert
    word_list[i]=reorder_eng_to_isl(words)
  File "C:\Users\lenovo\Desktop\folder\text_to_isl-main\main.py", line 256, in reorder_eng_to_isl
    possible_parse_tree_list = [tree for tree in parser.parse(input_string)]
  File "C:\Users\lenovo\Desktop\folder\text_to_isl-main\venv\lib\site-packages\nltk\parse\api.py", line 47, in parse
    return next(self.parse_sents([sent], *args, **kwargs))
  File "C:\Users\lenovo\Desktop\folder\text_to_isl-main\venv\lib\site-packages\nltk\parse\stanford.py", line 138, in parse_sents
    self._execute(
  File "C:\Users\lenovo\Desktop\folder\text_to_isl-main\venv\lib\site-packages\nltk\parse\stanford.py", line 260, in _execute
    stdout, stderr = java(
  File "C:\Users\lenovo\Desktop\folder\text_to_isl-main\venv\lib\site-packages\nltk\internals.py", line 139, in java
    print(_decode_stdoutdata(stderr))
  File "C:\Users\lenovo\Desktop\folder\text_to_isl-main\venv\lib\site-packages\nltk\internals.py", line 868, in _decode_stdoutdata
    return stdoutdata.decode(encoding)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 122: invalid continuation byte

我将输入更改为英语 utf-8

python windows visual-studio-code character-encoding
1个回答
0
投票

使用像

chardet
这样的包会更好地猜测编码,至少出于调试目的,并且对于整个代码来说也是可能的。

线路:

getattr(sys.__stdout__, "encoding", locale.getpreferredencoding())
应替换为:
chardet.detect(sys.__stdout__
它返回一组
dict
的编码,每个编码都有“置信度”,只需选择置信度最高的一个即可

© www.soinside.com 2019 - 2024. All rights reserved.