é
来自什么字符集?在Windows记事本中,在ANSI文本文件中包含此字符可以节省罚款。插入😍
之类的内容,将会出现错误。 é
似乎在Putty的ASCII终端(CP437和IBM437是否相同?)中工作正常,而😍
则不行。
我可以看到😍
是Unicode,而不是ASCII。但是什么是é
?它不会产生我在记事本中使用Unicode时遇到的错误,但是在我按SyntaxError: Non-ASCII character '\xc3' in file on line , but no encoding declared;
的建议添加“魔术注释”之前,Python抛出了Python NLTK: SyntaxError: Non-ASCII character '\xc3' in file (Sentiment Analysis -NLP)。
我添加了“魔术注释”,没有得到该错误,但是os.path.isfile()表示不存在带有é
的文件名。具有讽刺意味的是,字符é
位于Marc-André Lemburg
中,错误链接到该PEP的作者。
编辑:如果我打印文件的路径,带重音符号的e显示为├⌐
,但是我可以将é
复制并粘贴到命令提示符中。
EDIT2:参见下文
Private > cat scratch.py ### LOL cat scratch :3 # coding=utf-8 file_name = r"Filéname" file_name = unicode(file_name) Private > python scratch.py Traceback (most recent call last): File "scratch.py", line 3, in <module> file_name = unicode(file_name) UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 3: ordinal not in range(128) Private >
EDIT3:
Private > PS1="Private > " ; echo code below ; cat scratch.py ; echo ======= ; echo output below ; python scratch.py
code below
# -*- coding: utf-8 -*-
file_name = r"Filéname"
file_name = unicode(file_name, encoding="utf-8")
# I have code here to determine a path depending on the hostname of the
# machine, the folder paths contain no Unicode characters, for my debug
# version of the script, I will hardcode the redacted hostname.
hostname = "One"
if hostname == "One":
folder = "C:/path/folder_one"
elif hostname == "Two":
folder = "C:/path/folder_two"
else:
folder = "C:/path/folder_three"
path = "%s/%s" % (folder, file_name)
path = unicode(path, encoding="utf-8")
print path
=======
output below
Traceback (most recent call last):
File "scratch.py", line 18, in <module>
path = unicode(path, encoding="utf-8")
TypeError: decoding Unicode is not supported
Private >
é来自什么字符集?在Windows记事本中,在ANSI文本文件中包含此字符可以节省罚款。插入诸如😍之类的内容,将会出现错误。 é似乎在ASCII终端中的工作正常,...
[您需要告诉unicode
字符串是什么编码,在这种情况下,它是utf-8
而不是ascii
,文件头应该是# -*- coding: utf-8 -*-
,Encoding Declarations