UnicodeDecodeError:“utf-8”编解码器无法解码位置 355 中的字节 0xbf:无效的起始字节

问题描述 投票:0回答:1

我一直在尝试使用以下代码迭代 csv 文件:

`

import csv
import os, sys

directory = "/Users/aliharam/Desktop/Lamis File"
files = []
for filename in os.listdir(directory):
    f = os.path.join(directory, filename)
    # checking if it is a file
    if os.path.isfile(f):
        files.append(f)
files.pop()

for i in files:
    with open(i, 'r') as csvfile:
        datareader = csv.reader(csvfile)
        for row in datareader:
            print(row)

`

这是我收到的错误:

Traceback (most recent call last):
  File "/Users/aliharam/PycharmProjects/LamisTasks/Normalization.py", line 16, in <module>
    for row in datareader:
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbf in position 355: invalid start byte
['\tAli Haram                                             \tAli Haram                                             ']

Process finished with exit code 1

我该如何解决这个问题?!!

我尝试使用

dataset = pd.read_csv(i, header= 0,
                          encoding= 'unicode_escape')

with io.open(filename, 'r', encoding='utf-8') as fn:
  lines = fn.readlines()

两者都不起作用

python csv
1个回答
-1
投票

您的程序读取的文件包含不属于 Unicode 的字符(位置 355)。

如果我们假设您正在读取 Unicode 编码的文件,则您的数据文件中存在错误。首先,您需要确保您的程序读取的文件是否以 Unicode 编码。

© www.soinside.com 2019 - 2024. All rights reserved.