用于从多个UTF8文本文件中的汉字中删除空格的Python代码

问题描述 投票:0回答:1

[我正在尝试在Python 3.7.2中编写Python代码,以删除同一目录中多个UTF8文本文件中所有汉字的空格。

我当前拥有的代码仅适用于1个文件:

import re

with open("transcript 0623.txt") as text:
    new_text = re.sub("(?<![ -~]) (?![ -~])", "", text)
    with open("transcript 0623_out.txt", "w") as result:
        result.write(new_text)

我收到以下错误:

Traceback (most recent call last):
  File "C:\Users\Admin\Desktop\Wave.3\test.py", line 4, in <module>
    new_text = re.sub("(?<![ -~]) (?![ -~])", "", text)
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python37-32\Lib\re.py", line 192, in sub
    return _compile(pattern, flags).sub(repl, string, count)
TypeError: expected string or bytes-like object

您能告诉我有什么问题并帮助我提出对代码的改进吗?谢谢。

python python-3.x
1个回答
0
投票

[open()返回文件对象(来源:https://docs.python.org/3/library/functions.html#open

如果要对文件的内容执行正则表达式操作,则必须在文件对象上使用.read()函数来获取文本内容。

例如,

with open("transcript 0623.txt") as f:

    text = f.read()

    new_text = re.sub("(?<![ -~]) (?![ -~])", "", text)
    with open("transcript 0623_out.txt", "w") as result:
        result.write(new_text)
© www.soinside.com 2019 - 2024. All rights reserved.