我在文件读取时遇到问题,当在python中读取文件(tex文件)时,文件内容中的字符之间引入了空格(参见屏幕截图),但实际上在输入中没有空格。我通过python打开utf-8格式的文件,打印内容时显示空格字符,如何纠正这个问题
Python代码
open_TeX = open(file_name,'r',encoding="utf-8")
tex_cnt = open_TeX.read()
open_TeX.close
文件内容
\begin{document}
\frontmatter%
\include{00-Meyer-Prelims}%
\mainmatter%
\include{01-Meyer-Ch01}%
\include{02-Meyer-Ch02}%
\include{03-Meyer-Ch03}%
\include{04-Meyer-Ch04}%
\include{05-Meyer-Ch05}%
\include{06-Meyer-Ch06}%
\include{07-Meyer-Ch07}%
\include{08-Meyer-Ch08}%
\backmatter
\include{09-Meyer-Appendix}%
\include{10-Meyer-Reference}%
\include{12-Meyer-Index}%
\include{11-Meyer-Seriespages}%
\end{document}
输出:
确保文件确实采用 UTF-8 编码。如果文件采用不同的编码,您可能会看到意外的字符。您可以使用
chardet
库自动检测文件编码
示例:
import chardet
with open(file_name, 'rb') as f:
result = chardet.detect(f.read())
print(f"Detected encoding: {result['encoding']}")
当我尝试复制您的问题时,一切似乎都很好。我能够获得相同的 TeX 文件内容。
我建议您对 TeX 文件内容使用 .replace 和/或 .strip 方法,如下所示:
open_TeX = open("f.tex",'r',encoding="utf-8")
tex_cnt = open_TeX.read().strip(" ") # Removes spaces
# Or you could also do text_cnt = open_TeX.read().replace(" ", ""), which replaces spaces with nothing.
open_TeX.close()
print(tex_cnt)