在 Python 中使用元素树解析错误 XML

问题描述 投票:0回答:1

Python 新手,在从 URL 源转换为 XML 时遇到问题。尝试了很多方法来修复代码,但卡住了。任何建议都会非常有帮助!

下面的程序错误在'xtree = ER.parse(fhand)'

错误:

Traceback (most recent call last): File "C:\Users\Simeon\Desktop\Py4e\ex13_1.py", line 12, in <module> xtree = ET.parse(fhand) File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.2800.0_x64__qbz5n2kfra8p0\lib\xml\etree\ElementTree.py", line 1222, in parse tree.parse(source, parser) File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.2800.0_x64__qbz5n2kfra8p0\lib\xml\etree\ElementTree.py", line 580, in parse self._root = parser._parse_whole(source) xml.etree.ElementTree.ParseError: no element found: line 1, column 0

代码:

import xml.etree.ElementTree as ET
import urllib.request
from urllib.request import urlopen

fhand = urllib.request.urlopen('https://py4e-data.dr-chuck.net/comments_42.xml')
sum = 0
for line in fhand: 
    sum = sum + len(line)
print(sum)

xtree = ET.parse(fhand)
xroot = xtree.getroot()

xlist = xtree.findall()
print(len(xlist))
lst = xtree.findall('comments/comment')
print('count:' , len(lst))

flist = [] 
for item in lst: 
    num = item.find('count').text
    flist.append(num) 

for i in range(0, len(flist)): 
    flist[i] = int(flist[i]) 

print(sum(flist))`

尝试转换为字符串,但 ER.parse 需要一个类似字节的对象。我也有很多 httpresponse 错误,我不确定为什么

python xml elementtree
1个回答
0
投票

fhand
是一个文件处理程序。当您计算响应的长度时,您将光标移动到文件末尾。所以你没有更多要解析的......你必须使用
fhand.seek(0)
在文件开头寻找指针但是你可以使用
HTTPResponse
的标题做得更好。

重写代码:

import xml.etree.ElementTree as ET
import urllib.request
from urllib.request import urlopen

fhand = urllib.request.urlopen('https://py4e-data.dr-chuck.net/comments_42.xml')
print(f"Content-Length: {fhand.getheader('Content-Length')}")

xtree = ET.parse(fhand)
xroot = xtree.getroot()

lst = xtree.findall('.//comment')
print(f"Count: {len(lst)}")

flist = [int(item.find('count').text) for item in lst]
print(f"Sum: {sum(flist)}")

输出:

Content-Length: 4189
Count: 50
Sum: 2553
© www.soinside.com 2019 - 2024. All rights reserved.