lxml 相关问题

lxml是一个功能齐全的高性能Python库,用于处理XML和HTML。

在Python中从给定的html获取所有xpath列表的最佳方法是什么?

我希望从Python中的给定html中获取所有xpath的列表。我当前的实现仅使用 lxml 库为我提供相对 xpath。我需要 xpath 来使用 ids 和其他

回答 1 投票 0

无法在 Python 3.7 中导入名称“etree”,我怎样才能让它工作?

我正在学习《用 Python 自动化无聊的事情》第 13 章,但不知道如何让 python-docx 模块工作。当我尝试导入它时 ImportError: Cannot import name 'etree...

回答 1 投票 0

在 Python 中使用 request 和 LXML 抓取网站

我正在尝试抓取此网站以获取图块和正文内容(“说明”和“功能”)以及链接到该页面的 pdf。当我尝试使用 Xpath 获取文本时,我...

回答 1 投票 0

有条件地用Python替换XML(Word文档)中的节点?

如何根据附近标签的内容替换 xml? 我有一个很长的 Word 文档,其中包含许多开发人员内容字段,特别是下拉列表。我想改变选择...

回答 1 投票 0

如何在Python中使用lxml将包含XML的字符串作为XML插入(追加)到内部XML(或删除父标签但保留内容)?

我正在尝试使用 lxml 将文本插入到 XML 中。该文本包含 XML,它应该成为它所插入的 XML 的一部分。 以下代码不起作用: 从 lxml 导入 etree 树=

回答 1 投票 0

使用 Python 操作 xml

我正在使用Python 3.9。我有一个像这样的嵌套 xml 文档字符串 负载_xml =“”“ ... ... ... 我使用的是Python 3.9。我有一个像这样的嵌套 xml 文档字符串 payload_xml = """ <AllData> <MyPayload> ... ... ... </MyPayload> </AllData> """ 现在我想创建另一个父 xml 文档字符串,并将此有效负载提取到新创建的 xml 文档中,如下所示 新 XML <Full_Message prop1="" prop2=""> <Header> <headerValue1> </headerValue1> <headerValue2> </headerValue2> <headerValue3> </headerValue3> <NestedValues> <someval1> </someval> </NestedValues> </Header> <Body> <!--Insert MyPayload xml string here ignoring AllData node--> </Body> </Full_Message> 这是我目前所在的位置 from lxml import etree FullMessage_root = etree.Element("Full_Message") AllData_root = etree.fromstring(payload_xml) payload_only = AllData_root[0] FullMessage_root.append(payload_only) FullMessage_root.insert(0, etree.Element("Header")) FullMessage_root.insert(1, etree.Element("Body")) FullMessage_root.attrib['prop1']='hello world' 这会导致: <Full_Message prop1="hello world"> <Header/> <Body/> <MyPayload> </MyPayload> </Full_Message> 如何将 <MyPayload> 嵌套在 <Body> 标签中并在 <Header> 中创建多个嵌套值? 以下是实现您目标的一种方法。我们从上到下逐层创建新的 XML,并使用 append 将子元素附加到父元素。 import xml.etree.ElementTree as ET def create_elements(parent_ele, child_tags, child_vals): for tag, val in zip(child_tags, child_vals): ele = ET.Element(tag) if val: ele.text = val parent_ele.append(ele) payload_xml = ''' <AllData> <MyPayload> Foo </MyPayload> </AllData> ''' # Create root root = ET.Element('Full_Message') root.set('prop1', 'some prop') root.set('prop2', 'other prop') # Add elements to root create_elements( root, ['Header', 'Body'], [None] * 2, # no text value attached to header and body ) # Add elements to header create_elements( root.find('Header'), ['headerValue1', 'headerValue2', 'headerValue3', 'NestedValues'], ['val1', 'val2', 'val3', None], # note that no text value attached to NestedValues ) # Add elements to NestedValues create_elements( root.find('Header').find('NestedValues'), ['someval1', 'someval2', 'someval3'], ['nested val1', 'nested val2', 'nested val3'], ) # Insert payload AllData_root = ET.ElementTree(ET.fromstring(payload_xml)).getroot() root.find('Body').append(AllData_root) # print the new XML ET.indent(root) print(ET.tostring(root, encoding='unicode')) 输出将是 <Full_Message prop1="some prop" prop2="other prop"> <Header> <headerValue1>val1</headerValue1> <headerValue2>val2</headerValue2> <headerValue3>val3</headerValue3> <NestedValues> <someval1>nested val1</someval1> <someval2>nested val2</someval2> <someval3>nested val3</someval3> </NestedValues> </Header> <Body> <AllData> <MyPayload> Foo </MyPayload> </AllData> </Body> </Full_Message>

回答 1 投票 0

使用lxml库解析xliff文件

我无法解析这个 xliff 片段: 文字1 文字2 文字3 文字4 我想要一个迭代方法...

回答 1 投票 0

回答 1 投票 0

Python lxml 解析器不会返回 <p> 元素的整个文本(如果其中有 <xref>)继续教育活动

我正在尝试使用 lxml 从 .xml 格式文章中的所有 元素中提取文本。这是文章的示例: 继续教育活动 <... 我正在尝试使用 lxml 从 .xml 格式文章中的所有 <p> 元素中提取文本。这是文章的示例: <title>Continuing Education Activity</title> <p>Ebstein anomaly is a rare congenital heart disease that involves the apical displacement of the tricuspid valve with adherence of the septal and posterior leaflets to the myocardium and &#x00022;atrialization of the inlet portion of the right ventricle&#x00022;. It is usually accompanied by tricuspid regurgitation, right ventricular failure, and arrhythmias. Clinical manifestations range from asymptomatic to severe, depending on the degree of tricuspid valve displacement and severity of regurgitation, the effective right ventricular volume, and the associated malformations (i.e., pulmonary valve stenosis, atresia, atrial septal defect, etc.). Arrhythmias are common and protracted due to the likelihood of having accessory pathways, in addition to having right atrial dilatation. Symptomatic patients can present with cyanosis, congestive heart failure, and arrhythmias, with exertional dyspnea being common in older patients. This activity reviews the pathophysiology and presentation of Ebstein's malformation and highlights the role of the interprofessional team in its management.</p> <p> <bold>Objectives:</bold> <list list-type="bullet"><list-item><p>Describe the pathophysiology of Ebstein anomaly.</p></list-item><list-item><p>Review the clinical presentation of a patient with an Ebstein anomaly.</p></list-item><list-item><p>Outline the approach to the management of patients with Ebstein anomaly, including the indications for non-surgical and surgical intervention.</p></list-item><list-item><p>Summarize the importance of improving care coordination among interprofessional team members to improve outcomes for patients affected by Ebstein anomaly.</p></list-item></list> <ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="https://www.statpearls.com/account/trialuserreg/?articleid=20850&#x00026;utm_source=pubmed&#x00026;utm_campaign=reviews&#x00026;utm_content=20850">Access free multiple choice questions on this topic.</ext-link> </p> </sec> <sec id="article-20850.s2" sec-type="pubmed-excerpt"> <title>Introduction</title> <p>Ebstein anomaly is a rare congenital abnormality involving the tricuspid valve and the right ventricle (RV),<xref ref-type="bibr" rid="article-20850.r1">[1]</xref>&#x000a0;with an incidence of &#x0003c;1% of congenital heart defects.<xref ref-type="bibr" rid="article-20850.r2">[2]</xref>&#x000a0;It was first described by the pathologist Wilhelm Ebstein in 1866 when he performed an autopsy of a 19-year-old cyanotic&#x000a0;male who had suffered from exertional dyspnea and palpitations and died of a sudden cardiac arrest.<xref ref-type="bibr" rid="article-20850.r3">[3]</xref>&#x000a0;Ebstein anomaly&#x000a0;is defined by the following characteristics:</p> 注意最后一个 <p> 元素如何散布 <xref> 元素作为引文。当我使用以下Python代码提取文本时: import lxml def extract_text(filename): chunks = [] tree = etree.parse('./data/statpearls_NBK430685/' + filename) root = tree.getroot() p_tags = tree.findall('.//p') # list_tags = tree.findall('.//list') # whenever there's a list, include the para above as well as context. for p in p_tags: if p.text is None: continue elif not any(char.isalpha() for char in p.text): # check that there are some alphabetical characters and ignore if there aren't continue chunks.append(p.text) return chunks extract_text('article-20850.nxml') 这是输出: ['Ebstein anomaly is a rare congenital heart disease that involves the apical displacement of the tricuspid valve with adherence of the septal and posterior leaflets to the myocardium and "atrialization of the inlet portion of the right ventricle". It is usually accompanied by tricuspid regurgitation, right ventricular failure, and arrhythmias. Clinical manifestations range from asymptomatic to severe, depending on the degree of tricuspid valve displacement and severity of regurgitation, the effective right ventricular volume, and the associated malformations (i.e., pulmonary valve stenosis, atresia, atrial septal defect, etc.). Arrhythmias are common and protracted due to the likelihood of having accessory pathways, in addition to having right atrial dilatation. Symptomatic patients can present with cyanosis, congestive heart failure, and arrhythmias, with exertional dyspnea being common in older patients. This activity reviews the pathophysiology and presentation of Ebstein\'s malformation and highlights the role of the interprofessional team in its management.', 'Describe the pathophysiology of Ebstein anomaly.', 'Review the clinical presentation of a patient with an Ebstein anomaly.', 'Outline the approach to the management of patients with Ebstein anomaly, including the indications for non-surgical and surgical intervention.', 'Summarize the importance of improving care coordination among interprofessional team members to improve outcomes for patients affected by Ebstein anomaly.', 'Ebstein anomaly is a rare congenital abnormality involving the tricuspid valve and the right ventricle (RV),'] 最后一块完全丢失了 <xref> 标签之后的所有文本。有人知道是什么原因导致这种行为以及如何防止这种情况吗? 我建议使用beautifulsoup库来解析这个HTML/XML混合文件: from bs4 import BeautifulSoup text = """\ <title>Continuing Education Activity</title> <p>Ebstein anomaly is a rare congenital heart disease that involves the apical displacement of the tricuspid valve with adherence of the septal and posterior leaflets to the myocardium and &#x00022;atrialization of the inlet portion of the right ventricle&#x00022;. It is usually accompanied by tricuspid regurgitation, right ventricular failure, and arrhythmias. Clinical manifestations range from asymptomatic to severe, depending on the degree of tricuspid valve displacement and severity of regurgitation, the effective right ventricular volume, and the associated malformations (i.e., pulmonary valve stenosis, atresia, atrial septal defect, etc.). Arrhythmias are common and protracted due to the likelihood of having accessory pathways, in addition to having right atrial dilatation. Symptomatic patients can present with cyanosis, congestive heart failure, and arrhythmias, with exertional dyspnea being common in older patients. This activity reviews the pathophysiology and presentation of Ebstein's malformation and highlights the role of the interprofessional team in its management.</p> <p> <bold>Objectives:</bold> <list list-type="bullet"><list-item><p>Describe the pathophysiology of Ebstein anomaly.</p></list-item><list-item><p>Review the clinical presentation of a patient with an Ebstein anomaly.</p></list-item><list-item><p>Outline the approach to the management of patients with Ebstein anomaly, including the indications for non-surgical and surgical intervention.</p></list-item><list-item><p>Summarize the importance of improving care coordination among interprofessional team members to improve outcomes for patients affected by Ebstein anomaly.</p></list-item></list> <ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="https://www.statpearls.com/account/trialuserreg/?articleid=20850&#x00026;utm_source=pubmed&#x00026;utm_campaign=reviews&#x00026;utm_content=20850">Access free multiple choice questions on this topic.</ext-link> </p> </sec> <sec id="article-20850.s2" sec-type="pubmed-excerpt"> <title>Introduction</title> <p>Ebstein anomaly is a rare congenital abnormality involving the tricuspid valve and the right ventricle (RV),<xref ref-type="bibr" rid="article-20850.r1">[1]</xref>&#x000a0;with an incidence of &#x0003c;1% of congenital heart defects.<xref ref-type="bibr" rid="article-20850.r2">[2]</xref>&#x000a0;It was first described by the pathologist Wilhelm Ebstein in 1866 when he performed an autopsy of a 19-year-old cyanotic&#x000a0;male who had suffered from exertional dyspnea and palpitations and died of a sudden cardiac arrest.<xref ref-type="bibr" rid="article-20850.r3">[3]</xref>&#x000a0;Ebstein anomaly&#x000a0;is defined by the following characteristics:</p> """ soup = BeautifulSoup(text, "html.parser") # remove <xref> to not appear in text for xref in soup.select("xref"): xref.extract() for p in soup.select("p"): print(p.get_text(strip=True, separator=" ")) print("-" * 80) 打印: Ebstein anomaly is a rare congenital heart disease that involves the apical displacement of the tricuspid valve with adherence of the septal and posterior leaflets to the myocardium and "atrialization of the inlet portion of the right ventricle". It is usually accompanied by tricuspid regurgitation, right ventricular failure, and arrhythmias. Clinical manifestations range from asymptomatic to severe, depending on the degree of tricuspid valve displacement and severity of regurgitation, the effective right ventricular volume, and the associated malformations (i.e., pulmonary valve stenosis, atresia, atrial septal defect, etc.). Arrhythmias are common and protracted due to the likelihood of having accessory pathways, in addition to having right atrial dilatation. Symptomatic patients can present with cyanosis, congestive heart failure, and arrhythmias, with exertional dyspnea being common in older patients. This activity reviews the pathophysiology and presentation of Ebstein's malformation and highlights the role of the interprofessional team in its management. -------------------------------------------------------------------------------- Objectives: Describe the pathophysiology of Ebstein anomaly. Review the clinical presentation of a patient with an Ebstein anomaly. Outline the approach to the management of patients with Ebstein anomaly, including the indications for non-surgical and surgical intervention. Summarize the importance of improving care coordination among interprofessional team members to improve outcomes for patients affected by Ebstein anomaly. Access free multiple choice questions on this topic. -------------------------------------------------------------------------------- Describe the pathophysiology of Ebstein anomaly. -------------------------------------------------------------------------------- Review the clinical presentation of a patient with an Ebstein anomaly. -------------------------------------------------------------------------------- Outline the approach to the management of patients with Ebstein anomaly, including the indications for non-surgical and surgical intervention. -------------------------------------------------------------------------------- Summarize the importance of improving care coordination among interprofessional team members to improve outcomes for patients affected by Ebstein anomaly. -------------------------------------------------------------------------------- Ebstein anomaly is a rare congenital abnormality involving the tricuspid valve and the right ventricle (RV), with an incidence of <1% of congenital heart defects. It was first described by the pathologist Wilhelm Ebstein in 1866 when he performed an autopsy of a 19-year-old cyanotic male who had suffered from exertional dyspnea and palpitations and died of a sudden cardiac arrest. Ebstein anomaly is defined by the following characteristics: --------------------------------------------------------------------------------

回答 1 投票 0

为什么我在安装了库的 Azure DevOps 管道中收到“ModuleNotFoundError:没有名为 'lxml' 的模块”错误

我正在尝试运行一个简单的 python 脚本,该脚本在本地运行良好,但在 DevOps 管道中继续遇到相同的错误。我已将库的安装包含在 yaml 文件中并在

回答 1 投票 0

如何在lxml中使用XPath忽略非内容元素

我正在尝试处理一堆 XML 文件,并在满足某些条件时向特定元素添加某些属性。我有相同 XML 文档的不同版本。其中一些有...

回答 1 投票 0

在Python中剥离lxml根标签

给定示例country.xml 文件,我希望将每个国家/地区复制到新的output.xml 文件,作为新根的子元素。问题是当我附加每个国家/地区时,我会得到重复的

回答 1 投票 0

lxml HtmlElement 属性的结构模式匹配

我想使用 PEP 634 – 结构模式匹配来匹配具有特定属性的 HtmlElement。这些属性可通过 .attrib 属性访问,该属性返回

回答 1 投票 0

使用lxml解析包含重复元素的文件

我正在尝试处理 GDML 文件,这不是我所说的平面结构。 而不是 具有单个元件 我...

回答 1 投票 0

Mac OS X 10.9 上的 Python3、lxml 和“未找到符号:_lzma_auto_decoder”

我使用homebrew安装了python 3,然后安装了pip3和lxml。 以下行 从 lxml 导入主菜 导致以下错误: $ python3 Python...

回答 5 投票 0

AWS Lambda Python 3.11:无法导入 lxml:libxslt.so.1:无法打开共享对象文件:没有这样的文件或目录

我在 AWS Lambda 上有一个依赖于 lxml 的 Python 函数。依赖层包含诗歌安装lxml的结果,但我在运行时收到以下错误: “错误消息”:&

回答 1 投票 0

XML 格式问题

我是第一次使用 Salesforce SOAP API,所以我不熟悉 SOAP 格式问题等。我使用 lxml 库生成 XML,但似乎有格式问题...

回答 3 投票 0

我正在尝试安装Scrapy;但是,这是我遇到的错误: Failed Building Wheel for lxml 。请帮忙

遇到错误 lxml 构建轮子失败 src/lxml/etree.c:96:10:致命错误:找不到“Python.h”文件 #include“Python.h” ^~~~~~~~~~ 生成 1 个错误。 错误:无法构建...

回答 2 投票 0

使用 lxml 和 django/python - 列表索引超出范围

我有一个小问题。我正在尝试使用 lxml 从 XML 中提取一些数据,但一直收到“列表索引超出范围”错误,现在我正在尝试获取列表的 [0] 位置,这应该...

回答 1 投票 0

如何修复:引发 ImportError("lxml 未找到,请安装它")

我目前在 Pythonanywhere 上托管我的 python Flask 应用程序。 当我运行我的抓取脚本时,它使用代码 df = pd.read_html(当前数据.内容) 我收到标题中发现的错误。 跑步...

回答 1 投票 0

© www.soinside.com 2019 - 2024. All rights reserved.