Python lxml 解析器不会返回 <p> 元素的整个文本(如果其中有 <xref>)继续教育活动

问题描述 投票:0回答:1
python xml lxml
1个回答
0
投票

我建议使用库来解析这个HTML/XML混合文件:

from bs4 import BeautifulSoup

text = """\
<title>Continuing Education Activity</title>
        <p>Ebstein anomaly is a rare congenital heart disease that involves the apical displacement of the tricuspid valve with adherence of the septal and posterior leaflets to the myocardium and &#x00022;atrialization of the inlet portion of the right ventricle&#x00022;. It is usually accompanied by tricuspid regurgitation, right ventricular failure, and arrhythmias. Clinical manifestations range from asymptomatic to severe, depending on the degree of tricuspid valve displacement and severity of regurgitation, the effective right ventricular volume, and the associated malformations (i.e., pulmonary valve stenosis, atresia, atrial septal defect, etc.). Arrhythmias are common and protracted due to the likelihood of having accessory pathways, in addition to having right atrial dilatation. Symptomatic patients can present with cyanosis, congestive heart failure, and arrhythmias, with exertional dyspnea being common in older patients. This activity reviews the pathophysiology and presentation of Ebstein's malformation and highlights the role of the interprofessional team in its management.</p>
        <p>
<bold>Objectives:</bold>
<list list-type="bullet"><list-item><p>Describe the pathophysiology of Ebstein anomaly.</p></list-item><list-item><p>Review the clinical presentation of a patient with an Ebstein anomaly.</p></list-item><list-item><p>Outline the approach to the management of patients with Ebstein anomaly, including the indications for non-surgical and surgical intervention.</p></list-item><list-item><p>Summarize the importance of improving care coordination among interprofessional team members to improve outcomes for patients affected by Ebstein anomaly.</p></list-item></list>
<ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="https://www.statpearls.com/account/trialuserreg/?articleid=20850&#x00026;utm_source=pubmed&#x00026;utm_campaign=reviews&#x00026;utm_content=20850">Access free multiple choice questions on this topic.</ext-link>
</p>
      </sec>
      <sec id="article-20850.s2" sec-type="pubmed-excerpt">
        <title>Introduction</title>
        <p>Ebstein anomaly is a rare congenital abnormality involving the tricuspid valve and the right ventricle (RV),<xref ref-type="bibr" rid="article-20850.r1">[1]</xref>&#x000a0;with an incidence of &#x0003c;1% of congenital heart defects.<xref ref-type="bibr" rid="article-20850.r2">[2]</xref>&#x000a0;It was first described by the pathologist Wilhelm Ebstein in 1866 when he performed an autopsy of a 19-year-old cyanotic&#x000a0;male who had suffered from exertional dyspnea and palpitations and died of a sudden cardiac arrest.<xref ref-type="bibr" rid="article-20850.r3">[3]</xref>&#x000a0;Ebstein anomaly&#x000a0;is defined by the following characteristics:</p>
"""

soup = BeautifulSoup(text, "html.parser")

# remove <xref> to not appear in text
for xref in soup.select("xref"):
    xref.extract()

for p in soup.select("p"):
    print(p.get_text(strip=True, separator=" "))
    print("-" * 80)

打印:

Ebstein anomaly is a rare congenital heart disease that involves the apical displacement of the tricuspid valve with adherence of the septal and posterior leaflets to the myocardium and "atrialization of the inlet portion of the right ventricle". It is usually accompanied by tricuspid regurgitation, right ventricular failure, and arrhythmias. Clinical manifestations range from asymptomatic to severe, depending on the degree of tricuspid valve displacement and severity of regurgitation, the effective right ventricular volume, and the associated malformations (i.e., pulmonary valve stenosis, atresia, atrial septal defect, etc.). Arrhythmias are common and protracted due to the likelihood of having accessory pathways, in addition to having right atrial dilatation. Symptomatic patients can present with cyanosis, congestive heart failure, and arrhythmias, with exertional dyspnea being common in older patients. This activity reviews the pathophysiology and presentation of Ebstein's malformation and highlights the role of the interprofessional team in its management.
--------------------------------------------------------------------------------
Objectives: Describe the pathophysiology of Ebstein anomaly. Review the clinical presentation of a patient with an Ebstein anomaly. Outline the approach to the management of patients with Ebstein anomaly, including the indications for non-surgical and surgical intervention. Summarize the importance of improving care coordination among interprofessional team members to improve outcomes for patients affected by Ebstein anomaly. Access free multiple choice questions on this topic.
--------------------------------------------------------------------------------
Describe the pathophysiology of Ebstein anomaly.
--------------------------------------------------------------------------------
Review the clinical presentation of a patient with an Ebstein anomaly.
--------------------------------------------------------------------------------
Outline the approach to the management of patients with Ebstein anomaly, including the indications for non-surgical and surgical intervention.
--------------------------------------------------------------------------------
Summarize the importance of improving care coordination among interprofessional team members to improve outcomes for patients affected by Ebstein anomaly.
--------------------------------------------------------------------------------
Ebstein anomaly is a rare congenital abnormality involving the tricuspid valve and the right ventricle (RV), with an incidence of <1% of congenital heart defects. It was first described by the pathologist Wilhelm Ebstein in 1866 when he performed an autopsy of a 19-year-old cyanotic male who had suffered from exertional dyspnea and palpitations and died of a sudden cardiac arrest. Ebstein anomaly is defined by the following characteristics:
--------------------------------------------------------------------------------
© www.soinside.com 2019 - 2024. All rights reserved.