尝试使用Python 3解析XML文件

问题描述 投票:1回答:1

希望大家都能帮助我。我对python比较陌生。我有需要在Powershell中工作的东西,但是通过Powershell对象访问XML元素比看起来似乎要容易得多。在powershell中,我可以简单地执行

[xml]$test = Get-Content .\test.xml

然后遍历对象以查找所需的信息。完全公开,尽管XML似乎很容易,但我还是措手不及。这是XML文件的小版本]

<?xml version="1.0" encoding="UTF-8"?>
<!--DISA STIG Viewer :: 2.9-->
<CHECKLIST>
    <ASSET>
        <ROLE>None</ROLE>
        <ASSET_TYPE>Computing</ASSET_TYPE>
        <HOST_NAME></HOST_NAME>
        <HOST_IP></HOST_IP>
        <HOST_MAC></HOST_MAC>
        <HOST_FQDN></HOST_FQDN>
        <TECH_AREA></TECH_AREA>
        <TARGET_KEY>2266</TARGET_KEY>
        <WEB_OR_DATABASE>false</WEB_OR_DATABASE>
        <WEB_DB_SITE></WEB_DB_SITE>
        <WEB_DB_INSTANCE></WEB_DB_INSTANCE>
    </ASSET>
    <STIGS>
        <iSTIG>
            <STIG_INFO>
                <SI_DATA>
                    <SID_NAME>version</SID_NAME>
                    <SID_DATA>5</SID_DATA>
                </SI_DATA>
                <SI_DATA>
                    <SID_NAME>classification</SID_NAME>
                    <SID_DATA>UNCLASSIFIED</SID_DATA>
                </SI_DATA>
                <SI_DATA>
                    <SID_NAME>customname</SID_NAME>
                </SI_DATA>
                <SI_DATA>
                    <SID_NAME>stigid</SID_NAME>
                    <SID_DATA>McAfee_VirusScan88_Managed_Client</SID_DATA>
                </SI_DATA>
                <SI_DATA>
                    <SID_NAME>description</SID_NAME>
                    <SID_DATA>The McAfee VirusScan Managed Client STIG is published as a tool to improve the security of Department of Defense (DoD) information systems. The requirements are derived from the NIST 800-53 and related documents. Comments or proposed revisions to this document should be sent via e-mail to the following address: [email protected].</SID_DATA>
                </SI_DATA>
                <SI_DATA>
                    <SID_NAME>filename</SID_NAME>
                    <SID_DATA>U_McAfee_VirusScan88_Managed_Client_STIG_V5R21_Manual-xccdf.xml</SID_DATA>
                </SI_DATA>
                <SI_DATA>
                    <SID_NAME>releaseinfo</SID_NAME>
                    <SID_DATA>Release: 21 Benchmark Date: 25 Oct 2019</SID_DATA>
                </SI_DATA>
                <SI_DATA>
                    <SID_NAME>title</SID_NAME>
                    <SID_DATA>McAfee VirusScan 8.8 Managed Client STIG</SID_DATA>
                </SI_DATA>
                <SI_DATA>
                    <SID_NAME>uuid</SID_NAME>
                    <SID_DATA>1a441b95-b269-4423-8a40-a34f56441f5a</SID_DATA>
                </SI_DATA>
                <SI_DATA>
                    <SID_NAME>notice</SID_NAME>
                    <SID_DATA>terms-of-use</SID_DATA>
                </SI_DATA>
                <SI_DATA>
                    <SID_NAME>source</SID_NAME>
                </SI_DATA>
            </STIG_INFO>
            <VULN>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>Vuln_Num</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA>V-6453</ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>Severity</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA>high</ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>Group_Title</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA>DTAM001-McAfee VirusScan Control Panel </ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>Rule_ID</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA>SV-55134r1_rule</ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>Rule_Ver</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA>DTAM001</ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>Rule_Title</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA>McAfee VirusScan On-Access General Policies must be configured to enable on-access scanning at system startup.
</ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>Vuln_Discuss</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA>For antivirus software to be effective, it must be running at all times, beginning from the point of the system's initial startup. Otherwise, the risk is greater for viruses, trojans, and other malware infecting the system during that startup phase.
</ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>IA_Controls</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA></ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>Check_Content</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA>From the ePO server console System Tree, select the Systems tab, select the asset to be checked, select Actions, select Agent, and select Modify Policies on a Single System. From the product pull down list, select VirusScan Enterprise 8.8.0. Select from the Policy column the policy associated with the On-Access General Policies. Under the General tab, locate the "Enable on-access scanning:" label. Ensure the "Enable on-access scanning at system startup" option is selected.

Criteria:  If the "Enable on-access scanning at startup" option is selected, this is not a finding. 

On the client machine, use the Windows Registry Editor to navigate to the following key:
HKLM\Software\McAfee\ (32-bit)
HKLM\Software\Wow6432Node\McAfee\ (64-bit)
SystemCore\VSCore\On Access Scanner\McShield\Configuration

Criteria:  If the value of bStartDisabled is 0, this is not a finding. If the value is 1, this is a finding.</ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>Fix_Text</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA>From the ePO server console System Tree, select the Systems tab, select the asset to be checked, select Actions, select Agent, and select Modify Policies on a Single System. From the product pull down list, select VirusScan Enterprise 8.8.0. Select from the Policy column the policy associated with the On-Access General Policies. Under the General tab, locate the "Enable on-access scanning:" label. Select the "Enable on-access scanning at system startup" option. Select Save.</ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>False_Positives</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA></ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>False_Negatives</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA></ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>Documentable</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA>false</ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>Mitigations</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA></ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>Potential_Impact</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA></ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>Third_Party_Tools</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA></ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>Mitigation_Control</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA></ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>Responsibility</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA>System Administrator</ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>Security_Override_Guidance</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA></ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>Check_Content_Ref</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA>M</ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>Weight</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA>10.0</ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>Class</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA>Unclass</ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>STIGRef</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA>McAfee VirusScan 8.8 Managed Client STIG :: Version 5, Release: 21 Benchmark Date: 25 Oct 2019</ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>TargetKey</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA>2266</ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>CCI_REF</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA>CCI-001242</ATTRIBUTE_DATA>
                </STIG_DATA>
                <STATUS>Not_Reviewed</STATUS>
                <FINDING_DETAILS></FINDING_DETAILS>
                <COMMENTS></COMMENTS>
                <SEVERITY_OVERRIDE></SEVERITY_OVERRIDE>
                <SEVERITY_JUSTIFICATION></SEVERITY_JUSTIFICATION>
            </VULN>
        </iSTIG>
    </STIGS>
</CHECKLIST>

我知道有几种不同的方法可以做到这一点,但是我首先尝试了低谷槽

import xml.dom.minidom
doc = xml.dom.minidom.parse(r'C:\Temp\test.xml')
print (doc.nodeName)
root = doc.firstChild.tagName
root

这将打印出CHECKLIST,这实际上是文档的根目录。现在在powershell中,我将执行root.STIG.iSTIG.STIG_INFO.SI_DATA并从那里开始循环,但无法绕开为什么这有很大不同的原因。

我也尝试过从ElementTree开始,但没有走远

from xml.etree import ElementTree as ET
doc = ET.parse(r'C:\Temp\test.xml').getroot()

任何人都可以向我指出正确的方向,而不必给出书面代码作为答案?我已经使用lxml转换了XML,并能够输出以下文件,虽然很好,但在下一步时遇到了麻烦。

谢谢!

python python-3.x xml lxml minidom
1个回答
0
投票

由于您正在寻找一个总体方向,请尝试这样的操作并根据需要进行修改:

from lxml import etree

stig = """your xml above"""
parser = etree.XMLParser()

tree = etree.fromstring(stig, parser)
items = tree.xpath('//iSTIG/STIG_INFO//SI_DATA')
for item in items:
    print(item.xpath('string(SID_NAME/text())')," ",item.xpath('string(SID_DATA/text())'))

输出:

version   5
classification   UNCLASSIFIED

显然,不用打印,您可以将每个项目添加到列表中,依此类推。

© www.soinside.com 2019 - 2024. All rights reserved.