Python - XML解析不适用于嵌套for循环

问题描述 投票:0回答:1

我目前正在创建一个要求几个单词的系统,如果在XML文件中找到该单词的同义词,则替换它。

这是代码:

def wordproc(self, word):

    lmtzr = nltk.WordNetLemmatizer()
    tokens = nltk.word_tokenize(word)
    tokens_lemma = [lmtzr.lemmatize(tokens) for tokens in tokens]
    tagged = nltk.pos_tag(tokens)
    chunking = nltk.chunk.ne_chunk(tagged)


    important_words = []
    unimportant_tags = ['MD', 'TO', 'DT', 'JJR', 'CC', 'VBZ']

    for x in chunking:

        if x[1] not in unimportant_tags:
            important_words.append(x[0])

    print(important_words)
    self.words = (important_words)
    print(self.words)
    self.loop = len(self.words)
    self.xmlparse(self.words, self.loop)

def xmlparse(self, words, loops):

    root = ElementTree.parse('data/word-test.xml').getroot()
    for i in range(loops):
        syn_loc = [word for word in root.findall('word') if word.findtext('mainword') == words]
        for nym in syn_loc:
            print(nym.attrib)
            word_loop = self.loop
            new_word = (nym.findtext('synonym'))
            words = new_word
    print(words)
    vf = videoPlay()
    vf.moviepy(words)

当wordproc的单词发送到xmlparse函数时,它不起作用。任何指导?或者我错过了一个关键点?任何帮助都会很棒!

编辑:这是一个简短的XML文件

<synwords>
<word>
    <mainword>affection</mainword>
    <wordtag>N</wordtag>
    <synonym>love</synonym>
</word>
<word>
    <mainword>sweetie</mainword>
    <wordtag>N</wordtag>
    <synonym>love</synonym>
</word>
<word>
    <mainword>appreciation</mainword>
    <wordtag>N</wordtag>
    <synonym>love</synonym>
</word>
<word>
    <mainword>beloved</mainword>
    <wordtag>N</wordtag>
    <synonym>love</synonym>
</word>
<word>
    <mainword>emotion</mainword>
    <wordtag>N</wordtag>
    <synonym>love</synonym>
</word>

我期望的结果:

words = ["beloved", "sweetie","affection"]

结果,在与XML比较之后,将会是

words = ["love", "love", "love"]
python xml parsing
1个回答
1
投票

而不是在xml中查找单词并解析它每次我建议你可以在python词典中映射你的单词和同义词,然后你可以很容易地查找或操纵你想要的。我使用beautifulsoup来解析下面的xml:

xml = """<synwords>
<word>
    <mainword>affection</mainword>
    <wordtag>N</wordtag>
    <synonym>love</synonym>
</word>

.
.
.

<synwords>"""

from bs4 import BeautifulSoup

soup = BeautifulSoup(xml, "html.parser")  # xml is your xml content
words = soup.find_all('word')
mapped_dict = {word.find("mainword").text: word.find("synonym").text for word in words}
print(mapped_dict)

输出:

{'sweetie': 'love', 'beloved': 'love', 'appreciation': 'love', 'affection': 'love', 'emotion': 'love'}
© www.soinside.com 2019 - 2024. All rights reserved.