如何修复列表索引必须是整数或切片,而不是str

问题描述 投票:-1回答:1

在前2个问题之后,仍无法解决问题。question 1question 2

我有一个python脚本,可以在分析文本部分之前清除文本。

所以我有一些功能可以清理文本并制作POS标签,以便拆分文本并将其标记化。我需要返回单词+标签+现有频率。

问题是该函数使用元组列表,然后结束系统崩溃并显示以下错误:

文件“ F:\ AIenv \ textAnalysis \ setup.py”,第221行,位于tag_and_save中file.write(“ {0} / {1} {2} \ n” .format(word,tag,freq_tagged_data [word]))

TypeError:列表索引必须是整数或切片,而不是str

def get_freq(tagged):
    freq_dist = {}
    freqs = FreqDist(tagged)
    freq_dist = [(word, freq) for word ,freq in freqs.items()]
    # print(freq_dist)
    return freq_dist

def tag_and_save(tagger,text,path):
    clt = clean_text(text)
    tagged_data = tagger.tag(clt)
    print("tagged_data\n\n\n",tagged_data)## **here its a list of tuple [('','')]**

    tagged_data = sorted(tagged_data,key=operator.itemgetter(1))
    freq_tagged_data = get_freq(tagged_data)
    file = open(path,"w",encoding = "UTF8")
    for word,tag in tagged_data:

        file.write("{0} /{1} {2} \n".format(word,tag,freq_tagged_data[word]))## the error is here 
    file.close()

预期输出:(“ ***** / POS tag”)次数。

python stanford-nlp pos-tagger word-frequency
1个回答
0
投票

更改

freq_dist = [(word, freq) for word ,freq in freqs.items()]

to

for word, freq in freqs.items():
    freq_dist[word] = freq

它可能会解决问题。当您将字典更改为该行中的列表时。

tag_and_save中尝试:

for word,tag in tagged_data:
    if (word and word != "") and (tag and tag != ""):
        file.write("{0} /{1} {2} \n".format(word,tag,freq_tagged_data[word]))
© www.soinside.com 2019 - 2024. All rights reserved.