如何根据python中列表中的标题和值从一个文件中提取字符?

问题描述 投票:-1回答:1

我有一个巨大的文件,看起来像这样:

-HVC1 tank
Contains300gallons
-HVC2 tank
Contains20gallonsofgasand220galonsofkero

我读到的第二个文件列表如下所示:

s = [['-HVC1', '0', '8'], ['-HVC1', '12', '18'], ['-HVC2', '9', '17']]

我需要比较与给定行相关联的文件中每个字符的位置,例如-HVC1或-HVC2,以查看它是否与列表中的HVC1或HCV2等匹配。基于此,提取列表中其他2个值范围内的字符,例如0,8; 12,18; 9,17

此示例列表的预期结果是:

-HVC1
Contains
-HVC1
gallons
-HVC2
20gallons

我的代码:

import csv

sequence =[]
with open('my_huge_file', 'r') as f:
    lines = f.readlines()
    dic = {}
    for line in lines:
        if line.startswith('-'):
            tx = line.split('tank', 1)[0] #include everything before tank in header
        else:
            gh = line[:-1]
            dic[tx] = gh

    s = [['-HVC1', '0', '8'], ['-HVC1', '12', '18'], ['-HVC2', '9', '17']]
    for i in s:
        seq =[]
        for m, n in dic.items():
            for j, k in enumerate(n):
                if int(i[1]) <= j <= int(i[2]) and m == i[0]:
                    seq.append(k)
        sequence.append(seq)
print(sequence)

我得到一个空的列表列表作为回报。

[[], [], [], []]

我知道我做错了什么但我的逻辑确实有意义。任何帮助将不胜感激(更好的解释)打印顺序的结果应该是:

[[Contains], [gallons], [20gallons]]

然后我将格式化为上面显示的预期结果

python
1个回答
1
投票

@ mkreiger1评论是正确的:在这种情况下,调试有很多帮助。

问题在于比较m == i[0]:在第一次迭代中,m'-HVC1 ',而i[0]'-HVC1'。因此比较总是False。解决方案是去除空白区域:

lines = ['-HVC1 tank', 'Contains300gallons', '-HVC2 tank',
        'Contains20gallonsofgasand220galonsofkero']

sequence = []
dic = {}
for line in lines:
    if line.startswith('-'):
        tx = line.split('tank', 1)[0]
    else:
        gh = line[:-1]
        # THE FIX IS HERE: Strip the white spaces in ``tx``
        dic[tx.strip()] = gh

s = [['-HVC1', '0', '8'], ['-HVC1', '12', '18'], ['-HVC2', '9', '17']]
for i in s:
    seq = []
    for m, n in dic.items():
        for j, k in enumerate(n):
            if (int(i[1]) <= j <= int(i[2])) and (m == i[0]):
                seq.append(k)
    sequence.append(seq)

print(sequence)

输出:

[['C', 'o', 'n', 't', 'a', 'i', 'n', 's', '3'], ['a', 'l', 'l', 'o', 'n'], ['0', 'g', 'a', 'l', 'l', 'o', 'n', 's', 'o']]
© www.soinside.com 2019 - 2024. All rights reserved.