如何在python中的重复文本模式中为每行分配一个变量?

问题描述 投票:0回答:3

我有一个python抓取脚本来获取有关即将举行的音乐会的信息,并且无论出现多少场音乐会,每次都是相同的文本模式,这意味着每一行将始终引用某些信息,例如本示例(请注意音乐会之间没有空格,我的数据正是这种格式):

01/01/99 9PM
Iron Maiden
Madison Square Garden 
New York City
01/01/99 9.30PM
The Doors
Staples Center
Los Angeles
01/02/99 8.45PM
Dr Dre & Snoop Dogg
Staples Center
Los Angeles
01/02/99 9PM
Diana Ross
City Hall
New York City ect...

对于每一行,我需要将其分配给一个变量,因此总共有4个变量:

time = all the 1st lines
name = all the 2nd lines
location = all the 3rd lines
city = all the 4th lines

然后遍历所有行以捕获与每个变量相对应的信息,例如从第一行获取所有日期,从第二行获取所有名称等...

到目前为止,我还没有找到任何解决方案,而且我几乎不了解正则表达式

希望您能理解这个主意,如果有任何疑问,请不要犹豫,谢谢

python variables design-patterns text
3个回答
1
投票
无需使用正则表达式:

string = '''01/01/99 9PM Iron Maiden Madison Square Garden New York City 01/01/99 9.30PM The Doors Staples Center Los Angeles 01/02/99 8.45PM Dr Dre & Snoop Dogg Staples Center Los Angeles 01/02/99 9PM Diana Ross City Hall New York City ''' lines = string.split('\n') dates = [i for i in lines [0::4]] bands = [i for i in lines [1::4]] places = [i for i in lines [2::4]] cities = [i for i in lines [3::4]]

这将为您提供日期/乐队/地方/城市的列表,使用起来会更容易。

1
投票
我个人会使用namedtuple。请注意,我将您的数据放在了一个名为namedtuple的文件中。

input.txt

输出:

from collections import namedtuple Entry = namedtuple("Entry", "time name location city") with open('input.txt') as f: lines = [line.strip() for line in f] objects = [Entry(*lines[i:i+4]) for i in range(0, len(lines), 4)] print(*objects, sep='\n') for obj in objects: print(obj.name)


0
投票
这需要切片:

Entry(time='01/01/99 9PM', name='Iron Maiden', location='Madison Square Garden', city='New York City') Entry(time='01/01/99 9.30PM', name='The Doors', location='Staples Center', city='Los Angeles') Entry(time='01/02/99 8.45PM', name='Dr Dre & Snoop Dogg', location='Staples Center', city='Los Angeles') Entry(time='01/02/99 9PM', name='Diana Ross', location='City Hall', city='New York City') Iron Maiden The Doors Dr Dre & Snoop Dogg Diana Ross

现在我们可以将这些列表压缩到元组中:

times = lines[0::4] names = lines[1::4] locations = lines[2::4] cities = lines[3::4]

[使用您的样本数据,这给了我们

events = zip(*[times, names, locations, cities])

您现在可以将这些元组处理为最适合您的用例的任何数据结构。
© www.soinside.com 2019 - 2024. All rights reserved.