我遇到了这个奇怪的csv格式,包含非转义的 ,
字符:
641,"Harstad/Narvik Airport, Evenes","Harstad/Narvik","Norway","EVE","ENEV",68.491302490234,16.678100585938,84,1,"E","Europe/Oslo","airport","OurAirports"
我需要返回一个这样的列表
[641,'Harstad/Narvik Airport Evenes', 'Harstad/Narvik', 'Norway', 'EVE', 'ENEV', 68.491302490234,16.678100585938,84,1, 'E', 'Europe/Oslo', 'airport', 'OurAirports']
我有两个regex来匹配部分字符串。
(\d+\.?\d*)
匹配数字(["'])(?:(?=(\\?))\2.)*?\1
匹配两个单引号或双引号之间的任意字符有什么方法可以将匹配结果合并成一个结果?
您可以使用此regex。
>>> s = '641,"Harstad/Narvik Airport, Evenes","Harstad/Narvik","Norway","EVE","ENEV",68.491302490234,16.678100585938,84,1,"E","Europe/Oslo","airport","OurAirports"'
>>> csvData = re.findall(r'"[^"\\]*(?:\\.[^"\\]*)*"|\d+(?:\.\d+)?', s)
>>> print csvData
['641', '"Harstad/Narvik Airport, Evenes"', '"Harstad/Narvik"', '"Norway"', '"EVE"', '"ENEV"', '68.491302490234', '16.678100585938', '84', '1', '"E"', '"Europe/Oslo"', '"airport"', '"OurAirports"']
RegEx详情。
"[^"\\]*(?:\\.[^"\\]*)*"
: 匹配一个允许使用转义引号或任何其他转义字符的引号字符串,如:"ENEV"。"foo\"bar"
匹配到一个单一元素中|
: 或\d+(?:\.\d+)?
: 匹配一个整数或小数。