我如何正确提取以下文本？

Question

im试图从以下文本中提取标题，其他信息和地址。但是，当附加项不存在时，我似乎无法确定由什么决定标题。我目前正在使用正则表达式来解决此问题。任何人都可以提出任何建议来帮助我吗？感谢您提供的帮助或建议。

Plaza Singapura
Addtional information. 11546.63Km Away.
68 Orchard Road #B1-21/22 Plaza Singapura Singapore, 238839
some text
some text
some text
Social Service Centre
Addtional information. 113446.63Km Away.
381 Toa Payoh Lorong 1 #01-13 Joint Social Service Centre Singapore, 319758
some text
some text
Junction 8
9 Bishan Place #01-21 Junction 8 Shopping Centre Singapore, 579837
some text

这是我想要的结果

Plaza Singapura
68 Orchard Road #B1-21/22 Plaza Singapura Singapore, 238839

Social Service Centre
381 Toa Payoh Lorong 1 #01-13 Joint Social Service Centre Singapore, 319758

Junction 8
9 Bishan Place #01-21 Junction 8 Shopping Centre Singapore, 579837

Answer 1

这可能真的很难，这取决于您来源中的地址有多大（当然，在现实生活中，即使在亚洲，它们也可能有很大的不同）。我以为您的地址行始终以数字开头并包含#来作弊；您需要想出一种与地址匹配的方法，因为这是在示例数据中按原样获取数据块的唯一机会。例如：

/^(.*)(?:\r?\n(Additional information.*))?\r?\n(\d+[^\n#]*#.*)/gm

引擎只是去匹配一行，然后有选择地尝试匹配以Additional information开头的行（顺便说一句，如果您正在测试，则拼错了Additional），然后尝试匹配地址行如上所述。

使用代码：

regex = /^(.*)(?:\r?\n(Additional information.*))?\r?\n(\d+[^\n#]*#.*)/mg;
result = your_text.matchAll(regex);
for (m of result) console.log(m);

输出（太长，无法在此处显示）的形式为：

['Full match', 'Title/Group 1', 'Info/Group 2 (or undefined)', 'Address/Group 3']

请参见demo。

我如何正确提取以下文本？

问题描述投票：0回答：1

1个回答

最新问题

我如何正确提取以下文本？

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1