我想编写一些接受两个参数
$text
和$keys
的函数。键是带有键的数组。
在输出中,我们需要获取一个数组,其中键将是传递给函数的键(如果我们在文本中找到它们),值将是该键后面的文本,直到遇到它下一个键或文本结束。如果文本中的键重复,则仅将最后一个值写入数组
例如:
可视化文本:Lorem Ipsum 只是印刷和two排版行业的one虚拟文本。自 Three 1500 年代以来,Lorem Ipsum 一直是业界的one 标准虚拟文本。
$text = 'Lorem Ipsum is simply one dummy text of the printing and two typesetting industry. Lorem Ipsum has been the industry's one standard dummy text ever since the three 1500s.';
$keys = ['one', 'two', 'three'];
所需输出:
[
'one' => 'standard dummy text ever since the',
'two' => 'typesetting industry. Lorem Ipsum has been the industry's',
'three' => '1500s.'
]
我尝试编写一个正则表达式来处理此任务,但没有成功。
最后一次尝试:
function getKeyedSections($text, $keys) {
$keysArray = explode(',', $keys);
$pattern = '/(?:' . implode('|', array_map('preg_quote', $keysArray)) . '):\s*(.*?)(?=\s*(?:' . implode('|', array_map('preg_quote', $keysArray)) . '):\s*|\z)/s';
preg_match_all($pattern, $text, $matches);
$keyedSections = [];
foreach ($keysArray as $key) {
foreach ($matches[1] as $index => $value) {
if (stripos($matches[0][$index], $key) !== false) {
$keyedSections[trim($key)] = trim($value);
break;
}
}
}
return $keyedSections;
}
由于任何段的末尾都可以由任何搜索字符串 (
$keys
) 标记,因此直接 preg_match()
模式可能有点太嘈杂(但并非不可能)。
也许只需在每个
$keys
值上拆分字符串,然后迭代这些段并推送符合条件的段。
rtrim()
)
$text = "Lorem Ipsum is simply one dummy text of the printing and two typesetting industry. Lorem Ipsum has been the industry's one standard dummy text ever since the three 1500s.";
$keys = ['one', 'two', 'three'];
$segments = preg_split('#\b(?=' . implode('|', array_map('preg_quote', $keys)) . ')\b#', $text);
foreach ($segments as $segment) {
foreach ($keys as $key) {
if (str_starts_with($segment, $key)) {
$result[$key] = rtrim($segment);
break;
}
}
}
var_export($result);
我想指出的是,上述脚本的结果不包含不匹配的搜索字符串——您没有说明该场景的结果应该是什么样子。
这是使用
preg_match_all()
的替代方案,它提取以任何键开头并在任何键之前结束的所有片段。无主体 foreach()
只是丢弃较早的匹配以进行后续的匹配,并设置所需的关联结果。 (演示)
$escaped = implode('|', array_map('preg_quote', $keys));
preg_match_all('#\s*\K\b(' . $escaped . ')\b.*?(?=\s*(?:$|\b(?:' . $escaped . ')\b))#', $text, $m, PREG_SET_ORDER);
foreach ($m as [1 => $key, 0 => $result[$key]]);
var_export($result ?? []);
需要交钥匙吗?这个如何将键附加在文本中出现的位置:
<?php
$text = "Lorem Ipsum is simply **one** dummy text of the printing and **two** typesetting industry. Lorem Ipsum has been the industry's **one** standard dummy text ever since the **three** 1500s.";
$matches = [];
preg_match_all("/(\*\*(\w|\d)+\*\*)(\w|\d|\s)+/", $text, $matches);
$actualMatches = $matches[0];
$keys = $matches[1];
$index = 0;
$results = array_reduce($actualMatches, function($carry, $item) use ($keys, &$index) {
$key = $keys[$index];
$carry[str_replace("*", "", $key)] = trim(substr($item, strlen($key)));
$index++;
return $carry;
}, []);
var_dump($results);
?>
如果您只需要特定的按键,这里有一个替代方案:
<?php
$text = "Lorem Ipsum is simply **one** dummy text of the printing and **two** typesetting industry. Lorem Ipsum has been the industry's **one** standard dummy text ever since the **three** 1500s.";
$matches = [];
preg_match_all("/(\*\*(\w|\d)+\*\*)(\w|\d|\s)+/", $text, $matches);
$actualMatches = $matches[0];
$keys = $matches[1];
$index = 0;
$targetKeys = ['one', 'three'];
$results = array_reduce($actualMatches, function($carry, $item) use ($keys, &$index, $targetKeys) {
$key = $keys[$index];
$cleanedKey = str_replace("*", "", $key);
if (in_array($cleanedKey, $targetKeys)) {
$carry[str_replace("*", "", $key)] = trim(substr($item, strlen($key)));
}
$index++;
return $carry;
}, []);
var_dump($results);