在字符每隔两次出现时分解字符串

问题描述 投票:0回答:4

如何在特定字符每隔两次出现时分解一串文本?

$text = "Online groceries is one of the few channels to market that is growing, though its profitability is questionable. According to industry research group IGD the UK online grocery market will nearly double to 18 billion pounds in the five years to 2020.Online groceries is one of the few channels to market that is growing, though its profitability is questionable. According to industry research group IGD the UK online grocery market will nearly double to 18 billion pounds in the five years to 2020.";

要在每个点上爆炸,我会使用:

$domain_fragmented = explode(".", $text);

有没有办法在分割字符串的同时做到这一点,或者我必须将其爆炸然后将其内爆以获得所需的效果?

php regex split explode preg-split
4个回答
2
投票

在每个点上爆炸,然后将成对的数组条目再次放回一起

$string = 'ab.cd.ef.gh.ij.kl.mn.op.qr';

$split = array_map(
    function($value) {
        return implode('.', $value);
    },
    array_chunk(explode('.', $string), 2)
);

var_dump($split);

演示


1
投票

我会使用

preg_split
,如下所示:

$test = "this.is.a.test.to.see.how.preg_split.can.be.used.for.this";
$res = preg_split ("/(.*?\..*?)\./", $test, NULL,
        PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE);
print_r($res);

输出

Array (
    [0] => this.is
    [1] => a.test
    [2] => to.see
    [3] => how.preg_split
    [4] => can.be
    [5] => used.for
    [6] => this
) 

解释

此处使用的正则表达式捕获直到第二个点(包括第二个点)的文本,并具有排除第二个点的捕获组。通过指定

PREG_SPLIT_DELIM_CAPTURE
选项,该捕获组的所有匹配项都将包含在结果中。

由于原始字符串中的每个字符都会以某种方式成为拆分表达式的一部分,因此其余部分(实际上是

split
的正常结果)都是空的,除了字符串的结尾部分。使用
PREG_SPLIT_NO_EMPTY
选项,这些空字符串将从结果中排除。

正则表达式不会捕获字符串的结尾部分,因为最后一个点会丢失。但是,当正常的

split
行为将该部分添加到数组中(被分割分隔符分割)时,我们仍然得到我们需要的内容,也包括字符串的结尾部分。

NB:在我原来的答案中,我将

\.|$
放在正则表达式的末尾,目的是识别字符串的最后一部分也作为分隔符,但如上所述,这是没有必要的。它无需
|$
部分即可工作。


0
投票

使用 pre_match_all 怎么样?

/(?:(?:.+?\.){2})/

这样,您将获得一个列表,每个列表项包含两个句子(用点分隔)。

$matches = null;
$returnValue = preg_match_all(
    '/(?:(?:.+?\\.){2})/',
    '
        1_Online groceries is one of the few channels to market that is growing, though its profitability is questionable. 
        2_According to industry research group IGD the UK online grocery market will nearly double to 18 billion pounds in the five years to 2020.
        3_Online groceries is one of the few channels to market that is growing, though its profitability is questionable. 
        4_According to industry research group IGD the UK online grocery market will nearly double to 18 billion pounds in the five years to 2020.',
    $matches
);

这会在 $matches 中返回以下内容:

array (
    0 => array (
        0 => '1_Online groceries is one of the few channels to market that is growing, though its profitability is questionable. 2_According to industry research group IGD the UK online grocery market will nearly double to 18 billion pounds in the five years to 2020.',
        1 => '3_Online groceries is one of the few channels to market that is growing, though its profitability is questionable. 4_According to industry research group IGD the UK online grocery market will nearly double to 18 billion pounds in the five years to 2020.',
    )
)

可悲的是,这并不完美。我没有成功捕获输入字符串末尾悬挂的任何其他文本(例如“sentence1.sentence2.bla”)。因此,如果您(或其他人)无法提出一个改进的正则表达式来捕获此内容(也就是说,您是否需要捕获此内容;如果您知道输入字符串始终由成对的句子组成,那么一切都已经很好了)您可能想修剪掉 pre_match_all 捕获的内容。因此,剩下的就必须是剩下的:)


0
投票

为此任务调用的最直接/最合适的函数是

preg_split()
。如果使用贪婪量词并且没有捕获组,则模式本身将具有最佳性能。由于这些原因,我的正则表达式比 trincot 的正则表达式更精简。

匹配一个文字点,然后零个或多个非点,然后用

\K
“忘记”那些匹配的字符,然后匹配下一个文字点 - 这将在每次分割期间仅“消耗”每个第二个点。

代码:(演示

var_export(
    preg_split(
        '/\.[^.]*\K\./',
        $string,
        0,
        PREG_SPLIT_NO_EMPTY
    )
);
© www.soinside.com 2019 - 2024. All rights reserved.