PHP preg_replace 非字母数字字符和选择的连词，然后拆分

Question

我想替换这个字符串：

This is my Store, it has an amazing design; its creator says it was losing money and he doesn't want to maintain it

除了

（不）和所有选定的连词之外的所有非字母数字字符：

is, it, its, the, this, if, so, and

到目前为止我已经得到了这个结果：

Array
(
    [1] => This
    [2] => my
    [3] => Store
    [4] => has
    [5] => an
    [6] => amazing
    [7] => design
    [8] => s
    [9] => creator
    [10] => says
    [11] => was
    [12] => losing
    [13] => money
    [14] => and
    [15] => he
    [16] => doesn
    [17] => t
    [18] => want
    [19] => maintain
)

这是代码：

$string = "This is my Store, it has an amazing design; its creator says it was losing money and he doesn't want to maintain it";
$words = array_filter(preg_split('/\s+/', preg_replace('/\W|\b(it|the|its|is|to)|\b/i', ' ', $string)));

print_r($words);

https://3v4l.org/cLrM4

但是正如你所看到的，当它应该替换

it

时，它正在替换

its

，并且它也在

中替换

doesn't

。

有人可以帮助我理解我做错了什么吗？ X_X

P.S：我还需要它不区分大小写，

/i

工作得非常滑稽:(

谢谢！

Answer 1

将正则表达式更改为：

/\W\B|\b(it|the|its|is|to)\b/i

|\b

中的管道对我来说没有意义，也许这是一个错字。

\B

之后的附加

\W

将确保非字母字符仅在其后面没有紧跟着字母字符时才被替换。这比您所要求的限制要少，但对于其他情况也可能有用，例如带有连字符的单词（例如婆婆）。

Answer 2

首先，在区分大小写的

preg_replace()

调用中删除您在黑名单中提到的所有整个单词（从技术上讲，这些不是英语中的连词）。

然后使用

str_word_count()

提取整个单词（甚至缩写和连字符的单词）。

代码：（演示）

print_r(
    str_word_count(
        preg_replace('/\b(?:its|i[stf]|the|this|so|and)|\b/i', '', $string),
        1  // mode 1 returns words as a flat, indexed array
    )
);

输出：

Array
(
    [0] => my
    [1] => Store
    [2] => has
    [3] => an
    [4] => amazing
    [5] => design
    [6] => creator
    [7] => says
    [8] => was
    [9] => losing
    [10] => money
    [11] => he
    [12] => doesn't
    [13] => want
    [14] => to
    [15] => maintain
)

PHP preg_replace 非字母数字字符和选择的连词，然后拆分

问题描述投票：0回答：2

2个回答

最新问题

PHP preg_replace 非字母数字字符和选择的连词，然后拆分

问题描述 投票：0回答：2

2个回答

最新问题

问题描述投票：0回答：2