使用 preg_replace() 将字母数字字符串从 camelCase 转换为 snake_case

Question

我现在有一个方法可以将我的驼峰式字符串转换为蛇式，但它被分成三个调用

preg_replace()

：

public function camelToUnderscore($string, $us = "-")
{
    // insert hyphen between any letter and the beginning of a numeric chain
    $string = preg_replace('/([a-z]+)([0-9]+)/i', '$1'.$us.'$2', $string);
    // insert hyphen between any lower-to-upper-case letter chain
    $string = preg_replace('/([a-z]+)([A-Z]+)/', '$1'.$us.'$2', $string);
    // insert hyphen between the end of a numeric chain and the beginning of an alpha chain
    $string = preg_replace('/([0-9]+)([a-z]+)/i', '$1'.$us.'$2', $string);

    // Lowercase
    $string = strtolower($string);

    return $string;
}

我编写了测试来验证它的准确性，并且它可以在以下输入数组 (

array('input' => 'output')

) 下正常工作：

$test_values = [
    'foo'       => 'foo',
    'fooBar'    => 'foo-bar',
    'foo123'    => 'foo-123',
    '123Foo'    => '123-foo',
    'fooBar123' => 'foo-bar-123',
    'foo123Bar' => 'foo-123-bar',
    '123FooBar' => '123-foo-bar',
];

我想知道是否有办法将我的

preg_replace()

调用减少到一行，这会给我相同的结果。有什么想法吗？

注意：参考这篇文章，我的研究向我展示了一个

preg_replace()

正则表达式，它让我几乎我想要的结果，除了它不适用于

foo123

的例子将其转换为

foo-123

.

Answer 1

您可以使用 lookarounds 在一个正则表达式中完成所有这些：

function camelToUnderscore($string, $us = "-") {
    return strtolower(preg_replace(
        '/(?<=\d)(?=[A-Za-z])|(?<=[A-Za-z])(?=\d)|(?<=[a-z])(?=[A-Z])/', $us, $string));
}

正则表达式演示

代码演示

正则表达式说明：

(?<=\d)(?=[A-Za-z])  # if previous position has a digit and next has a letter
|                    # OR
(?<=[A-Za-z])(?=\d)  # if previous position has a letter and next has a digit
|                    # OR
(?<=[a-z])(?=[A-Z])  # if previous position has a lowercase and next has a uppercase letter

Answer 2

根据我之前标记的重复帖子，这是我的两分钱。这里接受的解决方案很棒。我只是想尝试用分享的内容来解决它：

function camelToUnderscore($string, $us = "-") {
    return strtolower(preg_replace('/(?<!^)[A-Z]+|(?<!^|\d)[\d]+/', $us.'$0', $string));
}

例子：

Array
(
    [0] => foo
    [1] => fooBar
    [2] => foo123
    [3] => 123Foo
    [4] => fooBar123
    [5] => foo123Bar
    [6] => 123FooBar
)

foreach ($arr as $item) {
    echo camelToUnderscore($item);
    echo "\r\n";
}

输出：

foo
foo-bar
foo-123
123-foo
foo-bar-123
foo-123-bar
123-foo-bar

解释：

(?<!^)[A-Z]+      // Match one or more Capital letter not at start of the string
|                 // OR
(?<!^|\d)[\d]+    // Match one or more digit not at start of the string

$us.'$0'          // Substitute the matching pattern(s)

在线正则表达式

问题已经解决了，所以我不会说我希望它有所帮助，但也许有人会觉得这很有用。

编辑

此正则表达式有限制：

foo123bar => foo-123bar
fooBARFoo => foo-barfoo

感谢@urban 指出。这是他与此问题上发布的三个解决方案的测试链接：

三种解决方案演示

Answer 3

来自同事：

$string = preg_replace(array($pattern1, $pattern2), $us.'$1', $string);

可能有用

我的解决方案：

public function camelToUnderscore($string, $us = "-")
{
    $patterns = [
        '/([a-z]+)([0-9]+)/i',
        '/([a-z]+)([A-Z]+)/',
        '/([0-9]+)([a-z]+)/i'
    ];
    $string = preg_replace($patterns, '$1'.$us.'$2', $string);

    // Lowercase
    $string = strtolower($string);

    return $string;
}

Answer 4

您无需忍受低效的环视负载或多组模式来定位单词或连续数字之间的位置。

使用贪心匹配找到想要的序列，然后用

\K

重新设置全字符串匹配，然后检查该位置不是字符串的末尾。符合条件的所有内容都应包含定界字符。这种贪婪模式的速度在于它消耗一个或多个序列并且从不回头。

我会从我的回答中省略

strtolower()

电话，因为它只是挑战的噪音。

代码：（演示）

preg_replace(
    '/(?:\d++|[A-Za-z]?[a-z]++)\K(?!$)/',
    '-',
    $tests
)

文字/数字之间的处理：

用户	步骤	图案	更换
阿努巴瓦	660	`/(?<=\d)(?=[A-Za-z])\|(?<=[A-Za-z])(?=\d)\|(?<=[a-z])(?=[A-Z])`	`'-'`
米克马库萨	337	`/(?:\d++\|[A-Za-z]?[a-z]++)\K(?!$)/`	`'-'`

严格的驼峰处理：

用户	步骤	图案	更换
爵士	321	`/(?<!^)[A-Z]+\|(?<!^\|\d)[\d]+/`	`'-$0'`
米克马库萨	244	`/(?:\d++\|[a-z]++)\K(?!$)/`	`'-'`

我对@Matt 的回答打了折扣，因为它在每根弦上进行了三次完整的传递——就效率而言，它甚至不在同一个球场。

使用 preg_replace() 将字母数字字符串从 camelCase 转换为 snake_case

问题描述投票：0回答：4

4个回答

最新问题

使用 preg_replace() 将字母数字字符串从 camelCase 转换为 snake_case

问题描述 投票：0回答：4

4个回答

最新问题

问题描述投票：0回答：4