将字符串的部分提取到数组php中

问题描述 投票:1回答:3

我有一个字符串,我需要爆炸并获取信息。

示例字符串:

"20' Container 1, 40' Open Container 1, 40-45' Closed Container 3, container roll 10, container lift 50"

首先,我正在通过,爆炸弦并获得

"20' Container 1"
"40' Open Container 1"
"40-45' Closed Container 3"

现在我想爆炸已经爆炸的数组,以便我得到以下格式的结果

array[
    0 => [
        0 => "20'"
        1 => "Container"
        2 => "1"
        ]
    1 => [
        0 => "40'"
        1 => "Open Container"
        2 => "1"
        ]
    2 => [
          0=> container roll
          1=> 10
         ]
    3=> [
         0=> container lift
         1 => 50
        ]
    ]

字符串可能有所不同,但决定格式相同,例如length type number length是可选的,

我在做

$pattern = '/([\d-]*\')\s(.*)\s(\d+)/';
            foreach (explode(', ', $equipment->chassis_types) as $value) {
                preg_match($pattern, $value, $matches); // Match length, type, number
                $result[] = array_slice($matches, 1);   // Slice with offset 1
                $equipment->tokenized   =   $result;
            }

我明白了

Array
(
    [0] => Array
        (
            [0] => 20'
            [1] => container
            [2] => 10
        )

    [1] => Array
        (
            [0] => 40'
            [1] => open container
            [2] => 10
        )

    [2] => Array
        (
            [0] => 40-45'
            [1] => closed container
            [2] => 20
        )

    [3] => Array
        (
        )

    [4] => Array
        (
        )

)
php arrays regex string preg-match
3个回答
2
投票

通过给出的示例,您可以选择

<?php

$string = "20' Container 1, 40' Open Container 1, 40-45' Closed Container 3, container roll 10, container lift 50";

$regex = "~
        (?:(?P<group1>\d+(?:-\d+)?')\h*)?
        (?P<group2>(?i:[a-z]+\h?)+)\h+
        (?P<group3>\d+(?:'')?)
        ~x";

if (preg_match_all($regex, $string, $matches, PREG_SET_ORDER)) {
    print_r($matches);
}
?>

a demo on regex101.com


This yields:
Array
(
    [0] => Array
        (
            [0] => 20' Container 1
            [group1] => 20'
            [1] => 20'
            [group2] => Container
            [2] => Container
            [group3] => 1
            [3] => 1
        )

    [1] => Array
        (
            [0] => 40' Open Container 1
            [group1] => 40'
            [1] => 40'
            [group2] => Open Container
            [2] => Open Container
            [group3] => 1
            [3] => 1
        )

    [2] => Array
        (
            [0] => 40-45' Closed Container 3
            [group1] => 40-45'
            [1] => 40-45'
            [group2] => Closed Container
            [2] => Closed Container
            [group3] => 3
            [3] => 3
        )

    [3] => Array
        (
            [0] => container roll 10
            [group1] => 
            [1] => 
            [group2] => container roll
            [2] => container roll
            [group3] => 10
            [3] => 10
        )

    [4] => Array
        (
            [0] => container lift 50
            [group1] => 
            [1] => 
            [group2] => container lift
            [2] => container lift
            [group3] => 50
            [3] => 50
        )

)


The core regex is
(?:                               # non-capturing group
    (?P<group1>\d+(?:-\d+)?')\h*  # group1 = digits, 1+ (-other digits), optionally
)?                                # make the whole group optional
(?P<group2>(?i:[a-z]+\h?)+)\h+    # group2 = [a-zA-Z]+ horizontal whitespaces, no digits
(?P<group3>\d+(?:'')?)            # group3 = other digits + '', eventually

0
投票

假设只有length可以丢失,你可以尝试使用我从现有模式修改的模式。加上array_filter()函数从每个$matches中删除空元素

$pattern = '/([\d-]*\')?\s?(\D+)\s(\d+)/';
foreach (explode(', ', $equipment->chassis_types) as $value) {
    preg_match($pattern, $value, $matches);
    $result[] = array_slice(array_filter($matches), 1);
}
$equipment->tokenized = $result;

修改你的模式:

  • ?在第一个捕获组之后,如果不存在则可以跳过它
  • 如果第一组不存在,/s?也会跳过第一个空格
  • (.*)更改为(\D+)以匹配任何不是数字的字符(假设type从不包含数字)

注意:我在循环外移动了$equipment->tokenized = $result;行,只将其设置一次,而不是在循环内重复设置它


0
投票

您可以使用*to制作第一个数字和'可选。

$str = '20\' Container 1, 40\' Open Container 1, 40-45\' Closed Container 3, container roll 10, container lift 50';
preg_match_all('/(\d*\'*)\s([a-zA-Z ]+)(\d+)/', $str, $matches);
var_dump($matches);

这给出了这样的输出:

array(4) {
  [0]=>
  array(5) {
    [0]=>
    string(15) "20' Container 1"
    [1]=>
    string(20) "40' Open Container 1"
    [2]=>
    string(22) "45' Closed Container 3"
    [3]=>
    string(18) " container roll 10"
    [4]=>
    string(18) " container lift 50"
  }
  [1]=>
  array(5) {
    [0]=>
    string(3) "20'"
    [1]=>
    string(3) "40'"
    [2]=>
    string(3) "45'"
    [3]=>
    string(0) ""
    [4]=>
    string(0) ""
  }
  [2]=>
  array(5) {
    [0]=>
    string(10) "Container "
    [1]=>
    string(15) "Open Container "
    [2]=>
    string(17) "Closed Container "
    [3]=>
    string(15) "container roll "
    [4]=>
    string(15) "container lift "
  }
  [3]=>
  array(5) {
    [0]=>
    string(1) "1"
    [1]=>
    string(1) "1"
    [2]=>
    string(1) "3"
    [3]=>
    string(2) "10"
    [4]=>
    string(2) "50"
  }
}

要获得更接近您想要的数组,可以使用数组列按照您的喜好对匹配进行分组。

$str = '20\' Container 1, 40\' Open Container 1, 40-45\' Closed Container 3, container roll 10, container lift 50';
preg_match_all('/(\d*\'*)\s([a-zA-Z ]+)(\d+)/', $str, $matches);
unset($matches[0]); // remove full match as it's not needed.

$res =[];
foreach($matches[1] as $key => $val){
    $res[] = array_column($matches, $key);
}
var_dump($res);

https://3v4l.org/4rGod

© www.soinside.com 2019 - 2024. All rights reserved.