使用 PHP 解析 csv 文件中的转义逗号

问题描述 投票:0回答:1

我正在尝试解析 csv 文件。但是当尝试解析以下行时,我遇到了转义逗号的问题。

<?php
$str = "19018216307,Public,\,k]'=system1-system2,20230914143505.5,1-050000,No";
$data = str_getcsv($str);
?>

输出:

<?php
Array
(
    [0] => 19018216307
    [1] => Public
    [2] => \
    [3] => k]'=system1-system2
    [4] => 20230914143505.5
    [5] => 1-050000
    [6] => No
)
?>

让我们考虑列值 \,k]'=system1-system2。预计会被解析为 ,k]'=system1-system2。但是在处理 CSV 文件时,PHP 将其视为 2 列,结果类似于 \k]'=@system1-system2

预期输出:

<?php
Array
(
    [0] => 19018216307
    [1] => Public
    [2] => ,k]'=system1-system2
    [3] => 20230914143505.5
    [4] => 1-050000
    [5] => No
);
?>

NOET:CSV 文件是外部网站生成的原始数据。所以我无法对 csv 文件内容做任何事情。 (例如:将列值放在双引号中)

提前致谢!

php csv fgetcsv php-8.1
1个回答
0
投票

解决奇怪的“csv 格式”的方法:

$str = "19018216307,Public,\,k]'=system1-system2,20230914143505.5,1-050000,No";

$pattern = <<<'REGEX'
~(?nxx)
    (?# modifiers:
        - inline n: parenthesis act as non-capturing groups
        - inline xx: spaces are ignored even in character classes
        - global A: all the matches have to be contiguous
    )

    # pattern
    ( (?!\A) , \K | \A ) # not at the start with a commas or at the start without
    [^ , \\ ]* ( \\ . [^ , \\ ]* )* # field content (all that isn't a comma, or escaped comma)
    
    # check
    ( \z (*:END) )? # define a marker if the end of the string is reached
~A
REGEX;

if (preg_match_all($pattern, $str, $m) && isset($m['MARK'])) {
    $result = array_map(fn($s) => strtr($s, ['\\,' => ',']), $m[0]);
    print_r($result);
}
© www.soinside.com 2019 - 2024. All rights reserved.