preg_replace在遇到ó（急性o）后返回null）>

Question

我正在读取并解析ANSI中的CSV文件。在解析之前，我想删除不在白名单中的所有字符

// remove any odd characters from string
$match_list = "\x{20}-\x{5f}\x{61}-\x{7e}"; // basic ascii chars excluding backtick
$match_list .= "\x{a1}-\x{ff}"; // extended latin 1 chars excluding control chars
$match_list .= "\x{20ac}\x{201c}\x{201d}"; // euro symbol & left/right double quotation mark (from Word)
$match_list .= "\x{2018}\x{2019}"; // left/right single quotation mark (from word)

$cleaned_line = preg_replace("/[^$match_list]/u", "*",$linein); 
问题是，当它到达其中包含ó（急性o）字符的行时，它返回NULL。根据我的文本编辑器，这是xF3，因此应允许使用。

为什么在preg_replace中抛出错误？

更新-似乎与该文件有关-如果我将问题行从CSV文件复制并粘贴到我的PHP文件中，就可以了。

更新2-使用preg_last_error（）我能够确定错误是：

PREG_BAD_UTF8_ERROR Returned by preg_last_error() if the last error was caused by malformed UTF-8 data (only when running a regex in UTF-8 mode).

我的文本编辑器刚刚报告该文件为ANSI，但是使用unix文件命令，我得到了：

% file PRICE_LIST_A.csv PRICE_LIST_A.csv: Non-ISO extended-ASCII text, with CRLF line terminators % file DOLLARS_PRICE_LIST.csv DOLLARS_PRICE_LIST.csv: ISO-8859 text, with CRLF line terminators % file PRICE_LIST_B.csv PRICE_LIST_B.csv: Non-ISO extended-ASCII text, with CRLF line terminators % file PRICE_LIST_TEST.csv PRICE_LIST_TEST.csv: ASCII text, with CRLF line terminators

因此，似乎从同一会计应用程序为我提供了具有各种编码的文件。我猜这些不是有效的Unicode

我正在读取并解析ANSI中的CSV文件。在解析之前，我想删除所有不在白名单中的字符//从字符串$ match_list =“ \ x {20}-\ x {5f} \ x {61} ...

Answer 1

当您使用/u（PCRE_UTF8修饰符）时，请确保要传递的主题字符串为UTF-8。

preg_replace在遇到ó（急性o）后返回null）>

问题描述投票：0回答：1

1个回答

最新问题

preg_replace在遇到ó（急性o）后返回null）>

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1