我正在将用户提交的字符串从 UTF-8 转换为 ASCII(可打印):
$str = 'Thê qúïck 😈 brõwn fõx júmps?😈 Óvér thé lázy dõg?😈';
$out = iconv('UTF-8', 'ASCII//TRANSLIT', $str);
var_dump($out);
$out = 'The quick ? brown fox jumps?? Over the lazy dog??';
我想删除
?
中多余的 $out
问号。
if ($out !== $str && strpos($out, '?') !== false) {
// The input string was modified and contains question marks...
//
// Not even really sure where to begin
//
// Do we need to compare the position of every character from the
// original string to every position of the new string and replace
// where the original string did not contain a question mark?
//
// That's all I can think of, but there has to be a better way.
}
我想保留所有
//TRANSLIT
字符,包括上面示例中包含的少数字符,例如áéïõú
= aeiou
。这个问题没有其他细微差别。我认为这可以归结为字符串比较和替换问题。
我不一定要找人来编写整个代码,只是为您提供解决此问题的正确方向的指针。
transliterator_transliterate()
的解决方案:
$str = transliterator_transliterate('Latin-ASCII', 'Thê qúïck 😈 brõwn fõx júmps?😈 Óvér thé lázy dõg?😈');
$str = preg_replace('/[\x80-\xFF]/', '', $str);
echo $str;
输出:
The quick brown fox jumps? Over the lazy dog?
请注意,表情符号由
transliterator_transliterate()
保留,因此我使用正则表达式删除所有剩余的非 ASCII 字符。