如何将 Unicode 特殊字符转换为 html 实体?

问题描述 投票:0回答:3

我有以下字符串:

$string = "★ This is some text ★";

我想将其转换为html实体:

$string = "★ This is some text ★";

每个人都在写的解决方案:

htmlentities("★ This is some text ★", "UTF-8");

但是 htmlentities 无法将所有 unicode 转换为 html 实体。所以它只是给了我与输入相同的输出:

★ This is some text ★

我也尝试将此解决方案与两者结合起来:

header('Content-Type: text/plain; charset=utf-8');

和:

mb_convert_encoding();

但这要么打印结果为空,要么根本不转换,要么错误地将星星转换为:

Â

如何将 ★ 和所有其他 unicode 字符转换为正确的 html 实体?

php unicode utf-8 html-entities
3个回答
14
投票
在这种情况下,

htmlentities
不起作用,但您可以尝试对字符串进行
UCS-4
编码,例如:

$string = "★ This is some text ★";
$entity = preg_replace_callback('/[\x{80}-\x{10FFFF}]/u', function ($m) {
    $char = current($m);
    $utf = iconv('UTF-8', 'UCS-4', $char);
    return sprintf("&#x%s;", ltrim(strtoupper(bin2hex($utf)), "0"));
}, $string);
echo $entity;

★ This is some text ★

Ideone-演示


0
投票

@Pedro Lobito 答案的简单版本:

$string = "★ This is some text ★";

$entitified = preg_replace_callback('/[\x{80}-\x{10FFFF}]/u', function ($m) {
    $char = $m[0];
    return sprintf("&#x%X;", mb_ord($char, 'UTF-8'));
}, $string);


echo $entitified;

或者,实体以十进制而不是十六进制表示:

$string = "★ This is some text ★";

$entitified = preg_replace_callback('/[\x{80}-\x{10FFFF}]/u', function ($m) {
    $char = $m[0];
    return sprintf("&#%d;", mb_ord($char, 'UTF-8'));
}, $string);


echo $entitified;

-1
投票

这样更好

html_entity_decode('zł');

输出-zł

© www.soinside.com 2019 - 2024. All rights reserved.