如何理解HttpLoggingInterceptor.class中的方法'isPlaintext'？

Question

我很困惑如何定义Human Readable Text。我想因为Unicode几乎包含所有语言字符，所以只要codePoint在其中，它就是可读的。

但在HttpLoggingInterceptor # isPlaintext(buffer)：

static boolean isPlaintext(Buffer buffer) {
        try {
            Buffer prefix = new Buffer();
            long byteCount = buffer.size() < 64 ? buffer.size() : 64;
            buffer.copyTo(prefix, 0, byteCount);
            for (int i = 0; i < 16; i++) {
                if (prefix.exhausted()) {
                    break;
                }
                int codePoint = prefix.readUtf8CodePoint();
                if (Character.isISOControl(codePoint) && !Character.isWhitespace(codePoint)) {
                    return false;
                }
            }
            return true;
        } catch (EOFException e) {
            return false; // Truncated UTF-8 sequence.
        }
    }

它表明如果字节包含非whiteSpace控制字符，则它们不可读。

这是什么原因？谢谢。

Answer 1

AFAI可以看到readUtf8CodePoint返回给定缓冲区的UTF-8代码点。

来自维基百科

UTF-8是一种可变宽度字符编码，能够使用一到四个8位字节对Unicode中的所有1,112,064个有效代码点进行编码。

所以Unicode并不一定只涵盖人类可读的字符。因此，UTF-8系列也有控制字符，以及通过\u0000的unicode \u001F或\u007F到\u009F的范围，这些都不是人类可读的。

记住Unicode是标准的，UTF-8是编码Unicode的方法之一。

如何理解HttpLoggingInterceptor.class中的方法'isPlaintext'？

问题描述投票：0回答：1

1个回答

最新问题

如何理解HttpLoggingInterceptor.class中的方法'isPlaintext'？

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1