我正在从蓝牙特征中获取 WiFi SSID 列表。每个 SSID 都表示为一个字符串,有些具有这些 UTF8 文字,例如“\xc3\xa6”。
我尝试了多种方法来解码这个像
let s = "\\xc3\\xa6"
let dec = s.utf8
由此我期待
print(dec)
> æ
等等。但它不起作用,它只会导致
print(dec)
> \xc3\xa6
如何在 Swift 5 中解码字符串中的 UTF-8 文字?
您只需解析字符串,将每个十六进制字符串转换为
UInt8
,然后使用 String.init(byte:encoding:)
: 进行解码
let s = "\\xc3\\xa6"
let bytes = s
.components(separatedBy: "\\x")
// components(separatedBy:) would produce an empty string as the first element
// because the string starts with "\x". We drop this
.dropFirst()
.compactMap { UInt8($0, radix: 16) }
if let decoded = String(bytes: bytes, encoding: .utf8) {
print(decoded)
} else {
print("The UTF8 sequence was invalid!")
}
要转换整个字符串,我们可以使用 Swift 5.7 中新的 Regex Builder,并构建可接受的答案。我们可能会做这样的事情:
import Foundation
import RegexBuilder
extension String {
public func decodingUTF8Characters() throws -> String {
let regex = Regex {
#"\x"#
Capture {
Repeat(count: 2) {
.hexDigit
}
}
}
return try self.replacing(regex) { match in
let hexString = String(match.output.1)
guard let byte = UInt8(hexString, radix: 16),
let decodedString = String(bytes: [byte], encoding: .utf8) else {
throw UTF8DecodingError(string: hexString)
}
return decodedString
}
}
struct UTF8DecodingError: LocalizedError {
let string: String
var errorDescription: String? {
"Couldn't decode '\(string)' as UTF8 string"
}
}
}