我从网址中提取乌尔都语文本。例如,我的文字是فروردین
。但是当我打印它时,我看到'ÙرÙردÛÙ'。我该如何正确打印?
import 'dart:convert';
import 'package:http/http.dart';
import 'package:html/parser.dart';
import 'package:html/dom.dart';
Future initiate() async {
var client = Client();
Response response = await client.get('https://www.varzesh3.com/');
var document = parse(response.body);
List<Element> links = document.querySelectorAll('tr.match-date > td.text-center');
for (var link in links) {
print(link.text)
//var bytes = utf8.encode(link.text);
}
问题似乎是客户端无法识别页面的charset并默认为latin1。请看下面的代码,我强制使用UTF-8,将响应作为字节,并使用utf8解码器将它们转换为UTF-8。
import 'dart:convert';
import 'package:http/http.dart';
import 'package:html/parser.dart';
import 'package:html/dom.dart';
main() async {
var client = Client();
Response response = await client.get('https://www.varzesh3.com/');
var document = parse(utf8.decode(response.bodyBytes), encoding: "utf8");
List<Element> links = document.querySelectorAll(
'tr.match-date > td.text-center');
for (var link in links) {
print(link.text);
}
}