请求返回字节，但我无法解码它们[重复]

Question

本质上，我向网站发出了请求并得到了字节响应：

b'[{"geonameId:"703448"}..........'.

我很困惑，因为虽然它是字节类型，但它非常易于人类阅读，并且看起来像一个 json 列表。我确实知道响应是通过运行

r.encoding

以 latin1 编码的，它返回了

ISO-859-1

，我尝试对其进行解码，但它只返回一个空字符串。这是我到目前为止所拥有的：

r = response.content
string = r.decode("ISO-8859-1")
print (string)

这就是打印空行的地方。然而当我跑步时

len(string)

我得到：回来

如何解码这些字节而不返回空字符串？

Answer 1

您是否尝试使用

json

模块解析它？

import json
parsed = json.loads(response.content)

Answer 2

另一种解决方案是使用response.text，它以unicode返回内容

Type:        property
String form: <property object at 0x7f76f8c79db8>
Docstring:  
Content of the response, in unicode.

If Response.encoding is None, encoding will be guessed using
``chardet``.

The encoding of the response content is determined based solely on HTTP
headers, following RFC 2616 to the letter. If you can take advantage of
non-HTTP knowledge to make a better guess at the encoding, you should
set ``r.encoding`` appropriately before accessing this property.

Answer 3

有

r.text

和

r.content

。第一个是字符串，第二个是字节。

你想要

import json

data = json.loads(r.text)

Answer 4

我在抓取网页时使用

beautifulsoup4

和

requests

遇到了类似的问题，但是

response.text

和

response.content

看起来都是字节。

响应标头在标头中包含

'Content-Type': 'text/html; charset=UTF-8'

编码，响应标头中也包含此 -

'Content-Encoding': 'br'

。事实证明我没有在环境中安装

brotlipy

并且运行

pip install brotlipy

解决了我的问题。我认为

chardet

或

cchardet

就足够了，但数据需要正确解压缩。

类似的问题以同样的方式解决了here，并链接到这个答案，因为直到我明确搜索 brotli 压缩后它才出现。

请求返回字节，但我无法解码它们[重复]

问题描述投票：0回答：4

4个回答

最新问题

请求返回字节，但我无法解码它们[重复]

问题描述 投票：0回答：4

4个回答

最新问题

问题描述投票：0回答：4