我正在参与一个有关文本 ocr 的项目
我的Python代码:
import requests
url = "https://translate.yandex.net/ocr/v1.1/recognize?srv=tr-image&sid=&lang=ru&rotate=auto&yu=&yum=&sprvk="
headers = {
"Connection": "keep-alive",
"sec-ch-ua": '"Chromium";v="118", "Google Chrome";v="118", "Not=A?Brand";v="99"',
"sec-ch-ua-platform": "Windows",
"sec-ch-ua-mobile": "?0",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/118.0.0.0 Safari/537.36",
"Accept": "*/*",
"Origin": "https://translate.yandex.ru",
"Sec-Fetch-Site": "cross-site",
"Sec-Fetch-Mode": "cors",
"Sec-Fetch-Dest": "empty",
"Referer": "https://translate.yandex.ru/ocr",
"Accept-Encoding": "gzip, deflate, br",
"Accept-Language": "ru-RU,ru;q=0.9,en-US;q=0.8,en;q=0.7",
}
files = {
'fieldNameHere': ('da.webp', open('da.webp', 'rb'), 'image/webp')
}
response = requests.post(url, headers=headers, files=files,verify=False)
print(response.text)
出了什么问题 - 当我在没有任何中间人代理或嗅探应用程序的情况下运行应用程序时,我没有得到任何答案
但是,当我运行 fiddler (用于嗅探 http/https 请求的应用程序)并运行代码时,我得到了我需要的所有数据的良好答案
它可能是什么?
我尝试使用不同的标题,但也没有结果
您可以尝试抑制
urllib3
级别的安全警告。另外,不要发送任何headers=
:
import requests
from urllib3.exceptions import InsecureRequestWarning
requests.packages.urllib3.disable_warnings(category=InsecureRequestWarning)
url = "https://translate.yandex.net/ocr/v1.1/recognize?srv=tr-image&sid=&lang=ru&rotate=auto&yu=&yum=&sprvk="
files = {
"fieldNameHere": ("russian_text.png", open("russian_text.png", "rb"), "image/png")
}
response = requests.post(url, files=files, verify=False)
print(response.json())
打印:
{
"status": "success",
"data": {
"detected_lang": "ru",
"rotate": 0,
"blocks": [
{
"angle": 0,
"x": 0,
"y": 45,
"w": 686,
"h": 42,
"rx": 0,
"ry": 44,
"rw": 686,
"rh": 47,
"boxes": [
...
使用图片: