我一直在使用此库通过Microsoft认知翻译器翻译文本。
PHP Microsoft Translate package
我有一个Azure帐户,我认为我的端点和密钥仍然有效。尽管我拥有“免费”软件包,但在此期间,我没有进行任何更改。
我实际上已经翻译了多种语言的代码,并且之前运行良好。由于某种原因,它最近停止了工作。我编辑了帖子以设置:
$host = "https://api.cognitive.microsofttranslator.com", thank you (typo).
如果我将$ text设置为简单的东西,它实际上就可以工作:
$text = 'Guten Morgen! Mit Corona zusammenraufen.Milde gesagt: Es sind keine besonders beruhigenden Nachrichten.';
返回英文:
{"text":"Good morning! To put it mildly, it's not particularly reassuring news.","to":"en"}
如果我将字符串稍微加长到:
$text = 'Guten Morgen! Mit Corona zusammenraufen. Milde gesagt: Es sind keine besonders beruhigenden Nachrichten. Milde gesagt: Es sind keine besonders beruhigenden Nachrichten.';
我得到一个:
failed to open stream: HTTP request failed! HTTP/1.1 400 Bad Request
以前使用更长的字符串进行输入。我对Azure有一个“免费”计划。我的Azure帐户现在是否可以限制字符串长度?
我拥有的代码是。还有一些额外的东西可用于将翻译内容放入数据库中。
$key = 'MicrosoftKEY';
$host = "https://api.cognitive.microsofttranslator.com";
$path = "/translate?api-version=3.0";
$languages = $data['languagearray']; // array of languages to translate to.
$params = '&from=' . $data["from"] ; // language to translate from
foreach ($languages as $language) {
$params .= "&to=" . $language;
}
$params .= "&textType=html";
$text = $data["text"]; // text to translate
if (!function_exists('com_create_guid')) {
function com_create_guid() {
return sprintf( '%04x%04x-%04x-%04x-%04x-%04x%04x%04x',
mt_rand( 0, 0xffff ), mt_rand( 0, 0xffff ),
mt_rand( 0, 0xffff ),
mt_rand( 0, 0x0fff ) | 0x4000,
mt_rand( 0, 0x3fff ) | 0x8000,
mt_rand( 0, 0xffff ), mt_rand( 0, 0xffff ), mt_rand( 0, 0xffff )
);
}
}
// function to return the translated text, seem below, $json = jsonp_decode($result, true)[0]["translations"];
function Translate ($host, $path, $key, $params, $content) {
$headers = "Content-type: application/json\r\n" .
"Content-length: " . strlen($content) . "\r\n" .
"Ocp-Apim-Subscription-Key: $key\r\n" .
"X-ClientTraceId: " . com_create_guid() . "\r\n";
// NOTE: Use the key 'http' even if you are making an HTTPS request. See:
// http://php.net/manual/en/function.stream-context-create.php
$options = array (
'http' => array (
'header' => $headers,
'method' => 'POST',
'content' => $content
)
);
$context = stream_context_create ($options);
$result = file_get_contents ($host . $path . $params, false, $context);
echo $result;
return $result;
}
$requestBody = array (
array (
'Text' => $text,
),
);
function jsonp_decode($jsonp, $assoc = false) { // PHP 5.3 adds depth as third parameter to json_decode
if($jsonp[0] !== '[' && $jsonp[0] !== '{') { // we have JSONP
$jsonp = substr($jsonp, strpos($jsonp, '('));
}
return json_decode(trim($jsonp,'();'), $assoc);
}
$content = json_encode($requestBody);
$result = Translate ($host, $path, $key, $params, $content);
// Note: We convert result, which is JSON, to and from an object so we can pretty-print it.
// We want to avoid escaping any Unicode characters that result contains. See:
// http://php.net/manual/en/function.json-encode.php
$json = jsonp_decode($result, true)[0]["translations"];
$conn = DatabaseFactory::getFactory()->getConnection();
foreach ($json as $language) {
$query = 'UPDATE kronen_translations SET translated_text = ? WHERE language_code = ?';
$parameters = [$language['text'], $language['to']];
$stmt = $conn->prepare($query);
$stmt->execute($parameters);
echo $language['to'] . '<br>';
echo $language['text'] . '<br>';
}
我进行了更多调查,最终向Microsoft Cognitive Services开了张罚单。这个脚本/几个月前运行良好,实际上我使用捆绑调用一次调用将几段翻译成大约40种不同的语言(几乎支持它们的全部功能),我得到的答复是所有的翻译。以前工作正常。
但是,翻译服务的API请求有字符限制:
参见:Request limits for Translator Text
“在您要翻译的所有目标语言中,每个翻译请求都限制为5,000个字符。例如,发送1,500个字符的翻译请求以翻译为3种不同的语言,则请求大小为1,500x3 = 4,500个字符,满足请求限制。按字符数收费,而不是按请求数收费。建议发送较短的请求。“
[使用40种不同的语言以及几段包含几百个字符的段落,我超出了限制。在使用MS支持对电子邮件进行交易之后,事实证明,由于去年年底和年底的一个错误,API要求限制没有得到强制实施,并且显然是最近对其进行了修复。
仅供参考,旧代码类似于以下内容,查询字符串大约有40个左右的&to = lang&。 。 。 。
$languages = $data['languagearray'];
$params = '&from=' . $data["from"] ;
foreach ($languages as $language) {
$params .= "&to=" . $language;
}
$params .= "&textType=html";
$text = $data["text"];
if (!function_exists('com_create_guid')) {
function com_create_guid() {
return sprintf( '%04x%04x-%04x-%04x-%04x-%04x%04x%04x',
mt_rand( 0, 0xffff ), mt_rand( 0, 0xffff ),
mt_rand( 0, 0xffff ),
mt_rand( 0, 0x0fff ) | 0x4000,
mt_rand( 0, 0x3fff ) | 0x8000,
mt_rand( 0, 0xffff ), mt_rand( 0, 0xffff ), mt_rand( 0, 0xffff )
);
}
}
function Translate ($host, $path, $key, $params, $content) {
$headers = "Content-type: application/json\r\n" .
"Content-length: " . strlen($content) . "\r\n" .
"Ocp-Apim-Subscription-Key: $key\r\n" .
"X-ClientTraceId: " . com_create_guid() . "\r\n";
// NOTE: Use the key 'http' even if you are making an HTTPS request. See:
// http://php.net/manual/en/function.stream-context-create.php
$options = array (
'http' => array (
'header' => $headers,
'method' => 'POST',
'content' => $content
)
);
$context = stream_context_create ($options);
$result = file_get_contents ($host . $path . $params, false, $context);
echo $result;
return $result;
}
$requestBody = array (
array (
'Text' => $text,
),
);
function jsonp_decode($jsonp, $assoc = false) { // PHP 5.3 adds depth as third parameter to json_decode
if($jsonp[0] !== '[' && $jsonp[0] !== '{') { // we have JSONP
$jsonp = substr($jsonp, strpos($jsonp, '('));
}
return json_decode(trim($jsonp,'();'), $assoc);
}
$content = json_encode($requestBody);
$result = Translate ($host, $path, $key, $params, $content);
// Note: We convert result, which is JSON, to and from an object so we can pretty-print it.
// We want to avoid escaping any Unicode characters that result contains. See:
// http://php.net/manual/en/function.json-encode.php
$json = jsonp_decode($result, true)[0]["translations"];
$conn = DatabaseFactory::getFactory()->getConnection();
foreach ($json as $language) {
$query = 'UPDATE kronen_translations SET translated_text = ? WHERE language_code = ?';
$parameters = [$language['text'], $language['to']];
$stmt = $conn->prepare($query);
$stmt->execute($parameters);
echo $language['to'] . '<br>';
echo $language['text'] . '<br>';
}
允许您这样打出捆绑的电话,然后解码您返回的结果。
解决方案是进行一系列调用(在这种情况下,一次调用一次),每种语言一次,然后按顺序处理结果。新代码是这样的。也许有更好的方法,但是至少我可以处理大小合理的文本块。运行40种左右语言的脚本只需要一点时间,尽管仍然只有30秒,这取决于文本的大小。
function Translate ($host, $path, $key, $params, $content) {
$headers = "Content-type: application/json\r\n" .
"Content-length: " . strlen($content) . "\r\n" .
"Ocp-Apim-Subscription-Key: $key\r\n" .
"X-ClientTraceId: " . com_create_guid() . "\r\n";
// NOTE: Use the key 'http' even if you are making an HTTPS request. See:
// http://php.net/manual/en/function.stream-context-create.php
$options = array (
'http' => array (
'header' => $headers,
'method' => 'POST',
'content' => $content
)
);
$context = stream_context_create ($options);
$result = file_get_contents ($host . $path . $params, false, $context);
echo $result;
return $result;
}
function jsonp_decode($jsonp, $assoc = false) { // PHP 5.3 adds depth as third parameter to json_decode
if($jsonp[0] !== '[' && $jsonp[0] !== '{') { // we have JSONP
$jsonp = substr($jsonp, strpos($jsonp, '('));
}
return json_decode(trim($jsonp,'();'), $assoc);
}
$key = 'xxxxx';
$host = "https://api.cognitive.microsofttranslator.com";
$path = "/translate?api-version=3.0";
$languages = $data['languagearray'];
// print_r($languages);
$conn = DatabaseFactory::getFactory()->getConnection();
$text = $data["text"];
foreach ($languages as $language) {
$params = '&from=' . $data["from"] ;
$params .= "&to=" . $language;
$params .= "&textType=html";
$text = $data["text"];
$requestBody = array (
array (
'Text' => $text,
),
);
$content = json_encode($requestBody);
$result = Translate ($host, $path, $key, $params, $content);
$json = jsonp_decode($result, true)[0]["translations"];
foreach ($json as $language) {
$query = 'UPDATE kronen_translations SET translated_text = ? WHERE language_code = ?';
$parameters = [$language['text'], $language['to']];
$stmt = $conn->prepare($query);
$stmt->execute($parameters);
echo $language['to'] . '<br>';
echo $language['text'] . '<br>';
}