我正在尝试将实验性DirectLineSpeech Echo Bot sample的Speak()
方法更新为使用神经语音,但似乎无效。
这是我想使其起作用的代码-
public IActivity Speak(string message)
{
var activity = MessageFactory.Text(message);
string body = @"<speak version='1.0' xmlns='https://www.w3.org/2001/10/synthesis' xmlns:mstts='https://www.w3.org/2001/mstts' xml:lang='en-US'>
<voice name='en-US-JessaNeural'><mstts:express-as type='chat'>" +
$"{message}" + "</mstts:express-as></voice></speak>";
activity.Speak = body;
return activity;
}
这是基于SSML Guide中提供的建议
这里是标准的T2S供参考:
public IActivity Speak(string message)
{
var activity = MessageFactory.Text(message);
string body = @"<speak version='1.0' xmlns='https://www.w3.org/2001/10/synthesis' xml:lang='en-US'>
<voice name='Microsoft Server Speech Text to Speech Voice (en-US, JessaRUS)'>" +
$"{message}" + "</voice></speak>";
activity.Speak = body;
return activity;
}
有人可以帮助我了解其工作原理或我做错了什么吗?
神经语音的确切名称是Microsoft Server Speech Text to Speech Voice (en-US, JessaNeural)
,而不是en-US-JessaNeural
(请参阅doc here)
因此更改以下内容:
<voice name='en-US-JessaNeural'><mstts:express-as type='chat'>" + $"{message}" + "</mstts:express-as></voice></speak>";
收件人:
<voice name='Microsoft Server Speech Text to Speech Voice (en-US, JessaNeural)'><mstts:express-as type='chat'>" + $"{message}" + "</mstts:express-as></voice></speak>";