我如何能采取人类口音(Wavenet或Ssml的声音)?

问题描述 投票:0回答:1

我正在使用这个谷歌云文本到语音,就像他们在网站上写的那样。https:/codelabs.developer.google.comcodelabscloud-text-speech-csharp#6。 )

但没有详细说明如何采取输出Wavenet语音(Ssml)。这个编码输出的是正常的声音。

我的问题是,用这个代码,我怎么能采取人类的口音(Wavenet或Ssml的声音)?

using Google.Cloud.TextToSpeech.V1;
using System;
using System.IO;

namespace TextToSpeechApiDemo
{
    class Program
    {
        static void Main(string[] args)
        {
            var client = TextToSpeechClient.Create();

            // The input to be synthesized, can be provided as text or SSML.
            var input = new SynthesisInput
            {
                **Text = "This is a demonstration of the Google Cloud Text-to-Speech API"
            };
            // Build the voice request.
            var voiceSelection = new VoiceSelectionParams
            {
                LanguageCode = "en-US",
                SsmlGender = SsmlVoiceGender.Female**
            };

            // Specify the type of audio file.
            var audioConfig = new AudioConfig
            {
                AudioEncoding = AudioEncoding.Mp3
            };

            // Perform the text-to-speech request.
            var response = client.SynthesizeSpeech(input, voiceSelection, audioConfig);

            // Write the response to the output file.
            using (var output = File.Create("output.mp3"))
            {
                response.AudioContent.WriteTo(output);
            }
            Console.WriteLine("Audio content written to file \"output.mp3\"");
        }
    }
}
api google-cloud-platform text-to-speech
1个回答
0
投票

这里的 您可以检查文本到语音API支持的语言和声音。如本文所述 教程 语音的特点是三个参数:即 language_codeజజజజజజజజజజజజజజజజజజజజజజజజజజజజజజజజజజజజజజజ namessml_gender.

你可以使用下面的Python代码来翻译文本。"Hello my name is John. How are you?" 带着口音说成英语 en-GB-Standard-A

 def synthesize_text(text):                                                                                                                                                                       
     """Synthesizes speech from the input string of text."""                                                                                                                                      
     from google.cloud import texttospeech                                                                                                                                                        
     client = texttospeech.TextToSpeechClient()                                                                                                                                                   

     input_text = texttospeech.types.SynthesisInput(text=text)                                                                                                                                    

     # Note: the voice can also be specified by name.                                                                                                                                             
     # Names of voices can be retrieved with client.list_voices().                                                                                                                                
     voice = texttospeech.types.VoiceSelectionParams(                                                                                                                                             
         language_code='en-GB',                                                                                                                                                                   
         name='en-GB-Standard-A',                                                                                                                                                                 
         ssml_gender=texttospeech.enums.SsmlVoiceGender.FEMALE)                                                                                                                                   

     audio_config = texttospeech.types.AudioConfig(                                                                                                                                               
         audio_encoding=texttospeech.enums.AudioEncoding.MP3)                                                                                                                                     

     response = client.synthesize_speech(input_text, voice, audio_config)                                                                                                                         

     # The response's audio_content is binary.                                                                                                                                                    
     with open('output.mp3', 'wb') as out:                                                                                                                                                        
         out.write(response.audio_content)                                                                                                                                                        
         print('Audio content written to file "output.mp3"')                                                                                                                                      


 text="Hello my name is John. How are you?"                                                                                         
 synthesize_text(text)

我对C#语言并不熟悉,但从C#语言中的 C#java 文档中,你应该也可以定义名称参数来调整语音。

© www.soinside.com 2019 - 2024. All rights reserved.