实际上,这是我先前提出的问题的扩展。这是将文本转换为音频文件中的语音的工作块。
import json
from symbol import parameters
from ibm_watson import TextToSpeechV1
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator
from ibm_watson import ApiException
import speech_recognition as sr
try:
() # Invoke a Text to Speech method
except ApiException as ex:
print("Method failed with status code " + str(ex.code) + ": " + ex.message)
authenticator = IAMAuthenticator("(API KEY)")
text_to_speech = TextToSpeechV1(
authenticator=authenticator
)
text_to_speech.set_service_url(
'https://api.us-south.text-to-speech.watson.cloud.ibm.com/instances/113cd664-f07b-44fe-a11d-a46cc50caf84')
with open('hello_world.wav', 'wb') as audio_file:
audio_file.write(
text_to_speech.synthesize(
'Hello world',
voice='en-US_AllisonVoice',
accept='audio/wav'
).get_result().content)
使用上面提供的代码,我如何将IBM api与Python结合使用,并用我的声音输入问题,并根据我的声音输入从IBM声音中获得不同的响应?现在,输入的文本是“ hello world”所在的位置,即被发送到IBM的内容,然后使用选定的Allison Voice,它将保存一个Wav文件,该文本由Allison读取。如果有人对我该怎么做有想法,请留下一段代码作为示例。我对python缩进不太满意。
[这是我尝试将代码从Text转换为语音和从Speech to Text合并IBM api的尝试。这是非常有缺陷的,我不知道该怎么办。任何帮助都会很棒。
import json
import os
from symbol import parameters
import wikipedia
import pyjokes
from ibm_watson import TextToSpeechV1
from ibm_watson import SpeechToTextV1
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator
from ibm_watson import ApiException
try:
() # Invoke a Text to Speech method
except ApiException as ex:
print("Method failed with status code " + str(ex.code) + ": " + ex.message)
authenticator = IAMAuthenticator("API Key")
text_to_speech = TextToSpeechV1(
authenticator=authenticator
)
authenticator = IAMAuthenticator('API KEY')
speech_to_text = SpeechToTextV1(
authenticator=authenticator
)
voice='en-US_AllisonVoice'
speech_to_text.set_service_url('https://api.us-south.speech-to-text.watson.cloud.ibm.com/instances/7393db4a-82d8-40f8-a86d-09cb948589e2')
text_to_speech.set_service_url('https://api.us-south.text-to-speech.watson.cloud.ibm.com/instances/113cd664-f07b-44fe-a11d-a46cc50caf84')
def takeCommand(recognize_using_websocket)
with dict recognize_using_websocket.Microphone() as source:
print("Listening...")
r.pause_threshold = .5
audio = r.listen(source)
try:
print("Recognizing...")
query = r.recognize_google(audio, language='en-us')
print("User said: {query}\n")
except Exception as e:
print(e)
text_to_speech.synthesize("I can't hear you sir.")
print("I can't hear you sir.")
return "None"
return query
if __name__ == '__main__':
clear = lambda: os.system('cls')
# This Function will clean any
# command before execution of this python file
clear()
while True:
query = takeCommand().lower()
# All the commands said by user will be
# stored here in 'query' and will be
# converted to lower case for easily
# recognition of command
if 'wikipedia' in query:
text_to_speech.synthesize('Searching Wikipedia...')
query = query.replace("wikipedia", "")
results = wikipedia.summary(query, sentences=3)
text_to_speech.synthesize("According to Wikipedia")
print(results)
text_to_speech.synthesize(results)
elif "who made you" in query or "who created you" in query:
text_to_speech.synthesize("I have been created by you sir.")
elif 'tell me a joke' in query or "make me laugh" in query:
text_to_speech.synthesize(pyjokes.get_joke()
我试图在这里建立一个简单的问与答系统。有些只是问而已,就像Wikipedia一样弹出网站。我对此很陌生,因此非常感谢您的帮助。我会悬赏以帮助一些人,但我才刚开始使用StackOverflow。
我建议您看一下项目TJBot。这是一个基于Raspberry Pi的硬纸板机器人,带有麦克风和扬声器。可以使用所谓的recipes与麦克风和扬声器进行交互,以构建具有语音功能的机器人。他们中的许多人都使用Watson STT,WA和TTS。我几年前使用的一个较旧的示例是“ Tell the time”。另一个配方“ Conversation”使用相同的服务。