TorToiSe TTS Wav2Vec2 模型每次运行都进行训练

问题描述 投票:0回答:1

我正在尝试使用 tortoise tts lib 来合成人类语音。这是我的代码:

import os
import torchaudio

from tortoise.api import TextToSpeech
from tortoise.utils.audio import load_voice


tts = TextToSpeech(use_deepspeed=True, kv_cache=True, autoregressive_batch_size=2)

voice = 'freeman'
voice_samples, conditioning_latents = load_voice(voice)
gen = tts.tts_with_preset(
    '''
    Hello, boys and girls, my name is Fatlip
    And this is my friend, Sammy the Salmon
    Today, we're gonna teach you some fun facts about salmon
    And a brand new dance
    ''',
    preset='fast',
    voice_samples=voice_samples,
    conditioning_latents=conditioning_latents,
)

torchaudio.save(os.path.join('results/', f'synthesized_test.wav'), gen.squeeze(0).cpu(), 24000)

每次运行我都会收到此警告:

初始化 Wav2Vec2ForCTC 时,未使用 jbetker/wav2vec2-large-robust-ft-libritts-voxpopuli 处模型检查点的某些权重:['wav2vec2.encoder.pos_conv_embed.conv.weight_g', 'wav2vec2.encoder.pos_conv_embed.conv .weight_v']

  • 如果您从在其他任务或其他架构上训练的模型的检查点初始化 Wav2Vec2ForCTC(例如,从 BertForPreTraining 模型初始化 BertForSequenceClassification 模型),这是预期的。
  • 如果您从希望完全相同的模型的检查点初始化 Wav2Vec2ForCTC(从 BertForSequenceClassification 模型初始化 BertForSequenceClassification 模型),则不会出现这种情况。 Wav2Vec2ForCTC 的一些权重未从 jbetker/wav2vec2-large-robust-ft-libritts-voxpopuli 的模型检查点初始化,而是新初始化的:['wav2vec2.encoder.pos_conv_embed.conv.parametrizations.weight.original1', 'wav2vec2 .encoder.pos_conv_embed.conv.parametrizations.weight.original0'] 您可能应该在下游任务上训练此模型,以便能够将其用于预测和推理。

这个模型(jbetker/wav2vec2-large-robust-ft-libritts-voxpopuli)开始训练,大约花了半个小时。该模型在

tortoise.utils.wav2vec_alignment.py
中的tortoise库中使用。我该如何修复它?我可以训练它模型一次并使用吗?抱歉,如果我的问题很愚蠢,我是后端工程师。

这是我的规格: MacBook Air m1 16GB 内存

要求:

absl-py==2.0.0
accelerate==0.24.0
aiofiles==23.2.1
aiohttp==3.8.6
aiosignal==1.3.1
annotated-types==0.6.0
anyascii==0.3.2
appdirs==1.4.4
async-timeout==4.0.3
attrs==23.1.0
audioread==3.0.1
Babel==2.13.1
bangla==0.0.2
blinker==1.6.3
bnnumerizer==0.0.2
bnunicodenormalizer==0.1.6
cachetools==5.3.2
certifi==2023.7.22
cffi==1.16.0
chardet==3.0.4
charset-normalizer==3.3.1
clean-fid==0.1.35
click==8.1.7
clip-anytorch==2.5.2
coloredlogs==15.0.1
comtypes==1.2.0
contourpy==1.1.1
coqpit==0.0.17
cycler==0.12.1
Cython==0.29.30
dataclasses-json==0.6.1
dateparser==1.1.8
decorator==4.4.2
docker-pycreds==0.4.0
docopt==0.6.2
dtw-python==1.3.0
editdistance==0.6.2
einops==0.6.1
encodec==0.1.1
epitran==1.17
espeak-phonemizer==1.3.1
filelock==3.13.0
Flask==2.3.3
flatbuffers==23.5.26
fonttools==4.43.1
frozenlist==1.4.0
fsspec==2023.6.0
ftfy==6.1.1
g2pkk==0.1.2
gitdb==4.0.11
GitPython==3.1.40
google-auth==2.23.3
google-auth-oauthlib==1.1.0
googletrans==4.0.0rc1
grpcio==1.59.0
gruut==2.2.3
gruut-ipa==0.13.0
gruut-lang-de==2.0.0
gruut-lang-en==2.0.0
gruut-lang-es==2.0.0
gruut-lang-fr==2.0.2
gTTS==2.4.0
h11==0.9.0
h2==3.2.0
hpack==3.0.0
hstspreload==2023.1.1
httpcore==0.9.1
httpx==0.13.3
huggingface-hub==0.18.0
humanfriendly==10.0
hypercorn==0.14.4
hyperframe==5.2.0
idna==2.10
imageio==2.31.6
imageio-ffmpeg==0.4.9
inflect==5.6.2
itsdangerous==2.1.2
jamo==0.4.1
jieba==0.42.1
Jinja2==3.1.2
joblib==1.3.2
jsonlines==1.2.0
jsonmerge==1.9.2
jsonschema==4.19.1
jsonschema-specifications==2023.7.1
k-diffusion==0.0.16
kiwisolver==1.4.5
kornia==0.7.0
lazy_loader==0.3
librosa==0.10.0
llvmlite==0.40.1
marisa-trie==1.1.0
Markdown==3.5
MarkupSafe==2.1.3
marshmallow==3.20.1
matplotlib==3.7.3
more-itertools==10.1.0
moviepy==1.0.3
mpmath==1.3.0
msgpack==1.0.7
multidict==6.0.4
munkres==1.1.4
mycroft-mimic3-tts==0.2.4
mypy-extensions==1.0.0
networkx==2.8.8
nltk==3.8.1
num2words==0.5.13
numba==0.57.0
numpy==1.22.0
oauthlib==3.2.2
onnxruntime==1.16.1
openai-whisper==20230918
packaging==23.1
pandas==1.5.3
panphon==0.20.0
pathtools==0.1.2
phonemes2ids==1.2.2
Pillow==10.0.1
platformdirs==3.11.0
pooch==1.8.0
priority==2.0.0
proglog==0.1.10
progressbar==2.5
protobuf==4.23.4
psutil==5.9.6
pyasn1==0.5.0
pyasn1-modules==0.3.0
pycparser==2.21
pydantic==2.4.2
pydantic_core==2.10.1
pynndescent==0.5.10
pyobjc==10.0
pyobjc-core==10.0
pyobjc-framework-Accessibility==10.0
pyobjc-framework-Accounts==10.0
pyobjc-framework-AddressBook==10.0
pyobjc-framework-AdServices==10.0
pyobjc-framework-AdSupport==10.0
pyobjc-framework-AppleScriptKit==10.0
pyobjc-framework-AppleScriptObjC==10.0
pyobjc-framework-ApplicationServices==10.0
pyobjc-framework-AppTrackingTransparency==10.0
pyobjc-framework-AudioVideoBridging==10.0
pyobjc-framework-AuthenticationServices==10.0
pyobjc-framework-AutomaticAssessmentConfiguration==10.0
pyobjc-framework-Automator==10.0
pyobjc-framework-AVFoundation==10.0
pyobjc-framework-AVKit==10.0
pyobjc-framework-AVRouting==10.0
pyobjc-framework-BackgroundAssets==10.0
pyobjc-framework-BusinessChat==10.0
pyobjc-framework-CalendarStore==10.0
pyobjc-framework-CallKit==10.0
pyobjc-framework-CFNetwork==10.0
pyobjc-framework-ClassKit==10.0
pyobjc-framework-CloudKit==10.0
pyobjc-framework-Cocoa==10.0
pyobjc-framework-Collaboration==10.0
pyobjc-framework-ColorSync==10.0
pyobjc-framework-Contacts==10.0
pyobjc-framework-ContactsUI==10.0
pyobjc-framework-CoreAudio==10.0
pyobjc-framework-CoreAudioKit==10.0
pyobjc-framework-CoreBluetooth==10.0
pyobjc-framework-CoreData==10.0
pyobjc-framework-CoreHaptics==10.0
pyobjc-framework-CoreLocation==10.0
pyobjc-framework-CoreMedia==10.0
pyobjc-framework-CoreMediaIO==10.0
pyobjc-framework-CoreMIDI==10.0
pyobjc-framework-CoreML==10.0
pyobjc-framework-CoreMotion==10.0
pyobjc-framework-CoreServices==10.0
pyobjc-framework-CoreSpotlight==10.0
pyobjc-framework-CoreText==10.0
pyobjc-framework-CoreWLAN==10.0
pyobjc-framework-CryptoTokenKit==10.0
pyobjc-framework-DataDetection==10.0
pyobjc-framework-DeviceCheck==10.0
pyobjc-framework-DictionaryServices==10.0
pyobjc-framework-DiscRecording==10.0
pyobjc-framework-DiscRecordingUI==10.0
pyobjc-framework-DiskArbitration==10.0
pyobjc-framework-DVDPlayback==10.0
pyobjc-framework-EventKit==10.0
pyobjc-framework-ExceptionHandling==10.0
pyobjc-framework-ExecutionPolicy==10.0
pyobjc-framework-ExtensionKit==10.0
pyobjc-framework-ExternalAccessory==10.0
pyobjc-framework-FileProvider==10.0
pyobjc-framework-FileProviderUI==10.0
pyobjc-framework-FinderSync==10.0
pyobjc-framework-FSEvents==10.0
pyobjc-framework-GameCenter==10.0
pyobjc-framework-GameController==10.0
pyobjc-framework-GameKit==10.0
pyobjc-framework-GameplayKit==10.0
pyobjc-framework-HealthKit==10.0
pyobjc-framework-ImageCaptureCore==10.0
pyobjc-framework-InputMethodKit==10.0
pyobjc-framework-InstallerPlugins==10.0
pyobjc-framework-InstantMessage==10.0
pyobjc-framework-Intents==10.0
pyobjc-framework-IntentsUI==10.0
pyobjc-framework-IOBluetooth==10.0
pyobjc-framework-IOBluetoothUI==10.0
pyobjc-framework-IOSurface==10.0
pyobjc-framework-iTunesLibrary==10.0
pyobjc-framework-KernelManagement==10.0
pyobjc-framework-LatentSemanticMapping==10.0
pyobjc-framework-LaunchServices==10.0
pyobjc-framework-libdispatch==10.0
pyobjc-framework-libxpc==10.0
pyobjc-framework-LinkPresentation==10.0
pyobjc-framework-LocalAuthentication==10.0
pyobjc-framework-LocalAuthenticationEmbeddedUI==10.0
pyobjc-framework-MailKit==10.0
pyobjc-framework-MapKit==10.0
pyobjc-framework-MediaAccessibility==10.0
pyobjc-framework-MediaLibrary==10.0
pyobjc-framework-MediaPlayer==10.0
pyobjc-framework-MediaToolbox==10.0
pyobjc-framework-Metal==10.0
pyobjc-framework-MetalFX==10.0
pyobjc-framework-MetalKit==10.0
pyobjc-framework-MetalPerformanceShaders==10.0
pyobjc-framework-MetalPerformanceShadersGraph==10.0
pyobjc-framework-MetricKit==10.0
pyobjc-framework-MLCompute==10.0
pyobjc-framework-ModelIO==10.0
pyobjc-framework-MultipeerConnectivity==10.0
pyobjc-framework-NaturalLanguage==10.0
pyobjc-framework-NetFS==10.0
pyobjc-framework-Network==10.0
pyobjc-framework-NetworkExtension==10.0
pyobjc-framework-NotificationCenter==10.0
pyobjc-framework-OpenDirectory==10.0
pyobjc-framework-OSAKit==10.0
pyobjc-framework-OSLog==10.0
pyobjc-framework-PassKit==10.0
pyobjc-framework-PencilKit==10.0
pyobjc-framework-PHASE==10.0
pyobjc-framework-Photos==10.0
pyobjc-framework-PhotosUI==10.0
pyobjc-framework-PreferencePanes==10.0
pyobjc-framework-PushKit==10.0
pyobjc-framework-Quartz==10.0
pyobjc-framework-QuickLookThumbnailing==10.0
pyobjc-framework-ReplayKit==10.0
pyobjc-framework-SafariServices==10.0
pyobjc-framework-SafetyKit==10.0
pyobjc-framework-SceneKit==10.0
pyobjc-framework-ScreenCaptureKit==10.0
pyobjc-framework-ScreenSaver==10.0
pyobjc-framework-ScreenTime==10.0
pyobjc-framework-ScriptingBridge==10.0
pyobjc-framework-SearchKit==10.0
pyobjc-framework-Security==10.0
pyobjc-framework-SecurityFoundation==10.0
pyobjc-framework-SecurityInterface==10.0
pyobjc-framework-ServiceManagement==10.0
pyobjc-framework-SharedWithYou==10.0
pyobjc-framework-SharedWithYouCore==10.0
pyobjc-framework-ShazamKit==10.0
pyobjc-framework-Social==10.0
pyobjc-framework-SoundAnalysis==10.0
pyobjc-framework-Speech==10.0
pyobjc-framework-SpriteKit==10.0
pyobjc-framework-StoreKit==10.0
pyobjc-framework-SyncServices==10.0
pyobjc-framework-SystemConfiguration==10.0
pyobjc-framework-SystemExtensions==10.0
pyobjc-framework-ThreadNetwork==10.0
pyobjc-framework-UniformTypeIdentifiers==10.0
pyobjc-framework-UserNotifications==10.0
pyobjc-framework-UserNotificationsUI==10.0
pyobjc-framework-VideoSubscriberAccount==10.0
pyobjc-framework-VideoToolbox==10.0
pyobjc-framework-Virtualization==10.0
pyobjc-framework-Vision==10.0
pyobjc-framework-WebKit==10.0
pyparsing==3.1.1
pypinyin==0.49.0
pysbd==0.3.4
python-crfsuite==0.9.9
python-dateutil==2.8.2
pyttsx3==2.90
pyttsx4==3.0.15
pytz==2023.3.post1
PyYAML==6.0.1
Quart==0.19.3
quart-cors==0.7.0
referencing==0.30.2
regex==2023.10.3
requests==2.31.0
requests-oauthlib==1.3.1
resize-right==0.0.2
rfc3986==1.5.0
rotary-embedding-torch==0.3.5
rpds-py==0.10.6
rsa==4.9
safetensors==0.4.0
scikit-image==0.22.0
scikit-learn==1.3.0
scipy==1.11.3
sentry-sdk==1.32.0
setproctitle==1.3.3
six==1.16.0
smmap==5.0.1
sniffio==1.3.0
soundfile==0.12.1
soxr==0.3.7
swagger-ui-py==21.12.8
sympy==1.12
tensorboard==2.15.0
tensorboard-data-server==0.7.2
threadpoolctl==3.2.0
tifffile==2023.9.26
tiktoken==0.3.3
tokenizers==0.13.3
tomli==2.0.1
torch==2.2.0.dev20231029
torchaudio==2.2.0.dev20231029
torchdiffeq==0.2.3
torchsde==0.2.6
torchvision==0.17.0.dev20231029
tortoise-tts==3.0.0
tqdm==4.64.1
trainer==0.0.31
trampoline==0.1.2
transformers==4.33.3
TTS==0.19.0
typing-inspect==0.9.0
typing_extensions==4.8.0
tzlocal==5.2
umap-learn==0.5.4
unicodecsv==0.14.1
Unidecode==1.3.7
urllib3==2.0.7
wandb==0.15.12
wcwidth==0.2.8
Werkzeug==3.0.1
whisper-timestamped==1.12.20
wsproto==1.2.0
xdgenvpy==2.3.5
yarl==1.9.2

我试图寻找这个问题的解决方案,但我希望我是唯一遇到这个问题的人

python pytorch text-to-speech huggingface
1个回答
0
投票

''' 这是多行 评论 在 Python。那么,您刚刚取消了该参数吗? 你怎么认为? '''

尝试将其设置为 gen = tts.tts_with_preset('Just this message is said.' ...) 相反。 HTH

© www.soinside.com 2019 - 2024. All rights reserved.