edx语音识别系统

问题描述 投票:0回答:1

我目前已经由edx注册了语音识别系统。到目前为止,我已经完成了两个模块,并停留在第三个模块上。我已经从给定的音频信号中提取了特征。我正在尝试使用cntk训练具有这些功能的DNN。以下是我遇到的错误。

Selected GPU[0] GeForce 940MX as the process wide default device.
Training 1321080 parameters in 10 parameter tensors.

-------------------------------------------------------------------
Build info:

                Built time: Nov 22 2017 22:00:37
                Last modified date: Mon Oct 16 23:15:20 2017
                Build type: Release
                Build target: GPU
                With 1bit-SGD: no
                With ASGD: yes
                Math lib: mkl
                CUDA version: 9.0.10
                CUDNN version: 6.0.21
                Build Branch: HEAD
                Build SHA1: feade5b1bc1fbe6a8fc13b214f66aa89dac0db53
                MPI distribution: Microsoft MPI
                MPI version: 7.0.12437.6
-------------------------------------------------------------------
Redirecting log to file ../Experiments\am\DNN\log
Reading script file ../Experiments\lists\feat_train.rscp ... 1832 entries
HTKDeserializer: selected '1832' utterances grouped into '1' chunks, average chunk size: 1832.0 utterances, 73280.0 frames (for I/O: 1832.0 utterances, 73280.0 frames)
HTKDeserializer: determined feature kind as '584'-dimensional 'USER' with frame shift 10.0 ms
Total (121) state names in state list '../Experiments\am\labels.ciphones'
MLFDeserializer: '2559' utterances with '1809843' frames
Reading script file ../Experiments\lists\feat_dev.rscp ... 466 entries
HTKDeserializer: selected '466' utterances grouped into '1' chunks, average chunk size: 466.0 utterances, 18640.0 frames (for I/O: 466.0 utterances, 18640.0 frames)
HTKDeserializer: determined feature kind as '1466'-dimensional 'USER' with frame shift 10.0 ms
Total (121) state names in state list '../Experiments\am\labels.ciphones'
MLFDeserializer: '2559' utterances with '1809843' frames
attempt: setkind: inconsistent feature kind for file '../Experiments\lists\..\feat\1272-128104-0001.feat', retrying 2-th time out of 5...
attempt: setkind: inconsistent feature kind for file '../Experiments\lists\..\feat\1272-128104-0001.feat', retrying 3-th time out of 5...
attempt: setkind: inconsistent feature kind for file '../Experiments\lists\..\feat\1272-128104-0001.feat', retrying 4-th time out of 5...
attempt: setkind: inconsistent feature kind for file '../Experiments\lists\..\feat\1272-128104-0001.feat', retrying 5-th time out of 5...
Traceback (most recent call last):
    File "M3_Train_AM.py", line 283, in <module>
    main()
    File "M3_Train_AM.py", line 277, in main
    model_type=model_type
    File "M3_Train_AM.py", line 197, in train_network
    cv_source=cv_source
    File "M3_Train_AM.py", line 154, in train_and_test
    cv_config=cv_checkpoint_config
    File "C:\Users\Shivani\Anaconda3\lib\site-packages\cntk\internal\swig_helper.py", line 69, in 
    wrapper
    result = f(*args, **kwds)
    File "C:\Users\Shivani\Anaconda3\lib\site-packages\cntk\train\training_session.py", line 333, 
    in train
    super(TrainingSession, self).train(device)
    File "C:\Users\Shivani\Anaconda3\lib\site-packages\cntk\cntk_py.py", line 3361, in train
    return _cntk_py.TrainingSession_train(self, computeDevice)
    RuntimeError: setkind: inconsistent feature kind for file '../Experiments\lists\..\feat\1272-128104-0001.feat'

[CALL STACK]
    > CreateDeserializer
    - CreateDeserializer (x6)
    - CreateCompositeDataReader (x8)
    - Microsoft::MSR::CNTK::Matrix<float>::  __autoclassinit2

但是当尝试使用相同的功能训练BLSTM时,以下是我遇到的错误,

Selected GPU[0] GeForce 940MX as the process wide default device.
Training 8691832 parameters in 3 parameter tensors.

-------------------------------------------------------------------
Build info:

                Built time: Nov 22 2017 22:00:37
                Last modified date: Mon Oct 16 23:15:20 2017
                Build type: Release
                Build target: GPU
                With 1bit-SGD: no
                With ASGD: yes
                Math lib: mkl
                CUDA version: 9.0.10
                CUDNN version: 6.0.21
                Build Branch: HEAD
                Build SHA1: feade5b1bc1fbe6a8fc13b214f66aa89dac0db53
                MPI distribution: Microsoft MPI
                MPI version: 7.0.12437.6
-------------------------------------------------------------------
Redirecting log to file ../Experiments\am\BLSTM\log
Reading script file ../Experiments\lists\feat_train.rscp ... 1832 entries
HTKDeserializer: selected '1832' utterances grouped into '1' chunks, average chunk size: 1832.0 utterances, 73280.0 frames (for I/O: 1832.0 utterances, 73280.0 frames)
HTKDeserializer: determined feature kind as '584'-dimensional 'USER' with frame shift 10.0 ms
Traceback (most recent call last):
    File "M3_Train_AM.py", line 283, in <module>
    main()
    File "M3_Train_AM.py", line 277, in main
    model_type=model_type
    File "M3_Train_AM.py", line 177, in train_network
    frame_mode=frame_mode)
    File "M3_Train_AM.py", line 53, in create_mb_source
    return C.io.MinibatchSource([fd, ld], frame_mode=frame_mode, max_sweeps=max_sweeps)
    File "C:\Users\Shivani\Anaconda3\lib\site-packages\cntk\io\__init__.py", line 226, in __init__
    source = cntk_py.create_composite_minibatch_source(config)
    RuntimeError: HTKDeserializer: model vector size is not multiple of input features

[CALL STACK]
    > CreateDeserializer
    - CreateDeserializer (x3)
    - CreateCompositeDataReader (x3)
    - CNTK::  UniversalLearner
    - CNTK::  CreateCompositeMinibatchSource
    - PyInit__cntk_py
    - PyCFunction_FastCallDict
    - PyObject_CallFunctionObjArgs
    - PyEval_EvalFrameDefault
    - Py_CheckFunctionResult
    - PyFunction_FastCallDict
    - PyObject_IsInstance
speech-recognition cntk edx
1个回答
0
投票

我也参加了该课程,Jim Lewis提出了一个解决方案。请参阅下面的链接。

https://trainingsupport.microsoft.com/en-us/tcmpd/forum/all/problem-running-blstm-problem-in-dev287x-module-03/126e3f7b-4150-44f0-bf46-9e871a198ed4?auth=1

看看是否可行!

© www.soinside.com 2019 - 2024. All rights reserved.