Git Product home page Git Product logo

speech_recognition's People

Contributors

xxbb1234021 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

speech_recognition's Issues

数据问题

楼主,您好。初学者有几个问题请教
test.world.txt文件里放的是wav里的所有中文数据吗?顺序必须一致吗?还有测试的时候wave_files是文件名?还是?我调用文件名好像不对。

语音预处理问题

您好,我想问一下比如训练需要输入x和y,x是对一句语音进行预处理而成的,那y怎么进行处理呢?如果是数字分类问题,只需要将y变成0-9对应的数据,可是这一句话要变成什么呢?一句话里面有很多的汉字呀,该怎么对应呢?希望您能解决我的疑问

test problem

我是用的train里面的数据训练的,训练了两天多,然后想用test里面的数据进行测试,直接改config.ini文件是不能够测

about the data path

你好,你的conf文件夹内的路径我不太明白,label_file 具体是什么呀,还有wave_path 是你从清华数据集里面调出来的么,万分感谢

cpu跑build_target_wav_file_test报InvalidArgumentError

修改了代码中使用gpu的部分。
两处地方:


 # with tf.device('/gpu:0'):
 with tf.device('/cpu:0'):

# gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.7)
# self.sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))
    

self.sess = tf.Session()

修改完之后报错:
InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [36] rhs shape= [1788]

完全就是假效果,靠作弊训练的结果,大家不要信他的

大家千万别相信他的结果,这个所谓的"准确"其实是作弊的方式得来的,大家可以随便测试,只要是自己录制的声音,哪怕你照着标签念一遍,你就会发现,这预测结果很狗屎,无论你用什么设备录制,真不知道,靠这种把式弄这么个结果有啥意思,大家看到这个语音识别就请绕道把,一点鸟用没有,别浪费时间

按照作者提供的模型预测结果问题

读入语音文件: /opt/wav/test/D13/D13_992.wav
开始识别语音数据......
语音原始文本: 山东省 烟台 奥尔 呼 斯 药业 有限公司 近日 研制 成功 外用 降血压 新药 利 压 平 霜
识别出来的文本: 局内但内但阿内碗内碗但内碗但碗内碗局碗章来琼章罔汁碗章局罔碗汁碗章内汁局汁陈迷内碗扬但陈肥碗肥碗来内碗电罔汁来肥来据来罔碗汁章汁碗汁扭汁罔汁碗来碗来汁语汁语碗语碗罔电局琼电琼电章琼来碗汁碗内碗内电无碗章碗汁碗内碗内汁碗来内来陈汁陈内电阿语碗汁碗来碗来罔来罔来陈来陈罔电碗电碗电章碗来碗局碗局引罔来汁来碗来局支章汁碗汁电来碗殖电汁琼很章祖汁来内来罔电罔来罔来锦来肥电碗著碗章碗汁碗汁单来碗来电汁语汁陈碗陈来很碗肥汁碗罔电罔电来电汁西支音
读入语音文件: /opt/wav/test/D13/D13_823.wav
开始识别语音数据......
语音原始文本: 五月 的 一天 下大雨 阳 台上 漏 进 许多 雨水 可 又 没有 排 水洞 只好 一盆盆 往 外 端
识别出来的文本: 电内电内碗罔碗汁碗章来章碗局碗汁局汁碗来碗章碗章碗章碗章碗琼来碗章碗扬碗电碗电罔引碗局碗局来很来碗来碗电扬电扬碗罔碗肥碗内碗章碗局碗局碗章汁碗章碗罔琼汁来汁锦汁碗来局汁碗汁碗汁语碗很碗汁碗汁碗汁碗汁来汁碗汁语扬碗罔碗支碗汁碗汁碗局碗肥碗局碗汁碗来碗来碗罔碗扬碗肥引碗汁碗章碗罔碗罔来罔来支电碗来碗来单内碗汁肥很汁肥汁章碗汁碗汁碗汁碗局碗罔碗局汁局罔局很局汁局引支章碗罔琼罔汁来

新添加语料导致输出端节点不匹配

请问下大佬,在你的模型基础上我有添加了新的语音文件和相应的标签文件,但是这会导致词表变大,直接导致载入你的模型在layer6的节点和新的词表不匹配,请问如果想在你的模型基础上做迁移学习应该这么做呢?

tensorflow.python.framework.errors_impl.NotFoundError:

tensorflow.python.framework.errors_impl.NotFoundError: FindFirstFile failed for: /home/kevin/workspaces/python/speech-new-test : ϵͳ\udcd5Ҳ\udcbb\udcb5\udcbdָ\udcb6\udca8\udcb5\udcc4·\udcbe\udcb6\udca1\udca3
; No such process

用你的模型测试会有这种错误,求问怎么解决

大佬,有训练好的模型checkpoint没?

用CPU训练太慢。能否分享一下大佬训练好的model?

结果都是这样子的:
循环次数: 69 损失: 220.8472399030413
错误率: 0.87767243
语音原始文本: 风平浪静 的 漆黑 午夜 王云 州 王新 槐 王 羊子 首先 发现 怪物 闪 着 红光 的 眼睛
识别出来的文本: 这 人龚龚

训练时错误率0.8怎么回事

这个模型是音频直接到字的, 训练错误率达0.8, 楼主说哟个改进的代码,可是那个是音频到拼音再到字的,不一样。
我的数据就是用test的。训练和测试时都用这个。 为什么错误率那么高?

UnicodeDecodeError

config.read(os.path.join(current_dir, "conf", "%s.ini" % file_name))
File "C:\Users\zxchong\AppData\Local\Programs\Python\Python36\lib\configparser.py", line 697, in read
self._read(fp, filename)
File "C:\Users\zxchong\AppData\Local\Programs\Python\Python36\lib\configparser.py", line 1015, in _read
for lineno, line in enumerate(fp, start=1):
UnicodeDecodeError: 'gbk' codec can't decode byte 0xae in position 56: illegal multibyte sequence
请问大佬这是什么问题啊?

test problem

i haven train a model (50 epochs, datatrain 11000 wave files with text for each file, batchsize 16, learning rate = 0.001)
when training:
train_err: 0.028019862
trans_array_to_text_ch len: 72
words [' ', '4', ':', 'A', 'B', 'C', 'D', 'E', 'G', 'H', 'I', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'X', 'Y', 'À', 'Á', 'Â', 'Ã', 'È', 'É', 'Ê', 'Ì', 'Í', 'Ò', 'Ó', 'Ô', 'Õ', 'Ù', 'Ú', 'Ý', 'Ă', 'Đ', 'Ĩ', 'Ũ', 'Ơ', 'Ư', 'Ạ', 'Ả', 'Ấ', 'Ầ', 'Ẩ', 'Ẫ', 'Ậ', 'Ắ', 'Ằ', 'Ẳ', 'Ẵ', 'Ặ', 'Ẹ', 'Ẻ', 'Ẽ', 'Ế', 'Ề', 'Ể', 'Ễ', 'Ệ', 'Ỉ', 'Ị', 'Ọ', 'Ỏ', 'Ố', 'Ồ', 'Ổ', 'Ỗ', 'Ộ', 'Ớ', 'Ờ', 'Ở', 'Ỡ', 'Ợ', 'Ụ', 'Ủ', 'Ứ', 'Ừ', 'Ử', 'Ữ', 'Ự', 'Ỳ', 'Ỵ', 'Ỷ', 'Ỹ'] value [20 18 15 14 8 0 20 9 77 10 0 8 10 3 14 0 42 50 21 0 14 8 46 77
10 0 13 76 10 0 42 62 14 0 5 9 46 3 0 4 10 62 20 0 5 26 5 9
0 12 25 13 0 22 10 66 5 0 9 10 66 21 0 17 21 48 -1 -1 -1 -1 -1 -1] results TRONG THỜI GIAN ĐẦU NGƯỜI MỚI ĐẾN CHƯA BIẾT CÁCH LÀM VIỆC HIỆU QUẢỸỸỸỸỸỸ
orig: TRONG THỜI GIAN ĐẦU NGƯỜI MỚI ĐẾN CHƯA BIẾT CÁCH LÀM VIỆC HIỆU QUẢ
decoded_str: TRONG THỜI GIAN ĐẦU NGƯỜI MỚI ĐẾN CHƯA BIẾT CÁCH LÀM VIỆC HIỆU QUẢỸỸỸỸỸỸ

but when train done i test model text is not recognition
wav_files[0]: data_test\VIVOSDEV01_R117.wav
Testing......
trans_array_to_text_ch len: 65
words [' ', 'A', 'B', 'C', 'D', 'E', 'G', 'H', 'I', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'X', 'Y', 'À', 'Á', 'Â', 'Ã', 'È', 'É', 'Ê', 'Ì', 'Í', 'Ò', 'Ó', 'Ô', 'Õ', 'Ù', 'Ú', 'Ý', 'Ă', 'Đ', 'Ĩ', 'Ũ', 'Ơ', 'Ư', 'Ạ', 'Ả', 'Ấ', 'Ầ', 'Ẩ', 'Ẫ', 'Ậ', 'Ắ', 'Ằ', 'Ẳ', 'Ặ', 'Ẹ', 'Ẻ', 'Ẽ', 'Ế', 'Ề', 'Ể', 'Ễ', 'Ệ', 'Ỉ', 'Ị', 'Ọ', 'Ỏ', 'Ố', 'Ồ', 'Ổ', 'Ỗ', 'Ộ', 'Ớ', 'Ờ', 'Ở', 'Ỡ', 'Ợ', 'Ụ', 'Ủ', 'Ứ', 'Ừ', 'Ử', 'Ữ', 'Ự', 'Ỳ', 'Ỷ', 'Ỹ'] value [21 67 21 24 48 24 48 24 48 24 4 24 4 24 4 16 4 24 21 16 21 24 21 4
21 4 21 4 24 4 21 48 21 4 24 16 21 16 21 16 21 16 4 87 21 16 21 16
24 21 24 4 24 4 16 45 16 24 4 16 4 16 4 46 16] results XỎXÁẦÁẦÁẦÁDÁDÁDRDÁXRXÁXDXDXDÁDXẦXDÁRXRXRXRDỸXRXRÁXÁDÁDRẠRÁDRDRDẢR
orig: VIỆC MỘT QUYỂN TIỂU THUYẾT NHƯ THẾ
decoded_str: XỎXÁẦÁẦÁẦÁDÁDÁDRDÁXRXÁXDXDXDÁDXẦXDÁRXRXRXRDỸXRXRÁXÁDÁDRẠRÁDRDRDẢR

训练时出错,求大神帮忙

错误内容如下:
Original stack trace for 'b1/Initializer/random_normal/RandomStandardNormal':
File "C:/Users/Administrator/Desktop/speech_recognition-master/test.py", line 18, in
bi_rnn.build_test()
File "C:\Users\Administrator\Desktop\speech_recognition-master\model.py", line 334, in build_test
self.bi_rnn_layer()
File "C:\Users\Administrator\Desktop\speech_recognition-master\model.py", line 77, in bi_rnn_layer
b1 = self.variable_on_device('b1', [n_hidden_1], tf.random_normal_initializer(stddev=b_stddev))
File "C:\Users\Administrator\Desktop\speech_recognition-master\model.py", line 348, in variable_on_device
var = tf.get_variable(name=name, shape=shape, initializer=initializer)
File "D:\Anaconda3\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 1496, in get_variable
aggregation=aggregation)
File "D:\Anaconda3\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 1239, in get_variable
aggregation=aggregation)
File "D:\Anaconda3\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 562, in get_variable
aggregation=aggregation)
File "D:\Anaconda3\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 514, in _true_getter
aggregation=aggregation)
File "D:\Anaconda3\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 929, in _get_single_variable
aggregation=aggregation)
File "D:\Anaconda3\lib\site-packages\tensorflow\python\ops\variables.py", line 259, in call
return cls._variable_v1_call(*args, **kwargs)
File "D:\Anaconda3\lib\site-packages\tensorflow\python\ops\variables.py", line 220, in _variable_v1_call
shape=shape)
File "D:\Anaconda3\lib\site-packages\tensorflow\python\ops\variables.py", line 198, in
previous_getter = lambda **kwargs: default_variable_creator(None, **kwargs)
File "D:\Anaconda3\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 2511, in default_variable_creator
shape=shape)
File "D:\Anaconda3\lib\site-packages\tensorflow\python\ops\variables.py", line 263, in call
return super(VariableMetaclass, cls).call(*args, **kwargs)
File "D:\Anaconda3\lib\site-packages\tensorflow\python\ops\variables.py", line 1568, in init
shape=shape)
File "D:\Anaconda3\lib\site-packages\tensorflow\python\ops\variables.py", line 1698, in _init_from_args
initial_value(), name="initial_value", dtype=dtype)
File "D:\Anaconda3\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 901, in
partition_info=partition_info)
File "D:\Anaconda3\lib\site-packages\tensorflow\python\ops\init_ops.py", line 323, in call
shape, self.mean, self.stddev, dtype, seed=self.seed)
File "D:\Anaconda3\lib\site-packages\tensorflow\python\ops\random_ops.py", line 79, in random_normal
shape_tensor, dtype, seed=seed1, seed2=seed2)
File "D:\Anaconda3\lib\site-packages\tensorflow\python\ops\gen_random_ops.py", line 763, in random_standard_normal
seed2=seed2, name=name)
File "D:\Anaconda3\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 788, in _apply_op_helper
op_def=op_def)
File "D:\Anaconda3\lib\site-packages\tensorflow\python\util\deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "D:\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 3616, in create_op
op_def=op_def)
File "D:\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 2005, in init
self._traceback = tf_stack.extract_stack()

Process finished with exit code 1

请问这是什么问题?

我的结果如下,怎么回事啊?

读入语音文件: E:\ASR\DeepSpeechRecognition-master\data\data_thchs30\test\D11_773.wav
开始识别语音数据......
语音原始文本: ** 婴幼儿 营养 专家 还 与 厦门市 同行 就 婴幼儿 的 生长 发育 进行 专题 探讨
识别出来的文本: 唢商唢诊礼输见输见前见前见输见输见诊输诊前姑锐含自慧自慧锐慧诊锐诊姑诊锐诊歪诊姑锐诊姑诊伸诊伸诊锐诊见 姑至自锐诊锐伸姑侧姑诊姑锐姑自姑自锐自锐究慧诊伸诊姑诊泥泉诊抛诊姑诊姑锐姑诊泥姑泥泡泥僻锐究锐靠究姑锐泡锐姑泡姑锐肿锐诊慧诊慧诊锐诊泡诊天诊姑锐姑锐姑锐姑僻输究靠究泥僻姑锐唢锐伸肿伸姑诊姑诊僻诊伸锐姑伸姑伸翻伸泥档伸姑伸姑诊姑泡姑伸桩究途玄泡锐泡姑诊姑诊姑伸右见前见诊姑归置

test.py文件报错

你好,我在运行您的test.py文件的时候,代码报错,不知道什么情况,这是我的测试代码:

wav_files = ['/home/zh/sda2/语音转文本/speech_recognition/data/test/D8/D8_999.wav']
txt_labels = ['国务委员 兼 国务院 秘书长 罗干 民政部 部长 多吉 才 让 也 一同 前往 延安 看望 人民群众']
words_size, words, word_num_map = utils.create_dict(txt_labels)
bi_rnn = BiRNN(wav_files, txt_labels, words_size, words, word_num_map)
bi_rnn.build_target_wav_file_test(wav_files, txt_labels)

配置文件:

[FILE_DATA]
wav_path=/opt/wav/test/
label_file=/home/zh/sda2/语音转文本/speech_recognition/model/test.word.txt
savedir=/home/zh/sda2/语音转文本/speech_recognition/model
savefile=speech.cpkt
tensorboardfile=/home/zh/sda2/语音转文本/speech_recognition/model

checkpoint文件内容:

model_checkpoint_path: "/home/zh/sda2/语音转文本/speech_recognition/model/speech.cpkt-101"
all_model_checkpoint_paths: "/home/zh/sda2/语音转文本/speech_recognition/model/speech.cpkt-101"

但是我得到这样的报错:

raceback (most recent call last):
  File "/atlas/home/hzhou/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1322, in _do_call
    return fn(*args)
  File "/atlas/home/hzhou/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1307, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "/atlas/home/hzhou/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1409, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [36] rhsshape= [1788]
         [[Node: save/Assign_14 = Assign[T=DT_FLOAT, _class=["loc:@b6"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](b6/Adam_1, save/RestoreV2/_29)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "test.py", line 24, in <module>
    bi_rnn.build_target_wav_file_test(wav_files, txt_labels)
  File "/scratch2/hzhou/AI_voice/speech_recognition/model.py", line 344, in build_target_wav_file_test
    self.init_session()
  File "/scratch2/hzhou/AI_voice/speech_recognition/model.py", line 199, in init_session
    self.saver.restore(self.sess, ckpt)
  File "/atlas/home/hzhou/.local/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1752, in restore
    {self.saver_def.filename_tensor_name: save_path})
  File "/atlas/home/hzhou/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 900, in run
    run_metadata_ptr)
  File "/atlas/home/hzhou/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1135, in _run
    feed_dict_tensor, options, run_metadata)
  File "/atlas/home/hzhou/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1316, in _do_run
    run_metadata)
  File "/atlas/home/hzhou/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1335, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [36] rhsshape= [1788]
         [[Node: save/Assign_14 = Assign[T=DT_FLOAT, _class=["loc:@b6"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](b6/Adam_1, save/RestoreV2/_29)]]

Caused by op 'save/Assign_14', defined at:
  File "test.py", line 24, in <module>
    bi_rnn.build_target_wav_file_test(wav_files, txt_labels)
  File "/scratch2/hzhou/AI_voice/speech_recognition/model.py", line 344, in build_target_wav_file_test
    self.init_session()
  File "/scratch2/hzhou/AI_voice/speech_recognition/model.py", line 186, in init_session
    self.saver = tf.train.Saver(max_to_keep=1)  # 生成saver
  File "/atlas/home/hzhou/.local/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1284, in __init__
    self.build()
  File "/atlas/home/hzhou/.local/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1296, in build
    self._build(self._filename, build_save=True, build_restore=True)
  File "/atlas/home/hzhou/.local/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1333, in _build
    build_save=build_save, build_restore=build_restore)
  File "/atlas/home/hzhou/.local/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 781, in _build_internal
    restore_sequentially, reshape)
  File "/atlas/home/hzhou/.local/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 422, in _AddRestoreOps
    assign_ops.append(saveable.restore(saveable_tensors, shapes))
  File "/atlas/home/hzhou/.local/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 113, in restore
    self.op.get_shape().is_fully_defined())
  File "/atlas/home/hzhou/.local/lib/python3.6/site-packages/tensorflow/python/ops/state_ops.py", line 219, in assign
    validate_shape=validate_shape)
  File "/atlas/home/hzhou/.local/lib/python3.6/site-packages/tensorflow/python/ops/gen_state_ops.py", line 60, in assign
    use_locking=use_locking, name=name)
  File "/atlas/home/hzhou/.local/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/atlas/home/hzhou/.local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3414, in create_op
    op_def=op_def)
  File "/atlas/home/hzhou/.local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1740, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [36] rhs shape= [1788]
         [[Node: save/Assign_14 = Assign[T=DT_FLOAT, _class=["loc:@b6"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](b6/Adam_1, save/RestoreV2/_29)]]

请问是您提供的模型有问题还是别的原因呢?请给我一些指导,谢谢

模型的算法原型

您好,请问您的模型是基于哪个算法实现的呢?是DeepSpeech吗?

测试过程异常

大佬,我测试用的thchs30中的test,但是运行时并没有遍历一遍test中的语音
测试的整个过程如下

C:\Users\zxchong\Desktop\speech_recognition-master> python test.py
C:\Users\zxchong\AppData\Local\Programs\Python\Python36\lib\site-packages\h5py_init_.py:36: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type.
from ._conv import register_converters as _register_converters
F:\xunlei\data\data_thchs30\test\D11_750.wav 东北军 的 一些 爱国 将士 马 占 山 李杜 唐 聚 伍 苏 炳 艾 邓 铁梅 等 也 奋起 抗战
wav: 2132 label 2132
字表大小: 1787
ckpt: C:\Users\zxchong\Desktop\speech_recognition-master\voice\model\speech.cpkt-101
101

读入语音文件: F:\xunlei\data\data_thchs30\test\D11_773.wav
开始识别语音数据......
语音原始文本: ** 婴幼儿 营养 专家 还 与 厦门市 同行 就 婴幼儿 的 生长 发育 进行 专题 探讨
识别出来的文本: ** 婴幼儿 营养 专家 还 厦门市 同行 就 婴幼儿 的 生长 发育 进行 专题 探讨
读入语音文件: F:\xunlei\data\data_thchs30\test\D11_774.wav
开始识别语音数据......
语音原始文本: 像 张 王村 八十年代 四百 户 拥有 耕牛 七百 多头 而今 五百 五十多 农户 仅 拥有 耕牛 十 一头
识别出来的文本: 像 张 王村 八十年代 四百 户 拥有 耕牛 七百 多头 今 五百 五十多 农户 仅 拥有 耕牛 十 一头
读入语音文件: F:\xunlei\data\data_thchs30\test\D11_775.wav
开始识别语音数据......
语音原始文本: 看 羊 狗 跑前跑后 一只 惊 飞 的 山雀 惹得 它 汪汪汪 咬 几 声 嗡 嗡嗡 的 在 山间 回荡
识别出来的文本: 看 羊 狗 跑前跑后 一只 惊 飞 的 山雀 惹得 汪汪汪 咬 几 声 省 嗡 嗡嗡 的 在 山间 回荡
读入语音文件: F:\xunlei\data\data_thchs30\test\D11_776.wav
开始识别语音数据......
语音原始文本: 承运人 有权 要求 托运 人 填写 航空 货运单 托运 人 有权 要求 承运人 接受 该 航空 货运单
识别出来的文本: 承运人 有权 要求 托运 人 填写 航空 货运单 托运 人 有权 要求 承运人 接受 该 航空 货运单
读入语音文件: F:\xunlei\data\data_thchs30\test\D11_779.wav
开始识别语音数据......
语音原始文本: 当然 他们 不 像 鳄鱼 那样 吞下 石块 当做 压 舱 石 而是 磨成 粉 以后 才 服用
识别出来的文本: 当 他 不 鳄鱼 鳄样 那 吞下 石块 做 压 石 到 多声 草 用
读入语音文件: F:\xunlei\data\data_thchs30\test\D11_780.wav
开始识别语音数据......
语音原始文本: 刀 光 枪 影 白骨 如 麻 我 仿佛 听到 五 十年前 三十万 冤魂 的 呐喊 痛 贯 心肝
识别出来的文本: 刀 光 枪 影 白骨 如 麻 我 仿佛 听到 五 十年前 三如万 冤魂 呐喊 痛 贯 心肝
读入语音文件: F:\xunlei\data\data_thchs30\test\D11_781.wav
开始识别语音数据......
语音原始文本: 不仅 要 宣传 少生 还要 宣传 晚婚晚育 优生优育 宣传 生 男生 女 都一样
识别出来的文本: 不仅 要 宣传 少生 还要 宣传 晚婚晚育 优生优育 宣传 生 男生 女 都一样
读入语音文件: F:\xunlei\data\data_thchs30\test\D11_782.wav
开始识别语音数据......
语音原始文本: 七 我国 培育 出 首 株 抗病毒 转基因 小麦 为 小麦 抗病 育种 奠定 了 坚实 基础
识别出来的文本: 七 我国 培育 出 首 株 抗病毒 转基因 小麦 为 为 小麦 抗病 育种 奠定 了 坚实 基础
读入语音文件: F:\xunlei\data\data_thchs30\test\D11_783.wav
开始识别语音数据......
语音原始文本: 天空 一些 云 忙 走 月亮 陷进 云 围 时 云和 烟 样 和 煤 山 样 快要 燃烧 似地
识别出来的文本: 天空 一些 云 忙 走 月亮 陷进 云 围 时 云和 烟 样 和 煤 山 样 快要 燃烧 似地
读入语音文件: F:\xunlei\data\data_thchs30\test\D11_785.wav
开始识别语音数据......
语音原始文本: 北京 丰台区 农民 自己 花钱 筹办 万 佛 延寿 寺 迎春 庙会 吸引 了 区内 六十 支 秧歌队 参赛
识别出来的文本: 北京 丰台区 农民 自己 花钱 筹办 万 佛 延寿 寺 迎春 庙会 吸引 了 区内 六十 支 秧歌队 参赛
PS C:\Users\zxchong\Desktop\speech_recognition-master>

大佬知道这是什么问题吗?
请大佬帮忙看一眼,,谢谢了

Excellent Codes,非常好的代码,感谢!

The Precision and Recall is amazing for small corpus, through it's the contribution of tensorflow it's own.
The author delivered codes with a clear architecture, consized and full functional.
For me, this code is the best one, comparing to several other similar codes.
朋友的代码非常诚恳。感谢!
相比之下,有个AS*什么的code,训练代码根本跑不起来。结果提供训练好的模型,没什么用。

test problem

I've try several ways to test my custom test audio file including use last 4 lines in test.py(actually it's 5 line) and change file location in conf.ini and none of these work for me,I think it's a code problem and I cant find the reason now,you did a really good job build this mandarin speech model! please fix test problem and make it better!

模型不能在CPU上运行吗?

你好,我在测试的时候,由于环境的限制,我不得不改在CPU上运行,但是我更改了model.py中如下代码:

 def variable_on_device(self, name, shape, initializer):
        with tf.device('/cpu:0'):
            var = tf.get_variable(name=name, shape=shape, initializer=initializer)
        return var

然后运行我得到如下错误:

Traceback (most recent call last):
  File "tf_speech.py", line 27, in <module>
    re = X.speech_to_text('./test_voice/D4_750.wav')
  File "tf_speech.py", line 22, in speech_to_text
    res = bi_rnn.build_target_wav_file_test(wav_files, self.text_labels)
  File "/home/zh/sda2/sg-ai/Detox_AI/tf_recong/model.py", line 213, in build_target_wav_file_test
    self.init_session()
  File "/home/zh/sda2/sg-ai/Detox_AI/tf_recong/model.py", line 184, in init_session
    ckpt = tf.train.latest_checkpoint(self.savedir)
  File "/home/zh/sda3/Anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1805, in latest_checkpoint
    ckpt = get_checkpoint_state(checkpoint_dir, latest_filename)
  File "/home/zh/sda3/Anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1061, in get_checkpoint_state
    text_format.Merge(file_content, ckpt)
  File "/home/zh/sda3/Anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/google/protobuf/text_format.py", line 536, in Merge
    descriptor_pool=descriptor_pool)
  File "/home/zh/sda3/Anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/google/protobuf/text_format.py", line 590, in MergeLines
    return parser.MergeLines(lines, message)
  File "/home/zh/sda3/Anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/google/protobuf/text_format.py", line 623, in MergeLines
    self._ParseOrMerge(lines, message)
  File "/home/zh/sda3/Anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/google/protobuf/text_format.py", line 638, in _ParseOrMerge
    self._MergeField(tokenizer, message)
  File "/home/zh/sda3/Anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/google/protobuf/text_format.py", line 763, in _MergeField
    merger(tokenizer, message, field)
  File "/home/zh/sda3/Anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/google/protobuf/text_format.py", line 888, in _MergeScalarField
    value = tokenizer.ConsumeString()
  File "/home/zh/sda3/Anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/google/protobuf/text_format.py", line 1251, in ConsumeString
    the_bytes = self.ConsumeByteString()
  File "/home/zh/sda3/Anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/google/protobuf/text_format.py", line 1266, in ConsumeByteString
    the_list = [self._ConsumeSingleByteString()]
  File "/home/zh/sda3/Anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/google/protobuf/text_format.py", line 1291, in _ConsumeSingleByteString
    result = text_encoding.CUnescape(text[1:-1])
  File "/home/zh/sda3/Anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/google/protobuf/text_encoding.py", line 103, in CUnescape
    result = ''.join(_cescape_highbit_to_str[ord(c)] for c in result)
  File "/home/zh/sda3/Anaconda3/envs/deep2.0.0/lib/python3.6/site-packages/google/protobuf/text_encoding.py", line 103, in <genexpr>
    result = ''.join(_cescape_highbit_to_str[ord(c)] for c in result)
IndexError: list index out of range

请问我能怎么在CPU上运行呢?谢谢

data

我在清华数据集中没有找到test.word.txt,你能传一下么,还有个问题,我可以在gpu上面训练么,速度好慢呀

train with gpu

when i use gpu to train my own model, it seems that gpu is not used. the train speed is so slow.

rr

训练数据的采样率是多少

您好,我用您的模型测试您提供的数据,识别很准确,但是我把您音频中的信息自己念一遍,然后再识别,就发现一点不准,这是为什么呢?难道是数据的采样率不一致导致的吗?我使用的采样率是16K

准确率

楼主, 您好,我用所有D_**.wav文件训练完事后,训练样本内的准确率很高,但测试样本准确率很低,这个问题不知道楼主有没有遇到,谢谢!

InvalidArgumentError

root@zp:/home/zp/Desktop/voice# python3 test.py
./train_data/wav/test/D21/D21_847.wav **队 首先 出场 的 是 赖 亚文 孙悦 李艳 吴 咏梅 崔 咏梅 和 何 琦 这 也是 首选 阵容
wav: 2132 label 2132
字表大小: 1787
Traceback (most recent call last):
File "/root/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1322, in _do_call
return fn(*args)
File "/root/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1305, in _run_fn
self._extend_graph()
File "/root/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1340, in _extend_graph
tf_session.ExtendSession(self._session)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation 'h6/Adam_1': Operationwas explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/device:CPU:0 ]. Makesure the device specification refers to a valid device.
[[Node: h6/Adam_1 = VariableV2_class=["loc:@h6"], container="", dtype=DT_FLOAT, shape=[512,1788], shared_name="", _device="/device:GPU:0"]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "test.py", line 18, in
bi_rnn.build_test()
File "/home/zp/Desktop/voice/model.py", line 336, in build_test
self.init_session()
File "/home/zp/Desktop/voice/model.py", line 192, in init_session
self.sess.run(tf.global_variables_initializer())
File "/root/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 900, in run
run_metadata_ptr)
File "/root/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1135, in _run
feed_dict_tensor, options, run_metadata)
File "/root/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1316, in _do_run
run_metadata)
File "/root/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1335, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation 'h6/Adam_1': Operationwas explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/device:CPU:0 ]. Makesure the device specification refers to a valid device.
[[Node: h6/Adam_1 = VariableV2_class=["loc:@h6"], container="", dtype=DT_FLOAT, shape=[512,1788], shared_name="", _device="/device:GPU:0"]]

Caused by op 'h6/Adam_1', defined at:
File "test.py", line 18, in
bi_rnn.build_test()
File "/home/zp/Desktop/voice/model.py", line 335, in build_test
self.loss()
File "/home/zp/Desktop/voice/model.py", line 156, in loss
self.optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(self.avg_loss)
File "/root/miniconda3/lib/python3.6/site-packages/tensorflow/python/training/optimizer.py", line 424, in minimize
name=name)
File "/root/miniconda3/lib/python3.6/site-packages/tensorflow/python/training/optimizer.py", line 600, in apply_gradients
self._create_slots(var_list)
File "/root/miniconda3/lib/python3.6/site-packages/tensorflow/python/training/adam.py", line 132, in _create_slots
self._zeros_slot(v, "v", self._name)
File "/root/miniconda3/lib/python3.6/site-packages/tensorflow/python/training/optimizer.py", line 1150, in _zeros_slot
new_slot_variable = slot_creator.create_zeros_slot(var, op_name)
File "/root/miniconda3/lib/python3.6/site-packages/tensorflow/python/training/slot_creator.py", line 181, in create_zeros_slot
colocate_with_primary=colocate_with_primary)
File "/root/miniconda3/lib/python3.6/site-packages/tensorflow/python/training/slot_creator.py", line 155, in create_slot_with_initializer
dtype)
File "/root/miniconda3/lib/python3.6/site-packages/tensorflow/python/training/slot_creator.py", line 65, in _create_slot_var
validate_shape=validate_shape)
File "/root/miniconda3/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 1317, in get_variable
constraint=constraint)
File "/root/miniconda3/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 1079, in get_variable
constraint=constraint)
File "/root/miniconda3/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 425, in get_variable
constraint=constraint)
File "/root/miniconda3/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 394, in _true_getter
use_resource=use_resource, constraint=constraint)
File "/root/miniconda3/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 786, in _get_single_variable
use_resource=use_resource)
File "/root/miniconda3/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 2220, in variable
use_resource=use_resource)
File "/root/miniconda3/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 2210, in
previous_getter = lambda **kwargs: default_variable_creator(None, **kwargs)
File "/root/miniconda3/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 2193, in default_variable_creator
constraint=constraint)
File "/root/miniconda3/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 235, in init
constraint=constraint)
File "/root/miniconda3/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 349, in _init_from_args
name=name)
File "/root/miniconda3/lib/python3.6/site-packages/tensorflow/python/ops/state_ops.py", line 137, in variable_op_v2
shared_name=shared_name)
File "/root/miniconda3/lib/python3.6/site-packages/tensorflow/python/ops/gen_state_ops.py", line 1255, in variable_v2
shared_name=shared_name, name=name)
File "/root/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/root/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3392, in create_op
op_def=op_def)
File "/root/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1718, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): Cannot assign a device for operation 'h6/Adam_1': Operation was explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/device:CPU:0 ]. Make sure the device specification refers to a valid device.
[[Node: h6/Adam_1 = VariableV2_class=["loc:@h6"], container="", dtype=DT_FLOAT, shape=[512,1788], shared_name="", _device="/device:GPU:0"]]

模型测试

训练了25epoch后进行模型的测试,修改配置的数据路径为test.出现以下错误:
InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [1787] rhs shape= [2664]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.