Git Product home page Git Product logo

Comments (5)

yeyupiaoling avatar yeyupiaoling commented on September 3, 2024

from voiceprintrecognition-pytorch.

Heavenbest avatar Heavenbest commented on September 3, 2024

没有改其他参数呢,都是使用的默认值。

from voiceprintrecognition-pytorch.

yeyupiaoling avatar yeyupiaoling commented on September 3, 2024

from voiceprintrecognition-pytorch.

Heavenbest avatar Heavenbest commented on September 3, 2024

是的呢,两个数据集,训练集是解压缩后的CN-Celeb_flac,CN-Celeb2_flac这两个合并的,num_speakers: 2796,评估检验的数据集是cn-celeb-test。

数据集参数

dataset_conf:

过滤最短的音频长度

min_duration: 0.5

最长的音频长度,大于这个长度会裁剪掉

max_duration: 3

是否裁剪静音片段

do_vad: False

音频的采样率

sample_rate: 16000

是否对音频进行音量归一化

use_dB_normalization: True

对音频进行音量归一化的音量分贝值

target_dB: -20

训练数据的数据列表路径

#train_list: 'dataset/train_list.txt'
train_list: '../../datasets/voiceprint/train_list.txt'

评估注册的数据列表路径

enroll_list: 'dataset/cn-celeb-test/enroll_list.txt'

评估检验的数据列表路径

trials_list: 'dataset/cn-celeb-test/trials_list.txt'

评估的数据要特殊处理

eval_conf:
# 评估的批量大小
batch_size: 1
# 最长的音频长度
max_duration: 20

数据加载器参数

dataLoader:
# 训练的批量大小
batch_size: 128
# 读取数据的线程数量
num_workers: 4

数据增强参数

aug_conf:
# 是否使用语速扰动增强
speed_perturb: True
# 使用语速增强是否分类大小翻三倍
speed_perturb_3_class: True
# 是否使用音量增强
volume_perturb: False
# 音量增强概率
volume_aug_prob: 0.2
# 噪声增强的噪声文件夹
noise_dir: 'dataset/noise'
# 噪声增强概率
noise_aug_prob: 0.2

是否使用SpecAug

use_spec_aug: True

Spec增强参数

spec_aug_args:
# 随机频谱掩码大小
freq_mask_width: [ 0, 8 ]
# 随机时间掩码大小
time_mask_width: [ 0, 10 ]

数据预处理参数

preprocess_conf:

音频预处理方法,支持:MelSpectrogram、Spectrogram、MFCC、Fbank

feature_method: 'Fbank'

设置API参数,更参数查看对应API,不清楚的可以直接删除该部分,直接使用默认值

method_args:
sample_frequency: 16000
num_mel_bins: 80

optimizer_conf:

优化方法,支持Adam、AdamW、SGD

optimizer: 'Adam'

初始学习率的大小

learning_rate: 0.001
weight_decay: !!float 1e-5

学习率衰减函数,支持WarmupCosineSchedulerLR、CosineAnnealingLR

scheduler: 'WarmupCosineSchedulerLR'

学习率衰减函数参数

scheduler_args:
min_lr: !!float 1e-5
max_lr: 0.001
warmup_epoch: 5

model_conf:
backbone:
# 所使用的池化层,支持ASP、SAP、TSP、TAP
pooling_type: 'ASP'
embd_dim: 192
classifier:
# 说话人数量,即分类大小
num_speakers: 2796
#num_speakers: 200
num_blocks: 0

loss_conf:

所使用的损失函数,支持AAMLoss、AMLoss、ARMLoss、CELoss

use_loss: 'AAMLoss'

损失函数参数

args:
margin: 0.2
scale: 32
easy_margin: False

是否使用损失函数margin调度器

use_margin_scheduler: True

margin调度器参数

margin_scheduler_args:
final_margin: 0.3

train_conf:

是否开启自动混合精度

enable_amp: False

是否使用Pytorch2.0的编译器

use_compile: False

训练的轮数

max_epoch: 60
log_interval: 100

所使用的模型

use_model: 'EcapaTdnn'

from voiceprintrecognition-pytorch.

yeyupiaoling avatar yeyupiaoling commented on September 3, 2024

from voiceprintrecognition-pytorch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.