Git Product home page Git Product logo

asv-anti-spoofing-with-res2net's Introduction

ASV-anti-spoofing-with-Res2Net

This repository provides the implementation of the paper: Replay and Synthetic Speech Detection with Res2Net architecture.

System Architecture

  1. ResNet blocks
  1. Overall model architecture

Main Results

  1. System performance on the ASVspoof2019 PA and LA dataset. (The input features for PA and LA are Spec and LFCC, respectively.)

  2. System performance on the ASVspoof2019 PA (left) and LA (right) of SE-Res2Net50 with different acoustic features.

Dependencies

  1. Python and packages

    This code was tested on Python 3.7 with PyTorch 1.6.0. Other packages can be installed by:

    pip install -r requirements.txt
  2. Kaldi

    This work used Kaldi to extract features, you need to install Kaldi before running our scripts.

  3. MATLAB

    The LFCC feature adopted in this work is extracted via the MALTAB codes privided by ASVspoof2019 orgnizers.

Dataset

This work is conducted on ASVspoof2019 Dataset, which can be downloaded via https://datashare.ed.ac.uk/handle/10283/3336. It consists of two subsets, i.e. physical access (PA) for replay attacks and logical access (LA) for synthetic speech attacks.

Start Your Project

This repository mainly consists of two parts: (i) feature extraction and (ii) system training and evaluation.

Feature extraction

Three features are adopted in this repo, i.e. Spec, LFCC and CQT. The top script for feature extraction is extract_feats.sh, where the first step (Stage 0) is required to prepare dataset before feature extraction. It also provides feature extraction for Spec (Stage 1) and CQT (Stage 2), while for LFCC extraction, you need to run the ./baseline/write_feature_kaldi_PA_LFCC.sh and ./baseline/write_feature_kaldi_LA_LFCC.sh scripts. All features are required to be truncated by the Stage 4 in extract_feats.sh.

Given your dataset directory in extract_feats.sh, you can run any stage (e.g. NUM) in the extract_feats.sh by

./extract_feats.sh --stage NUM

For LFCC extraction, you need to run

./baseline/write_feature_kaldi_LA_LFCC.sh
./baseline/write_feature_kaldi_PA_LFCC.sh

System training and evaluation

This repo supports different system architectures, as configured in the conf/training_mdl directory. You can specify the system architecture, acoustic features in start.sh, then run the codes below to train and evaluate your models.

./start.sh

Remember to rename your runid in start.sh to differentiate each configuration. From our experiments after ICASSP 2021 submission, we observe that SERes2Net50 configured with 14w_8s and 26w_8s can achieve slightly better performance.

For evaluating systems, you can either use the Kaldi command compute-eer with the resulting *.eer file to compute system EER, e.g.

. ./path.sh
compute-eer NameofScoringFile.txt.eer

or use the ASVspoof2019 official script scoring/evaluate_tDCF_asvspoof19.py with the resulting *.txt file to compute both system EER and t-DCF, e.g. on the LA evalation set, you need to run

python scoring/evaluate_tDCF_asvspoof19.py scoring/la_asv_scores/ASVspoof2019.LA.asv.eval.gi.trl.scores.txt NameofScoringFile.txt

Citation

If this repo is helpful with your research or projects, please kindly star our repo and cite our paper as follows:

@article{li2020replay,
  title={Replay and Synthetic Speech Detection with Res2net Architecture},
  author={Li, Xu and Li, Na and Weng, Chao and Liu, Xunying and Su, Dan and Yu, Dong and Meng, Helen},
  journal={arXiv preprint arXiv:2010.15006},
  year={2020}
}

Contact

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.