conformer's Introduction

Conformer

About

This is the implementation of Conformer [1] in Tensorflow 2

Note: This repository is still in development and constantly evolving. New features and updates will appear over time.

Motivation

I have seen many Conformer's implementations ([2], [3]) but none of them is in Tensorflow. Therefore, I wanted to challenge myself to implement this model in my favorite framework.

Installation

You should have Python 3.7 or higher. I highly recommend creating a virual environment like venv or conda.

The main part of this project uses only Tensorflow 2. However, I also use Tensorflow I/O for features augmentation (which is not needed right now)

Script for downloading dependencies (setuptools is not available right now)

pip install tensorflow
pip install tensorflow-io   # optional

Usage

import tensorflow as tf
from conformer import Conformer

batch_size, seq_len, input_dim = 3, 15, 256

model = Conformer(
    num_conv_filters=[512, 512], 
    num_blocks=1, 
    encoder_dim=512, 
    num_heads=8, 
    dropout_rate=0.4, 
    num_classes=10, 
    include_top=True
)

# Get sample input
inputs = tf.random.uniform((batch_size, seq_len, input_dim),
                            minval=-40,
                            maxval=40)

# Convert to 4-dimensional tensor to fit Conv2D
inputs = tf.expand_dims(inputs, axis=1)  

# Get output
outputs = model(inputs)     # [batch_size, 1, seq_len, num_class]
outputs = tf.squeeze(outputs, axis=1)

References

[1] Conformer: Convolution-augmented Transformer for Speech Recognition 🔗

[2] @sooftware's PyTorch implementation 🔗

[3] @jaketae's PyTorch implementation 🔗

conformer's People

Contributors

Stargazers

Watchers

conformer's Issues

Error in computing RelativeMHA

At line 105 in attention.py, you've written
context = self.out_linear(tf.reshape(context, [batch_size, -1, seq_len, self.d_model]), training=training)
which should've been
context = self.out_linear(tf.reshape(context, [batch_size, -1, self.d_model]), training=training)

Since output for each multihead attention should be [B, S, D_Model]

Recommend Projects

thanhtvt / conformer Goto Github PK

conformer's Introduction

Conformer

About

Motivation

Installation

Usage

References

conformer's People

Contributors

Stargazers

Watchers

Forkers

conformer's Issues

Error in computing RelativeMHA

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent