tihu-nlp / tihu Goto Github PK

View Code? Open in Web Editor NEW

85.0 7.0 11.0 64.54 MB

Persian Text-To-Speech

Home Page: http://lilak-project.com/tihu_demo.php

License: Other

Makefile 1.42% C++ 75.80% C 2.24% QMake 0.16% HTML 19.93% Dockerfile 0.44%

persian-language persian-nlp text-to-speech tihu morphological-analysis

tihu's Introduction

Tihu, Persian Text-To-Speech

Tihu is an open source Persian text-to-speech engine. It's a cross-platform application and mostly is written in C++. Tihu uses Hazm for digesting Persian text and Tihu g2p-seq2seq for Grapheme-to-Phoneme conversion.

Compile the source

Make sure you have installed gcc (6.0 or higher). To check your gcc version run gcc --version. Also you need python2.7 to run Tihu.

Compiling Tihu library

You can compile Tihu library by following these steps:

git clone https://github.com/tihu-nlp/tihu.git
cd tihu

apt install curl
apt install python2-pip
apt install libespeak-ng-dev
apt install libsamplerate0-dev

make ready
make release

Tihu console

Tihu console is a gui application that allows you to work with Tihu library. Before compiling Tihu console make sure you have installed qt framework. You can also find a pre-compiled version of Tihu console in release page for linux x64.

To compile Tihu console run:

qmake --version
make console

Tihu gRPC

Make sure you have installed gRPC C++. Then type: make grpc

docker

To create Tihu docker with the gRPC endpoint: type make docker

These command will be usefull while using docker:

Run the docker: docker run --name tihu -p 50051:50051 tihu
Stop the docker: docker stop tihu
Start the docker: docker start tihu

How Tihu works

Check the wiki page to know how Tihu convert Persian text to speech. This document is in Persian and explain how Tihu works with an example.

About the Name

Tihu is Persian name for Partridge.

tihu's People

Contributors

Stargazers

Watchers

Forkers

b00f mohamadeq importos ebraminio entn-at shotor mohammedraji bugbounted saied-delshad 5l1v3r1 gmh5225

tihu's Issues

Android support

Hi,

I'm interested in implementing Lilak Project's Tihu demo into an Android application for Farsi-English education. This is an extremely valuable tool for English speakers because it allows us to read short vowels in Penglish (Romanization of Persian Script). I would love to share this with the world in my app, and am interested in splitting percentage of profits. Thank you

Sincerely,
Parizi

504 Gateway Time-out

Hello thanks for this amazing project .
and I was visiting the demo of that in lilak-project.com
and it wasn't working and I got this error 504 Gateway Time-out please fix that
thank u <3

Compressing output audio

Using Lame library we can compress raw PCM data to mp3. Look at this function: lame_encode_buffer
I have built lame inside travis successfully.

List words with wrong tags or pron

طوطیان tutiAn
مسکوت moskoat
راهرویی تاریک rAhroyiye tArik

Help using the release

Hi, I would to thank you for working on this project. I'm trying to use the release v 0.2.0, but I don't know how to use it. I try something like "./tihu_play ./libtihu.so "سلام" test.mp3" but it gives an error "No module named hazm". I don't know much about programming by the way. Any help on how to run this program is appreciated

Please provide AppImage for download

tihu/.travis.yml

Line 61 in 8c61e1c

 - ./linuxdeployqt-continuous-x86_64.AppImage appdir/usr/share/applications/*.desktop -appimage 

Please provide the AppImage file for download on GitHub Releases, thanks.

how can i use tihu grpc services

hello guys. how can i use tihu grpc for tts. thanks for working on this project

g2p-seq2seq is deprecated.

g2p-seq2seq using python 2.7 which is deprecated now. There is an open issue in g2p-seq2seq about this: cmusphinx/g2p-seq2seq#177

Updating the g2p-seq2seq is not easy since it uses tensorflow and tensorflow is known for unsuitability and changing the codebase.
I emailed Nickolay who is maintaining this project and talk about this issue. here is his response:

Hello Mostafa

Nice to meet you here. I've been following your project for quite some time actually, it is a great undertaking.

I do not have the plans to keep the project alive, you might take over it if you want. I'll just add you as developers.

As for future development, I given up on Tensorflow as industry also left it. Everyone is on Pytorch nowadays. Tensorflow really screwed it with unclear API and frequent changes which break everything. 1.10 is different from 1.12 and so on and so forth. You can also read comments here: https://news.ycombinator.com/item?id=21118018

I'm planning to move to Pytorch and use maybe fairseq or a different seq2seq framework in Pytorch, but that will take some work which I don't have time to do right now. Feel free to jump in and let me know if you have further questions.

We have two options:

Updating g2p-seq2seq to use python3 and tensorflow 2.0 or higher
Using other libraries like: https://github.com/Kyubyong/g2p or https://github.com/AdolfVonKleist/Phonetisaurus

LTS module is very inaccurate

LTS module working inaccurate . Maybe the model should be trained again.

Windows version?

I can see only the linux version in the release section.

Can we have a compiled windows version of tihu, in release section too?

Error on compiling

Error message:

Traceback (most recent call last):
  File "/usr/bin/pip2", line 9, in <module>
    from pip import main
ImportError: cannot import name main
Makefile:17: recipe for target 'ready' failed
make: *** [ready] Error 1

I need Help :(

Hello first of all thanks for this amazing project
and the question is, how can I import and use Persian text to speech in python project ??

tihu server weird respond

I try to build and run tihu docker. Creating and running docker image using current Dockerfile:

$ sudo docker build .
$ sudo docker run -p 50051:50051ab1fb002c73f
error loading g2p model: -1mbrola: mbrola: No such file or directory
�
mbrowrap error: mbrola exited with status 1
Server listening on 0.0.0.0:50051

After checking Dockerfile, I realize that make ready was missed so the g2p model was not downloaded. Add make ready to Dockerfile, will change the log to:

$ sudo docker build .
$ sudo docker run -p 50051:50051 ab1fb002c73f
error loading g2p model: -1mbrowrap: voice samplerate = 16000
mbrola started.
mbrowrap: voice samplerate = 22050
mbrola started.
Server listening on 0.0.0.0:50051

But in both cases server respond with something non sense!

��@��@�� ‏��?��

Please help me to run tihu server. @b00f
regards

trouble on compile tihu and use it

Hi there
I try use tihu in my RPI(Raspberry PI) robot
seems that document is not updated
some lines return errors
cp -r espeak-1.48.04-source/espeak-data src/build/data/
src/build/data/ no such file or directory
there is no guide how i can call it in command line instead of use gui
i want call it something like this
tihu -speak "سلام"
then he say سلام
best regards

Crash on Emojis

Emojis causes a crash in Tihu (like: 😂😂😂)

Kasre-Ezafe is not applied to the pronunciation

Tihu can detect Kasre-Ezafe but it has no effect on pronunciations.

MbrolaMale has same voice as FEMALE

It looks we can't initialize both Mbrola Male and Female voices at the same time.

Voice data resampling

Currently onlu mbrol-ir1 works fine. For other voices we need to resample output data

Wave headers

I'm having issues using actual responses from the grpc api. They don't have a wave header from what I can tell. So I have to add this in order to be able to play them.

In my case: If I pre-process the uint8 response on nodejs and add a waveheader, then send the full wave file over grpc, deserialize as b64 on expo/android it works.

So far all my attempts to add a wave header dynamically in the app, as b64 have failed. This is because even though it's a mobile app I only have access to web-apis and native APIs that expo exposes. I can't process them with Node how I did previously. Or use iOS/Java.

So I wanted to ask (1) how did you process wave files on lilak-project to be web playable? I might be able to re-use some of this. (2) is it possible to add the wave header to the grpc response, possibly with a param? This would make my life easier actually. But I'm not sure how this affects streaming.

Memory Leaks

We should fix all memory leaks.

tihu_server doesn't close properly

By sending sig_int (ctrl+c) tihu_server won't close and gives error.

Using espeak-ng

eSpeak is no obsoleted project and we can use eSpeak-ng instated:
https://github.com/espeak-ng/espeak-ng

Invalid pronunciation for a suffix

All the pronunciation for ها is invalid here: https://github.com/tihu-nlp/tihu/blob/master/build/data/lexicon.aff

Merging Tokens and Punctuation files

We can merge tokens.txt and punctuations.txt files and make a new file including reading status and pronunciations. In this case we can read stand-alone characters. like: آ

Using latest version of g2p-seq2seq (python library)

Currently Tihu is using a compiled version of g2p-seq2seq which the version is quite old and we also need to train the model data with the new version. Current implantation can be found here
We need to call and load this python library directly. It will let us to build a new models and work with the latest code.
Look at here to check how Tihu is using Hazm library (which is in python as well).

docker grpc crash segmentation fault after query

Hi @b00f, love the work you put into this

I'm building a small os react-native client and trying to run tihu as a grpc service through docker.

It builds fine and starts up, but when doing a query it crashes with:

Segmentation fault (core dumped)

My code: https://github.com/shotor/tihu-grpc-js/blob/master/index.js

Am I doing something wrong?

چند نمونه خروجی میشه بذارید از این کد؟

با سلام و احترام
لطفا میشه چند نمونه از خروجی این کد رو بذارید.
سپاسگزارم

error loading g2p model: -1mbrowrap:

anyone can you help me about this error
after run:
./tihu_play ./libtihu.so "سلام" /tmp/raw1
get this:
error loading g2p model: -1mbrowrap: voice samplerate = 16000
mbrola started.
mbrowrap: voice samplerate = 22050
mbrola started.

504 Gateway Time-out

this website lilak-project.com/tihu_demo.php is not working
when i enter my Persian text and click on Submit , the website crashes and show this error
504 Gateway Time-out
so yeah it is not working :(

rsa requires Python '>=3.5, <4' but the running Python is 2.7.16 make: *** [Makefile:27: ready] Error 1

Hi i get this error on install tihu on run this command
make ready
Collecting rsa>=3.1.4 (from oauth2client->tensor2tensor==1.7.0)
Downloading https://files.pythonhosted.org/packages/2d/d3/41b3db87f262debadb153900d4e6f8d61aa87187dd6fedd855ed24e8526d/rsa-4.7.1.tar.gz rsa requires Python '>=3.5, <4' but the running Python is 2.7.16 make: *** [Makefile:27: ready] Error 1

lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description: Debian GNU/Linux 10 (buster)
Release: 10
Codename: buster

Segmentation fault when running locally

Hi @b00f

I'm trying to get the grpc server to run locally so I can find out why the generated js client doesn't work. I think it has to do with docker networking and what the generated client is using internally to do the grpc calls. So I want to run it locally to confirm but I get a segmentation fault when I try to run the server:

./tihu_server 0.0.0.0:50051 /tihu/build/libtihu.so ""
[1]    29866 segmentation fault (core dumped)  ./tihu_server 0.0.0.0:50051 /tihu/build/libtihu.so ""

Steps that I took:

Following instructions from the Dockerfile on an ubuntu 18.04 machine:

grpc on branch tags/v1.21.x
third_party/protobuf on branch tags/v3.7.0

Protobuf installation first. No errors.

When building grpc I got the error:

error adding symbols: DSO missing from command line

I solved that by modifying the makefile as suggested here: grpc/grpc#9549 (comment)

After that installation runs without error and I have libprotoc.so.18 in /usr/local/lib

Running make on the tihu repo goes fine except for this warning when running make grpc:

/usr/bin/ld: warning: libprotobuf.so.10, needed by /usr/local/lib/libgrpc++_reflection.so, may conflict with libprotobuf.so.18

I tried to follow the warning and uninstalled protobuf then installed protobuf 2.7.0 so I get libprotoc.so.10. But that doesn't seem to be the answer:

./tihu_server 0.0.0.0:50051 /tihu/build/libtihu.so ""
./tihu_server: error while loading shared libraries: libprotobuf.so.18: cannot open shared object file: No such file or directory

I don't know how to proceed from here

Mbrola 253 error

Hi,
I installed all tihu packages based on MD file and installation was successful.
But when I run sample code, I get "mbrolla exited with status 253"

Please let me know how to resolve it

Thanks