misyaguziya / vrct Goto Github PK

View Code? Open in Web Editor NEW

75.0 1.0 6.0 49.33 MB

VRCT(VRChat Chatbox Translator & Transcription)

Home Page: https://misyaguziya.booth.pm/items/

License: MIT License

Python 98.83% Batchfile 0.17% NSIS 1.01%

free osc speech-recognition speech-to-text vrchat

vrct's Introduction

| English | 日本語 | 한국어 | 繁體中文 |

VRCT is software that supports VRChat conversations with translation and transcription.

Download & Install

Download from anywhere you like.

Just download and run the exe.

What is VRCT?

VRCT is software that supports conversations between people who speak different languages by providing chat or voice translation. These features are designed for use within VRChat. *Although not supported, it is also used for other purposes such as watching movies.

VRCT supports your conversations with

💬 Send chat to VRChat
🌐 Translation
🎙 Transcription of audio from microphone
🔈 Transcription of audio from Speaker

Documents

Initial setup, basic functions, and other features are also described.

Documents Link

How to Use (YouTube)

If you want to run it in python

Install the following version of python. python version 3.11.5
Install package and run main.py.
```
./install.bat
python main.py
```

Author

みしゃ(misyaguzi) (Main Development)
しいな(Shiina_12siy) (UI/UX, UI multilingual support)
レラ (Translation:Korean)
どね (Logo Design)

VRCT is not endorsed by VRChat and does not reflect the views or opinions of VRChat or anyone officially involved in producing or managing VRChat properties. VRChat and all associated properties are trademarks or registered trademarks of VRChat Inc. VRChat © VRChat Inc.

vrct's People

Contributors

Stargazers

Watchers

Forkers

cympfh 0kq-github soumt-r zapabob flizeee anshinintakaha

vrct's Issues

Feature Request: Use local models on GPU (機能リクエスト: GPU でローカルモデルを使用する)

First of all, this is a beautiful program. Thank you for making it.

I think it would be useful to have a setting to use a GPU for local models. This would greatly reduce the latency for higher quality models, if the user has spare VRAM.

For users with multiple GPUs, such as AI enthusiasts, there should also be a setting to select which GPU to use.
That way, VRChat can use the main GPU, and models can be loaded on the secondary GPU.

This would also enable running even better models in the future.

(If I have some spare time, I will try to implement this. If it interests you, it may be faster for you to do it.)

Google Translation:
まず、これは素晴らしいプログラムです。作成していただきありがとうございます。

ローカルモデルに GPU を使用する設定があると便利だと思います。これにより、ユーザーに余裕のある VRAM がある場合、高品質のモデルの遅延が大幅に削減されます。

AI 愛好家など、複数の GPU を持つユーザーの場合は、使用する GPU を選択する設定も必要です。

そうすれば、VRChat はメイン GPU を使用し、モデルはセカンダリ GPU にロードできます。

これにより、将来的にさらに優れたモデルを実行できるようになります。

(時間に余裕があれば、これを実装してみます。興味があれば、自分でやったほうが早いかもしれません。)

Translation doesn't seems to work well.

Seems like translating from English to Japanese is getting incorrect results. Sometime I will also get "NaN" or

within the translation output. This had happened since v2.1.0.

I have been using VRCT since v1.3.2. translating from English to Japanese / Korean / Chinese has been successful up to v2.0.2
Since v2.1.0 with the new UI, the translation results is not the same, friends will misunderstand me and sometime it is just wrong. My solution is to revert back to v2.0.2

Japanese and Korean friend using VRCT don't have problem, whatever they typed in Japanese translate correctly into English as always.

I'm using Deepl.com free API key, the same API key use in v2.0.2, High accuracy model 1.2GB

If you require me to generate any log for troubleshooting, please let me know where I can find those log, I'll gladly cooperate and find + upload for you.

I can`t open it

I try to use VPN in Japen and America to download and install.
stop in starting up then the window just Disappeared.
I don`t know why and what happend.
And may first time download the translate fail.

I use python in Python 3.11.5. and VRcaht open the option.
In windows 11.

My friend is having trouble opening the program

Hi! My friend just downloaded VRCT and is having issues opening/running the program. I've attached an image so you can see what's happening. I have installed the program version 2.2.4 on Windows 11 and it works perfectly fine, so I tried to help her set up the program.

She has restarted her computer and followed the installation directions. When she opens the program it looks like it's hanging and is blank. We tried to search for an error log but can't find it. The log folder is empty. If you need it, could you tell us where to look?

Here is her operating system and hardware specifications:

Update:
Not sure if it's related, but just got an error.log that reads:

Traceback (most recent call last):
File "main.py", line 10, in
File "", line 1176, in _find_and_load
File "", line 1147, in find_and_load_unlocked
File "", line 690, in load_unlocked
File "PyInstaller\loader\pyimod02_importers.py", line 419, in exec_module
File "config.py", line 1115, in
File "config.py", line 34, in new
File "config.py", line 1100, in load_config
File "config.py", line 270, in SELECTED_TAB_NO
File "config.py", line 22, in saveJson
File "json_init.py", line 293, in load
File "json_init.py", line 346, in loads
File "json\decoder.py", line 337, in decode
File "json\decoder.py", line 355, in raw_decode
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Update 2:
She decided to download older versions and version 1.3.2 is working for her. Version 2.0 and above does not work.

機能提案: 多言語チャットボックス表示。Feature Suggestion: Multilingual Chatbox Display

こんにちは、Misyaguziyaさん！

まず最初に、VRChatのためにこの革新的なプログラムを作成してくださりありがとうございます。このプログラムのおかげで、さまざまな言語背景を持つプレイヤーとのリアルタイムコミュニケーションが大幅に向上しました。

プログラムの使いやすさと包括性をさらに向上させるための提案があります。現在、日本語に関しては、チャットボックスには翻訳されたテキストと英語のテキストが表示されており、非常に便利です。しかし、以下の機能を追加することが有益だと考えています：

発音表記（日本語のローマ字表記）：これは、日本語を学んでいるユーザーや、日本語の文字にまだ慣れていないユーザーが、翻訳されたテキストの発音と音声を理解するのに役立ちます。

これらの機能を実装することで、プログラムはより多用途で使いやすくなり、新しい言語を学んでいる人々や、複数の形態で翻訳を見たいと考える人々に特に役立つでしょう。また、全てのユーザーが翻訳の正確性を確認し、正しい発音を学ぶことができるようになります。

例えば、ユーザーが「How are you?」と言った場合、チャットボックスには次のように表示されます：

英語: How are you?
日本語: お元気ですか？
ローマ字: Ogenki desu ka?
これにより、日本語を話すことができるようになり、誰かが応答したときに日本語を読むことができます。

ご検討いただきありがとうございます。この提案を受けて、プログラムがどのように進化し、VRChatのユーザー体験が向上するのか楽しみにしています。

よろしくお願いします。

Hi misyaguziya!

First, I want to thank you for creating such an innovative program for VRChat. It has greatly enhanced my experience by facilitating real-time communication with players from different linguistic backgrounds.

I have a suggestion that I believe could further improve the usability and inclusiveness of the program. Currently, for the Japanese language, the chatbox displays the translated and English text, which is incredibly useful. However, I think it would be beneficial to expand the chatbox functionality to include:

Phonetic Transcription (Romaji for Japanese): This helps users who are learning Japanese or are not yet comfortable with Japanese characters to understand the pronunciation and sound of the translated text.

Implementing these features would make the program more versatile and user-friendly, especially for those learning new languages or who prefer to see the translation in multiple forms. It would also ensure that all users can cross-check translations for accuracy and learn correct pronunciations.

For example, if a user says "How are you?", the chatbox could display:

English: How are you?
Japanese: お元気ですか？
Romaji: Ogenki desu ka?

This would enable me to speak Japanese and allow me to read Japanese when someone responds.

Thank you for considering this suggestion. I look forward to seeing how the program evolves and continues to improve the VRChat experience for its users.

Multilanguage chatbox suggestion

Hello, I was going to make this suggestion earlier, in a quick glance I noticed Dr-Kakashi had suggested this so I didn't post it.
Today I was revisiting github and reading carefully Dr-Kakashi's suggestion : Noticed he was only suggesting to add Romaji so it is different than my original thoughts, so I'll post my suggestion also.

Suggesting to have translation output for up to total of 3 languages at once.

more than often we encounter situation where within the same instance there people that speak different languages, example : English, Japanese, German, Korean. We try to speak a common language, which usually end up to be English, but occasionally translation is still needed to get the meaning across.

I understand translation need to submit to translation services multiple time, thus the chatbox will be slower to display, but once everyone in the instance able to see the result, the extra 2 seconds of wait time will make everyone understand each other better.
I also understand the chatbox characters are limited, this will need to be put into a warning sign regarding this feature and it will be user responsibility to take note and aware of this limitation.

Example:
Your language : English
Primary translation : Japanese
2nd translation : German
3rd translation : Korean

When I type in "Good Morning" in English, it will output something like this:

Good Morning / おはようございます / Guten Morgen / 좋은 아침

For Speaker2Log function, only the Primary translation language will be used.

I think this should be easy to implement, as the program already able to have 3 different language settings define. It will just need to have a good UI design to introduce this new feature and a new routine to call all 3 translator services at once and output them together. You can even do a characater count, should it exceed vrchat chatbox character limits, output a RED color warning sign in the translation windows to notify user aware that this maybe too long and letter may got cut off.

What's your thought on this ?

Feature request: Selectable output device.

I've recently been introduced to VRCT and generally find the app really nice.

However there is one thing i am missing from it, and that is the option to be able to select which output device is used for monitoring, as at the moment it just uses the default system one.

Being able to change the monitored output device would allows the app do be used in more complex setups.
For example a specific virtual audio cable that is just using the game audio.
(E.g. using OBS to capture Game Audio and outputting it on a dedicated device.)
This would mean that listening to background music or videos would not mess with detecting of conversations.

Or you could also specifically have it listen to the VR Headset output instead of the desktop audio without having to switch default audio devices in Windows for more specific audio setups.

This would also open up possible use cases of the app to way more than just VRChat for someone who uses dedicated conversation outputs.
Like community calls on Discord or something like that.

I for example have 2 separate audio devices in use for the PC, one is dedicated as a conversation output.

Speaker2Log not working for me?

Everything else work but for some reason Speaker2Log are not working for me at all.

It doesn't translate what other are speaking and transcript it to VRCT. It also does not work on video too.
I wounder if it how i route my audio
Motherboard Optical->DAC->Headphone

VR Overlay locks up, freezes Speaker2Log transcription

I noticed that the incoming translation would only show on the VR overlay for about 10-15 minutes before it stopped showing up and also did not show on the log window in the main program anymore either, but outgoing chatbox transcription and translation would keep going. With some trial and error I was able to narrow the issue down to a problem with the VR overlay itself.

Even with translation toggled off, if the other speaker's transcription is being shown on the VR Overlay then after a few minutes it will stop working. This happens regardless of whether it's set to use Whisper or not. If I turn off the fade-out then the last line that it displayed on the overlay will get stuck on screen. No further lines will be shown in the main program window log until I turn off the overlay, then it will resume transcribing as normal again (but I'll have to go look at the main window to see it).

I'm using SteamVR and also have VRCX running if that makes a difference, though it seems to me like the overlay floats over top of the VRCX without any issue I still should consider a possible collision there and try without VRCX running next time.

If you have any advice on what I could check on to help you debug the overlay issue let me know. There is an "error.log" in the AppData\Local\VRCT folder but the error showing there is not from when the overlay froze up.

Feature Request: Use direct translation feature of whisper (機能リクエスト: whisper の直接翻訳機能を使用する)

The whisper models can transcribe and translate at the same time, if configured correctly.

The models were trained on either English-only data or multilingual data. The English-only models were trained on the task of speech recognition. The multilingual models were trained on both speech recognition and speech translation. For speech recognition, the model predicts transcriptions in the same language as the audio. For speech translation, the model predicts transcriptions to a different language to the audio.
source

If we add this feature to VRCT, combined with the feature of running these models on the GPU, it could result in very fast, high quality translations.

Google Translate:
ウィスパーモデルは、正しく構成されていれば、文字起こしと翻訳を同時に行うことができます。

モデルは、英語のみのデータまたは多言語データでトレーニングされました。英語のみのモデルは、音声認識のタスクでトレーニングされました。多言語モデルは、音声認識と音声翻訳の両方でトレーニングされました。音声認識の場合、モデルはオーディオと同じ言語での文字起こしを予測します。音声翻訳の場合、モデルはオーディオとは異なる言語への文字起こしを予測します。
ソース

この機能を VRCT に追加し、これらのモデルを GPU で実行する機能と組み合わせると、非常に高速で高品質の翻訳が可能になります。

geolocation.onetrust.comが何かしらの要因で接続できない場合にVRCTが起動に失敗する

結論

geolocation.onetrust.comがブロックされているか何かしらの要因で接続できない際にVRCTがエラーを吐いて一定確率で起動できません。

詳細

私のPCでは、諸事情で全ての通信を常時VPNを通して行っています。私の使用しているVPNでは標準で、主に個人の特定のために使用されるサイトやドメインをブロックする機能が搭載されています。

その機能がgeolocation.onetrust.comをブロックしており、translators/server.pyのget_region_of_server関数内で例外が発生し、VRCTの起動に失敗します。

その他のURL、ip.taobao.com, httpbin.org, ip-api.com に関しては接続可能であったため、原因は geolocation.onetrust.com であると考えています。

また、このエラーは起動時に確実に発生するわけではなく、PC起動後(もしくはログオン後?)の初回～3回目にかけてほぼ確実に発生し、その後は稀に発生するか一切発生しない状態です。また、管理者権限でVRCTを起動すると確実に回避出来ました。

普段管理者権限を必要としないソフトウェアを管理者権限で起動するのはあまり好ましくないのと、私の使用しているVPNのブロック機能が特定のサイトやドメインのみ許可することが出来ないため、大変わがままではありますが、もしよろしければ何かしらの対策をしていただけると助かります。

エラーログ

Error Log 1

  File "socket.py", line 962, in getaddrinfo
socket.gaierror: [Errno 11001] getaddrinfo failed

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "urllib3\connectionpool.py", line 793, in urlopen
  File "urllib3\connectionpool.py", line 491, in _make_request
  File "urllib3\connectionpool.py", line 467, in _make_request
  File "urllib3\connectionpool.py", line 1099, in _validate_conn
  File "urllib3\connection.py", line 616, in connect
  File "urllib3\connection.py", line 205, in _new_conn
urllib3.exceptions.NameResolutionError: <urllib3.connection.HTTPSConnection object at 0x000001C559F71510>: Failed to resolve 'geolocation.onetrust.com' ([Errno 11001] getaddrinfo failed)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "requests\adapters.py", line 486, in send
  File "urllib3\connectionpool.py", line 847, in urlopen
  File "urllib3\util\retry.py", line 515, in increment
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='geolocation.onetrust.com', port=443): Max retries exceeded with url: /cookieconsentpub/v1/geo/location (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x000001C559F71510>: Failed to resolve 'geolocation.onetrust.com' ([Errno 11001] getaddrinfo failed)"))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "translators\server.py", line 318, in get_region_of_server
  File "requests\api.py", line 73, in get
  File "requests\api.py", line 59, in request
  File "requests\sessions.py", line 589, in request
  File "requests\sessions.py", line 703, in send
  File "requests\adapters.py", line 519, in send
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='geolocation.onetrust.com', port=443): Max retries exceeded with url: /cookieconsentpub/v1/geo/location (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x000001C559F71510>: Failed to resolve 'geolocation.onetrust.com' ([Errno 11001] getaddrinfo failed)"))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "translators\server.py", line 326, in get_region_of_server
AttributeError: 'NoneType' object has no attribute 'get'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "main.py", line 25, in <module>
  File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1147, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
  File "PyInstaller\loader\pyimod02_importers.py", line 419, in exec_module
  File "controller.py", line 5, in <module>
  File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1147, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
  File "PyInstaller\loader\pyimod02_importers.py", line 419, in exec_module
  File "model.py", line 18, in <module>
  File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1147, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
  File "PyInstaller\loader\pyimod02_importers.py", line 419, in exec_module
  File "models\translation\translation_translator.py", line 4, in <module>
  File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1147, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
  File "PyInstaller\loader\pyimod02_importers.py", line 419, in exec_module
  File "translators\__init__.py", line 5, in <module>
  File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1147, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
  File "PyInstaller\loader\pyimod02_importers.py", line 419, in exec_module
  File "translators\server.py", line 5439, in <module>
  File "translators\server.py", line 5169, in __init__
  File "translators\server.py", line 332, in get_region_of_server
RuntimeError: input(): lost sys.stdin

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "main.py", line 32, in <module>
PermissionError: [Errno 13] Permission denied: 'error.log'

Error Log 2

Traceback (most recent call last):
  File "requests\models.py", line 971, in json
  File "json\__init__.py", line 346, in loads
  File "json\decoder.py", line 337, in decode
  File "json\decoder.py", line 355, in raw_decode
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "translators\server.py", line 325, in get_region_of_server
  File "requests\models.py", line 975, in json
requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "main.py", line 25, in <module>
  File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1147, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
  File "PyInstaller\loader\pyimod02_importers.py", line 419, in exec_module
  File "controller.py", line 5, in <module>
  File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1147, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
  File "PyInstaller\loader\pyimod02_importers.py", line 419, in exec_module
  File "model.py", line 18, in <module>
  File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1147, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
  File "PyInstaller\loader\pyimod02_importers.py", line 419, in exec_module
  File "models\translation\translation_translator.py", line 4, in <module>
  File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1147, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
  File "PyInstaller\loader\pyimod02_importers.py", line 419, in exec_module
  File "translators\__init__.py", line 5, in <module>
  File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1147, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
  File "PyInstaller\loader\pyimod02_importers.py", line 419, in exec_module
  File "translators\server.py", line 5439, in <module>
  File "translators\server.py", line 5169, in __init__
  File "translators\server.py", line 332, in get_region_of_server
RuntimeError: input(): lost sys.stdin

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "main.py", line 32, in <module>
PermissionError: [Errno 13] Permission denied: 'error.log'

その他

エラーログが長いため、Google FormではなくGitHubのIssueで報告いたしました。

もし、Google Formでの報告が良い場合は今後はそちらに送らせていただきます。

Improvements

Hello! This is a really cool project! I have so questions and suggestions
Have you looked into open ai's whisper model? I can run locally and transcribe (and translate) voice! But the benefit is that is free! (and probably faster).
Also another option for translating text could be using the google actions api (which is free up to 5000 times a day).

Speaker2Log repetition

特徴：ホットキー / キービンド

(英語から翻訳)

以前から言われていたことなのか、計画されていることなのか分かりませんが（Githubは初めてなので）、VRCTの特定の機能（Voice2Chatboxなど）を有効/無効にするホットキー/キーバインドがあるととても便利だと思います。

この機能が追加されなくても、ありがとう。