Comments (5)
Hi Alexandro!
The constructor for SpkModel should already be available, but the setter method of the KaldiRecognizer Recognizer::SetSpkModel
still needs to be exposed in src/bindings.cc. After that, the speaker x-vector should be available together with the result.
The constructor Recognizer::Recognizer(Model *model, float sample_frequency, SpkModel *spk_model)
should also be exposed for completeness.
Go ahead if you want to give it a try. I will otherwise make some time next week for it.
from vosk-browser.
Preparing Builder Environment:
apt update && apt -y upgrade
apt install -y build-essential git sudo screen curl
curl -sSL https://get.docker.com | sh
sudo usermod -aG docker $(whoami)
docker run hello-world
cd $HOME
git clone --recursive https://github.com/ccoreilly/vosk-browser
cd vosk-browser
screen
time make builder
time make binary
from vosk-browser.
updated files:
src/vosk.d.ts
export declare class Model {
constructor(path: string);
public delete(): void;
}
export declare class SpkModel {
constructor(path: string);
public delete(): void;
}
export declare class KaldiRecognizer {
constructor(model: Model, sampleRate: number);
constructor(model: Model, sampleRate: number, grammar: string);
constructor(model: Model, sampleRate: number, spkModel: SpkModel);
public SetSpkModel(spkModel: SpkModel): void;
public SetWords(words: boolean): void;
public AcceptWaveform(address: number, length: number): boolean;
public Result(): string;
public PartialResult(): string;
public FinalResult(): string;
public delete(): void;
}
export declare interface Vosk {
FS: {
mkdir: (dirName: string) => void;
mount: (fs: any, opts: any, path: string) => void;
};
MEMFS: Record<string, any>;
IDBFS: Record<string, any>;
WORKERFS: Record<string, any>;
HEAPF32: any;
downloadAndExtract: (url: string, localPath: string) => void;
syncFilesystem: (fromPersistent: boolean) => void;
Model;
KaldiRecognizer;
SetLogLevel(level: number): void;
GetLogLevel(): number;
_malloc: (size: number) => number;
_free: (buffer: number) => void;
}
export default function LoadVosk(): Promise<Vosk>;
src/bindings.cc
// Copyright 2020 Denis Treskunov
// Copyright 2021 Ciaran O'Reilly
#include <emscripten/bind.h>
#include "utils.h"
#include "../vosk/src/kaldi_recognizer.h"
#include "../vosk/src/model.h"
#include "../vosk/src/spk_model.h"
using namespace emscripten;
namespace emscripten {
namespace internal {
template<> void raw_destructor<Model>(Model* ptr) { /* do nothing */ }
template<> void raw_destructor<SpkModel>(SpkModel* ptr) { /* do nothing */ }
}
}
struct ArchiveHelperWrapper : public wrapper<ArchiveHelper> {
EMSCRIPTEN_WRAPPER(ArchiveHelperWrapper);
void onsuccess() {
return call<void>("onsuccess");
}
void onerror(const std::string &what) {
return call<void>("onerror", what);
}
};
static Model *makeModel(const std::string &model_path) {
try {
return new Model(model_path.c_str());
} catch (std::exception &e) {
KALDI_ERR << "Exception in Model ctor: " << e.what();
throw;
}
}
static SpkModel *makeSpkModel(const std::string &model_path) {
try {
return new SpkModel(model_path.c_str());
} catch (std::exception &e) {
KALDI_ERR << "Exception in SpkModel ctor: " << e.what();
throw;
}
}
static KaldiRecognizer* makeRecognizerWithGrammar(Model *model, float sample_frequency, const std::string &grammar) {
try {
KALDI_VLOG(2) << "Creating model with grammar";
return new KaldiRecognizer(model, sample_frequency, grammar.c_str());
} catch (std::exception &e) {
KALDI_ERR << "Exception in KaldiRecognizer ctor: " << e.what();
throw;
}
}
static KaldiRecognizer* makeRecognizerWithSpk(Model *model, float sample_frequency, SpkModel *spk_model) {
try {
KALDI_VLOG(2) << "Creating model with spk";
return new KaldiRecognizer(model, sample_frequency, spk_model);
} catch (std::exception &e) {
KALDI_ERR << "Exception in KaldiRecognizer ctor: " << e.what();
throw;
}
}
static void KaldiRecognizer_SetSpkModel(KaldiRecognizer &self, SpkModel *spk_model)
{
KALDI_VLOG(2) << "Setting SpkModel";
self.SetSpkModel(spk_model);
}
static void KaldiRecognizer_SetWords(KaldiRecognizer &self, int words) {
KALDI_VLOG(2) << "Setting words to " << words;
self.SetWords(words);
}
static bool KaldiRecognizer_AcceptWaveform(KaldiRecognizer &self, long jsHeapAddr, int len) {
const float *fdata = (const float*) jsHeapAddr;
KALDI_VLOG(3) << "AcceptWaveform received len=" << len << " 0=" << fdata[0] << " " << len-1 << "=" << fdata[len-1];
return self.KaldiRecognizer::AcceptWaveform(fdata, len);
}
static string KaldiRecognizer_Result(KaldiRecognizer &self) {
std::string s;
s += self.KaldiRecognizer::Result();
return s;
}
static string KaldiRecognizer_FinalResult(KaldiRecognizer &self) {
std::string s;
s += self.KaldiRecognizer::FinalResult();
return s;
}
static string KaldiRecognizer_PartialResult(KaldiRecognizer &self) {
std::string s;
s += self.KaldiRecognizer::PartialResult();
return s;
}
EMSCRIPTEN_BINDINGS(vosk) {
class_<ArchiveHelper>("ArchiveHelper")
.function("Extract", &ArchiveHelper::Extract)
.allow_subclass<ArchiveHelperWrapper>("ArchiveHelperWrapper")
.function("onsuccess", optional_override([](ArchiveHelper& self) {
return self.ArchiveHelper::onsuccess();
}))
.function("onerror", optional_override([](ArchiveHelper& self, const std::string &what) {
return self.ArchiveHelper::onerror(what);
}))
;
class_<Model>("Model")
.constructor(&makeModel, allow_raw_pointers())
;
class_<SpkModel>("SpkModel")
.constructor(&makeSpkModel, allow_raw_pointers())
;
class_<KaldiRecognizer>("KaldiRecognizer")
.constructor(&makeRecognizerWithGrammar, allow_raw_pointers())
.constructor<Model *, float>(allow_raw_pointers())
.constructor(&makeRecognizerWithSpk, allow_raw_pointers())
.constructor<SpkModel *, float>(allow_raw_pointers())
.function("SetWords", &KaldiRecognizer_SetWords)
.function("SetSpkModel", &KaldiRecognizer_SetSpkModel)
.function("AcceptWaveform", &KaldiRecognizer_AcceptWaveform)
.function("Result", &KaldiRecognizer_Result)
.function("FinalResult", &KaldiRecognizer_FinalResult)
.function("PartialResult", &KaldiRecognizer_PartialResult)
;
emscripten::function("SetLogLevel", &SetVerboseLevel);
emscripten::function("GetLogLevel", &GetVerboseLevel);
}
faced with these errors:
- no matching constructor for initialization of KaldiRecognizer
- static_assert failed due to requirement '!std::is_pointer<SpkModel *>::value' "Implicitly binding raw pointers is illegal. Specify allow_raw_pointer<arg<?>>"
from vosk-browser.
Very big thank you Ciaran O'Reilly for your answer.
from vosk-browser.
Hi @arbdevml, sorry for my late reply. I'll check your changes. In the future, it'd be easier if you forked the repository and shared your changes in a branch of your fork. That way, it is pretty straightforward to check it out and test.
from vosk-browser.
Related Issues (20)
- 16kHz sample rate does not work HOT 2
- Result event not triggered on file upload HOT 4
- Build output location HOT 2
- Delays when transcribing streaming audio HOT 4
- information available in the User Agent string will be reduced
- can not build vosk-browser HOT 1
- View timing of words/phonemes? HOT 1
- Can't run HOT 14
- Can't build HOT 10
- Add SetMaxAlternatives
- Word confidence value is always 1...
- Two problems when using vosk-browser with non-streaming, separated static waveforms HOT 2
- Malayalam Model is not loading HOT 11
- Attempting to pass data to the KaldiRecognizer results in an odd internal error HOT 4
- react example process undefined HOT 2
- Is the demo working? HOT 1
- having problems running make (using wsl/ubuntu on windows)
- How to run vosk-browser instantly?
- Library Documentation and Numbers as non text HOT 4
- Get ready state of recognizer HOT 8
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from vosk-browser.