Git Product home page Git Product logo

picovoice / rhino Goto Github PK

View Code? Open in Web Editor NEW
595.0 15.0 82.0 302.7 MB

On-device Speech-to-Intent engine powered by deep learning

Home Page: https://picovoice.ai/

License: Apache License 2.0

C 3.60% Python 16.08% Java 9.13% Swift 7.86% C# 12.01% Ruby 0.60% TypeScript 14.12% Shell 0.49% Dart 6.94% Go 10.62% Rust 10.88% JavaScript 7.67%
natural-language-understanding voice-recognition nlu spoken-language-understanding voice-assistant voice-ui voice-user-interface speech-recognition voice-commands voice-control

rhino's Issues

How to design the model?

Or more specifically: what trade-offs are there to consider?

Hi there! As a minimal examples to illustrate my questions:

  1. Are two intents "lightsOn" (expression: "turn lights on") and "lightsOff" (expression: "turn lights off") cheaper in terms of performance than one intent "switchLight" with expression "turn lights $state:state" with slot "state" having the elements "on" and "off"?

  2. How about the equivalent, but less intuitive option of a single intent "switchLight" with expressions "$dummy:on lights on" and "$dummy:off lights of" with the slot "dummy" having just the one element "turn"? This is admittedly a bad example, but I think the general idea to just have an expression put a dummy value into a specifically named slot could come in handy sometimes - unless it's always better to create a separate intent for some reason...

  3. Is it helpful to define sort of sub-slots (e.g. have a slot with all the days and a separate one with just the workdays) and use the more specific one where the other options are not valid? Or just put the general slot and filter the invalid results later, in your application, to avoid cluttering the model?

  4. Do I put everything into a single model or does it make sense to have multiple smaller models and just let Rhino listen for the one that is expected/allowed in the current situation? If neither performance nor colliding expressions are an issue, a single model might be easier to have, but its a bit hard to manage in the console because you cannot re-order elements (at least not as far as I have seen).

And while I'm here, a question regarding licensing: what do you mean by # Voice Interactions (per month): 1000 on the pricing page? And can I at least switch between devices as I am allowed just one? Or would it even be acceptable to run the software on multiple computers as long as they are all my machines, located in different rooms of my home? (might be easier than having to send the data from all microphones to a single instance)

Rhino Issue: Cannot open Intent Editor in Picovoice Console

Make sure you have read the documentation, and have put forth a reasonable effort to find an existing answer.

Expected behaviour

Click on newly created Rhino Context using Empty Template, the Intent Editor displays

Actual behaviour

Browser displays an empty page, with the following console errors
TypeError: can't convert null to object
Uncaught (in promise) TypeError: can't convert null to object

Steps to reproduce the behaviour

Create new Rhino Context with an Empty Template and click on it to progress to Intent Editor
(Include enough details so that the issue can be reproduced independently.)
(Using Firefox Browser)

Feature request: demo apps for 64bit ARM64/AARCH64

Hi,
we are using a plain Ubuntu Xenial linux distro both on a RPi3+ board and another Cortex-A53 custom board and, in both cases, we use a 64bit version of the OS.
Is there any demo available that may run on our platforms?
The Python executable in our systems is a 64bit binary, hence it fails to load the shared libraries that you provide.
Thanks,

Roberto

Define hardcoded/prefilled slots

Hi,

I'm controlling my Spotify streaming via Picovoice. I have expressions in the form music $musiccommand:command where the command can be something like 'play', 'pause', 'next', 'previous'.

Now some commands need extra parameters, like 'shuffle' (on/off), 'volume' (double digit integer). I want to use the same intent for that, and the code for that intent expects the command slot to be there. Therefore, for now, I also define the expressions music $musiccommand:command $pv.SingleDigitInteger:volumelevel and music $musiccommand:command $state:state.

Now I could also say music play off which does not make sense. So what I would like to do is define an expression like music shuffle:command $state:state or music volume:command $pv.SingleDigitInteger:volumelevel, basically defining hardcoded values for the slots instead of using a list of possible values.

Of course I could define extra slots for shuffle/volume, but this way it would make things much easier.

Generally, Picovoice with Respeaker on Raspi - awesome!

Define keyword to separate multiple intents in one voice command

Is your feature request related to a problem? Please describe.
Sometimes I want to give multiple commands to Rhino without having to say the wake word again each time.

Describe the solution you'd like
I guess the title says it all: allow the user to define a keyword for a Rhino context that can be used to separate multiple intents in one voice command. Could be a keyword that is then forbidden to be used in any intent or slot, to make it easier?

Describe alternatives you've considered
No idea.

Additional context
As I have Audio feedback when processing intents, currently I always have to wait for that to finish before I can say the wake word again or it would interfere with the voice command.

permanently train Rhino inference models

Is your feature request related to a problem? Please describe.
I am working on permanently installing a voice-controlled component for our research lab, and would like to avoid re-training and re-downloading the Rhino inference model file every 30 days.

Describe the solution you'd like
Is there a way to permanently train the Rhino inference model (so only need to download once?)

Describe alternatives you've considered
If we can't permanently train the Rhino file, I'm wondering if there is a way to integrate the porcupine wake word with the Picovoice Cheetah speech-to-text engine? (again to avoid re-downloading every month)

Additional context
Add any other context or screenshots about the feature request here.

Chaining rhino inferences together

Hi, I'm trying to do a series of rhino inferences chained triggered by porcupine. So for example:

"Porcupine" -> "turn the lights on in the living room" -> "turn the lights in the living room blue" -> "turn off the lights in the kitchen" -> timeout -> Porcupine listening

I'm struggling with an error in AudioRecord start() with status code -38. My guess is that there's two things trying to use audio at the same time?

Currently in the RhinoManager callback, depending on inference results, I'm setting an observable integer that corresponds to whether rhino or porcupine should be running. The chaining works a few times before I get this status code. Any suggestions?

Rhino Issue: not able to find intent

i am running picovoice rhino demo for react-native i have given a context to it which contains : 30-40 words its like a 3-4 line sentence that i have inserted as my context.

IMG_20201212_121453

what is happening is when i am trying to say the exact same sentence which is there in the context inserted by me it is sometime showing "isunderstood" : false, even if i am resiting the exact same sentence it is not able to find the intent 'sometimes' why is this happening and how to overcome this challenge?

Screenshot_2020-12-12-12-09-11-084_ai picovoice reactnative rhinodemo

Also can you tell me how the rhino speech to intent is actually working is it first converting the speech to text and then using NLU to find out the intent?

Rhino C code on Rpi unable to parse the wav file

Hello Alireza,
I am experimenting with Rhino and tried it on RPi-3. It takes input from the test_within_context.wav file from audio_samples and returns the detected intent. I recorded few more audios in same format and expected rhino to understand the intents. But it could just give slots, slots_value output for one of the audio files. The rest of three audio wav files are not understood by Rhino despite being recorded in the same environment as the one being understood. What might be the error?

Comparing it with Snips: Snips is able to understand my voice commands spoken in the same environment. So, it makes me believe that my voice commands and noise should not be an issue.
Please guide.

Thanks!

Libraries for armhf and arm64

Hi,

I am working on a ROS wrapper for porcupine and rhino. The ROS build farm is building for linux for architectures armhf, arm64 and amd64. For now I am only using the this library which causes the arm64 and armhf builds to fail. What libraries should I use for the armhf and arm64 linux builds? Or aren't these available?

Thanks!

Rhino Documentation Issue

What is the URL of the doc?

https://github.com/Picovoice/rhino

What's the nature of the issue? (e.g. steps do not work, typos/grammar/spelling, etc., out of date)

The note here about self service "self-service. Developers can train custom models using Picovoice Console " paired with the 30 day expiration shown in the console is unclear to me what is possible in the free model. Is the free model only available to be downloaded in that 30 day window or does it actually expire in use every 30 days and need to be retrained and installed? I don't seem to see any other options to create custom models outside of the console at this time so I am not sure if that is a possibility either.

List of intents that the demo supports?

What is the list of intents that the demo files support?

I tried to recreate the working audio file using audacity but the only I way I could the demo to match my voice was to include the silence at the beginning and the end of the wav file. Is this expected behaviour?

Tony V

[question] Can the context/model be trained on the end device?

Hi, we are thinking of trialling out Rhino on our custom android devices but can you tell us if it is possible to retrain the models on the devices so they are personalized for the end-users, please?
For example, if we wanted to design an intent which would let the user call the contacts on their device i.e. something like "Call [contact_on_device]", would it be possible to inject all the contact names on the user's device in the model so that it can recognize them?
Thank you.

Using custome wake words and custom contexts

Hello, We just created custom wake words and custom contexts and trained models using the account we have in the Picovoice console. We would like to use these instead of the default ones supplied with the python demo. I see that the following variables need to be customized:

  • rhino_library_path=args.rhino_library_path
  • rhino_model_file_path=args.rhino_model_file_path,
  • rhino_context_file_path=args.rhino_context_file_path,
  • porcupine_library_path=args.porcupine_library_path,
  • porcupine_model_file_path=args.porcupine_model_file_path,
  • porcupine_keyword_file_path=args.porcupine_keyword_file_path

We are able to figure out the rhino_library_path, rhino_model_file_path, porcupine_model_file_path, porcupine_library_path. But what about the keyword file and context file, how do we get those files for the custom keyword and context we trained?

Thanks

Rhino Issue: React Native using voice to control video playback

Expected behaviour

When triggering play/pause of video component, picovoice should continue listening

Actual behaviour

Currently after several commands Rhino encounters an error (callstack available in XCode
#6 0x000000010baadc2e in PvRhino.process(handle:pcm:resolver:rejecter:) at /Users/john.bowden/development/rnpvtest/node_modules/@picovoice/rhino-react-native/ios/Rhino.swift:102

Steps to reproduce the behaviour

Init a react native project with picovoice and @react-native-community/voice, use picovoice with wakeword and intent to toggle playback of a video,

I have a private repo that I can share to reproduce this, I can invite someone if given a github user.

Note: we also have a corporate account, can provide details separately

Add support for React Native

If you are developing a cross-platform mobile application with React Native, please comment on this issue. The more requests we get, the faster we'd support this platform.

We are also looking for technical contributors who can help us expand our SDK support for a variety of platforms. If you are an expert in React Native or other platforms listed in the link below, please do not hesitate to reach out to us via the link below:

https://forms.gle/7B58sj87QT8gbwrr8

Add Raspberry Pi support to Go binding

Make sure you have read the documentation, and have put forth a reasonable effort to find an existing answer.

Expected behaviour

demo should compile

Actual behaviour

doesnt compile
---- OUTPUT
$ go run micdemo/rhino_mic_demo.go

github.com/gen2brain/malgo

In file included from miniaudio.c:4:
miniaudio.h: In function ‘ma_device_data_loop_wakeup__alsa’:
miniaudio.h:20930:9: warning: ignoring return value of ‘write’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
20930 | write(pDevice->alsa.wakeupfdCapture, &t, sizeof(t));
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
miniaudio.h:20933:9: warning: ignoring return value of ‘write’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
20933 | write(pDevice->alsa.wakeupfdPlayback, &t, sizeof(t));
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
miniaudio.h: In function ‘ma_device_wait__alsa’:
miniaudio.h:20760:13: warning: ignoring return value of ‘read’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
20760 | read(pPollDescriptors[0].fd, &t, sizeof(t)); /* <-- Important that we read here so that the next write() does not block. */
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

github.com/Picovoice/rhino/binding/go

/home/ubuntu/go/pkg/mod/github.com/!picovoice/rhino/binding/[email protected]/rhino_posix.go:142:5: could not determine kind of name for C.bool
/home/ubuntu/go/pkg/mod/github.com/!picovoice/rhino/binding/[email protected]/rhino_posix.go:148:12: could not determine kind of name for C.pv_rhino_is_understood_wrapper
/home/ubuntu/go/pkg/mod/github.com/!picovoice/rhino/binding/[email protected]/rhino_posix.go:138:12: could not determine kind of name for C.pv_rhino_process_wrapper
cgo:
gcc errors for preamble:
/home/ubuntu/go/pkg/mod/github.com/!picovoice/rhino/binding/[email protected]/rhino_posix.go:34:67: error: unknown type name 'bool'
34 | typedef int32_t (*pv_rhino_process_func)(void *, const int16_t *, bool *);
| ^~~~
/home/ubuntu/go/pkg/mod/github.com/!picovoice/rhino/binding/[email protected]/rhino_posix.go:36:77: error: unknown type name 'bool'
36 | int32_t pv_rhino_process_wrapper(void *f, void *object, const int16_t *pcm, bool *is_finalized) {
| ^~~~
/home/ubuntu/go/pkg/mod/github.com/!picovoice/rhino/binding/[email protected]/rhino_posix.go:40:62: error: unknown type name 'bool'
40 | typedef int32_t (*pv_rhino_is_understood_func)(const void *, bool *);
| ^~~~
/home/ubuntu/go/pkg/mod/github.com/!picovoice/rhino/binding/[email protected]/rhino_posix.go:42:69: error: unknown type name 'bool'
42 | int32_t pv_rhino_is_understood_wrapper(void *f, const void *object, bool *is_understood) {
| ^~~~

Steps to reproduce the behaviour

try to compile on a raspberry pi4

(Include enough details so that the issue can be reproduced independently.)

$ uname -a
Linux raspi4 5.8.0-1024-raspi #27-Ubuntu SMP PREEMPT Thu May 6 10:07:12 UTC 2021 aarch64 aarch64 aarch64 GNU/Linux

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 20.10
Release: 20.10
Codename: groovy

$ go version
go version go1.16.4 linux/arm64

PvArgumentError when running the File Demo command using Node JS

Hi,

NodeJS: v12.13.0
I am using Node JS global packages. I have installed : npm install -g @picovoice/rhino-node-demo

When I am running the command to transciber --> rhn-file-demo --context_path resources/contexts/linux/coffee_maker_mac.rhn --input_audio_file_path resources/audio_samples/test_within_context.wav

I am getting the following error.

throw new PvArgumentError(
^

ReferenceError: PvArgumentError is not defined
at fileDemo (/home/jav/.nvm/versions/node/v12.13.0/lib/node_modules/@picovoice/rhino-node-demo/file.js:63:5)
at Object. (/home/jav/.nvm/versions/node/v12.13.0/lib/node_modules/@picovoice/rhino-node-demo/file.js:132:1)

Thanks in advance

React-native - Continuous processing

Is your feature request related to a problem? Please describe.
At the moment, when using RhinoManager, as soon as an intent is proceed, the VoiceProcessor is stopped and we have to launch it again (via manager.process() with a delay). So to listen continuously, the record will start and stop and start, etc...

Describe the solution you'd like
It would be great to have an option to listen continuously so after an intent, it does not call await this._voiceProcessor.stop();
It would be just a little option to the process method.

I'm willing to post a PR if needed.

Processing crashes on some commands

For some context: I'm writing a Java program (running on Windows) that combines Porcupine and Rhino, i.e., listens for a wake word and passes to Rhino to interpret a single command. I tried to test it with the smart lighting demo context and an extended model that is based on the smart lighting one; both show the same problem.

When I say "Turn light in the kitchen to blue.", the program crashes somewhere inside the Rhino.process(short[] pcm) method and only the message "Process finished with exit code -1073740940 (0xC0000374)". No Java Exception/Error is thrown, so I think it has to be in the native code. The odd part is that similar commands like "Turn light in the kitchen to green." or "Turn light in the bathroom to blue." work fine and are interpreted correctly...

Do you have any hints as to what could cause this issue or is there any more useful information I should provide?

Libpv_rhino.dll - The specified module could not be found.

I am attempting to use Rhino in custom Python code on Windows. I am able to successfully run the rhino_demo_mic.py, but when trying a minimal custom example:

    _rhino_library_path = "bin/pvrhino/libpv_rhino.dll"
    _rhino_model_file_path = "bin/pvrhino/rhino_params.pv"
    turn_on_context_file_path = "bin/pvrhino/model_2020-08-22_v1.5.0.rhn"

    toggle_display = Rhino(
        library_path=_rhino_library_path,
        model_path=_rhino_model_file_path,
        context_path=turn_on_context_file_path)

It throws the following error:

  File "C:\Projects\xxx\xxx\xxx\lib\Rhino\rhino.py", line 61, in __init__
    library = cdll.LoadLibrary(library_path)
  File "C:\Users\xxx\AppData\Local\Programs\Python\Python37\lib\ctypes\__init__.py", line 442, in LoadLibrary
    return self._dlltype(name)
  File "C:\Users\xxx\AppData\Local\Programs\Python\Python37\lib\ctypes\__init__.py", line 364, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: [WinError 126] The specified module could not be found
bin/pvrhino/libpv_rhino.dll

Process finished with exit code 1

It is getting past the Rhino check for the file's existance:

        if not os.path.exists(library_path):
            raise IOError("couldn't find Rhino's library at '%s'" % library_path)

But fails immediately after that.

I have copied the rhino_params.pv and libpv_rhino.dll file from the repository to make sure that I am the most up to date and using the same file that was used during the rhino_demo_mic.py demo.

Any suggestions would be appreciated.

.NET demo has missing dependency

Hello,

I tried to run your .NET demo on Win10 v1909 FR.

dotnet run -c MicDemo.Release -- --context_path "C:\Users\Username\Desktop\model.rhn"

It crashes with

Listening
System.DllNotFoundException: Could not load the dll 'openal32.dll' (this load is intercepted, specified in DllImport as 'AL').
   at OpenTK.Audio.OpenAL.ALLoader.ImportResolver(String libraryName, Assembly assembly, Nullable1 searchPath)
   at System.Runtime.InteropServices.NativeLibrary.LoadLibraryCallbackStub(String libraryName, Assembly assembly, Boolean hasDllImportSearchPathFlags, UInt32 dllImportSearchPathFlags)
   at OpenTK.Audio.OpenAL.ALC.CaptureOpenDevice(String devicename, Int32 frequency, ALFormat format, Int32 buffersize)
   at RhinoDemo.MicDemo.RunDemo(String contextPath, String modelPath, Single sensitivity, Nullable1 audioDeviceIndex, String outputPath) in  E:\rhino\demo\dotnet\RhinoDemo\MicDemo.cs:line 83
   at RhinoDemo.MicDemo.Main(String[] args) in E:\rhino\demo\dotnet\RhinoDemo\MicDemo.cs:line 296

Seems related to opentk/opentk#1169

non-English support for STM32 on GitHub

I'm a newbie in Picovoice World and I don't know if this is the right place to ask a question because I dindn't find a forum for Picovoice.
Anyway, I created a Personal account to get started with Picovoice solution with a STM32 device (available demo) and it looks very cool.
My issue is that when I select the Langage French, I get this error:
[ERROR] context file belongs to 'fr' while model file belongs to 'en'
I already selected french langage in Rhino interface before generating the model!
Does Personal account supports the usage of langages apart of English langage?
If yes, what is the issue? how can I test french langage with the current STM32 demo?
Thank you.

Rhino Issue: issue when French language is used

Hello,

I got a notification that different languages are now supported in rev 2.0.0 which is very nice (Issue #160). Thank you a lot for that.
However, I tried to a do a test with French language with STM32F769-Disco board, I got the following error:
[ERROR] Keyword files (.PPN) and model file (.PV) should belong to the same language. Keyword files belongs to 'fr' while model file belongs to 'en'.
This is what I did:

  • In the console, and in Language menu I've selected French language.
  • Created a new context.
  • Fill the context with different commands in French.
  • Save and train the model.
  • Tested the command with the PC microphone in the console: the commands are detected perfectly.
  • Downloaded the latest version of Picovoice from github and in the demo under \demo\mcu\stm32f769\stm32f769i-disco, copied the new generated context array in CONTEXT_ARRAY in pv_params.h in French section.
  • Activated French language in the project preprocessor by replacing the flag PV_LANGUAGE_ENGLISH by PV_LANGUAGE_FRENCH then compiled the project.
  • Loaded the project and at the beginning, in the trace, I got the previous error!
    Am I missing something?
    Thank you for your help.

Set explicit slot phrase values

Hi,

I want to keep the code that processes the intents as generic as possible. As described in #61, I am controlling Spotify via its API. Right now I have commands like music play or music pause, and then I am using spotifyApi[intent.slots.command] to call the according method on the Spotyify API. However, the command to skip to the next song is skipToNext, in the voice command I'd like to say music next.

Would it be possible to configure slot phrase values for each phrase defined in a slot? In the best case, I could even define if its a string, a boolean or a number (e.g. for the shuffle command, where I'd like to convert "on" to true and "off" to false), and the same is true for volume levels (music volume low or similar that I'd like to translate into an integer). If a separate slot phrase value is available, that could then be available in intent.slot_values.xyz. That way, I could have slot names like param_1, param_2 that I automatically pass into the Spotify API functions when present in a intent.

The advantage is that I'd only have to cange the Rhino config in the cosole, and not touch my intent processing code, to add a new Spotify API voice call. Hope that makes sense.

Jonas

Possible to have a generic slot?

Hi, I'm very interested in using this package for a Flutter app I'm working on.

I'm wondering if it is possible to have a slot that works somewhat like a wildcard. For example, a "food" slot is naturally difficult to make as listing all foods in a slot isn't really feasible.

I tried to use the built-in $pv.Alphabetic slot but that seems to pick up individual letters instead of whole words.

Some examples to demonstrate exactly what I'd like to do:

How many calories are in $pv.TwoDigitInteger:quantity servings of $food:food?
What pairs well with $food:food_a and $food:food_b?

date and time as built-in slots

Hi,

I am using rhino to create a model for scheduling a helper. Based on a task and a timing, it should extract what that task is and the date and time as intents. I am having trouble with this and cannot find any resources online to guide me through adding time and date as slots, which normally would be regular expressions. Any advice/guidance?

Thanks

Model used

What is the deep learning model used in the examples? I'm willing to run the C example on a Sparkfun Edge, which is powered by an Ambiq Apollo3 Blue chip (featuring a Cortex M4 MCU), but I'm not sure it would work right away.
However, I thought I could train the model myself and deploy it on my Sparkfun Edge if I knew the model's architecture and the dataset it was trained on. Actually, if you trained it using TensorFlow, a model checkpoint would be sufficient since the Sparkfun Edge is meant to run TensorFlow Lite models.

Rhino Issue: Undersores not supported in slot names

As a way to overcome #133 i've built a yaml preprocessor that would enable optional slots, and add a syntax to define aliases for slot values.

I noticed that slot names are very restrictive, they do not even allow underscores.

Expected behaviour

Allow underscores in slot names

Actual behaviour

"Slot command__close regex /^([A-Za-z]){1,32}$/"

Steps to reproduce the behaviour

Use underscore in slot name

Edit: Maybe this is not a bug, but I don't know how to change label now

Where/How does rhino do silence detection

Hello,

I'm playing with the demos and I'm very impressed so far. One thing I am missing though is how to do silence / break in speech detection? You must be doing so in the rhino coffee machine demo?

Thanks

Rhino using French on Flutter

Dear Picovoice team,

I am trying to run picovoice on French but I get trouble to load a french custom model (created using the console).
I get error loading model without further info. Should I add french context somewhere ?
I only added the .rhn model in my asset

`pico_voice % flutter doctor
Doctor summary (to see all details, run flutter doctor -v):
[✓] Flutter (Channel stable, 2.2.0, on macOS 11.2.3 20D91 darwin-x64, locale fr-FR)
[✓] Android toolchain - develop for Android devices (Android SDK version 30.0.3)
[✓] Xcode - develop for iOS and macOS
[✓] Chrome - develop for the web
[✓] Android Studio (version 4.1)
[✓] Connected device (2 available)

• No issues found!`

`
n Observatory debugger and profiler on iPhone 12 Pro Max is available at: http://127.0.0.1:54807/P1AkUQBJ8Oc=/
flutter: Failed to initialize Rhino: Failed to initialize Rhino.
The Flutter DevTools debugger and profiler on iPhone 12 Pro Max is available at:
http://127.0.0.1:9101?uri=http%3A%2F%2F127.0.0.1%3A54807%2FP1AkUQBJ8Oc%3D%2F
[VERBOSE-2:ui_dart_state.cc(199)] Unhandled Exception: LateInitializationError: Field '_rhinoManager@606092562' has not been initialized.

`

Problem while invoking Rhino from a foreground service

Hi Picovoice,
I am raising again an issue originally described at #153 as it is vital for my application.

Expected behaviour

I am using the Picovoice SDK and Rhino SDK for Android. The Picovoice SDK works very well for me on a foreground service and I use the Porcupine wake word engine to invoke Rhino.

But I also need to use the Rhino SDK alone to implement a dialogue in a foreground service.
The scenario looks like this:

  1. Start service with Picovoice SDK and send the app in the background (or lock the phone)
  2. Use Porcupine to start recording audio
  3. Speak and get the intent back from Rhino
  4. Based on the intent coming from Rhino, I want to invoke Rhino again (programmatically, without wake word, using rhinoManager.process();) so the user can speak again and give a second command/instruction

Actual behaviour

  1. I can successfully start service with Picovoice SDK and send the app in the background (lock the phone)
  2. I can successfully use Porcupine to start recording audio
  3. I can successfully speak and get the intent back from Rhino
  4. After the intent is returned from Picovoice SDK, I am not able to invoke Rhino only, so the user can speak again and give a second command/instruction (without using the wake word engine) and I get the error below:

E/IAudioFlinger: createRecord returned error -1
E/AudioRecord: createRecord_l(1225279496): AudioFlinger could not create record track, status: -1
E/AudioRecord-JNI: Error creating AudioRecord instance: initialization check failed with status -1.
E/android.media.AudioRecord: Error code -20 when initializing native AudioRecord object.

Steps to reproduce the behaviour

The easiest ways to reproduce would be to either use my scenario above or you can also try the steps below:

  1. Create an Android foreground service with a Rhino instance
  2. Set a timer (2 mins for example) and after the time elapses, the Rhino engine should get invoked automatically while the app is still in the background (using rhinoManager.process();)
  3. Send the app that contains the service in the background (or lock the phone)
  4. After 2 minutes Rhino engine should get invoked, the user speaks and Rhino returns the intent

Important note: The Rhino engine works when the app is not in the background, I get the error only when the app is running in the background or the phone is locked.

Hope this explains my problem. Let me know if you need more info.

Many thanks
Adhurim

Rhino Issue:

i am running picovoice rhino demo for react-native i have given a context to it which contains : 30-40 words its like a 3-4 line sentence that i have inserted as my context.

what is happening is when i am trying to say the exact same sentence which is there in the context inserted by me it is sometime showing "isunderstood" : false, even if i am resiting the exact same sentence it is not able to find the intent 'sometimes' why is this happening and how to overcome this challenge?
Also can you tell me how the rhino speech to intent is actually working is it first converting the speech to text and then using NLU to find out the intent?

Train more Expressions

How I can train so many Expressions in a intent. In the docs , it just trains one Expression by pressing mic button and talking, but when having more Expressions, how should I say ?

Optional slots?

Is your feature request related to a problem? Please describe.
I'm super impressed by your product. Really amazing offering, and great experience.

My question: is it possible to make slot optional for matching? I.e. allow the expression to be matched even if slot was not matched? My example

[open, extend, slide out] (the, a) $zone:zone (hydraulic, hydraulics, slide out, slide outs, room, section, slide, slides, cylinder, cylinders, areas, areas) (completely, entirely, fully, to the max, hundred percent, to the limit, as much as possible)

So I have only one meaningful $zone slot here, but also I have a set of words that I don't care that much for, but they can be in the command. Because i have a bunch of intents and commands reusing those sets, i'd love to turn them into slots for reuse, while basically ignoring them later. The problem is that I'd like the command to work even if they are not matched.,

Describe the solution you'd like
($zone:zone) type syntax for optional slots

Describe alternatives you've considered
I can use multiple separate commands with or without slots, but there's combinatorial complexity to support them all. I can also keep listing them as is which makes commands hard to parse visually

Rhino Issue:

Make sure you have read the documentation, and have put forth a reasonable effort to find an existing answer.

Expected behaviour

I expect to build iOS project with the custom Rhino context file.

Actual behaviour

Project cannot find my custom Rhino file and the Teminal states that it cannot find the model file.
error messages are shown as below:

  • porcupineError(Porcupine.PorcupineError.invalidArgument(message: "Model file at does not exist at '' "))

Steps to reproduce the behaviour

Please guide me how to solve this issue.

(Include enough details so that the issue can be reproduced independently.)

Rhino Issue: Rhino issue

i am running picovoice rhino demo for react-native i have given a context to it which contains : 30-40 words its like a 3-4 line sentence that i have inserted as my context.
what is happening is when i am trying to say the exact same sentence which is there in the context inserted by me it is sometime showing "isunderstood" : false, even if i am resiting the exact same sentence it is not able to find the intent 'sometimes' why is this happening and how to overcome this challenge?

Screenshot_2020-12-12-12-09-11-084_ai picovoice reactnative rhinodemo

Also can you tell me how the rhino speech to intent is actually working is it first converting the speech to text and then using NLU to find out the intent?

Rhino Issue: Audio error when running on a background service

Hi Picovoice,

I have been playing with Porcupine and Rhino on Android and recently came to a situation where the Rhino is showing an error when the app runs in the background (foreground service). It works well when the app is not in the background.
Any idea if I am doing something wrong, or is this a bug?

This is the message it shows:
E/IAudioFlinger: createRecord returned error -1
E/AudioRecord: createRecord_l(1225279496): AudioFlinger could not create record track, status: -1
E/AudioRecord-JNI: Error creating AudioRecord instance: initialization check failed with status -1.
E/android.media.AudioRecord: Error code -20 when initializing native AudioRecord object.

handle non-ASCII chars when returning inference results.

Hello,

After testing French language, the detection is working fine but the returned string from slots is "tricky" when the word contains symbols such as 'é', 'è' etc..
For example, for the word "éteindre", the returned slot string is "éteindre" which is not user-friendly!
I don't know how do you manage this, I propose to replace any letter of a specific symbol by its basic Latin letter, for example:
'é', 'è' -> e
à -> a
etc ..

PS: I tried to put the text "eteindre" instead of "éteindre" in the rhino console but the dictionary rejected the first one :-(

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.