Git Product home page Git Product logo

aenea's Introduction

https://travis-ci.com/dictation-toolbox/aenea.svg?branch=master

Aenea

A system to allow speech recognition via Dragonfly on one computer to send events to another.


With many thanks to Tavis Rudd for showing us it was practical for coding...
...and to Joel Gould for NatLink, making it possible...
...and to the current maintainers of Natlink: Rudiger Wilke, Mark Lillibridge, and Quintijn Hoogenboom...
...and to Christo Butcher and Dragonfly for making it easy...
...and to Nuance for being so awesomely hack friendly...
(even if it means we have to write the grammars ourselves :-) )

Summary

Aenea is a project to allow Dragonfly (a Python-based voice macro system for Windows) to send commands to another computer. Typically, this is used to run Dragonfly in a virtual machine while commands are sent to the host operating system. Currently only Linux and OS X hosts are fully supported, but a working subset of the functionality is available for Windows. The primary audience is system administrators and software engineers.

Current Features:

  • Control keyboard and mouse on one computer using your voice on another. (Supports proxy versions of Dragonfly's Key, Text, and Mouse commands)
  • Enable or disable voice commands based on the currently active window on the host. (Supports flexible proxy versions of Dragonfly's AppContext)
  • Access to Dragonfly's powerful grammar specification, using Dragon NaturallySpeaking (via Natlink) or Windows Speech Recognition (not tested).
  • Many Dragonfly grammars will work with only minor modification via proxy.
  • Dictate prose directly into any remote application via emulated keystrokes using the keystroke capture client.
  • Easily add custom Python RPCs run on the server (Linux host) that will be available from your grammars (see server/linux_x11/plugins and client/_server_plugin_example.py)
  • More of a toolkit to voice-enable your current workflow than an off the shelf development environment.

Primary limitations:

  • Limited ability to take advantage of context when dictating prose.
  • Lacks ability to use context dependent editing commands (select that, etc)
  • Requires some knowledge of Python programming to edit or create new grammars, and in some cases to customize.
  • Relies on neither free nor gratis Windows and Dragon NaturallySpeaking.
  • Somewhat complex to set up, depending on your background.

Missing features:

  • Currently no encryption or authentication for remote control protocol (not a huge issue since it is typically used on single user systems via loopback).
  • Currently only fully supports a Linux X11 and OS X host. Partial support for Windows is available.

The primary focus of this project is writing code, system administration, terminal use, etc, and it works quite well for those tasks. For writing prose, word processing, etc., this project is quite limited compared to using Dragon natively on Windows, though it is still usable for those tasks.

Current Status

This project currently does everything I need well enough, and no new features are currently planned. I remain astounded by what Nuance has made possible via Natlink, and the flexibility and power of Dragonfly. I'm happy to help with troubles getting set up and take a look at bugs you may encounter, and still welcome pull requests.

Overview

The system consists of a client, Dragon NaturallySpeaking with Natlink and Dragonfly and any voice grammars the user wishes to use running on a Windows virtual machine, and a server, running on the host computer. The client listens on a microphone, recognizes commands based on what you say and the current context on the server, and then sends commands to the server via JSON-RPC to perform actions such as pressing keys and clicking the mouse.

Aenea provides Proxy versions of Dragonfly actions such as Key, Text, and Mouse (ProxyKey, ProxyText, ProxyMouse), which take the same specification language as Dragonfly actions, but instead forward the action to the host to execute. There are also wrapper classes that will respect whether or not the proxy is enabled and delegate execution to either Dragonfly (locally) or via the server. There are also wrappers that will take different actions based on whether the proxy is enabled and/or which OS is running on the server.

Getting Started

Windows VM Software (versions given are ones I used, others likely work too):

Some notes:

  • NatLink has some problems on Windows 10 64-bit related to the msvcr100.dll file. It is unclear whether it is the 64-bit or the Windows 10 which resulted in the problem. Therefore it is recommended that you use the version of Windows mentioned above.
  • If you have problems installing NatLink, this page may help http://qh.antenna.nl/unimacro/installation/problemswithinstallation.html
  • A previous version of these instructions recommended python-jsonrpclib-0.1.3, but I ran into a bug in it that was fixed in the later version

Setup Instructions

Note: poppe1219 has other installation instructions at https://github.com/dictation-toolbox/dragonfly-scripts; take a look if you're having trouble getting everything working in the VM.

Operating system, Dragon, Natlink, and Dragonfly

  1. Install VirtualBox.
  2. Install Windows. It works well with 1 GB of RAM, two processors, and a 35 GB dynamically-sized hard disk, of which it is uses about 17-20 GB. (You can increase the RAM to speed up the installation process, and then lower it later to spare system resources.) While it's installing, I suggest you skim the Dragonfly documentation at http://dragonfly2.readthedocs.org/en/latest/
  3. Install Dragon, and create your profile according to their directions. IMPORTANT: Ensure that you select BestMatchIV when creating your profile. Recent versions of Dragon default to BestMatchV, which has substantially worse performance with the sorts of grammars we will be using with Dragonfly.

Note: I and others have had problems when creating a profile where only a few seconds into the volume check a pop-up appears complaining about the microphone. To get around this, I memorized the text and continued reading while clicking okay on the dialogue as soon as it appeared. I had to read the text seven or eight times speaking in an unnaturally loud voice to get past this step. You may have to try a few times. I believe this may be a side effect of the USB microphone going through the virtual machine, and as such you may consider creating your profile on a native Windows installation and then moving it over, however I have not tried this. You may also have issues getting past the microphone quality check, as I did, however it worked just fine after that.

  1. Install the other software mentioned above, and enable Natlink (by selecting GUI configuration from its start menu entry with Dragon closed). Make sure you install Python and dragonfly into paths with no spaces in them.
  2. In VirtualBox's networking settings, set the network to host-only adapter so the VM can't access the network and gets a subnet. If you don't do this, you will need to modify the client and server config files to specify the correct interface to connect to.
  3. Now when you start Dragon, a second small window with the title Messages from NatLink should pop up. If you have issues with this, take a look at the various forums that discuss using NatLink/Dragonfly on Windows.
  4. You should now be able to run Natlink and Dragonfly grammars in the VM. Grammars are, by default, located in C:\\NatLink\\NatLink\\MacroSystem. NatLink will load any file named _*.py (where * is a wildcard). If your grammars depend on libraries, you can place them (not starting with an _) here. Your grammars will be able to import them, but NatLink will not attempt to load them directly.
  5. Test that NatLink is working correctly. Copy aenea/client/_hello_world_natlink.py to C:\\NatLink\\NatLink\\MacroSystem and restart Dragon. In the Messages from NatLink window, you should see NatLink hello world module successfully loaded. All it does is print this message:-) typed out into Notepad. This means that NatLink successfully loaded your grammar. You can now delete the file you just created inside C:\\NatLink\\NatLink\\MacroSystem along with its corresponding .pyc file.
  1. Copy aenea/client/_hello_world_dragonfly.py into the MacroSystem folder, and turn your microphone off and on again. Now open Notepad (or similar) and say test hello world grammar. The phrase Hello world grammar: recognition successful! should be typed into the active window. (If you are curious to see how it works, open the aenea/client/_hello_world_dragonfly.py file to have a look - this will be good preparation for your future grammar writing career :P). If this doesn't work, try switching Dragon to command mode first. If it still doesn't work, try restarting Dragon. If it still doesn't work, then there is an issue with the setup of Dragon/NatLink/Dragonfly. Once the recognition successful has been typed out into Notepad, you can now delete the file you just created inside C:\\NatLink\\NatLink\\MacroSystem along with its corresponding .pyc file.
  2. You're ready to move on to real ones in the next section! Jump to the server section that corresponds to your host operating system.

Server (Linux X11)

  1. Go to aenea/server/linux_x11
  2. Copy config.py.example to config.py. Edit to suit. The default assumes you are using a host-only adapter for the VM which is NOT the default. Note that the HOST/PORT here must work with those specified in the client-side config (in most cases they will need to be identical).
  3. Install the dependencies. Versions I used are in parentheses for reference; you probably don't need these exact versions for it to work. Install jsonrpclib (0.1.7), xdotool (3.20140213.1), xprop (1.2.3), xsel (1.2.0; optional but recommended), and yapsy (1.10.223-1; optional but recommended if you want server-side plugin support). Some window managers (xmonad) may require you to enable extended window manager hints for getcontext to work properly. On Awesome, it works out of the box.
  4. Edit the server's config.py.example to specify the host and port it should listen on.
  5. Run server_x11.py. Specify -d if you want it to daemonize; default is to run in foreground.
  6. In a separate terminal (or the same one if you daemonized), cd to the linux_x11 dir and run test_client.py. This should type out some text like AABB and a dict describing the context of your terminal, move the mouse around, right click and drag, etc, to test it's all working. I tried not to make it too invasive but just in case, best not have anything you care about on screen! If this works, then the server is operational and accepting commands from clients. No point trying to get it to work with Dragon and the VM until it can accept local commands!

Server (Windows)

windows server by @grayjay

Note that the Windows server only supports a subset of the commands (key_press, write_text, and pause; get_context currently only returns the title of the foreground window as "title" and the title of the foreground window's ancestor as "name".).

Installation:

  • Install the Haskell Platform for Windows from http://www.haskell.org/platform.
  • Run the command cabal update.
  • Run cabal install in the folder ...``aenea\WindowsServer\aenea-windows-server`` to create aenea.exe in cabal's bin folder.
  • aenea.exe takes a required --security-token flag. It also takes optional flags specifying the IP address and port. These should match those on C:\\NatLink\\NatLink\\MacroSystem\\aenea.json.
  • Set use_multiple_actions to false in aenea.json.

Server (OS X)

Install:
  • python2
  • pip install pyobjc (this is required for py-applescript and will take a while. no, a really, really long while)
  • pip install py-applescript

Enable access for assistive devices in your system preferences

Aenea client-side library

At this point, the folder C:\\NatLink\\NatLink\\MacroSystem should contain a folder named core (which would have been created after installing and enabling Natlink).

  1. Close Dragon and then copy aenea/client/aenea into C:\\NatLink\\NatLink\\MacroSystem.
  2. Copy aenea/aenea.json.example to aenea/aenea.json and edit to suit.
  3. Copy aenea/aenea.json into C:\\NatLink\\NatLink\\MacroSystem.

2a) Optional Step: For aenea itself you have a choice -- you can either store its state and configuration files (these are used for keeping track of which dynamic vocabulary are currently active, which server to send commands to, etc) in C:\\Natlink\\NatLink\\MacroSystem, or you can store them elsewhere. If you store them in MacroSystem just edit aenea.json to suit and you're done. If you want to store it elsewhere (I put it on a shared folder mounted as the E drive so I can manage it from the host), then delete all the lines except project_root', and set its value to whatever directory you want to manage the config from. Then, in that directory, copy the full aenea.json.example and edit to taste. Basically on startup we first load C:\\NatLink\\NatLink\\MacroSystem\\aenea.json (hardcoded), then if the project_root specified is another directory we load aenea.json from that directory, overwriting any settings, and repeat until aenea.json specifies its own path (or a cycle which is an error). All other config files are relative to the project_root.

2b) If not using VirtualBox host only adapter as described above, you will need to set the host and port to the correct settings in all of the aenea.json files.

  1. Copy aenea/client/_hello_world_aenea.py into C:\\NatLink\\NatLink\\MacroSystem, and restart Dragon. Now try saying test hello world remote grammar. The text Aenea remote setup operational should be typed through the server, into whatever window is in the foreground (unless it is the VM itself). The server will also print updates for every command received and executed to aid in debugging setup issues. If it doesn't work, check the NatLink window for backtraces as well. Note that the JSON-RPC library will serialize and return Python exceptions from the server to print in the NatLink window, so a backtrace in that window can be either from the client or the server.
  2. If all's well, delete _hello_world_aenea.py from MacroSystem.

Built-In Optional Modules

While optional, Aenea comes with two very useful modules.

_aenea.py allows you to dynamically switch between local (i.e., in the VM) and remote (i.e., send to server), as well as changing which server commands are sent to (if you're using several different computers). It will also print useful information when the module is loaded such as the current networking settings. To install, just copy client/_aenea.py into the MacroSystem directory. It is configured in ROOT\\grammar_config\\aenea.json, there you can rebind commands and add or remove servers to connect to. It reads and writes ROOT\\server_state.json to keep track of which server is currently active.

_vocabulary.py is used by most of my grammars, and allows multiple grammars to make use of the same set of vocabulary. (For example, one may want access to Python vocabulary both in a VIM grammar and a generic edit grammar). It makes use of ROOT\\vocabulary_config. ROOT\\vocabulary_config\\static contains vocabularies that are always enabled, and ROOT\\vocabulary_config\\dynamic contains vocabularies that may be switched on and off by the user at will. ROOT\\vocabulary_config\\enabled.json (read and written) keeps track of the current state of dynamic vocabularies. You can rebind the commands used to control vocabulary in ROOT\\grammar_config\\vocabulary.json. To install, just copy client/_vocabulary.py into the MacroSystem dir.

Aenea Dictation Client (optional)

Also available is a dictation capture client @poppe1219 wrote. This is simply a window that captures all keystrokes typed into it an relays them to the Linux host. If you disable Dragon's dictation box, you can dictate in Dragon's normal mode with the capture client in the foreground in Windows. Dragon will then type into the client, which will send the keystrokes to the server. You can still use grammars with the client in the foreground. To use, just copy client/aenea_client.py to MacroSystem and run it. By default, all grammars will only work when the client is in the foreground. You can change this behavior in aenea.json by setting restrict_proxy_to_aenea_client to false.

Aenea Recognition Results Plugin (optional, for linux_x11)

While using Aenea, the results of the recognition can be viewed by inspecting the server's output in a terminal. However, this requires constant task-switching. The Recognition Results plugin displays the latest recognized phrase in a small GUI that stays on top of all the windows.

https://raw.githubusercontent.com/facundoq/aenea_recognition_results/master/demo_small.gif

The Recognition Results requires Tk and the corresponding bindings installed in the same python distribution used for Aenea. These can be easily acquired in Ubuntu/Debian by executing sudo apt install python-tk or sudo pacman -S tk in Arch-based distributions.

To install the plugin:

  1. Copy client/_recognition_results_observer.py to MacroSystem in the guest VM. This grammar sends the recognition results to the server.
  2. Check that the following files exist in the server:
  1. server/linux_x11/plugins/recognition_results.yapsy-plugin
  2. server/linux_x11/plugins/recognition_results/config.py
  3. server/linux_x11/plugins/recognition_results/__init__.py
  4. server/linux_x11/plugins/recognition_results/recognitionbar_tk.py
  1. Give execute permissions to the file recognitionbar_tk.py (chmod +x recognitionbar_tk.py)
  2. Open the file config.py and modify the variable enabled to the value True (enabled = True)
  3. Configure the appearance of the bar in config.py (optional). By default, the assigned window type is dock and the window is set to show always on top. You can use your window manager to provide additional customization, such as showing the window in all desktops.

If you need to reopen the recognition bar window, just execute recognitionbar_tk.py.

Please note that the plugin uses a local file (~/.aenea_phrases.log by default) in the server to communicate the results from Aenea to the Recognition Bar GUI application. This file stores in plain text the results of your speech. If security or privacy of what you input is a concern, you should delete this file regularly or take other appropriate measures. The location of the file can be modified in config.py.

Virtualbox usb passthrough to reduce latency(optional)

Using pulseaudio can lead to significant latencies and audio drifts between the host and the VM. This can be avoided by either switching to alsa-audio or by directly passing through your audio device do the VM. This can easily be achieved by installing the virtualbox extensions on the host machine and adding your user to the vboxusers group : sudo adduser $USER vboxusers

Snapshot and backup (MANDATORY)

This is a brittle setup. Part of why I went with a Windows VM and remote connection rather than something like Platypus and/or wine is for the encapsulation. Several times, my VM has broken for no clear reason, with Dragon permacrashing, NatLink not starting, etc, and I was unable to fix it. Reverting to a snapshot easily and quickly fixed my problem, and in the year+ I've used this I've never had more than a few minutes of downtime thanks to snapshots and backups. Once you have it working, take a snapshot AND backup your VM image. You don't want to have to go through that setup process ever again. Seriously, do it now. I'll wait. Don't think of this VM as a OS, think of it as an embedded device that just does one thing.

Security

Virtual machines have a nasty tendency to not be up-to-date and at any rate they increase the attack surface. Therefore I recommend that you select Host-only adapter in virtual box so that the virtual machine can only connect to your computer and not to the Internet, thus limiting its potential to get compromised.

Please remember that the server simply accepts any connection made to it and will execute the commands given, that command streams are neither authenticated nor encrypted, and that the server is not written to deal with untrusted clients. I hope to address authentication and encryption in the future (I see little point to dealing with untrusted clients given they literally control your computer), but for now I strongly suggest only running the system on a network interface you trust (i.e., VirtualBox's subnet). Be careful that other virtual machines you may run on the same system cannot access it, if you are concerned about security.

Using Aenea-Aware Modules

Drop them in C:\\NatLink\\NatLink\\MacroSystem along with anything they depend on. In theory you can just say force natlink to reload all grammars (if you are using the _aenea.py module mentioned further above), but if anything goes wrong just restart Dragon.

Using Dragonfly Modules

To make a dragonfly module work with Aenea, add the line:

from aenea.strict import *

to the top of the file below the rest of the imports. This will replace Dragonfly's action and context classes with those from Aenea. Some dragonfly modules make use of actions or context features that require modification to work with Aenea, or will not work at all. This of course assumes * import style was used for dragonfly in the module.

Non-exhaustive list of Dragonfly modules that should work (with the above change):

  • multiedit
  • cmdmemory
  • kbbreak
  • firefox (except save_now command)
  • audacity

Writing Your Own Modules

Writing your own modules is quite easy and the Dragonfly documentation is extensive. This section details what you will need to know to make your modules work via a proxy, and does not duplicate the Dragonfly documentation.

Aenea provides several classes which take an action via the proxy server. Their class names start with Proxy:

  • ProxyAppContext -- provides an AppContext that lets you match on the title, window class/window class name, etc of the currently active window on the host. This tries to be a drop-in replacement for AppContext, but can't quite work the same way since we need to take X11 properties into account.
  • ProxyCustomAppContext -- provides a custom context that allows querying by any value the server provides. See the docstring for details.
  • ProxyCrossPlatformContext -- chooses between one of several contexts based on what OS the server reports is running. Pass in a dict-like from OS to Context. Note that the OS is queried dynamically -- whenever we use the context, so you can use this if you need to switch between servers.
  • ProxyPlatformContext -- chooses between one of two contexts based on whether or not we are currently sending commands to the proxy server -- so you can use the same grammar on the VM/local machine and via proxy.
  • ProxyKey, ProxyMouse, ProxyText -- very similar to Dragonfly's, but support additional functionality (e.g., the Key can accept Linux keysyms as well as Dragonfly ones). See their docstrings for details.
  • ProxyMousePhantomClick -- Move mouse to a location, click, return. From the user's perspective, click without moving the mouse.

Additionally, there are two wrapper layers to make it easier to write a grammar that works both locally and via proxy -- aenea.lax and aenea.strict. They are identical except in how they handle errors. Strict (default) is useful when you want to write one grammar that works both locally and remotely. When the grammar is loaded, it creates a Dragonfly and Proxy object (for each OS if appropriate), and if any errors occur, it raises.

The lax version will ignore errors at grammar load time and only raise them if you attempt to actually use an invalid object. So for example, if you have a Key object press a Linux keysym, it will only error if you attempt to execute the action on the local host. If you used the strict version, your grammar would be prevented from loading:

  • AeneaAction -- performs one of two actions based on whether the proxy server is currently enabled.
  • AeneaContext -- uses one of two contexts based on whether the proxy server is currently enabled.
  • AlwaysContext, NeverContext, NoAction -- useful for combining actions/contexts -- support combinator operators but do nothing.
  • ContextAction -- takes a different action based on which context is currently active. Takes a list of (context, action) pairs. Whenever executed, all actions whose context matches are executed.
  • Key, Text, Mouse -- Executes either on proxy or locally based on whether proxy server is currently enabled.

Taking advantage of the vocabulary system

I noticed that many of my grammars had similar vocabulary but wanted to put them in different places, leading to duplication. In particular, both vim and multiedit should be usable for programming, and as such duplicated a great deal of both language specific vocabulary as well as general help. Since both of these grammars make use of nested trees, and chaining commands together in the grammar, I wanted to separate vocabulary and grammar.

Inspired by the dynamics system @nirvdrum wrote, I also wanted the ability to dynamically disable and enable certain vocabulary as appropriate (e.g., disable Python vocabulary when not using Python). The vocabulary system allows you to define vocabulary items that grammars can then hook into. Currently, multiedit, vim, and _vocabulary use them.

There are two types of vocabulary, due to Dragonfly/NatLink limitations. Static vocabularies are loaded at system start, cannot be dynamically enabled/disabled, and you need to restart Dragon to reload them. On the plus side, they can use more complex specifications such as "reload [all] (configuration|config)".

Dynamic vocabulary is limited to straight key-value pairs -- what you say and what is typed. However _vocabulary.py lets you dynamically turn them on/off as necessary.

Writing a Vocabulary

The format is identical for both static and dynamic vocabularies. You create a JSON file in ROOT/vocabulary_config/static or ROOT/vocabulary_config/dynamic, containing several properties. "name" is what you will say to enable/disable the grammar. "tags" is a list of tags, explained below. "shortcuts" is a mapping from what you say to what KEY(s) are pressed (i.e., the string is used as the spec for a Key object). "vocabulary" is a mapping from what you say to what you get.

In addition to plain text, the value may also specify Text, Key, and Mouse actions (see the end of python.json for an example of this).

Using Vocabularies in your Grammar

Vocabularies are attached to grammars by use of the tag system. Your grammars may request one or more tags, which are simply hooks vocabularies can attach to. So for example, multiedit creates "multiedit" and "multiedit.count" hooks, which are simply things which may be chained together. The .count hook means you can say a number after it to do it N times. The dynamic Eclipse vocabulary is a good example of this. For example, my Python vocabulary says it should be active in "vim.insertions.code", "multiedit", and "global". This is best explained by examining the example vocabularies at https://github.com/dictation-toolbox/aenea-grammars/tree/master/vocabulary_config.

The "global" tag is special -- it's used by _vocabulary.py for things you should be able to say anywhere. The reason it's a special case is because we want to make sure that there aren't multiple grammars competing to recognize an entry. Thus, a grammar may suppress a tag in the global context (multiedit and vim do this), so that whenever they are in use, _vocabulary won't recognize the tags they've taken over. See multiedit and vim for examples of this.

The whole system can sound quite intimidating at first (much like Dragonfly) but it's not as bad as it sounds to use, I promise! Just take a look at the example grammars and vocabularies and you'll be writing your own in no time! (example grammars: https://github.com/dictation-toolbox/aenea-grammars)

Grammar Configuration

configuration.py is designed to provide easy to use code for grammars to read config files under PROJECT_ROOT/grammar_config. In particular, they make it easy for a grammar to allow users to overwrite their keybindings. This is similar to the idea behind Dragonfly's configuration system, but simpler and less powerful -- you can't include arbitrary code. Grammars need not use this system, but all mine do.

Documentation

The API and core are extensively documented via pydoc. I tried to provide a high level description of how it all fits together in this README, but for the latest/details, see the pydoc. aenea should import on Linux even though Dragonfly isn't there (necessary for running tests), so you should be able to browse/read the docs.

Server Plugins

You can add custom RPCs to the server using the plugin system (using yapsy). Take a look at the example plugin and corresponding grammar for details.

Writing Your Own Server

Writing your own server should be fairly straightforward. All you would need to do is implement the JSON-RPC calls from server_x11.py. The protocol as of this writing should be reasonably stable, although I do intend to add encryption and authentication support in the future, but this will likely occur via TLS.

Help!

Please feel free to post in the Dragonfly Google group https://groups.google.com/forum/#!forum/dragonflyspeech or to email me if you have questions about this system or issues getting it working. I don't use it as much as I used to, but I'm still happy to discuss getting it to work and improving it, particularly the setup instructions, and I've learned a great deal from other users already.

aenea's People

Contributors

c-nichols avatar calmofthestorm avatar chep avatar dependabot[bot] avatar djr7c4 avatar drmfinlay avatar dylan-chong avatar edk avatar facundoq avatar fgaray avatar grahamc avatar grayjay avatar kereyroper avatar kn0rk avatar nirvdrum avatar poppe1219 avatar seananderson avatar sweetmandm avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

aenea's Issues

Make it clear that server plugins are optional

After I pulled from master and started up the server, I was greeted with the message "Cannot import yapsy; server plugin functionality not available.". I assumed this was a new requirement and something I needed to have running, rather than something that was optional. A softer worded message, that indicates the functionality is optional, would probably be better.

Basic integration test for server_x11.py

I'd like to see a simple integration test for the server. This could be as simple as firing up xev, setting it full screen, running the script, and verifying that the correct events were produced.

aenea_client batching sometimes gets text "stuck"

While it generally works perfectly, the batching code I wrote for @poppe1219's awesome dictation capture client has the property that occasionally text gets caught in the buffer.

I'll investigate this more when I have a chance, but I just wanted to make a note that if you are using it and something you say doesn't show up, saying something else usually should flush the buffer and get both the old and new speech.

"force natlink to reload all grammars" occasionally causes errors

Specifically, sometimes modules used in grammars are set to None after running this command but this does not always happen. This causes obvious problems when grammars attempt to refer to those modules...

Presumeably, this command sometimes unloads modules it shouldn't (modules imported from grammars are unloaded and then reloaded in the implementation). It's also possible that it might reload things in the wrong order which could also cause problems (this seems like the most likely cause).

Plugin-based chaining grammars

Grammars such as multiedit and verbal_emacs work by building up a bunch of commands, each of which the user would think of as a single command, and then creating a repetition rule. This allows the user to string together commands, rather than having to wait for each to be recognized before speaking the next (or at the very least without pausing between commands). To Dragon, this looks like one large command each time the user speaks.

This is great for usability, and it is wonderful that Dragon can support such complicated commands. I would argue that it is the capability of doing this alone that makes editing text possible in this way. Unfortunately, this tends to lead to conflating the natural concept of a command from a users perspective with the logic necessary to support the chaining. This leads to issues such as the duplication of code between verbal_emacs and multiedit. As the project grows, and more people want to use different languages, different programs, etc., I feel like this architecture will become more and more unwieldy.

I propose the following as a long-term design for how to solve this. I basically see three distinct problems that need to be composed in various combinations to make any particular grammar along the lines of verbal_emacs, multiedit, shell commands, etc:

  • The actual logic of chaining (take a list of commands, put them together in a repetition rule, delegate the actual command processing as appropriate). You can also think of this as the interface to dragonfly/natlink.
  • Various "plug-ins" that each constitute a family of commands. Examples of this include the variable name formatting functions in multiedit and verbal_emacs, shell functionality such as pipes and redirects, editing commands in VIM, and the Python-specific and Eclipse-specific commands in multiedit.
  • The tie together that determines which plug-ins should be enabled in which contexts, possibly also with some glue logic (such as managing the mode you are in in VIM). This could possibly also integrate with the work in https://github.com/poppe1219/dragonfly-scripts/tree/master/dynamics to allow dynamic loading of plug-ins as appropriate.

This is how I would build this in the absence of constraints. Unfortunately I need to look into what constraints Natlink/Dragonfly/Dragon impose (in particular I suspect some of the issues with nested grammars being thrown out for being too complex that I ran into with verbal_emacs will likely reappear here; Dragon seems to object more to grammars that are too deep (even when unambiguous) then to grammars which recognize a large number of commands, which is what I would expect based on how most parsing tends to go) and I suspect that is likely to dominate the design, unfortunately.

This is also complicated by #17.

I welcome feedback, both on my design in ideal land and on the realities anyone has run into with similar projects. @poppe1219 @nirvdrum

Grammar configuration needs to be written

Grammars may have configuration files (NAME.py) placed in grammars_config. These files need to be copied to the VM whenever configuration is reloaded. Additionally, the existing grammars which would benefit from external configuration need to be rewritten to use this format.

Long action chains take a significant amount of time to execute

After playing with aenea for some time I started digging into trying to tighten the latency between "say something" and "do something".

Occasionally I noticed a delay AFTER recognition when a long chain of commands was to be executed so I came up with this test

from aenea import Text

def time_actions(count=20, data=None):
    s = time.time()
    for _ in range(0, count):
        Text('abc').execute(data=data)
    return time.time() - s

# >>> time_actions(count=20)
#0.43 seconds

That means that aenea takes almost half a second to perform 20 actions client to server. This is on the high end of what we might see with a long-ish dictation but not too uncommon. Latency is high enough as it is. We can definitely shave down ~0.4 seconds.

I noticed I wasn't passing the optional data parameter to action.execute. So I tried again.

# >>> time_actions(count=20, data=ensure_execution_context({}))
#0.20 seconds

Explicitly passing execution context prevents aenea from having to query it between nearly ever action. This saves the server from having to shell out repeatedly to get context.

I did one last test to see just how much making multiple rpc calls hurts on the longer chains

class BatchTextActionExecutor(object):
    def __init__(self):
        self.batch = []
        self.original_write_text = None

    def __enter__(self):
        batch = self.batch

        def write_text_patch(*args, **kwargs):
            batch.append(('write_text', args, kwargs))

        self.original_write_text = aenea.communications.server.write_text
        aenea.communications.server.write_text = write_text_patch

    def __exit__(self, exc_type, exc_val, exc_tb):
        aenea.communications.server.multiple_actions(self.batch)
        aenea.communications.server.write_text = self.original_write_text

def time_batch_actions(count=20, data=None):
    s = time.time()
    with BatchTextActionExecutor():
        for _ in range(0, count):
            Text('abc').execute(data=data)
    return time.time() - s

# time_batch_actions(count=20, data=ensure_execution_context())
#0.10 seconds 

The above context manager is a proof of concept. It patches aenea's proxy to waits for multiple actions to be executed. Upon leaving the context manager all commands are flushed to the aenea server in a single RPC. By doing so the post recognition latency is reduced even further. This could be improved further by not shelling out 20 times to xdotool on the server.

I'd like to know if anyone else thinks batch action execution is something worth investing more time in. I feel that recognition is slow enough as it is and that any reduction in latency is a win.

TL;DR Post recognition latency between nearly 0.5 seconds on longish action chains. We can optimize to ~0.1 seconds. Is it worth the effort?

gist containing some of the test hacks:
https://gist.github.com/mzizzi/fd2924a8d68a33ef0ec2

aenea_client should clear window sometimes

It would be nice if text that appeared in the client window cleared itself periodically (this could be either time-based or after N flushes, where I prefer the latter).

Error connecting to server

Hello, I have everything set up until testing with _hello_world_aenea.py.
I have a winxp vm with the host-only adapter running dragon.

I get an error in the natlink window:

Socket error connecting to aenea server. To avoid slowing dictation, we won't try again for 5 seconds.

Which is the result of:

[Errno 10065] A socket operation was attempted to an unreachable host

I have tried many things but have not been able to get the guest to connect to the server.
Any help - exactly what configuration to use for host/port in config.py and aenea.json, would be appreciated.
Thank you

Where is numbers defined?

On trying the 'zip 00' sample command, I get the following error:

...
  File "C:\NatLink\NatLink\MacroSystem\_stopgap.py", line 30, in _process_recognition
    x = numbers.get(x, x)
NameError: global name 'numbers' is not defined

I don't see where numbers is defined, even when grepping the python files.

But I guess this is evidence that at least it's working!

recent foloders choose does not work

when recent folders dialog appear, I say choose one, it make no sense. I use chinese Gui, and I have add my language to the explorer select funtion, I see the function gotoRecentFolder. but I do not see the caller.

OSX test-client.py hangs

I'm trying to setup aenea server on OSX El Capitan. When I try to run server_osx.py followed by test-client.py I get aAB printed to my terminal after which the script seems to hang all while seemingly keeping the Shift key pressed all the time.

Is this a known issue, how good is osx support for aenea. It is not really clear from the README files.

PS: Great project, I think it this impacts/helps a lot of people in a meaningful way. I hope it to include me as well in the near future.

Grammars don't work if mic turned off then on

Natlink reloads modules when the microphone is turned off and then back on. Unfortunately, it only reloads grammars -- those that begin with _. This leads to two issues:

  • Modules that grammars may import will not be reloaded.
  • Due to reliance on isinstance and similar, after the reload, you run into x-is-not-an-x type of errors. This occurs whether or not you attempt to manually reload modules that grammars rely on.

For now, the workaround is to exit and start again Dragon whatever you make a change to grammar. I agree this is not ideal, but at this time the only way I know to fix this would be to make every grammar a standalone file. This would result in an unacceptable degree of duplication of code, and make configuration a nightmare.

Allow matching by command args

I'm not sure if the new refactoring handles this, but I ran into a case where it would be nice to match against an application's command-line arguments. In this case, I wrote a grammar for Terminator. Terminator is a Python app and is apparently launched via /usr/bin/python /usr/bin/terminator. This means the app context needs to match "python", which is rather broad. Being able to match on the full procline would allow the grammar to be scoped properly.

Help getting started

Having trouble getting this started. Followed instructions, and installed the packages on an xp virtual box. I am trying to connect to an Ubuntu 12.04 system. I think I have the natlink side of things working.

screen shot 2014-06-28 at 10 54 59 pm

The server seems to start properly when I run it from util: python server_x11.py

However, when i start util/aenea.py in the windows vm it exits immediately with a 0 error code.
The experimental aenea_client.py brings up a window, but it doesn't seem to send the data properly to the server.

On the ubuntu server:

enea$ python server_x11.py
eddyks-MacBook-Pro.local - - [28/Jun/2014 22:37:28] "POST / HTTP/1.1" 200 -
type: unrecognized option '--file'
Usage: type [--window windowid] [--delay milliseconds] <things to type>
--window <windowid>    - specify a window to send keys to
--delay <milliseconds> - delay between keystrokes
--clearmodifiers       - reset active modifiers (alt, etc) while typing
--args N  - how many arguments to expect in the exec command. This is
            useful for ending an exec and continuing with more xdotool
            commands
--terminator TERM - similar to --args, specifies a terminator that
                    marks the end of 'exec' arguments. This is useful
                    for continuing with more xdotool commands.
-h, --help             - show this help output
eddyks-MacBook-Pro.local - - [28/Jun/2014 22:37:49] "POST / HTTP/1.1" 200 -

screen shot 2014-06-28 at 10 43 57 pm

When I'm normal or dictation mode, with the capture window focused, the Dragon dictation dialog pops up. Then I need to tell it to transfer or manually click it myself. Is this right?

Thanks for any help getting this going. It's made more challenging because I'm not familiar with python yet, and I'm in the middle of a worsening rsi episode.

Socket error connecting to aenea server

First of all, thanks for this awesome project, I highly appreciate the work and the throrough documentation!

I basically completed all the steps of the short tutorial and the host-only adapter has been set up and can be pinged from the Guest as well as the Host (aenea.json and config.py have been adapted with the same IP accordingly). However, when I try to start DNS with, I get the following error:

Socket error connecting to aenea server. To avoid slowing dictation, we won't try again for 5 seconds.

It makes tracking down the point of failure quite hard since everything else seems to work fine.

Does anyone have an idea?

Thanks a lot in advance!

Mangled text over the wire

I need to track the problem down, but I wanted to file the issue here so it doesn't get forgotten. But the basic issue is sometimes certain text forms come across mangled.

E.g., I have this mapped rule:

"cap deploy to ": Text("SKIP_ASSETS=true RUBBER_ENV=") + Function(lib.format.lowercase_text) + Text(" cap deploy")

And when I say "cap deploy to kevin" with > 0.5 probability, I'll get something like the following typed:

SKIP_ASSETS=true RUBBER_ENV=ke cap deployvin

configuration refactoring

We were talking about it in the ssl pull request, and well, I did a thing... It's early stages right now, and the documentation looks atrocious, but I thought it might come in handy here. If you want any features added to make it more suitable for aenea, I'm all ears.

how to get active window title etc.

I tried a function action like

def diagnostics_command():
    window = Window.get_foreground()
    print (window.executable, window.title, window.handle)
        "diagnostics": Function(diagnostics_command),

but now when I say diagnostics, it prints the window title of the client (my Windows VM) instead of that of my X11 app. Can I somehow access the window title of the X11 app?

Grammer does not get loaded

I changed my project_root to a virtualbox share folder, but the grammers does not get loaded. They do if they are in the MacroSystem folder.

I searched the issues and did not find anything about it.

How do you organize your grammars? :)

Local keys very slow if remote server not up

If the configured remote server is not up and the proxy server is enabled, local keys are severely impacted. The out-of-the-box config, for example, will point at an IP that almost certainly will fail for everyone. I'm seeing that commands like "disable proxy server" and dictating into notepad (both local actions) takes ~12s to process. While I'd certainly expect issues sending commands to the aenea server, I'd expect local keys to work unimpacted.

observations on setup

Having spent days setting up (Virtualbox with Windows 7 Ultimate 64-bit, Natlink 4.1papa, Python 2.7, OSX 10.8.5), I'd like to relay some of my problems and solutions for possible inclusion in the readme.

  1. Natlink required installing Microsoft Visual C++ 2010 Service Pack 1 Redistributable Package (ONLY 32 BIT VERSION WILL WORK), which isn't in the natlink documentation.
  2. I wasted some time trying to test the server when the VM was not running, which causes an error. Start the VM or set the ip in the server config to your ip on your LAN.
  3. Wrappers.py was unable to import everything from dragonfly and so was loading the mock instead, causing an error in the hellow_world_aenea script. The fix was to comment out FormatState (line 48) and Word (87). I don't know yet if this will cause problems later on. My version of dragonfly is 0.6.6b1. Do people think my solution is OK?
  4. The readme states you need to install pyobjc for mac, but this is already included with the native python install on 10.8.5. I actually tried to add it to my brew python as well but couldn't get that to work. Just make sure you execute the server with the Apple-installed python.
  5. I think the applescript commands applicable to my OSX (10.8.5) must be different as the server-test does not perform all actions correctly. I have not been able to look into this yet.

Server plugins don't work out-of-the-box

If yapsy is installed, server plugins are automatically enabled. However, the default configuration file doesn't set the plugin directory and it's assumed that that configuration item exists. The net result is things just break upon server start up until that configuration entry is explicitly added.

OSX : Aenea server setup : test-client.py does not work for 192.168.x.x but works for loopback (127.0.0.1)

Hi All,

First of all, thanks for setting up this project. I am really looking forward to use this to get rid of my RSI.
I have followed all the steps in server setup. When I try to test it out, I could connect server with loopback address in config.py but it does not work when using 192.168.x.x IP. I tried with both 'Host only adapter' settings and 'NAT', but no luck. The server starts, but client says 'Operation timed out'.

Please help.

Here are logs.
Terminal 1 - Server

:: python server_osx.py
started on host = 192.168.1.104 port = 8240

Terminal 2 - Client

::python test-client.py
HOST=
192.168.1.104
PORT=
8240
Traceback (most recent call last):
  File "test-client.py", line 129, in <module>
    main()
  File "test-client.py", line 123, in main
    all_tests(communication.server)
  File "test-client.py", line 107, in all_tests
    test_key_press(distobj)
  File "test-client.py", line 44, in test_key_press
    distobj.key_press(key='a')
  File "/Library/Python/2.7/site-packages/jsonrpclib/jsonrpc.py", line 290, in __call__
    return self.__send(self.__name, kwargs)
  File "/Library/Python/2.7/site-packages/jsonrpclib/jsonrpc.py", line 237, in _request
    response = self._run_request(request)
  File "/Library/Python/2.7/site-packages/jsonrpclib/jsonrpc.py", line 255, in _run_request
    verbose=self.__verbose
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xmlrpclib.py", line 1264, in request
    return self.single_request(host, handler, request_body, verbose)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xmlrpclib.py", line 1292, in single_request
    self.send_content(h, request_body)
  File "/Library/Python/2.7/site-packages/jsonrpclib/jsonrpc.py", line 126, in send_content
    connection.endheaders()
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 969, in endheaders
    self._send_output(message_body)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 829, in _send_output
    self.send(msg)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 791, in send
    self.connect()
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 772, in connect
    self.timeout, self.source_address)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 571, in create_connection
    raise err
socket.error: [Errno 60] Operation timed out

What to do after _hello_world_aenea.py?

Really awesome project! I got the whole setup running, and I am able to run the _hello_world_aenea command and see it show up on my OSX system, but I am then confused about what to do next?

I can't seem to dictate anything and I don't really know how to write my own grammers with dragonfly.

I have also copied in _aenea.py.

What are the easiest way to get started? :)

Bad path for server_state.json

I've been updating the dragonfly-scripts repo to work with the aenea. It took a while to figure everything out, but I think I got there. But I'm currently running into an issue with a bad path for the server_state.json file. At start-up, I get:

Error writing config file .\.\server_state.json: [Errno 13]
Permission denied: '.\\.\\server_state.json'.

Port server integration test to xvfb

It would be nice if we could run the server tests in xvfb for isolation from user desktop, better control of window geometry so we can test absolute mouse movements more reliably, and (wishlist) integration with CI like Travis.

I took a quick look at this today, but ran into a number of problems -- getactivewindow fails with unsupported, and key_press events were not arriving (possibly the window was not activated).

Key translations requires a rethink

Right now, the way we handle key translations is less than ideal. The fundamental issue is simple: Dragonfly, Windows, X11, and OS X all use different names for the various keys on the keyboard. We want to make things work with each other as much as possible while avoiding mapping things that aren't equivalent.

Currently, the X11 server maps most Dragonfly keysyms into X11 keysyms. #64 adds a translation layer in the client for < and >. Additionally, Aenea will accept X11 keysyms directly and pass them along to the server.

So the questions I see is:

  • Do we need two levels of translation (client and server) or should it entirely be the responsibility of one?
  • Should Aenea accept the union of all keys across all platform, just keys for the currently active one (strict and lax do some of this already), or only Dragonfly keys?
    • If only Dragonfly keys, how do we handle platform specific keys (meta, hyper, super, command, etc)?
  • Should the client prefer to send Dragonfly symbols, or attempt to translate to the active platform keysyms?

In general, my instinct is it's better for Aenea to pass along anything it can than for it to police what a grammar and the server are allowed to do, insofar as it mediates communication between them. But I do want the same grammar to be able to work across platforms where possible.

My favored solution would probably be to accept union of all platforms on the client, translate all that can be to Dragonfly syms and pass along the rest as are, then the servers translate Dragonfly syms to local platform ones, and executes the rest natively. This means Aenea will let you send a Linux keysym to a Windows server, and the latter will experience an error, but I think this is an acceptable price to pay.

I welcome your thoughts, especially if you have any experience with internationalization and localization -- I've only worked with various layouts of US keyboards, and ideally this would work well for those on other hardware (or at the very least not actively get in their way).

Better cross-platform support

We should add some additional classes to the core to make working across platforms easier. You can currently do it with ProxyPlatformContext, and that may be necessary for more complex cases, but simple things like different keystrokes should be much easier.

  • ProxyPlatformAction -- performs a different action based on whether currently active is local or a server, and if a server based on what platform (and possibly WM?) it reports.
  • ProxyPlatformKey/Mouse -- sugar for ProxyPlatformAction with Key objects (IE, enter different keystrokes based on platform).
  • Some kind of custom RPC context that makes it easier to work with plugins maybe? Like ProxyRPCContext("get_vim_mode") would call the get_vim_mode RPC (added by a plugn presumably) and match based on the value. Maybe we don't need this and should let plugin authors do it themselves, I welcome opinions here.

Anything else in the same vein?

json-rpc ProtocolError

Hey,โ€‹
I'm trying to setup aenea on a Windows 10 machine to use in order to take advantage of the added functionality of aenea and the grammars contributed by the community.

Currently I'm stuck at the setup aenea itself. If you have a spare moment and you could look at the trace I would be grateful.

I launch windows server.
aenea.exe -a 127.0.0.1

`C:\NatLink\NatLink\MacroSystem\aenea.json`  
{
    "host": "127.0.0.1",
    "port": 8240,
    "platform": "proxy",
    "use_multiple_actions": false,
    "screen_resolution": [6400, 1440],
    "project_root": "C:\\NatLink\\NatLink\\MacroSystem",
    "restrict_proxy_to_aenea_client": false
}

The _hello_world_aenea.py test fails upon attempting to send the RPC request. Trace:

 Vocola not active
Aenea hello world grammar: Loaded.

--- WARNING: Speech Model BestMatch V is used for this User Profile
The performance of many NatLink grammars is not good with this model.
Please choose another User Profile with for example Speech Model BestMatch IV.
See http://unimacro.antenna.nl/installation/speechmodel.html
----
natlinkmain started from C:\NatLink\NatLink\MacroSystem\core:
  NatLink version: 4.1lima
  DNS version: 12
  Python version: 27
  Windows Version: 8

Traceback (most recent call last):
  File "C:\Python27\lib\site-packages\dragonfly\engines\engine_natlink.py", line 248, in results_callback
    r.process_recognition(root)
  File "C:\Python27\lib\site-packages\dragonfly\grammar\rule_mapping.py", line 181, in process_recognition
    self._process_recognition(item_value, extras)
  File "C:\Python27\lib\site-packages\dragonfly\grammar\rule_mapping.py", line 201, in _process_recognition
    value.execute(extras)
  File "C:\Python27\lib\site-packages\dragonfly\actions\action_base.py", line 127, in execute
    if self._execute(data) == False:
  File "C:\NatLink\NatLink\MacroSystem\aenea\wrappers.py", line 201, in _execute
    self._data = ensure_execution_context(data)
  File "C:\NatLink\NatLink\MacroSystem\aenea\wrappers.py", line 111, in ensure_execution_context
    data['_server_info'] = aenea.proxy_contexts._server_info()
  File "C:\NatLink\NatLink\MacroSystem\aenea\proxy_contexts.py", line 82, in _server_info
    _refresh_server()
  File "C:\NatLink\NatLink\MacroSystem\aenea\proxy_contexts.py", line 64, in _refresh_server
    _last_server_info = aenea.communications.server.server_info()
  File "C:\NatLink\NatLink\MacroSystem\aenea\communications.py", line 104, in call
    return self._execute_batch([(meth, a, kw)])
  File "C:\NatLink\NatLink\MacroSystem\aenea\communications.py", line 87, in _execute_batch
    batch[0][0])(*batch[0][1], **batch[0][2])
  File "C:\Python27\lib\site-packages\jsonrpclib\jsonrpc.py", line 276, in __call__
    return self.__send(self.__name, kwargs)
  File "C:\Python27\lib\site-packages\jsonrpclib\jsonrpc.py", line 225, in _request
    check_for_errors(response)
  File "C:\Python27\lib\site-packages\jsonrpclib\jsonrpc.py", line 529, in check_for_errors
    raise ProtocolError((code, message))
jsonrpclib.jsonrpc.ProtocolError: (-32601, u'Method not found: server_info')

I have the python libs in the global-sites packages.

dragonfly==0.6.5
jsonrpclib==0.1.3
pygame==1.9.1
pyparsing==2.0.3
pywin32==218
PyXML==0.8.4

I noticed that if I print the address from Proxy class's _execute_batch it points to u'localhost'. no matter what I pout in the aenea.json .

If I set the server to 192.168.56.1 it gives the error pasted above.

If the server is set to localhost I get the error bellow.

Socket error connecting to aenea server. To avoid slowing dictation, we won't try again for 5 seconds.

Any help much appreciated.

Key("control:down") not working

Aenea is super slick, thanks again for releasing this!

I see that Key("c:down").execute() works but repeats 'c'. I have been unable to get control:down to work, although perhaps I'm not trying the right keyword (this is the word for Dragonfly).

Let me know your thoughts? Thanks!

Windows VM config

This isnt really an issue with aenea but rather an discussion about configs for windows vms for those using aenea.

I first tried to use aenea with a windows xp vm with 2 cores, 2G ram, and dragon using bestmatch IV. Dictation was accurate but terribly slow due to XP not supporting multi core systems well. I then tried updated windows kernel and removed support omfor multiple cores and got a significant performace boost. Unfortunately Dragon's bestmatch IV algorithm is terribly inaccurate in single core mode.

So I'm currently left configuring a windows 7 ultimate vm to attempt to use aenea again.

Im curious to hear what configs work best for others.

Aenea Dictation Client keystroke errors with OS X server

Thank you for creating and maintaining this excellent set of tools. I'm excited to get back to programming after many months of RSI.

I have everything working with a Windows 10 client and OS X 10.11.3 server except that I'm having trouble with the Aenea Dictation Client.

With the current code, when I start capturing dictation with the Aenea Dictation Client I get the error:

ProtocolError: (-32603, u'Server error:   File "server_osx.py", line 414, in key_press | RuntimeError: Don\'t know how to handle keystroke Control_R')

I assume the purpose of pressing control is to test that the client is working. If I comment out line 74 of aenea_client.py then dictation is captured and transferred to the OS X server until a key outside of LITERAL_KEYS is pressed, say through a command from a grammar. Then I get errors such as:

ProtocolError: (-32603, u'Server error:   File "server_osx.py", line 414, in key_press | RuntimeError: Don\'t know how to handle keystroke Num_Lock')

or

jsonrpclib.jsonrpc.ProtocolError: (-32603, u'Server error:   File "server_osx.py", line 414, in key_press | RuntimeError: Don\'t know how to handle keystroke ctrl')

And dictation no longer transfers from the client to server until I restart Dragon on Windows and restart the Aenea Dictation Client.

I assume this is a problem with key translation. Is there a way that I can fix this so that it works with the OS X server?

Here's a full traceback:

Exception in Tkinter callback
Traceback (most recent call last):
  File "C:\Python27\lib\lib-tk\Tkinter.py", line 1486, in __call__
    return self.func(*args)
  File "aenea_client.py", line 217, in start_capture
    self.proxy_buffer.start_capture()
  File "aenea_client.py", line 74, in start_capture
    aenea.ProxyKey('Control_R').execute()
  File "C:\Python27\lib\site-packages\dragonfly\actions\action_base.py", line 127, in execute
    if self._execute(data) == False:
  File "C:\Python27\lib\site-packages\dragonfly\actions\action_base.py", line 179, in _execute
    self._execute_events(self._events)
  File "C:\NatLink\NatLink\MacroSystem\aenea\proxy_actions.py", line 130, in _execute_events
    aenea.communications.server.execute_batch(commands)
  File "C:\NatLink\NatLink\MacroSystem\aenea\communications.py", line 100, in execute_batch
    self._execute_batch(batch, aenea.config.USE_MULTIPLE_ACTIONS)
  File "C:\NatLink\NatLink\MacroSystem\aenea\communications.py", line 87, in _execute_batch
    batch[0][0])(*batch[0][1], **batch[0][2])
  File "C:\Python27\lib\site-packages\jsonrpclib\jsonrpc.py", line 290, in __call__
    return self.__send(self.__name, kwargs)
  File "C:\Python27\lib\site-packages\jsonrpclib\jsonrpc.py", line 238, in _request
    check_for_errors(response)
  File "C:\Python27\lib\site-packages\jsonrpclib\jsonrpc.py", line 567, in check_for_errors
    raise ProtocolError((code, message))
ProtocolError: (-32603, u'Server error:   File "server_osx.py", line 414, in key_press | RuntimeError: Don\'t know how to handle keystroke Control_R')

Project root aenea.json not optional

I was trying to use aenea with all the default config, but this unfortunately meant there was no way for me to point at my Linux X11 server. So, I copied aenea.json.example into the Natlink directory. I made the host change and made project_root point at my aenea installation. In particular, I wanted the settings I had in aenea's grammar_config directory to be picked up.

In any event, my project root didn't have an aenea.json file in there and so things blow up. If a config file is found, the logic goes to the project root and starts recursively trying to load additional configuration. However, it doesn't seem to handle the case when a file doesn't exist.

https://github.com/dictation-toolbox/aenea/blob/master/client/aenea/config.py#L37

is the current line that's problematic.

If the configuration isn't optional, then handling the missing file and printing a useful help message would be preferable. If it should be optional, then it should just silently fail. I can pull together a PR for either case, but please advise on the correct behavior.

stopgap doesn't load on Windows

I'm not really sure what the stopgap grammar is. It looks like maybe it's not even meant to run on Windows. But in any case, it fails when loading because the "h" modifier doesn't work with the built-in Key implementation. Knowing a bit more what this should be doing will make fixing it a lot easier, so any help is appreciated.

Changing commands in vocabulary.json results in KeyError

I think I've fixed this here but it's really hacky:
sweetmandm@d9f3be3

Basically, it looks like in most cases a vocabulary should specify {spoken: executed}. However in the case of _vocabulary.py it's expecting {executed: spoken}. If you change vocabulary.json to try to use a different spoken command, the original key is referenced later and breaks on KeyError.

This is what I get when I change vocabulary.json to include "enable vocab <vocabulary>": "enable vocabulary <vocabulary>"

Error loading _vocabulary from C:\NatLink\NatLink\MacroSystem\_vocabulary.py
Traceback (most recent call last):
  File "C:\NatLink\NatLink\MacroSystem\core\natlinkmain.py", line 340, in loadFile
    imp.load_module(modName,fndFile,fndName,fndDesc)
  File "C:\NatLink\NatLink\MacroSystem\_vocabulary.py", line 62, in <module>
    class EnableRule(dragonfly.CompoundRule):
  File "C:\NatLink\NatLink\MacroSystem\_vocabulary.py", line 63, in EnableRule
    spec = command_table['enable vocabulary <vocabulary>']
KeyError: 'enable vocabulary <vocabulary>'

Grammars not working in some apps

I have a vim grammar 'quit': Key('escape') that works when using MacVim, but when using IntelliJ with IDEAVim, it does not:

  • The word quit usually doesn't get recognised when IntelliJ is the current window, but when MacVim the current window, quit is able to be recognised more accurately.
  • When it is recognised, the text quit is entered instead of pressing the escape key.

Network protocol should support authentication

Given we allow full remote control of the keyboard and mouse, it would be possible to be rather evil if an attacker could connect to the server. Investigate TLS, Keyczar, and srp. Priority should be for simplicity so people don't just disable it, and ideally not reliant on certificate authorities.

Text is mangled when sent to a windows VM via xdotool

This may be related to #28 but I wanted to make this a separate issue since I did not experience any problems prior to commit 775af7c.

Basically some characters appear to get dropped so that only a subset go through. It's not always a problem but seems to happen intermittently.

bug: aenea raises WindowsError when Userdirectory is not set in natlink

pull request #131 introduced a bug: userDirectory can be ''

Error loading _translated_grammars from C:\NatLink\NatLink\MacroSystem\_translated_grammars.py
Traceback (most recent call last):
  File "C:\NatLink\NatLink\MacroSystem\core\natlinkmain.py", line 309, in loadFile
    imp.load_module(modName,fndFile,fndName,fndDesc)
  File "C:\NatLink\NatLink\MacroSystem\_translated_grammars.py", line 2, in <module>
    import dragonfly_grammars
  File "c:\users\nihlaeth\git\dragonfly-grammars\dragonfly_grammars\__init__.py", line 5, in <module>
    import dragonfly_grammars.aenea_
  File "c:\users\nihlaeth\git\dragonfly-grammars\dragonfly_grammars\aenea_.py", line 26, in <module>
    import aenea
  File "c:\users\nihlaeth\git\aenea\client\aenea\__init__.py", line 18, in <module>
    import aenea.communications
  File "c:\users\nihlaeth\git\aenea\client\aenea\communications.py", line 30, in <module>
    _server_config.write()
  File "c:\users\nihlaeth\git\aenea\client\aenea\configuration.py", line 56, in write
    os.makedirs(os.path.split(self._path)[0])
  File "C:\Python27-32bit\Lib\os.py", line 157, in makedirs
    mkdir(name, mode)
WindowsError: [Error 3] The system cannot find the path specified: ''

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.