Git Product home page Git Product logo

Comments (18)

dylan-chong avatar dylan-chong commented on July 17, 2024 1

@calmofthestorm There is an easy way to install the package, i could try installing it on my vm and see how it goes over a few days. Would be good to test for we officially depend on

from aenea.

dylan-chong avatar dylan-chong commented on July 17, 2024

from aenea.

calmofthestorm avatar calmofthestorm commented on July 17, 2024

@t4ngo hasn't been seen in a few years, but that was true awhile back when they showed up, answered a bunch of our questions, joined the dictation-toolbox org, moved dragonfly to github, then disappeared again:-)

While I think that dragonfly is a mostly stable, well tested, and complete project that's usable as is, this isn't the first issue we've wanted to fix but been unable to. I would support moving to a fork. Since most of the changes we want are fairly lightweight (indeed, already committed just not released in this case), they could always be upstreamed in the future if it makes sense.

AFAICT (please correct me if I'm wrong) the most active Dragonfly fork is Danesprite/dragonfly. It seems like the main intent is to support the Sphinx engine*, but it may be worth reaching out to @Danesprite as to whether it makes sense to try to standardize on that as a fork. The main request there I think would be to distribute a package we could depend on in requirements.txt. From a quick skim it's not clear to me whether this fork is already heavily integrated with Sphinx. That would not necessarily be a problem provided it is a soft dependency and does not add additional complexity to the parts of the stack we use.

I think the process forward would be to do some looking around to see if anyone already forked it, and if not we could add one to the dictation-toolbox organization. I think this would be as simple as forking, changing the package name so we can upload packages, etc, and making regular releases. I'm willing to admin it, co-admin it, or if someone more active wants to that'd be fine too.

  • As an aside, I think the long term future of voice coding, especially on linux, needs to be based on a fully open source stack. Copyright issues are a huge hassle with Aenea -- if it were legal to do so I'd just distribute a pre-configured VM image that would make this project much more widely accessible to people who don't have the background or time to devote to getting this Rube Goldberg machine working. I did look briefly into fully automated installation workflows but there's just too much variance in which version of Dragon, Windows, etc to make that workable, and I still was not sure of the legality. I think that Sphinx is nowhere near Dragon in terms of core recognition capabilities, but in the long run flexibility will prevail. Cloud-based speech to text APIs are also a possibility, but they may be geared toward a different feature set, and it may be legally difficult for me to get involved with that. Projects like Silvius are promising.

from aenea.

calmofthestorm avatar calmofthestorm commented on July 17, 2024

A further point of discussion: if we do end up forking it, what are your thoughts on whether we should minimize our changes to fixing bugs/adding features vs major refactors/modernization/etc?

The Dragonfly code is very, very old, and I do think a refactor would be in order if someone wanted to. But I also think there is value in minimizing the changes we make to simplify upstreaming them if @t4ngo returns.

from aenea.

oneyb avatar oneyb commented on July 17, 2024

from aenea.

calmofthestorm avatar calmofthestorm commented on July 17, 2024

I appreciate you mentioning the work-around, but I am uneasy putting something so brittle into the requirements. I think the best long term plan will be to move to a fork of dragonfly and depend on that. I'll try to look into that at some point, but I have a lot going on right now and it may be awhile. I agree tests would be nice, but don't think they need to block the move, and I'm not sure when I'd find the time.

It would be good to see your setup in more detail. I never really finished mine to any state it's especially worth sharing -- it's worked well enough for the few years I've used it, but is full of bugs and quirks. I'm also on spacemacs these days, which makes this especially interesting. Do you happen to use org and have voice bindings for it?

Armchair speculation on the future and general rambling:

Nuance may be able to stick around by targeting niche use cases (medical and legal edition, for example) and cases where compliance or legal issues make cloud-based services more problematic. Or maybe they get acquired by someone wanting to do same, or offer their tech on top a different platform as a branding/licensing/support/support for niche cases kind of deal. Software Engineer Edition may happen, who knows, but I doubt it:-) I think their position is that the built in visual basic to the more expensive edition should be good enough for anybody.

I haven't really been following cloud providers' efforts in the voice API space that closely. If they build general purpose APIs that allow custom grammars that'd be extremely powerful, but I wonder if they'll focus on that vs unstructured natural language, worry about latency, and to the extent that they do focus on custom grammars whether it will be less powerful than what you can do with Natlink (e.g., if the intent is just to implement a phone menu system or similar). There are also of course security concerns with hooking something like Aenea up to a remote system, and the pricing. And it may just stop working some day. If you're big enough and/or mainstream enough of a use case you can count on there being a clear path forward I think, but we're not.

Personally I also find value in an offline system for a tool critical to my life (though this stuff no longer is); Cloud services can be discontinued, change focus, etc. This is especially true if your use case is a niche one. I can freeze Dragon + Dragonfly + Aenea on a VM and have confidence I'll still be able to run it in decades, so long as we have the ability to virtualize x86 (and I don't see us losing that ability in my lifetime).

We'll see; predicting the future is hard. If this were my life's work or I were dependent on it for income/etc, I would probably be worried about future and change. Because it's simply a tool that I need, I can afford not to worry. The setup may be fragile and complex but it's also very well isolated from the rest of the world moving along without it, meaning I'm confident I can keep it running as long as I need to. To be honest I'm surprised it's still used by as many people as it is many years later. They'll keep using it as long as it's the best available solution to their problem, and while I'm proud to have made something useful to others, I have no direct stake in it continuing to be used.

from aenea.

oneyb avatar oneyb commented on July 17, 2024

Yeah, that's the thing. I can freeze it, forget it and raise it from the dead if I need it. I have an old version of dragon running on slimmed down xp vm with your 1st version of aenea. It works great! Not touching that thing; I may need it to work someday.

I can see how hardware companies have ancient software running on time-tested devices with a grumpy old guy that's responsible for its care and use. Don't touch a running system.

I will get my stuff up soon. I use holy-mode for voice-coding and evil-mode for typing. Nothing special there.

from aenea.

drmfinlay avatar drmfinlay commented on July 17, 2024

Hello all!

@calmofthestorm The original intent for my fork was indeed to write the Sphinx engine implementation, although it is not heavily integrated and the dependencies for it are also optional. There are a number of changes unrelated to the Sphinx engine in my fork too.

You're right, Sphinx is not near Dragon in terms of its capabilities. I believe the general inaccuracy with it is due to the acoustic models being trained with general prose in mind rather than commands for voice coding. The way Pocket Sphinx grammar searches work internally could also contribute to that. Silvius (at least with the model I used) does seem to do a better job accuracy wise.

@oneyb I guess I have taken over development of dragonfly, yeah. Some contributors from the Caster project and I have been fixing a few bugs with it this year. We also started versioning the changes with version 0.7.0 a few months ago. There's a changelog here if you're interested. I should be releasing another version soon. I'm happy for Aenea to use my fork of dragonfly or as an upstream version. Feel free to make issues/PRs over there if you do decide to use it.

I agree with both of your points on offline and cloud-based speech recognition. Removing Windows as a requirement was partly why I wrote the Sphinx dragonfly engine implementation in the first place. It could still use some work of course.

from aenea.

wolfmanstout avatar wolfmanstout commented on July 17, 2024

For anyone who missed it: this discussion was continued on the official Dragonfly mailing list, and the general consensus was to switch to using Danesprite/dragonfly. I was able to update my repository to become "forked from" Danesprite by sending an email to GitHub support (there doesn't appear to be a way to do this yourself). I've been very happy with the switch! It is great to have an active Gitter channel to discuss improvements, and the folks involved have been very responsive to pull requests. And in case you are worried, this new repository does not force any Sphinx dependencies by default. For the most part, if you aren't interested in Sphinx, what you are getting is Dragonfly with some much needed basic fixes + Python 3 support.

from aenea.

LexiconCode avatar LexiconCode commented on July 17, 2024

On behalf of Caster project We've been pleased with the switch to Danesprite's Fork since approximately Apr 8th this year. @Danesprite has been very helpful regarding a number of issues including a windows dpi scaling bug and unicode decode errors in dragonfly discovered through Caster. I'm excited to see innovation with additional speech recognition engines and progress on cross platform support. Not only is Sphinx integration maturing but Google Speech API is WIP. There is additional plans by dwk to integrate Silvius as well. Looks like a bright future ahead :)

from aenea.

calmofthestorm avatar calmofthestorm commented on July 17, 2024

Thanks for the thoughts everyone, I appreciated the discussion here and on the list. I think the plan of record for Aenea is to set this up to depend on Danesprite/dragonfly once they start building a package. Or if for whatever reason they don't want to build a package, I guess we can depend on their github repo.

I will do this, but I can't commit to doing it in a timely manner. I have way too much going on in Oct and Nov to make commitments at this time.

from aenea.

dylan-chong avatar dylan-chong commented on July 17, 2024

@calmofthestorm There is an easy way to install the package, i could try installing it on my vm and see how it goes over a few days. Would be good to test for we officially depend on

from aenea.

dylan-chong avatar dylan-chong commented on July 17, 2024

Just had a look. https://github.com/Danesprite/dragonfly#installation Will have a go installing ahat now

from aenea.

calmofthestorm avatar calmofthestorm commented on July 17, 2024

Oh great, it's a package already. That'll make this easier. Thanks for volunteering to test it! If all I need to do is add a 2 to requirements.txt I think I can manage that in the near term:-)

from aenea.

dylan-chong avatar dylan-chong commented on July 17, 2024

@calmofthestorm I am going to have a go installing it now. I might as well make a pull request with the requirements .txt change ones i have proven it is working well after a few days

from aenea.

dylan-chong avatar dylan-chong commented on July 17, 2024

@calmofthestorm I dont remember using requirements.txt to install the packages in the vm. Is the requirements.txt there more for documentation rather than easy install? requiremnts.txt is not mentioned in the README for this repo.

from aenea.

dylan-chong avatar dylan-chong commented on July 17, 2024

This issue should have been automatically closed in #173

Can someone close this?

from aenea.

oneyb avatar oneyb commented on July 17, 2024

Sounds good! Thanks again.

from aenea.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.