mobile-artificial-intelligence / maid Goto Github PK
View Code? Open in Web Editor NEWMaid is a cross-platform Flutter app for interfacing with GGUF / llama.cpp models locally, and with Ollama and OpenAI models remotely.
License: MIT License
Maid is a cross-platform Flutter app for interfacing with GGUF / llama.cpp models locally, and with Ollama and OpenAI models remotely.
License: MIT License
When using the local gguf model, in my case TinyLLama, and upon changing the character, the app shuts down itself and when it is reopened, it opens with the assistant loading indicator as it was before it was shut down. The theme also changed back to dark theme.
Well add more info later after investigating.
I'm wanting to integrate stable-diffusion.cpp into maid at some point. I'm submitting this issue as an expression of interest for anyone who wants to implement it for me.
Possibly related to #253
Hi
I am trying to setup Maid as an Ollama Client. I select "ollama" as "API Type" and the remote URL as "http://192.168.1.53:11434". I tested the URL in a browser and I get the generic OK mesage form my ollama server. But when I click on the "Remote Model" list, it is empty.
Please, see attached screenshot:
I sniffed the network and Maid is not querying the ollama server when entering the URL or when clicking the topdown list.
OS: Android 11
RAM: 5Gb
Maid: 1.1.7
Right now on the desktop build (linux), enter doesn't send the message but instead makes a newline. Enter should be send, and shift-enter/ctrl-enter should be a newline.
Hi, do you mind adding tasker integration? With the help of the intents maybe? Thanks a lot in advance
I found a bug which is apparently reproducible.
Step one, have a longer conversation. Step 2: while the model is outputing text, scroll up (the output must be out of sight) and back down. That's it, now the text box reverted to the three dots and is stuck in this state.
LLM model targeted for mobile devices, available in 1.4B and 2.7B
https://github.com/Meituan-AutoML/MobileVLM
Apparently only "MobileLLaMA" is available right now and not "MobileVLM"
Models links :
https://huggingface.co/mtgv/MobileLLaMA-1.4B-Base
https://huggingface.co/mtgv/MobileLLaMA-1.4B-Chat
https://huggingface.co/mtgv/MobileLLaMA-2.7B-Base
https://huggingface.co/mtgv/MobileLLaMA-2.7B-Chat
I know that #226 asked for that on Desktop, but that behavior is bad on mobile because you don't have ctrl on your usual mobile keyboard and pressing the send button of the app is simple enough. So we either need an option in the settings or this feature needs to change depending on the build. I tried the latest build which isn't released, and it does add the behavior to the Android apk.
Further testing that build, it also seems to mess with the model somehow.
So it took me a while to notice the correlation but I did some tests and am now sure about this: Pressing the reload button (which let's the model replace it's answer by a new one) does delete the whole context. This means that it forgets everything that was spoken before the last prompt (which obviously leads to hallucinations).
Originally posted by Slavczays December 20, 2023
Hello. First of all thanks for your great app - fantastic work. Is there any chance to add support for newest small llm models Phi-2 2.7 from Microsoft? GGUF model (quantized by the Bloke) is available at HF, but isn't compatible with your app?
Maybe it will be helpful - it looks like Phi-2 support was just recently added to Llama cpp https://github.com/ggerganov/llama.cpp/pull/4490#pullrequestreview-1787346569
As #193 (comment) says it's already planned so this issue is just to follow the development. My tests with the ai show that this is really needed and I expect big improvements from it. Hopefully Chatml will become the standard and no more new formats will be needed.
So I was hoping that I wouldn't need to dig into this once #203 would be finished. Since that's closed it probably makes sense reporting this bug probably. I still don't fully understand it.
To reproduce the crash all you need to do is set user alias to <|im_start|>user and response alias to <|im_start|>assistant.
After that write to the model and it will crash.
However, if you use user and assistant instead, no crashes will happen. The most confusing part is that if you only put <|im_start|> infront of one of the two aliases it won't crash. Only if both have <|im_start|> the app is going to crash.
These behaviors are 100% reproducible.
it's happening with every model I have(shearedllama2.7b, stablelm3b), I'm running the latest app from actions
getting "app not installed" on android version 10.
Hi,
When I run “flutter run” to build and run the project on my Android phone, it fails with the below message. It seems that the submodule llama.cpp is missing a file, or possibly something else. I did some research and found this ggerganov/llama.cpp#3902 however it seems that Maid's project submodule llama.cpp already is updated to the most recent llama.cpp.
Does anybody has any idea what could be the problem and how to fix it? And is able to run the project?
Thanks!
Error:
`FAILURE: Build failed with an exception.
com.android.ide.common.process.ProcessException: ninja: Entering directory `/home/dev/maid/android/app/.cxx/Debug/6o266g5k/arm64-v8a'
C++ build system [build] failed while executing:
/usr/bin/ninja
-C
/home/dev/maid/android/app/.cxx/Debug/6o266g5k/arm64-v8a
core
ggml_shared
llama
from /home/dev/maid/android/app
ninja: error: '/home/dev/maid/src/llama.cpp/.git/modules/src/llama.cpp/index', needed by '/home/dev/maid/src/llama.cpp/common/build-info.cpp', missing and no known rule to make it
Run with --stacktrace option to get the stack trace.
Run with --info or --debug option to get more log output.
Run with --scan to get full insights.
BUILD FAILED in 19s
Running Gradle task 'assembleDebug'... 19.7s
Exception: Gradle task assembleDebug failed with exit code 1`
I noticed that when a user attempts to run the app with a local unsupported model, even if the device has enough ram, the app tries to generate something and subsequently crashes.
I suggest implementing error handling at a low level to address this issue.
llama.cpp already handled it, so it should be simple to fix
As it's said in the title, I prefer it to be able to add the examples manually. I need to delete them every time the way it is now.
So this is something I knew about for a while because I did reinstall the app a few times.
The thing is Android has apparently a "feature" to save app data on Google drive automagically. I'n the case of maid, this means if you uninstall the app and reinstall it again, you will probably see your old conversation + the profile picture of the bot which Android silently saved in the cloud.
You can read more about this behavior here:
https://developer.android.com/guide/topics/data/autobackup
I would really like it if maid would opt out from this feature.
Tokens per second and the amount of time it took to generate response would be good metrics to have.
I tried to run flutter create .
on my mac but got error
"Maid" is not a valid Dart package name.
The name should be all lowercase, with underscores to separate words, "just_like_this".Use only basic Latin letters and Arabic digits: [a-z0-9_].Also, make sure the name is a valid Dart identifier—that it doesn't start with digits and isn't a reserved word.
See https://dart.dev/tools/pub/pubspec#name for more information.
Try "maid" instead.
stop generation button disappeared on android
Originally posted by @Ar57m in #203 (comment)
As the title says, I want to set a system message but I'm not sure. I like the raw input field of gpt4all, character cards are a bit cryptic to me.
Actually, I need ChatML for openhermes.
I also don't know if I need to indicate the new line after user or if that's added automatically.
Well forget everything above and let's change this to a bug report because the preprompt can't be even changed, even saving the json, editing it manually and loading it again won't work. And there isn't a workaround because the maid prepomt is automatically added as soon as you create a new character, so there is no way of getting rid of it.
I guess this is a known missing feature but once the app is closed the image for the character is gone. The app should save a copy in it's internal memory.
Please add min_p to user parameters!
Thanks
So I didn't write Dart code till now and I'm viewing the code on mobile but I'm pretty sure that "user" is here hardcoded into the String right? It is never replaced by the allias.
chatml_pfx = ::llama_tokenize(ctx, "\n<|im_start|>user\n", add_bos, true);
This means that the allias should have no effect at all if you use chatml? (That's not a bad thing)
By looking at the code I did find out that instruct mode ads the "###" and that Chatml is meant to be used without instruct mode, that was probably one reason why I had problems with chatml in the past (I had both active).
So am I right that editing the allias for the user should have no effect at all? Or is the alias string still added somewhere even if instruct mode is disabled?
I am using ngrok and I saw that chat was not working and the main reason was
{
"error": "option \"penalize_newline\" must be of type boolean"
}
And in the request it was sending "penalize_newline": 1
but when I removed the parameter it was working fine. Or at least make it True or False instead of 0 or 1
Im on pixel 7 not pro 7gb ram.
1.1.3 crashes out after prompting for the first time anything over a 3B model.
1.1.2 crashes out after prompting any model for the first time.
1.1.1 and 1.1.0 work with anything up to 7B but are considerably slower than 1.0.9. which is the current version I'm using because of the speed and the reliability.
Normal bugs I come across are that if I scroll down while the text is streaming it bugs out and goes back to the typing animation and doesn't regenerate, and changing basically any of the parameters results in a crash when prompting.
Just wanted to give feedback.
On SD865 12GB +4gb swap, I have still 7.3GB ram available with Mistral 7B
Considering android OS takes easily 2gb, it means Maid uses not even 4gb of ram
Maybe add a setting to force model to stay in ram ?
Or implement these to make it faster on devices with limited ram :
"PowerInfer"
https://huggingface.co/papers/2312.12456
"LLM in a flash"
https://huggingface.co/papers/2312.11514
On MacOS 14.2.1
"Ollama serve" entered in Terminal. Menubar app not running.
MAID 1.1.8 on Pixel 8 Pro Android
All Permissions granted
Disabled firewall too
Both are on same Wifi network
ollama serve
2024/01/17 14:12:47 images.go:808: total blobs: 15
2024/01/17 14:12:47 images.go:815: total unused blobs removed: 0
2024/01/17 14:12:47 routes.go:930: Listening on 127.0.0.1:11434 (version 0.1.20)
In Maid: Model-> API Type: Ollama -> Remote URL http://127.0.0.1:11434
No Remote Models load in the dropdown (I have three on my MacBook Air that I use on the Mac normally)
Settings Log:
Model created with name: null
Character created with name: Maid
Nearby Devices - Permission granted
Error: ClientException with SocketException: Connection refused (OS Error: Connection refused, errno = 111), address = 127.0.0.1, port = 35932, uri=http://127.0.0.1:11434/api/tags
Nearby Devices - Permission granted
Error: ClientException with SocketException: Connection refused (OS Error: Connection refused, errno = 111), address = 127.0.0.1, port = 35356, uri=http://127.0.0.1:11434/api/tags
Model Saved:
Am I doing anything incorrectly? Is it looking on the wrong port: "address = 127.0.0.1, port = 35356" ? IDK I've tried many times but it seems to look at ports other than 11434.
Implement Android native TTS for dyslexic people. Kind of like how Google Bard is doing.
Hello would it be possible to tell llamacpp to add <|im_end|>
as additional stop sequences if Chatml is selected? The reason is that some models aren't trained for any proper templates, I can use them with chatml mode but they will end their output with <|im_end|> <|im_start|>. Using instruct mode instead of chatml also brings some inconveniences because the model doesn't know any of the templates. So it would be a simple solution to enforce the model to stop whenever writes <|im_end|>
First off, this app is great, so thank you for creating something like this!
But the issue, once I load the model and go into my session, after I enter a prompt and it finishes the first response, I'll go to respond and it'll give me a pop-up saying to choose a model. I'll then choose a model, hit the back arrow at the top of the screen, but I'll still get the same message when I try to respond. And when it inevitably takes me back to the model settings page, there's no model loaded. Even if I load a model, go back to my chat window, and then go back to the model settings, the model isn't loaded.
I'm on an unrooted Pixel 7 Pro, if that matters. With 11gb of RAM. And the latest version of the app.
The context seems to be kept even if the session is deleted and a new session is started (and no it's not because of the example dialogue saved by the character, I made sure to clean them). This bug seems to be new I didn't had this with older versions.
In a clean installation (API 33 and API 34), not all permissions are requested.
As a result, when the file manager opens, only folders are visible
Maybe it come from this line:
https://github.com/MaidFoundation/maid/blob/a03e9bacd54db76a0472566b13bc49332ae23edc/lib/static/file_manager.dart#L13
After successfully loading a local GGUF model, the app cannot respond to prompts. Exception at line:
https://github.com/MaidFoundation/maid/blob/49c4b6cf55f5cfde62d45499b6ee275b1fd16f57/lib/core/bindings.dart#L31C16-L31C17
[ERROR:flutter/runtime/dart_vm_initializer.cc(41)] Unhandled Exception: Invalid argument(s): Failed to lookup symbol 'core_init': dlsym(RTLD_DEFAULT, core_init): symbol not found
#0 DynamicLibrary.lookup (dart:ffi-patch/ffi_dynamic_library_patch.dart:33:70)
#1 NativeLibrary._core_initPtr (package:maid/core/bindings.dart:31:78)
#2 NativeLibrary._core_initPtr (package:maid/core/bindings.dart)
#3 NativeLibrary._core_init (package:maid/core/bindings.dart:34:7)
#4 NativeLibrary._core_init (package:maid/core/bindings.dart)
#5 NativeLibrary.core_init (package:maid/core/bindings.dart:25:12)
#6 LocalGeneration._init (package:maid/core/local_generation.dart:113:20)
#7 LocalGeneration.prompt (package:maid/core/local_generation.dart:137:7)
Does saving the character image really encode the character informations into the png? If so that's really cool. But... it seems to have a problem with emojis inside the preprompt. So that needs a fix unfortunately.
Hi,
The only way currently to use the app on Android is to manually unzip the folder and run it on an Android device.
If possible, please implement the following -
Thanks!
Running Maid on Moto G9 Android 11. Tried to run two 1B models obtained from Hugging face this one and another one. The model is loaded successfully but when submitting a prompt the app crashes. Running it locally. Another 3B model works fine but 1B that I have tried so far crashes on my device.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.