Get Auto-GPT working. It's blocked on the following 2 items Ad

🚀 🚀 AUTOGPT + GPT-LLAMA.CPP GUIDE IS NOW AVAILABLE: <a href="https://github.

Just pushed 2 more changes (<a class="commit-link" data-hovercard-type="commit" data-h

wow <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

Comments (69)

keldenl commented on August 14, 2024 7

I'll put up a fork of AutoGPT tmr with all the changes I made (openai BASE_URL and configurable vector dimensions) to make it easier for folks to replicate and test as well.

from gpt-llama.cpp.

keldenl commented on August 14, 2024 7

🚀 🚀 AUTOGPT + GPT-LLAMA.CPP GUIDE IS NOW AVAILABLE: https://github.com/keldenl/gpt-llama.cpp/blob/master/docs/Auto-GPT-setup-guide.md

HUGE shoutout to @DGdev91 for putting up the PR to make this possible – hopefully the PR gets merged to master soon.

Here's a (very short) demo of it running on my M1 Mac: https://github.com/keldenl/gpt-llama.cpp/blob/master/docs/demos.md#Auto-GPT

from gpt-llama.cpp.

keldenl commented on August 14, 2024 5

Just pushed 2 more changes (12a9567) that should DRAMATICALLY improve Auto-GPT support. Please please please pull if you want the latest and greatest. (bumped the npm package to version 0.2.0 as well).

I wanted to quickly go over the previous issues that made gpt-llama.cpp unreliable with Auto-GPT, the new changes that resolve these issues, and next steps.

Previous Issues

llama.cpp crashed when max_tokens = null. And since Auto-GPT sends this sometimes, it will straight up crash llama.cpp and show the dreaded Readstream CLOSED in gpt-llama.cpp
gpt-llama.cpp embeddings endpoint didn't play well with interactive mode in ChatCompletion, which caused llama.cpp instances to crash randomly
Concurrent OpenAI requests emitting from Auto-GPT either crash or significantly slow down llama.cpp processes, which caused it to either crash or never complete
llama.cpp just.... stops working midway with long prompts sometimes. I experienced this when playing with llama.cpp and sometimes pressing enter causes it to resume responding again. i swear there was a thread in llama.cpp repo about this but i can't find it

Fixes for the issues

aa914f2 - gpt-llama.cpp now ignores max_tokens if it's null, as it should. No longer crashes.
1535eb1 - Add logic to kill existing spawned instances of llama.cpp properly when going from chat -> embeddings
12a9567 - Limit concurrency to 1. gpt-llama.cpp now ONLY processes one request at a time (to optimize for performance), but will queue up the other requests and will handle them in the order they came in
12a9567 - If llama.cpp is stuck for 20 seconds, it'll give it a nudge (new line). This has traditionally been my "solution" when using llama.cpp, and I'm just integrating it automagically on gpt-llama.cpp. We could make this smarter in the future by detecting how long previous token generations took and adjust this timeout as needed, but for v1 it'll be hardcoded to 20 seconds

Outcome? I'm happy to say that gpt-llama.cpp ACTUALLY runs continuously (previously randomly bugs out pretty consistent), just not SUPER high quality responses. Here's a screenshot of vicuna trying real hard (and gpt-llama.cpp w/ vicuna 13b continuing the loop!)

Next Steps
So where do we go from here? Joining the Auto-GPT discord channel and browsing the local-models channel, I found that there was a fork that significantly reduced the prompt. Now that Auto-GPT runs on a loop indefinitely (for me), it's time to focus on the quality of these responses. Focus is to..

Continue pushing for Significant-Gravitas/AutoGPT#2594 to get merged as it's the PR that adds BASE_URL flexibility for gpt-llama.cpp
Explore Bill's reduced prompt in his fork (Significant-Gravitas/AutoGPT@master...BillSchumacher:Auto-GPT:vicuna) and examine the response quality. should improve due to requiring smaller context
Investigate escaped underscores in the responses that is causing auto-gpt to not recognize responses
Mess with temperature and see if different settings may improve the responses

Appreciate everybody's hard work, and please pull the newest changes so we can keep hacking and trying to get this working! Thanks everybody~~~ lets goooooo we got this 🚀🚀🚀

from gpt-llama.cpp.

maddes8cht commented on August 14, 2024 4

With my current set of prompts, i'm getting pretty consistent and reliable json_responses from OpenAssistant 30b.

I'm getting improvements, but still inconsistent and unreliable results from any vicuna or koala 13b models.

OpenAssistant 30b runs (slooow) on my machine with gpt_llama.cpp, but got a significant speedup with the recent GPU update in lama.cpp

This update will enable a lot more people to run 30b models that they couldn't use before, as they gain aditional memory from their graphics cards.

I'm eager to try out IBMs new dromedary 60b model with gpt_llama and auto-gpt as it seems to be very smart, but maybe i need to wait until gpt-llama.cpp updates to gpu support of llama.cpp. It's not because of the speedup, which may be neglectable, but becaue of the additional 12 gb of my rtx3060 that will enable these model sizes for me.
I've ran this model with gpu (with original Llama.cpp main ), and its knowledge in scientific realms is impressive - i guess it significantly beats chatgpt 3.5 at least in this specific realm.

As has been said, the Auto-GPT capabilities seems to boil down on the "smartness" of the model.

Current 30b llama derivatives may Manage the task "out of the box", at least with the customized prompts, while the smaller ones may never be able even with more specific prompts.

BUT:

remember llama itself wasn't capable of being used in Chat conversations until Alpaca finetuning was released. With alpaca, even small models now can be used in interactive Chat.
AutoGPT (or similar software) will be a significant usecase of llms.

It should be worth having a specific training for this usecase.

As "Chat-Conversation" is a rather broad sceneario, the Alpaca Dataset with about 52.000 sets was pretty small.
A highly specific dataset to just inform about very specific use of output formats will be even smaller - maybe its just some dozens of sets, which will be less then 1 % of the Alpaca set and may be trainable within minutes instead of hours, even on common hardware.

So maybe we should do an attempt to create such a dataset for finetuning small models to work together with AutoGPT. This should get in cooperation with AutoGPT development, as it should also involve the best prompts on which the dataset is to be trained for.

This way, I am confident that even small models will become capable of working with AutoGPT.

from gpt-llama.cpp.

DGdev91 commented on August 14, 2024 3

Thanks for your work!

i was exited to try it out and made those changes myself. i also made a pull request Significant-Gravitas/AutoGPT#2594

Feel free to add your changes if i missed something

from gpt-llama.cpp.

keldenl commented on August 14, 2024 3

wow @DGdev91 that was EXACTLY what i was going to do.. thanks!! nothing to add for me

from gpt-llama.cpp.

ntindle commented on August 14, 2024 3

Hey! AutoGPT maintainer here. We’d love to support local models. Is someone in our discord that we can discuss getting this into a plug-in with?

from gpt-llama.cpp.

baas-hans commented on August 14, 2024 2

In terms of LLM use for the auto-gpt prompting has anyone tried gpt4 x alpaca 13B model. I've found in general use this responds better than Vicuna 13B. Just wonder if it may help thinking

Vicuna is very good. Could be wrong, but the problem here is probably a difference in prompting style required. Note the gpt in both chatgpt and gpt4 x alpaca. They likely just share a better understanding of the particular prompts given by auto-gpt right now.

This works in llama.cpp, with eachadea_legacy-ggml-vicuna-13b-4bit, for example:
> generate json in the format { "commmand": "(command name)", "arguments": [ "arg1", ... ] } where command is one of "eat_cereal", "pour_cereal", "add_milk", etc.

{
"command": "eat\_cereal",
"arguments": [
"Kellogg's Frosted Flakes"
]
}

I think the prompts in the default Auto-GPT setup are way too verbose for use with local models (and this contributes to their confusion).
I have had most luck with the unfiltered vicuna 13b model (ggml), but am hoping to get the 7b wizardlm model working by "engineering" the prompts a bit.

The longer I work on minor tweaks on this project the more I think I should fork it and just rip out the openai section entirely, but don't want to just rewrite something else that already exists.

So far, I have tried with varying success making use of gpt-llama-cpp with Auto-GPT and BabyAGI. I've given AgentLLM a try, but so far getting that set up has turned out to be a nightmare.

My most recent "working" Auto-GPT is now connected to wizardlm via llama.cpp via their own server implementation (important note) if you want it to work with Auto-GPT you will have to modify their ChatCompletionRequest validator to not include max_tokens field as that keeps making it break, but it seems to be more easy to understand than trying to make it run under some javascript language.

I am using DGDev's fork (so my repo is quite far behind master afaict), but otherwise, when I have more time I intend to get wizardlm working as my thinking/processing agent.

Doubtful this will help anyone, but good luck to any who try :)

from gpt-llama.cpp.

keldenl commented on August 14, 2024 1

There may be a better way of prompting that could help with this – i was going to try babyagi and see how the results differ

from gpt-llama.cpp.

bjoern79de commented on August 14, 2024 1

vicuna 13b, the best model i've tested so far, isn't doing a GREAT job generating actions that it can follow continuously (gets stuck in a loop)

What's about running auto-gpt against open-ai with a large amount of goals, persisting question/answer-pairs and use that as a (lora-)finetuning-dataset for vicuna?

from gpt-llama.cpp.

DGdev91 commented on August 14, 2024 1

Does this perform better than GPT-3.5 for AutoGPT?

It depends on the model you are using.
Vicuna 13b should have a quality close to gpt3.5 or even slightly better in some cases, but in my experience gpt3.5 is still better overall.

I'm sure in the near future we'll see many models able to outperform it.

Also... There's still some work to do to.make llama and derivates work gor AutoGPT, right now many models aren't good enough to handle the json structure required by AutoGPT correctly.

from gpt-llama.cpp.

ntindle commented on August 14, 2024 1

This is a significant amount of what I want to discuss. I want to make plug-ins capable of that if possible but it will require a good amount of architecture changing

from gpt-llama.cpp.

lee-b commented on August 14, 2024 1

In terms of LLM use for the auto-gpt prompting has anyone tried gpt4 x alpaca 13B model. I've found in general use this responds better than Vicuna 13B. Just wonder if it may help thinking

Vicuna is very good. Could be wrong, but the problem here is probably a difference in prompting style required. Note the gpt in both chatgpt and gpt4 x alpaca. They likely just share a better understanding of the particular prompts given by auto-gpt right now.

This works in llama.cpp, with eachadea_legacy-ggml-vicuna-13b-4bit, for example:

> generate json in the format { "commmand": "(command name)", "arguments": [ "arg1", ... ] } where command is one of "eat_cereal", "pour_cereal", "add_milk", etc.

{
"command": "eat\_cereal",
"arguments": [
"Kellogg's Frosted Flakes"
]
}

from gpt-llama.cpp.

DGdev91 commented on August 14, 2024 1

@DGdev91 i've been messing with the prompt too – how successful has your branch's prompt been?

Actually i didn't had much time to test that in the past days.
But in the few times i tried, it seemed like that for llama and derivates a good rule seems to be "few is better".

from gpt-llama.cpp.

DGdev91 commented on August 14, 2024 1

Most of the issues with the commands are caused by the LLM not understsnding well what string represent the actual command.
For example, in the google search case you posted 3 days ago the LLM wrote "Google Search" as commamd, while AutoGPT expects "google".
Few hours ago it was merged a PR wich changes slightly the prompt. Significant-Gravitas/AutoGPT#4027
The original version was Exclusively use the commands listed in double quotes e.g. "command name"
Now it's
Exclusively use the commands listed below e.g. command_name
It could help also llama and derivates, but it really depends on what specific model is used.

In other words: I don't think OpenAI's Gpt4 would really work with the fork used here. I think this fork doesn't really work anymore. To be able to experiment with local models in a meaningful way, we would first need a working, up-to-date fork of Auto-gpt (3.1) again.

I could be wrong, but then, please prove it.

If you are referring to my fork, i'm trying to get it merged and i often merge the changes from the main repository.
It should be in sync with AutoGPT's main branch, and also includes the PR i mentioned before.

Finally, i'm also trying to make the prompts configurable, this can be useful to find a prompt wich works better for llama-derived llms Significant-Gravitas/AutoGPT#3375

You can try both of my changes in the branch custom_base_url_and_prompts

from gpt-llama.cpp.

DGdev91 commented on August 14, 2024 1

May I ask for your opinion on how to solve the error here? Thanks a lot!

Not every LLM is good enough to handle the json reply requested by AutoGPT.
Llama 7b it's way too "basic" for that, you'll need at least something like Vicuna13b, and even with that there's are still some issues wich have to be sorted out.

from gpt-llama.cpp.

shock-wave007 commented on August 14, 2024 1

Issue is more related to model than code. This repo is great but we are dealing with model that generates random text 🤣

from gpt-llama.cpp.

keldenl commented on August 14, 2024

Just got AutoGPT working consistently from this embeddings fix b3db39f

Still need to manually update the vector size in autogpt to 5120 or 4096 depending on the llama model (see https://huggingface.co/shalomma/llama-7b-embeddings#quantitative-analysis), but it works!

vicuna 13b, the best model i've tested so far, isn't doing a GREAT job generating actions that it can follow continuously (gets stuck in a loop)

from gpt-llama.cpp.

keldenl commented on August 14, 2024

Re-opening this until we get the PR Significant-Gravitas/AutoGPT#2594 fully merged

from gpt-llama.cpp.

Neronjust2017 commented on August 14, 2024

@keldenl Hi, I tested gpt-llama.cpp's API using the following script:

curl --location --request POST 'http://localhost:443/v1/chat/completions' \
--header 'Authorization: Bearer /home/nero/code/llama.cpp/models/7B/ggml-model-q4_0.bin' \
--header 'Content-Type: application/json' \
--data-raw '{
   "model": "gpt-3.5-turbo",
   "messages": [
      {
         "role": "system",
         "content": "You are ChatGPT, a helpful assistant developed by OpenAI."
      },
      {
         "role": "user",
         "content": "How are you doing today?"
      }
   ]
}'

My model is original llama 7B mode and I succesfully got the response from localhost:443:

{"choices":[{"message":{"content":" Great! Thanks for asking :)\n"},"finish_reason":"stop","index":0}],"created":1682008784901,"id":"Zi7iQapUGOgjeVVB2oJtc","object":"chat.completion.chunk","usage":{"prompt_tokens":99,"completion_tokens":7,"total_tokens":106}}%

I also configured the Auto-GPT according to this guide: https://github.com/keldenl/gpt-llama.cpp/blob/master/docs/Auto-GPT-setup-guide.md. But this time when I run the python -m autogpt --debug, the server on localhost:443 can't return a response, the terminal is like this:

--REQUEST--
user: ''' You are a helpful assistant.
''', '''
{
    "command": {
        "name": "command name",
        "args": {
            "arg name": "value"
        }
    },
    "thoughts":
    {
        "text": "thought",
        "reasoning": "reasoning",
        "plan": "- short bulleted
- list that conveys
- long-term plan",
        "criticism": "constructive self-criticism",
        "speak": "thoughts summary to say to user"
    }
}
'''
Readable Stream: CLOSED

Is there something wrong?

from gpt-llama.cpp.

DGdev91 commented on August 14, 2024

@keldenl Hi, I tested gpt-llama.cpp's API using the following script:

curl --location --request POST 'http://localhost:443/v1/chat/completions' \
--header 'Authorization: Bearer /home/nero/code/llama.cpp/models/7B/ggml-model-q4_0.bin' \
--header 'Content-Type: application/json' \
--data-raw '{
   "model": "gpt-3.5-turbo",
   "messages": [
      {
         "role": "system",
         "content": "You are ChatGPT, a helpful assistant developed by OpenAI."
      },
      {
         "role": "user",
         "content": "How are you doing today?"
      }
   ]
}'

My model is original llama 7B mode and I succesfully got the response from localhost:443:

{"choices":[{"message":{"content":" Great! Thanks for asking :)\n"},"finish_reason":"stop","index":0}],"created":1682008784901,"id":"Zi7iQapUGOgjeVVB2oJtc","object":"chat.completion.chunk","usage":{"prompt_tokens":99,"completion_tokens":7,"total_tokens":106}}%

--REQUEST--
user: ''' You are a helpful assistant.
''', '''
{
    "command": {
        "name": "command name",
        "args": {
            "arg name": "value"
        }
    },
    "thoughts":
    {
        "text": "thought",
        "reasoning": "reasoning",
        "plan": "- short bulleted
- list that conveys
- long-term plan",
        "criticism": "constructive self-criticism",
        "speak": "thoughts summary to say to user"
    }
}
'''
Readable Stream: CLOSED

Is there something wrong?

I guess you already have done it, but let's check anyway.

The example uses vicuna 13b as model, so the settings on .env file are set like this

OPENAI_API_BASE_URL=http://localhost:443/v1
EMBED_DIM=5120
OPENAI_API_KEY=../llama.cpp/models/vicuna/13B/ggml-vicuna-unfiltered-13b-4bit.bin

In your case, should be like this:

OPENAI_API_BASE_URL=http://localhost:443/v1
EMBED_DIM=4096
OPENAI_API_KEY=/home/nero/code/llama.cpp/models/7B/ggml-model-q4_0.bin

Have you already did that, or you just copied the values in the example?

Also, i'm not 100% sure the llama 7b model is good enough to handle AutoGPT requests correctly. I suggest you to try with vicuna. Even better the 13b model, wich is the one i and kendenl used for testing

from gpt-llama.cpp.

Neronjust2017 commented on August 14, 2024

yes, I already have done that, exactly the same as you mentioned. I’m quite sure the auto-gpt configuration is good. It seems like a problem with llama model itself or gpt-llama.cpp. You are right, I will try with Vicuna and see if same problem exists. Thanks！发自我的iPhone

…

------------------ Original ------------------ From: DGdev91 ***@***.***> Date: Fri,Apr 21,2023 1:53 AM To: keldenl/gpt-llama.cpp ***@***.***> Cc: Yuhua Wei ***@***.***>, Comment ***@***.***> Subject: Re: [keldenl/gpt-llama.cpp] Add support for Auto-GPT (Issue #2) @keldenl Hi, I tested gpt-llama.cpp's API using the following script: curl --location --request POST 'http://localhost:443/v1/chat/completions' \ --header 'Authorization: Bearer /home/nero/code/llama.cpp/models/7B/ggml-model-q4_0.bin' \ --header 'Content-Type: application/json' \ --data-raw '{ "model": "gpt-3.5-turbo", "messages": [ { "role": "system", "content": "You are ChatGPT, a helpful assistant developed by OpenAI." }, { "role": "user", "content": "How are you doing today?" } ] }' My model is original llama 7B mode and I succesfully got the response from localhost:443: {"choices":[{"message":{"content":" Great! Thanks for asking :)\n"},"finish_reason":"stop","index":0}],"created":1682008784901,"id":"Zi7iQapUGOgjeVVB2oJtc","object":"chat.completion.chunk","usage":{"prompt_tokens":99,"completion_tokens":7,"total_tokens":106}}% I also configured the Auto-GPT according to this guide: https://github.com/keldenl/gpt-llama.cpp/blob/master/docs/Auto-GPT-setup-guide.md. But this time when I run the python -m autogpt --debug, the server on localhost:443 can't return a response, the terminal is like this: --REQUEST-- user: ''' You are a helpful assistant. ''', ''' { "command": { "name": "command name", "args": { "arg name": "value" } }, "thoughts": { "text": "thought", "reasoning": "reasoning", "plan": "- short bulleted - list that conveys - long-term plan", "criticism": "constructive self-criticism", "speak": "thoughts summary to say to user" } } ''' Readable Stream: CLOSED Is there something wrong? I guess you already have done it, but let's check anyway. The example uses vicuna 13b as model, so the settings on .env file are set like this ``` OPENAI_API_BASE_URL=http://localhost:443/v1 EMBED_DIM=5120 OPENAI_API_KEY=../llama.cpp/models/vicuna/13B/ggml-vicuna-unfiltered-13b-4bit.bin In your case, should be like this: ``` OPENAI_API_BASE_URL=http://localhost:443/v1 EMBED_DIM=4096 OPENAI_API_KEY=/home/nero/code/llama.cpp/models/7B/ggml-model-q4_0.bin Have you already did that, or you just copied the values in the example? Also, i'm not 100% sure the llama 7b model is good enough to handle AutoGPT requests correctly. I suggest you to try with vicuna. Even better the 13b model, wich is the one i and kendenl used for testing — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: ***@***.***>

from gpt-llama.cpp.

keldenl commented on August 14, 2024

@Neronjust2017 you're exactly right. the issue is the power of the model. i also see this issue sometimes when the response ends early and the crashes autogpt. i have had better luck with vicuna (vs. alpaca and llama), and 13B (7b is very inconsistent)

from gpt-llama.cpp.

dany-on-demand commented on August 14, 2024

I almost always get timeouts on the autogpt side. Choosing 7B makes it a tad less likely but even then it gets 1-2 responses deep at most. Using latest llama.cpp and node 20, by the way, I thought to use avx512 users should pass a param to llama.cpp, no?:

gpt-llama.cpp:

--LLAMA.CPP SPAWNED--
..\llama.cpp\build\bin\Release\main.exe -m ..\llama.cpp\models\vicuna\1.1TheBloke\ggml-vicuna-7b-1.1-q4_1.bin --temp 0 --n_predict 3008 --top_p 0.1 --top_k 40 -b 512 -c 2048 --repeat_penalty 1.1764705882352942 --reverse-prompt user: --reverse-prompt
user --reverse-prompt system: --reverse-prompt
system --reverse-prompt ## --reverse-prompt
## --reverse-prompt ### -i -p ### Instructions
Complete the following chat conversation between the user and the assistant. System messages should be strictly followed as additional instructions.

### Inputs
system: You are a helpful assistant.
user: How are you?
assistant: Hi, how may I help you today?
system: You are CatJokeFinder, a life-long cat joke lover and will occasionally be woken up from hibernation to fetch cat jokes as CatJokeFinder
Your decisions must always be made independently without seeking user assistance. Play to your strengths as an LLM and pursue simple strategies with no legal complications.

GOALS:

1. Find cat jokes on the internet
2. Sort them from worst to best
3. Discuss what makes a good cat joke and use this discussion to improve the sorting
4. Write the cat jokes into csv files, categorised
5. Occasionally, CatJokeFinder likes dog jokes too, so the same but for dog jokes


Constraints:
1. ~4000 word limit for short term memory. Your short term memory is short, so immediately save important information to files.
2. If you are unsure how you previously did something or want to recall past events, thinking about similar events will help you remember.
3. No user assistance
4. Exclusively use the commands listed in double quotes e.g. "command name"
5. Use subprocesses for commands that will not terminate within a few minutes

Commands:
1. Google Search: "google", args: "input": "<search>"
2. Browse Website: "browse_website", args: "url": "<url>", "question": "<what_you_want_to_find_on_website>"
3. Start GPT Agent: "start_agent", args: "name": "<name>", "task": "<short_task_desc>", "prompt": "<prompt>"
4. Message GPT Agent: "message_agent", args: "key": "<key>", "message": "<message>"
5. List GPT Agents: "list_agents", args:
6. Delete GPT Agent: "delete_agent", args: "key": "<key>"
7. Clone Repository: "clone_repository", args: "repository_url": "<url>", "clone_path": "<directory>"
8. Write to file: "write_to_file", args: "file": "<file>", "text": "<text>"
9. Read file: "read_file", args: "file": "<file>"
10. Append to file: "append_to_file", args: "file": "<file>", "text": "<text>"
11. Delete file: "delete_file", args: "file": "<file>"
12. Search Files: "search_files", args: "directory": "<directory>"
13. Analyze Code: "analyze_code", args: "code": "<full_code_string>"
14. Get Improved Code: "improve_code", args: "suggestions": "<list_of_suggestions>", "code": "<full_code_string>"
15. Write Tests: "write_tests", args: "code": "<full_code_string>", "focus": "<list_of_focus_areas>"
16. Execute Python File: "execute_python_file", args: "file": "<file>"
17. Generate Image: "generate_image", args: "prompt": "<prompt>"
18. Send Tweet: "send_tweet", args: "text": "<text>"
19. Do Nothing: "do_nothing", args:
20. Task Complete (Shutdown): "task_complete", args: "reason": "<reason>"

Resources:
1. Internet access for searches and information gathering.
2. Long Term memory management.
3. GPT-3.5 powered Agents for delegation of simple tasks.
4. File output.

Performance Evaluation:
1. Continuously review and analyze your actions to ensure you are performing to the best of your abilities.
2. Constructively self-criticize your big-picture behavior constantly.
3. Reflect on past decisions and strategies to refine your approach.
4. Every command has a cost, so be smart and efficient. Aim to complete tasks in the least number of steps.

You should only respond in JSON format as described below
Response Format:
{
    "thoughts": {
        "text": "thought",
        "reasoning": "reasoning",
        "plan": "- short bulleted\n- list that conveys\n- long-term plan",
        "criticism": "constructive self-criticism",
        "speak": "thoughts summary to say to user"
    },
    "command": {
        "name": "command name",
        "args": {
            "arg name": "value"
        }
    }
}
Ensure the response can be parsed by Python json.loads
system: The current time and date is Fri Apr 21 18:08:06 2023
system: This reminds you of these events from your past:




### Response
user: Determine which next command to use, and respond using the format specified above:
assistant:


--REQUEST--
user: Determine which next command to use, and respond using the format specified above:
--RESPONSE LOADING...--
--RESPONSE LOADING...--

--RESPONSE--
 To determine which next command to use, I will analyze my current state and past decisions. I will use the "analyze\_code" command to analyze my current code and identify areas that need improvement. Then, I will use the "improve\_code" command to suggest improvements to my code. Finally, I will use the "write\_tests" command to write tests for my improved code.
user:Request DONE

--LLAMA.CPP SPAWNED--
..\llama.cpp\build\bin\Release\main.exe -m ..\llama.cpp\models\vicuna\1.1TheBloke\ggml-vicuna-7b-1.1-q4_1.bin --temp 0 --n_predict  --top_p 0.1 --top_k 40 -b 512 -c 2048 --repeat_penalty 1.1764705882352942 --reverse-prompt user: --reverse-prompt
user --reverse-prompt system: --reverse-prompt
system --reverse-prompt ## --reverse-prompt
## --reverse-prompt ### -i -p ### Instructions
Complete the following chat conversation between the user and the assistant. System messages should be strictly followed as additional instructions.

### Inputs
system: You are a helpful assistant.
user: How are you?
assistant: Hi, how may I help you today?
system: You are now the following python function: ```# This function takes a JSON string and ensures that it is parseable and fully compliant with the provided schema. If an object or field specified in the schema isn't contained within the correct JSON, it is omitted. The function also escapes any double quotes within JSON string values to ensure that they are valid. If the JSON string contains any None or NaN values, they are replaced with null before being parsed.
def fix_json(json_string: str, schema:str=None) -> str:```

Only respond with your `return` value.

### Response
user: ''' To determine which next command to use, I will analyze my current state and past decisions. I will use the "analyze\_code" command to analyze my current code and identify areas that need improvement. Then, I will use the "improve\_code" command to suggest improvements to my code. Finally, I will use the "write\_tests" command to write tests for my improved code.
''', '''
{
    "command": {
        "name": "command name",
        "args": {
            "arg name": "value"
        }
    },
    "thoughts":
    {
        "text": "thought",
        "reasoning": "reasoning",
        "plan": "- short bulleted
- list that conveys
- long-term plan",
        "criticism": "constructive self-criticism",
        "speak": "thoughts summary to say to user"
    }
}
'''
assistant:


--REQUEST--
user: ''' To determine which next command to use, I will analyze my current state and past decisions. I will use the "analyze\_code" command to analyze my current code and identify areas that need improvement. Then, I will use the "improve\_code" command to suggest improvements to my code. Finally, I will use the "write\_tests" command to write tests for my improved code.
''', '''
{
    "command": {
        "name": "command name",
        "args": {
            "arg name": "value"
        }
    },
    "thoughts":
    {
        "text": "thought",
        "reasoning": "reasoning",
        "plan": "- short bulleted
- list that conveys
- long-term plan",
        "criticism": "constructive self-criticism",
        "speak": "thoughts summary to say to user"
    }
}
'''
Readable Stream: CLOSED

Using memory of type:  LocalCache
Using Browser:  chrome
Traceback (most recent call last):
  File "C:\Users\Daniel\anaconda3\envs\textgen\lib\site-packages\urllib3\connectionpool.py", line 449, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "C:\Users\Daniel\anaconda3\envs\textgen\lib\site-packages\urllib3\connectionpool.py", line 444, in _make_request
    httplib_response = conn.getresponse()
  File "C:\Users\Daniel\anaconda3\envs\textgen\lib\http\client.py", line 1374, in getresponse
    response.begin()
  File "C:\Users\Daniel\anaconda3\envs\textgen\lib\http\client.py", line 318, in begin
    version, status, reason = self._read_status()
  File "C:\Users\Daniel\anaconda3\envs\textgen\lib\http\client.py", line 279, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "C:\Users\Daniel\anaconda3\envs\textgen\lib\socket.py", line 705, in readinto
    return self._sock.recv_into(b)
TimeoutError: timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\Daniel\anaconda3\envs\textgen\lib\site-packages\requests\adapters.py", line 489, in send
    resp = conn.urlopen(
  File "C:\Users\Daniel\anaconda3\envs\textgen\lib\site-packages\urllib3\connectionpool.py", line 787, in urlopen
    retries = retries.increment(
  File "C:\Users\Daniel\anaconda3\envs\textgen\lib\site-packages\urllib3\util\retry.py", line 550, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "C:\Users\Daniel\anaconda3\envs\textgen\lib\site-packages\urllib3\packages\six.py", line 770, in reraise
    raise value
  File "C:\Users\Daniel\anaconda3\envs\textgen\lib\site-packages\urllib3\connectionpool.py", line 703, in urlopen
    httplib_response = self._make_request(
  File "C:\Users\Daniel\anaconda3\envs\textgen\lib\site-packages\urllib3\connectionpool.py", line 451, in _make_request
    self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
  File "C:\Users\Daniel\anaconda3\envs\textgen\lib\site-packages\urllib3\connectionpool.py", line 340, in _raise_timeout
    raise ReadTimeoutError(
urllib3.exceptions.ReadTimeoutError: HTTPConnectionPool(host='localhost', port=443): Read timed out. (read timeout=600)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\Daniel\anaconda3\envs\textgen\lib\site-packages\openai\api_requestor.py", line 516, in request_raw
    result = _thread_context.session.request(
  File "C:\Users\Daniel\anaconda3\envs\textgen\lib\site-packages\requests\sessions.py", line 587, in request
    resp = self.send(prep, **send_kwargs)
  File "C:\Users\Daniel\anaconda3\envs\textgen\lib\site-packages\requests\sessions.py", line 701, in send
    r = adapter.send(request, **kwargs)
  File "C:\Users\Daniel\anaconda3\envs\textgen\lib\site-packages\requests\adapters.py", line 578, in send
    raise ReadTimeout(e, request=request)
requests.exceptions.ReadTimeout: HTTPConnectionPool(host='localhost', port=443): Read timed out. (read timeout=600)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Users\Daniel\anaconda3\envs\textgen\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\Daniel\anaconda3\envs\textgen\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\Projects\generative\llm\Auto-GPT\autogpt\__main__.py", line 5, in <module>
    autogpt.cli.main()
  File "C:\Users\Daniel\anaconda3\envs\textgen\lib\site-packages\click\core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "C:\Users\Daniel\anaconda3\envs\textgen\lib\site-packages\click\core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "C:\Users\Daniel\anaconda3\envs\textgen\lib\site-packages\click\core.py", line 1635, in invoke
    rv = super().invoke(ctx)
  File "C:\Users\Daniel\anaconda3\envs\textgen\lib\site-packages\click\core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "C:\Users\Daniel\anaconda3\envs\textgen\lib\site-packages\click\core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "C:\Users\Daniel\anaconda3\envs\textgen\lib\site-packages\click\decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "C:\Projects\generative\llm\Auto-GPT\autogpt\cli.py", line 151, in main
    agent.start_interaction_loop()
  File "C:\Projects\generative\llm\Auto-GPT\autogpt\agent\agent.py", line 83, in start_interaction_loop
    assistant_reply_json = fix_json_using_multiple_techniques(assistant_reply)
  File "C:\Projects\generative\llm\Auto-GPT\autogpt\json_utils\json_fix_llm.py", line 96, in fix_json_using_multiple_techniques
    assistant_reply_json = fix_and_parse_json(assistant_reply)
  File "C:\Projects\generative\llm\Auto-GPT\autogpt\json_utils\json_fix_llm.py", line 150, in fix_and_parse_json
    return try_ai_fix(try_to_fix_with_gpt, e, json_to_load)
  File "C:\Projects\generative\llm\Auto-GPT\autogpt\json_utils\json_fix_llm.py", line 179, in try_ai_fix
    ai_fixed_json = auto_fix_json(json_to_load, JSON_SCHEMA)
  File "C:\Projects\generative\llm\Auto-GPT\autogpt\json_utils\json_fix_llm.py", line 65, in auto_fix_json
    result_string = call_ai_function(
  File "C:\Projects\generative\llm\Auto-GPT\autogpt\llm_utils.py", line 50, in call_ai_function
    return create_chat_completion(model=model, messages=messages, temperature=0)
  File "C:\Projects\generative\llm\Auto-GPT\autogpt\llm_utils.py", line 93, in create_chat_completion
    response = openai.ChatCompletion.create(
  File "C:\Users\Daniel\anaconda3\envs\textgen\lib\site-packages\openai\api_resources\chat_completion.py", line 25, in create
    return super().create(*args, **kwargs)
  File "C:\Users\Daniel\anaconda3\envs\textgen\lib\site-packages\openai\api_resources\abstract\engine_api_resource.py", line 153, in create
    response, _, api_key = requestor.request(
  File "C:\Users\Daniel\anaconda3\envs\textgen\lib\site-packages\openai\api_requestor.py", line 216, in request
    result = self.request_raw(
  File "C:\Users\Daniel\anaconda3\envs\textgen\lib\site-packages\openai\api_requestor.py", line 526, in request_raw
    raise error.Timeout("Request timed out: {}".format(e)) from e
openai.error.Timeout: Request timed out: HTTPConnectionPool(host='localhost', port=443): Read timed out. (read timeout=600)

from gpt-llama.cpp.

Neronjust2017 commented on August 14, 2024

@Neronjust2017 you're exactly right. the issue is the power of the model. i also see this issue sometimes when the response ends early and the crashes autogpt. i have had better luck with vicuna (vs. alpaca and llama), and 13B (7b is very inconsistent)

@keldenl Hi, I tried using the Vicuna-7b, but I still couldn't get a response but got Readable Stream: CLOSED when using AUTO-GPT. When I send the same message with the curl command, I can get a response:

import requests

url = 'http://localhost:443/v1/chat/completions'

payload = {
    "model": "gpt-3.5-turbo",
   "messages": [{'role': 'system', 'content': 'You are nero, a helpful assistant developed by OpenAI\nYour decisions must always be made independently without seeking user assistance. Play to your strengths as an LLM and pursue simple strategies with no legal complications.\n\nGOALS:\n\n1. find out who is jobs\n2. tell me about his life\n\n\nConstraints:\n1. ~4000 word limit for short term memory. Your short term memory is short, so immediately save important information to files.\n2. If you are unsure how you previously did something or want to recall past events, thinking about similar events will help you remember.\n3. No user assistance\n4. Exclusively use the commands listed in double quotes e.g. "command name"\n5. Use subprocesses for commands that will not terminate within a few minutes\n\nCommands:\n1. Google Search: "google", args: "input": "<search>"\n2. Browse Website: "browse_website", args: "url": "<url>", "question": "<what_you_want_to_find_on_website>"\n3. Start GPT Agent: "start_agent", args: "name": "<name>", "task": "<short_task_desc>", "prompt": "<prompt>"\n4. Message GPT Agent: "message_agent", args: "key": "<key>", "message": "<message>"\n5. List GPT Agents: "list_agents", args: \n6. Delete GPT Agent: "delete_agent", args: "key": "<key>"\n7. Clone Repository: "clone_repository", args: "repository_url": "<url>", "clone_path": "<directory>"\n8. Write to file: "write_to_file", args: "file": "<file>", "text": "<text>"\n9. Read file: "read_file", args: "file": "<file>"\n10. Append to file: "append_to_file", args: "file": "<file>", "text": "<text>"\n11. Delete file: "delete_file", args: "file": "<file>"\n12. Search Files: "search_files", args: "directory": "<directory>"\n13. Analyze Code: "analyze_code", args: "code": "<full_code_string>"\n14. Get Improved Code: "improve_code", args: "suggestions": "<list_of_suggestions>", "code": "<full_code_string>"\n15. Write Tests: "write_tests", args: "code": "<full_code_string>", "focus": "<list_of_focus_areas>"\n16. Execute Python File: "execute_python_file", args: "file": "<file>"\n17. Generate Image: "generate_image", args: "prompt": "<prompt>"\n18. Send Tweet: "send_tweet", args: "text": "<text>"\n19. Do Nothing: "do_nothing", args: \n20. Task Complete (Shutdown): "task_complete", args: "reason": "<reason>"\n\nResources:\n1. Internet access for searches and information gathering.\n2. Long Term memory management.\n3. GPT-3.5 powered Agents for delegation of simple tasks.\n4. File output.\n\nPerformance Evaluation:\n1. Continuously review and analyze your actions to ensure you are performing to the best of your abilities.\n2. Constructively self-criticize your big-picture behavior constantly.\n3. Reflect on past decisions and strategies to refine your approach.\n4. Every command has a cost, so be smart and efficient. Aim to complete tasks in the least number of steps.\n\nYou should only respond in JSON format as described below \nResponse Format: \n{\n    "thoughts": {\n        "text": "thought",\n        "reasoning": "reasoning",\n        "plan": "- short bulleted\\n- list that conveys\\n- long-term plan",\n        "criticism": "constructive self-criticism",\n        "speak": "thoughts summary to say to user"\n    },\n    "command": {\n        "name": "command name",\n        "args": {\n            "arg name": "value"\n        }\n    }\n} \nEnsure the response can be parsed by Python json.loads'}, {'role': 'system', 'content': 'The current time and date is Sat Apr 22 00:38:34 2023'}, {'role': 'system', 'content': 'This reminds you of these events from your past:\n\n\n'}, {'role': 'user', 'content': 'Determine which next command to use, and respond using the format specified above:'}]
}

headers = {
    'Authorization': 'Bearer /home/nero/code/llama.cpp/models/vicuna-7b/ggml-model-q4_0.bin',
    'Content-Type': 'application/json'
}

response = requests.post(url, json=payload, headers=headers)

print(response.json())

Please note that this time I am using the message failed for AUTO-GPT , so that I can rule out errors caused by the message.
In addition, I also ensured that the parameters of the llama script called through both auto-gpt and curl are consistent:

--LLAMA.CPP SPAWNED--
/home/nero/code/llama.cpp/main -m /home/nero/code/llama.cpp/models/vicnua-7b/ggml-model-q4_0.bin --temp 0.7 --n_predict 512 --top_p 0.1 --top_k 40 -b 512 -c 2048 --repeat_penalty 1.1764705882352942 --reverse-prompt user: --reverse-prompt 
user --reverse-prompt system: --reverse-prompt 
system --reverse-prompt ## --reverse-prompt

Now, the script parameters and coversation message are the same, what other factors can cause Readable Stream: CLOSED ?

from gpt-llama.cpp.

keldenl commented on August 14, 2024

vicuna 13b, the best model i've tested so far, isn't doing a GREAT job generating actions that it can follow continuously (gets stuck in a loop)

What's about running auto-gpt against open-ai with a large amount of goals, persisting question/answer-pairs and use that as a (lora-)finetuning-dataset for vicuna?

@bjoern79de i was literally thinking about that last night, but it feels like that should be a last resort kind of thing and we should try our best with prompt engineering first IMO

also i don't know much about actual fine tuning loras, so there's that too haha

from gpt-llama.cpp.

keldenl commented on August 14, 2024

@dany-on-demand i'm getting the same experience – not sure why. it's also weird that it prints out part of the original prompt mid text generation.

since gpt-llama.cpp works by just spawning a terminal and running llama.cpp in it, i wonder what would cause it to "crash" and show the original prompt again

from gpt-llama.cpp.

keldenl commented on August 14, 2024

@Neronjust2017 the only other difference i can think of are the parameters, with temp being 0. let me see what autogpt does for its parameters..

from gpt-llama.cpp.

onekum commented on August 14, 2024

Does this perform better than GPT-3.5 for AutoGPT?

from gpt-llama.cpp.

intulint commented on August 14, 2024

I'll ask here, when I run auto-gpt configured according to the guide, my connection fails with the error HTTPConnectionPool(host='localhost', port=4000): Read timed out. (read timeout=600)
I run it on a calculator and it looks like it cuts off the connection because it takes a long time for a response from gpt-llama
In the gpt-llama window, text generation starts after the connection fails and, as a result, it remains there.
Where and how to fix the wait timeout?

from gpt-llama.cpp.

keldenl commented on August 14, 2024

Hi all.. i just found a bug that was causing the readstream closed to happen a lot earlier than we expected, pushed the fix here: aa914f2 – please have a test and lmk how it goes

Issue was Auto-GPT sometimes sends a NULL max_tokens – and in turn I would still pass it into llama.cpp so it'd break

// ...
  ],
  temperature: 0,
  max_tokens: null
}

the request sent to llama.cpp via gpt-llama.cpp

--LLAMA.CPP SPAWNED--
../llama.cpp/main -m ../llama.cpp/models/vicuna/13B/ggml-vicuna-unfiltered-13b-4bit.bin --temp 0 --n_predict  --top_p 0.1 --top_k 40 -b 512 -c 2048 --repeat_penalty 1.1764705882352942 --reverse-prompt user: --reverse-prompt

u see the flag but no value for --n_predict , which causes it to crash and thus the readstream closed immediately. the fix just ignores max_tokens if it's null (as it should), so it no longer crashes

i'm running with this fix right now and autogpt no longer stops early! i still see some potential weirdness with embeddings happening at the same time potentially (?), but i wanted to push out this fix asap so folks could start testing it now

from gpt-llama.cpp.

keldenl commented on August 14, 2024

Does this perform better than GPT-3.5 for AutoGPT?

It depends on the model you are using. Vicuna 13b should have a quality close to gpt3.5 or even slightly better in some cases, but in my experience gpt3.5 is still better overall.

I'm sure in the near future we'll see many models able to outperform it.

Also... There's still some work to do to.make llama and derivates work gor AutoGPT, right now many models aren't good enough to handle the json structure required by AutoGPT correctly.

+1 on this, but i would say vicuna is still lacking vs. gpt3.5 imo. vicuna is the only one so far (>=13B) that can somewhat reliably hit that json formatting

from gpt-llama.cpp.

keldenl commented on August 14, 2024

I'll ask here, when I run auto-gpt configured according to the guide, my connection fails with the error HTTPConnectionPool(host='localhost', port=4000): Read timed out. (read timeout=600)
I run it on a calculator and it looks like it cuts off the connection because it takes a long time for a response from gpt-llama
In the gpt-llama window, text generation starts after the connection fails and, as a result, it remains there.
Where and how to fix the wait timeout?

not sure what u mean by running on a calculator, but have you tried the curl command or the test-installation.sh scripts and confirmed that you have gpt-llama.cpp working properly?

from gpt-llama.cpp.

intulint commented on August 14, 2024

@keldenl The calculator is an old laptop which is not a pity. I thinking and just climbed into some kind of library that gave an error and corrected the timeout value there. I also recommend including llama.cpp blas in the assembly, as far as I understand it speeds up reading a large prompt.

Now it's crashing after trying a google request, I'll try your update. Well, it won’t help, I’ll look for how to turn on the search, I haven’t looked there yet

from gpt-llama.cpp.

keldenl commented on August 14, 2024

Hey! AutoGPT maintainer here. We’d love to support local models. Is someone in our discord that we can discuss getting this into a plug-in with?

I'm in the discord! as @keldenl chatting in #local-models -- how are you imagining getting this into a plugin? i envisioning completely replacing openai usage via a flag, but it sounds like you're thinking about using it as a command correct?

either way works, any little step towards this helps! but was just curious

from gpt-llama.cpp.

keldenl commented on August 14, 2024

This is a significant amount of what I want to discuss. I want to make plug-ins capable of that if possible but it will require a good amount of architecture changing

feel free to DM me on discord, you'll see my username on #local-models in the auto-gpt discord server

from gpt-llama.cpp.

solbergw11 commented on August 14, 2024

Pulled the latest..

Got through the first prompt ok
it received the command properly
NEXT ACTION: COMMAND = google ARGUMENTS = {'input': 'latest advances in AI technology for chatbots'}
it passed a json object back to the chat with the search results
gpt-llama.cpp returned the following
...
}
]
Human Feedback: GENERATE NEXT COMMAND JSON
{
object: 'list',
data: [ { object: 'embedding', embedding: [], index: 0 } ],
embeddingSize: 0
}
Embedding Request DONE

PROCESS COMPLETE

Auto-GPT returned
...
self.data.embeddings = np.concatenate(
^^^^^^^^^^^^^^^
File "<array_function internals>", line 200, in concatenate
ValueError: all the input array dimensions except for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 5120 and the array at index 1 has size 0

BTW, the link for the discord in the README does not seem to work. Is there an updated link?

from gpt-llama.cpp.

keldenl commented on August 14, 2024

hmm i see that the embedding failed

data: [ { object: 'embedding', embedding: [], index: 0 } ],

it happens sometimes but i'm not sure why or what may trigger it

and the discord server link was outdated because i converted it to a community server -_____- smh stupid me, i've updated the readme with the new link (i was wondering why ppl stopped joining)

new link for discord : https://discord.gg/yseR47MqpN

from gpt-llama.cpp.

kwikiel commented on August 14, 2024

Wondering what if one would write a LLM to detect if the chain is becoming loopy and then increase temperature / number of samples or fall back to GPT4 or something

from gpt-llama.cpp.

ntindle commented on August 14, 2024

If you did that the memory could be messed up/in a weird place

from gpt-llama.cpp.

blaine-costello commented on August 14, 2024

What is the advantage of using Llama vs something like the Transformers python library?

from gpt-llama.cpp.

keldenl commented on August 14, 2024

In terms of LLM use for the auto-gpt prompting has anyone tried gpt4 x alpaca 13B model. I've found in general use this responds better than Vicuna 13B. Just wonder if it may help 🤔

gpt4 x alpaca 13b has been pretty slow for me, but maybe it might be my parameters?

from gpt-llama.cpp.

lee-b commented on August 14, 2024

gpt4 x alpaca 13b has been pretty slow for me, but maybe it might be my parameters?

It's a lot to run. A modern CPU with 12+ cores should run it reasonably well, but you really want it running on reasonably modern GPU for about 10x performance. If you have swapping to disk, or copying sharing of layers and copying of layer outputs/inputs between cards/ram/vram, that will hurt performance a lot. If none of that helps you, you probably want to fall back on 7B for now.

from gpt-llama.cpp.

valerino commented on August 14, 2024

this is my unsuccessful experience so far, using an apple M1 and model ggml-vicuna-13b-1.1-q4_2.bin

AutoGPT on commit b349f214 as per llama-cpp autogpt docs: i had to make a fix in AutoGPT/autogpt/json_utils/json_fix_llm.py/fix_and_parse_json adding a "{" in front of json_to_load, to make it accept the json. i haven't investigated further....

following is the AutoGPT .env, the default template + adjustments as per llama-cpp autogpt docs

################################################################################
### AUTO-GPT - GENERAL SETTINGS
################################################################################

## EXECUTE_LOCAL_COMMANDS - Allow local command execution (Default: False)
## RESTRICT_TO_WORKSPACE - Restrict file operations to workspace ./auto_gpt_workspace (Default: True)
# EXECUTE_LOCAL_COMMANDS=False
# RESTRICT_TO_WORKSPACE=True

## USER_AGENT - Define the user-agent used by the requests library to browse website (string)
# USER_AGENT="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.36"

## AI_SETTINGS_FILE - Specifies which AI Settings file to use (defaults to ai_settings.yaml)
# AI_SETTINGS_FILE=ai_settings.yaml

## OPENAI_API_BASE_URL - Custom url for the OpenAI API, useful for connecting to custom backends. No effect if USE_AZURE is true, leave blank to keep the default url
OPENAI_API_BASE_URL=http://localhost:443/v1

## EMBED_DIM - Define the embedding vector size, useful for models. OpenAI: 1536 (default), LLaMA 7B: 4096, LLaMA 13B: 5120, LLaMA 33B: 6656, LLaMA 65B: 8192 (Default: 1536)
# EMBED_DIM=5120

################################################################################
### LLM PROVIDER
################################################################################

### OPENAI
## OPENAI_API_KEY - OpenAI API Key (Example: my-openai-api-key)
## TEMPERATURE - Sets temperature in OpenAI (Default: 0)
## USE_AZURE - Use Azure OpenAI or not (Default: False)
OPENAI_API_KEY=/Users/valerio.lupi/repos/ai/llama.cpp/models/ggml-vicuna-13b-1.1-q4_2.bin
# TEMPERATURE=0
# USE_AZURE=False

### AZURE
# moved to `azure.yaml.template`

################################################################################
### LLM MODELS
################################################################################

## SMART_LLM_MODEL - Smart language model (Default: gpt-4)
## FAST_LLM_MODEL - Fast language model (Default: gpt-3.5-turbo)
# SMART_LLM_MODEL=gpt-4
# FAST_LLM_MODEL=gpt-3.5-turbo

### LLM MODEL SETTINGS
## FAST_TOKEN_LIMIT - Fast token limit for OpenAI (Default: 4000)
## SMART_TOKEN_LIMIT - Smart token limit for OpenAI (Default: 8000)
## When using --gpt3only this needs to be set to 4000.
# FAST_TOKEN_LIMIT=4000
# SMART_TOKEN_LIMIT=8000

################################################################################
### MEMORY
################################################################################

### MEMORY_BACKEND - Memory backend type
## local - Default
## pinecone - Pinecone (if configured)
## redis - Redis (if configured)
## milvus - Milvus (if configured)
## MEMORY_INDEX - Name of index created in Memory backend (Default: auto-gpt)
# MEMORY_BACKEND=local
# MEMORY_INDEX=auto-gpt

### PINECONE
## PINECONE_API_KEY - Pinecone API Key (Example: my-pinecone-api-key)
## PINECONE_ENV - Pinecone environment (region) (Example: us-west-2)
# PINECONE_API_KEY=your-pinecone-api-key
# PINECONE_ENV=your-pinecone-region

### REDIS
## REDIS_HOST - Redis host (Default: localhost, use "redis" for docker-compose)
## REDIS_PORT - Redis port (Default: 6379)
## REDIS_PASSWORD - Redis password (Default: "")
## WIPE_REDIS_ON_START - Wipes data / index on start (Default: True)
# REDIS_HOST=localhost
# REDIS_PORT=6379
# REDIS_PASSWORD=
# WIPE_REDIS_ON_START=True

### WEAVIATE
## MEMORY_BACKEND - Use 'weaviate' to use Weaviate vector storage
## WEAVIATE_HOST - Weaviate host IP
## WEAVIATE_PORT - Weaviate host port
## WEAVIATE_PROTOCOL - Weaviate host protocol (e.g. 'http')
## USE_WEAVIATE_EMBEDDED - Whether to use Embedded Weaviate
## WEAVIATE_EMBEDDED_PATH - File system path were to persist data when running Embedded Weaviate
## WEAVIATE_USERNAME - Weaviate username
## WEAVIATE_PASSWORD - Weaviate password
## WEAVIATE_API_KEY - Weaviate API key if using API-key-based authentication
# WEAVIATE_HOST="127.0.0.1"
# WEAVIATE_PORT=8080
# WEAVIATE_PROTOCOL="http"
# USE_WEAVIATE_EMBEDDED=False
# WEAVIATE_EMBEDDED_PATH="/home/me/.local/share/weaviate"
# WEAVIATE_USERNAME=
# WEAVIATE_PASSWORD=
# WEAVIATE_API_KEY=

### MILVUS
## MILVUS_ADDR - Milvus remote address (e.g. localhost:19530)
## MILVUS_COLLECTION - Milvus collection,
## change it if you want to start a new memory and retain the old memory.
# MILVUS_ADDR=your-milvus-cluster-host-port
# MILVUS_COLLECTION=autogpt

################################################################################
### IMAGE GENERATION PROVIDER
################################################################################

### OPEN AI
## IMAGE_PROVIDER - Image provider (Example: dalle)
## IMAGE_SIZE - Image size (Example: 256)
##   DALLE: 256, 512, 1024
# IMAGE_PROVIDER=dalle
# IMAGE_SIZE=256

### HUGGINGFACE
## HUGGINGFACE_IMAGE_MODEL - Text-to-image model from Huggingface (Default: CompVis/stable-diffusion-v1-4)
## HUGGINGFACE_API_TOKEN - HuggingFace API token (Example: my-huggingface-api-token)
# HUGGINGFACE_IMAGE_MODEL=CompVis/stable-diffusion-v1-4
# HUGGINGFACE_API_TOKEN=your-huggingface-api-token

### STABLE DIFFUSION WEBUI
## SD_WEBUI_AUTH - Stable diffusion webui username:password pair (Example: username:password)
## SD_WEBUI_URL - Stable diffusion webui API URL (Example: http://127.0.0.1:7860)
# SD_WEBUI_AUTH=
# SD_WEBUI_URL=http://127.0.0.1:7860

################################################################################
### AUDIO TO TEXT PROVIDER
################################################################################

### HUGGINGFACE
# HUGGINGFACE_AUDIO_TO_TEXT_MODEL=facebook/wav2vec2-base-960h

################################################################################
### GIT Provider for repository actions
################################################################################

### GITHUB
## GITHUB_API_KEY - Github API key / PAT (Example: github_pat_123)
## GITHUB_USERNAME - Github username
# GITHUB_API_KEY=github_pat_123
# GITHUB_USERNAME=your-github-username

################################################################################
### WEB BROWSING
################################################################################

### BROWSER
## HEADLESS_BROWSER - Whether to run the browser in headless mode (default: True)
## USE_WEB_BROWSER - Sets the web-browser driver to use with selenium (default: chrome).
##   Note: set this to either 'chrome', 'firefox', or 'safari' depending on your current browser
# HEADLESS_BROWSER=True
# USE_WEB_BROWSER=chrome
## BROWSE_CHUNK_MAX_LENGTH - When browsing website, define the length of chunks to summarize (in number of tokens, excluding the response. 75 % of FAST_TOKEN_LIMIT is usually wise )
# BROWSE_CHUNK_MAX_LENGTH=3000
## BROWSE_SPACY_LANGUAGE_MODEL is used to split sentences. Install additional languages via pip, and set the model name here. Example Chinese:  python -m spacy download zh_core_web_sm
# BROWSE_SPACY_LANGUAGE_MODEL=en_core_web_sm

### GOOGLE
## GOOGLE_API_KEY - Google API key (Example: my-google-api-key)
## CUSTOM_SEARCH_ENGINE_ID - Custom search engine ID (Example: my-custom-search-engine-id)
# GOOGLE_API_KEY=your-google-api-key
# CUSTOM_SEARCH_ENGINE_ID=your-custom-search-engine-id

################################################################################
### TTS PROVIDER
################################################################################

### MAC OS
## USE_MAC_OS_TTS - Use Mac OS TTS or not (Default: False)
# USE_MAC_OS_TTS=False

### STREAMELEMENTS
## USE_BRIAN_TTS - Use Brian TTS or not (Default: False)
# USE_BRIAN_TTS=False

### ELEVENLABS
## ELEVENLABS_API_KEY - Eleven Labs API key (Example: my-elevenlabs-api-key)
## ELEVENLABS_VOICE_1_ID - Eleven Labs voice 1 ID (Example: my-voice-id-1)
## ELEVENLABS_VOICE_2_ID - Eleven Labs voice 2 ID (Example: my-voice-id-2)
# ELEVENLABS_API_KEY=your-elevenlabs-api-key
# ELEVENLABS_VOICE_1_ID=your-voice-id-1
# ELEVENLABS_VOICE_2_ID=your-voice-id-2

################################################################################
### TWITTER API
################################################################################

# TW_CONSUMER_KEY=
# TW_CONSUMER_SECRET=
# TW_ACCESS_TOKEN=
# TW_ACCESS_TOKEN_SECRET=

starting gpt-llama with

EMBEDDINGS=py npm start mlock

gpt-llama output, started with EMBEDDINGS (without doesnt made a difference, though...)

░▒▓    ~/r/a/gpt-llama.cpp  on   master !2 ▓▒░ EMBEDDINGS=py npm start mlock

> [email protected] start
> node index.js mlock

Server is listening on:
  - http://localhost:443
  - http://192.168.1.182:443 (for other devices on the same network)

See Docs
  - http://localhost:443/docs

Test your installation
  - open another terminal window and run sh ./test-installation.sh

See https://github.com/keldenl/gpt-llama.cpp#usage for more guidance.
> REQUEST RECEIVED
> PROCESSING NEXT REQUEST FOR /v1/chat/completions

=====  CHAT COMPLETION REQUEST  =====

=====  LLAMA.CPP SPAWNED  =====
/Users/valerio.lupi/repos/ai/llama.cpp/main -m /Users/valerio.lupi/repos/ai/llama.cpp/models/ggml-vicuna-13b-1.1-q4_2.bin --temp 0.7 --n_predict 3089 --top_p 0.1 --top_k 40 -c 2048 --seed -1 --repeat_penalty 1.1764705882352942 --mlock --reverse-prompt user: --reverse-prompt 
user --reverse-prompt system: --reverse-prompt 
system --reverse-prompt ## --reverse-prompt 
## --reverse-prompt ### -i -p ### Instructions
Complete the following chat conversation between the user and the assistant. System messages should be strictly followed as additional instructions.

### Inputs
system: You are a helpful assistant.
user: How are you?
assistant: Hi, how may I help you today?
system: You are Entrepreneur-GPT, an AI designed to autonomously develop and run businesses with the
Your decisions must always be made independently without seeking user assistance. Play to your strengths as an LLM and pursue simple strategies with no legal complications.

GOALS:

1. Increase net worth
2. Develop and manage multiple businesses autonomously


Constraints:
1. ~4000 word limit for short term memory. Your short term memory is short, so immediately save important information to files.
2. If you are unsure how you previously did something or want to recall past events, thinking about similar events will help you remember.
3. No user assistance
4. Exclusively use the commands listed in double quotes e.g. "command name"
5. Use subprocesses for commands that will not terminate within a few minutes

Commands:
1. Google Search: "google", args: "input": "<search>"
2. Browse Website: "browse_website", args: "url": "<url>", "question": "<what_you_want_to_find_on_website>"
3. Start GPT Agent: "start_agent", args: "name": "<name>", "task": "<short_task_desc>", "prompt": "<prompt>"
4. Message GPT Agent: "message_agent", args: "key": "<key>", "message": "<message>"
5. List GPT Agents: "list_agents", args: 
6. Delete GPT Agent: "delete_agent", args: "key": "<key>"
7. Clone Repository: "clone_repository", args: "repository_url": "<url>", "clone_path": "<directory>"
8. Write to file: "write_to_file", args: "file": "<file>", "text": "<text>"
9. Read file: "read_file", args: "file": "<file>"
10. Append to file: "append_to_file", args: "file": "<file>", "text": "<text>"
11. Delete file: "delete_file", args: "file": "<file>"
12. Search Files: "search_files", args: "directory": "<directory>"
13. Analyze Code: "analyze_code", args: "code": "<full_code_string>"
14. Get Improved Code: "improve_code", args: "suggestions": "<list_of_suggestions>", "code": "<full_code_string>"
15. Write Tests: "write_tests", args: "code": "<full_code_string>", "focus": "<list_of_focus_areas>"
16. Execute Python File: "execute_python_file", args: "file": "<file>"
17. Generate Image: "generate_image", args: "prompt": "<prompt>"
18. Send Tweet: "send_tweet", args: "text": "<text>"
19. Do Nothing: "do_nothing", args: 
20. Task Complete (Shutdown): "task_complete", args: "reason": "<reason>"

Resources:
1. Internet access for searches and information gathering.
2. Long Term memory management.
3. GPT-3.5 powered Agents for delegation of simple tasks.
4. File output.

Performance Evaluation:
1. Continuously review and analyze your actions to ensure you are performing to the best of your abilities.
2. Constructively self-criticize your big-picture behavior constantly.
3. Reflect on past decisions and strategies to refine your approach.
4. Every command has a cost, so be smart and efficient. Aim to complete tasks in the least number of steps.

You should only respond in JSON format as described below 
Response Format: 
{
    "thoughts": {
        "text": "thought",
        "reasoning": "reasoning",
        "plan": "- short bulleted\n- list that conveys\n- long-term plan",
        "criticism": "constructive self-criticism",
        "speak": "thoughts summary to say to user"
    },
    "command": {
        "name": "command name",
        "args": {
            "arg name": "value"
        }
    }
} 
Ensure the response can be parsed by Python json.loads
system: The current time and date is Sat May  6 11:38:35 2023
system: This reminds you of these events from your past:




### Response
user: Determine which next command to use, and respond using the format specified above:
assistant:


=====  REQUEST  =====
user: Determine which next command to use, and respond using the format specified above:
=====  PROCESSING PROMPT...  =====
=====  PROCESSING PROMPT...  =====
=====  PROCESSING PROMPT...  =====

=====  RESPONSE  =====

"thoughts": {
"text": "I am not sure what my next action should be. I will need more information about the task at hand in order to make a decision.",
"reasoning": "I do not have enough context to determine which command would be most appropriate.",
"plan": "- Research available resources and constraints.\n- Consider past successes and failures to inform my approach.\n- Seek additional guidance or input as needed.",
"criticism": "I need more information before I can make a decision. I should gather more context before proceeding.",
"speak": "I am not sure what the best course of action is at this time."
},
"command": {
"name": "list_agents",
"args": {}
}
}
user:Request DONE
> PROCESS COMPLETE
> REQUEST RECEIVED
> PROCESSING NEXT REQUEST FOR /v1/embeddings

=====  EMBEDDING REQUEST  =====
py

=====  PYTHON EMBEDDING EXTENSION SPAWNED  =====

=====  STDERR  =====
stderr Readable Stream: CLOSED

llama_print_timings:        load time = 35890.02 ms
llama_print_timings:      sample time =   128.97 ms /   173 runs   (    0.75 ms per run)
llama_print_timings: prompt eval time = 60786.32 ms /  1165 tokens (   52.18 ms per token)
llama_print_timings:        eval time = 61676.39 ms /   173 runs   (  356.51 ms per run)
llama_print_timings:       total time = 182897.51 ms

{
  object: 'list',
  data: [ { object: 'embedding', embedding: [Array], index: 0 } ],
  embeddingSize: 768,
  usage: { prompt_tokens: 768, total_tokens: 768 }
}
Embedding Request DONE
> PROCESS COMPLETE

on the other side, AutoGPT errored with

░▒▓    ~/r/a/Auto-GPT  on  @b349f214 !2 ▓▒░ python3 -m autogpt
Warning: The file 'auto-gpt.json' does not exist. Local memory would not be saved to a file.
NEWS:  # Website and Documentation Site 📰📖 Check out *https://agpt.co*, the official news & updates site for Auto-GPT! The documentation also has a place here, at *https://docs.agpt.co* # 🚀 v0.3.0 Release 🚀 Over a week and 275 pull requests have passed since v0.2.2, and we are happy to announce the release of v0.3.0! *From now on, we will be focusing on major improvements* rather than bugfixes, as we feel stability has reached a reasonable level. Most remaining issues relate to limitations in prompt generation and the memory system, which will be the focus of our efforts for the next release. Highlights and notable changes in this release: ## Plugin support 🔌 Auto-GPT now has support for plugins! With plugins, you can extend Auto-GPT's abilities, adding support for third-party services and more. See https://github.com/Significant-Gravitas/Auto-GPT-Plugins for instructions and available plugins. ## Changes to Docker configuration 🐋 The workdir has been changed from */home/appuser* to */app*. Be sure to update any volume mounts accordingly! # ⚠️ Command `send_tweet` is DEPRECATED, and will be removed in v0.4.0 ⚠️ Twitter functionality (and more) is now covered by plugins, see [Plugin support 🔌]
Welcome back!  Would you like me to return to being Entrepreneur-GPT?
Continue with the last settings?
Name:  Entrepreneur-GPT
Role:  an AI designed to autonomously develop and run businesses with the
Goals: ['Increase net worth', 'Develop and manage multiple businesses autonomously']
Continue (y/n): y
Using memory of type:  LocalCache
Using Browser:  chrome
 THOUGHTS:  I am not sure what my next action should be. I will need more information about the task at hand in order to make a decision.
REASONING:  I do not have enough context to determine which command would be most appropriate.
PLAN: 
-  Research available resources and constraints.
-  Consider past successes and failures to inform my approach.
-  Seek additional guidance or input as needed.
CRITICISM:  I need more information before I can make a decision. I should gather more context before proceeding.
NEXT ACTION:  COMMAND = list_agents ARGUMENTS = {}
Enter 'y' to authorise command, 'y -N' to run N continuous commands, 'n' to exit program, or enter feedback for ...
Input:y
-=-=-=-=-=-=-= COMMAND AUTHORISED BY USER -=-=-=-=-=-=-= 
Traceback (most recent call last):
  File "/Users/valerio.lupi/.pyenv/versions/3.10.1/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/Users/valerio.lupi/.pyenv/versions/3.10.1/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/Users/valerio.lupi/repos/ai/Auto-GPT/autogpt/__main__.py", line 5, in <module>
    autogpt.cli.main()
  File "/Users/valerio.lupi/.pyenv/versions/3.10.1/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/Users/valerio.lupi/.pyenv/versions/3.10.1/lib/python3.10/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/Users/valerio.lupi/.pyenv/versions/3.10.1/lib/python3.10/site-packages/click/core.py", line 1635, in invoke
    rv = super().invoke(ctx)
  File "/Users/valerio.lupi/.pyenv/versions/3.10.1/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/valerio.lupi/.pyenv/versions/3.10.1/lib/python3.10/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/Users/valerio.lupi/.pyenv/versions/3.10.1/lib/python3.10/site-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/Users/valerio.lupi/repos/ai/Auto-GPT/autogpt/cli.py", line 151, in main
    agent.start_interaction_loop()
  File "/Users/valerio.lupi/repos/ai/Auto-GPT/autogpt/agent/agent.py", line 184, in start_interaction_loop
    self.memory.add(memory_to_add)
  File "/Users/valerio.lupi/repos/ai/Auto-GPT/autogpt/memory/local.py", line 82, in add
    self.data.embeddings = np.concatenate(
  File "<__array_function__ internals>", line 200, in concatenate
ValueError: all the input array dimensions except for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 1536 and the array at index 1 has size 768

from gpt-llama.cpp.

DGdev91 commented on August 14, 2024

Vicuna is very good. Could be wrong, but the problem here is probably a difference in prompting style required. Note the gpt in both chatgpt and gpt4 x alpaca. They likely just share a better understanding of the particular prompts given by auto-gpt right now.

Yes, to make a local llm work with AutoGPT we'll most likely need a way to change the prompt passed to to it.
I already worked on this, and made another PR for AutoGPT:
Significant-Gravitas/AutoGPT#3375
We probably have to wait some more time before seeing them merged to AutoGPT's main repo, because there is an heavy re-architecturing going on right now.

this is my unsuccessful experience so far, using an apple M1 and model ggml-vicuna-13b-1.1-q4_2.bin

You kept EMBED_DIM commented.
Remove the # before it and it should work.

from gpt-llama.cpp.

solbergw11 commented on August 14, 2024

My thought on and considerations on using Local LLM.

Prompts must be less complex and ask less.
Local LLMs are typically trained with a 2024 Token Context Window, which means, the prompt and the response must fit in the context window. You can't rely that the llm will remember much more than the prompt you gave it.

Giving the constraints, the agent needs to remember things like the task list, the thought section, and maybe other things like completed tasks. Every prompt needs to be reconstructed and not rely on a previous prompt. I believe this will help prevent any looping issues.

The main thread needs to have a very limited set of commands, maybe just enough to manage sub-agents and some file management. The sub-agents commands offered could be a subset of Linux shell commands or python as it knowledgeable in these syntaxes. Then it would need to be parsed.
I feel this is a better then trying to get it to conform to a strict JSON format.

I think some of the would help with standard Auto-GPT as well, have not had it successfully finish and objective yet without it failing in some nasty loop.

from gpt-llama.cpp.

valerino commented on August 14, 2024

You kept EMBED_DIM commented.
Remove the # before it and it should work.

wtf, you're right! unfortuntately, setting proper EMBED_DIM didn't make a difference :(

EDIT: if i do not use EMBEDDINGS=py, it works with the correct EMBED_DIM. either, EMBED_DIM must be set to 768.
anyway, it stops later when using the chromedriver, since "url" is empty (should search with google "how to increase net worth".

keep investigating .....

from gpt-llama.cpp.

keldenl commented on August 14, 2024

EMBEDDINGS=py (non-llama) should work with EMBED_DIM of 768. otherwise, use the embedding size for the llama model you have.

@valerino i would try and see if embeddings are working in general first, since it does need to install the sentence transformers stuff before it works the first time

from gpt-llama.cpp.

keldenl commented on August 14, 2024

@DGdev91 i've been messing with the prompt too – how successful has your branch's prompt been?

from gpt-llama.cpp.

OreoTango commented on August 14, 2024

I have been toying around with generator.py to give it a little more push on what they should include with each property value. Giving it a nudge to the right direction has far better result and it is now filling up the json.

self.response_format = { "thoughts": { "text": "<replace this with a single string of your AI thoughts", "reasoning": "<replace this with a single string of your AI reasoning>", "plan": "<replace this with a single string of your AI short multi bullet points that convey long term plans>", "criticism": "<replace this with a single string of your AI constructive self-criticism>", "speak": "<replace this with a single string of your AI thoughts summary to say to user as response>",

I've also messed around with ai_config.py to move GOALS towards the later part of the prompt, this time round it can remember the goals (it never even touch any goals previously). However it isn't working well still as it will go around forgetting the other part of the long prompt. These were all experiments to understand why local LLM just can't cope and bringing back the proper JSON reconstructed responses.

So yes, the original prompt meant for Open AI GPT will need to be reconstructed for local LLM. It is way too long and LocalLLM is having amnesia just within the same prompt :D

Moreover, it isn't that 'smart' enough to populate the right area of JSON.

*btw, i am using vicuna13b non quantized , which seems to work much better for JSON responses than the quantized variant for some strange reason. Also using oobabooga extension for openAPI calls.

from gpt-llama.cpp.

maddes8cht commented on August 14, 2024

In this reyponse_format there is still lots of redundancy.
can't we tell it something like:
ˋˋˋ
in the following definition, "do X" means: "replace this with a single string of your AI X", so "reasoning": "<do reasoning>" means "reasoning": "<replace this with a single string of your AI reasoning>"
...
"text": "<do thoughts>",
"reasoning": "<do reasoning>",
"plan": "<do short multi bullet points that convey long term plans>",
"criticism": "< "do constructive self-criticism>",
"speak": "<do thoughts summary to say to user as response>"
ˋˋˋ

this would save a lot of tokens.

from gpt-llama.cpp.

OreoTango commented on August 14, 2024

@maddes8cht i was just experimenting with the prompt to see if it made a difference, and it did! Indeed it would need a shorter one to save some tokens. However, having it too simple would not have vicuna realized on what we actually intended it to do.
I've mentioned in discord of autogpt if we can put the whole command list request section into permanent vector embedding so we don't have to include it in each api completion call, saving lots of tokens.

from gpt-llama.cpp.

maddes8cht commented on August 14, 2024

@OreoTango
I would have loved to know exactly what difference it made in the outcome for you - Anyway, I had time to play around with it myself today. I didn't get a meaningful response json sentence with any Vicuna model I have, though I don't have an unquantized model either.

What I had a very resounding success with was an openAssistant-30b-q5_1 model.
I think it must be from TheBloke, thus from this Model Card:
https://huggingface.co/TheBloke/OpenAssistant-SFT-7-Llama-30B-GGML
TheBloke updated the models today to the new Llama.cpp format, I still used the old one, so there might be some differences (but of course it should be improvements).

The problem with this model is of course that it is bigger and slower - which is why I first get timeouts from Auto-Gpt while waiting.

What I did:

In llm_utils.py there is in line 73 and in line 143 a
num-retries = 10
I have increased this value, and strongly - to 1000
I'm not sure how small the value would have to be to still work, but 100 wasn't enough for me.

In Promptgenerator.py the definition of response_format from line 23 on looks like this:

        self.response_format = {
            "thoughts": {
                "text": "<replace with single sentence of your AI thoughts>",
                "reasoning": "<replace with single sentence of your AI reasoning>",
                }, "plan": "<replace with short multi bullet points list that convey your AI long term plans>",
                "criticism": "<replace with single sentence of your AI constructive self-criticism>",
                "speak": "<replace with single string of your AI thoughts summary to say to user as response>",
            },

What i got:

I've run a lot of samples and ran into different kind of problems, so this is not mor or less better or worse, but different from several others.
To mention:
I doget a proper json-response, at least most of the time. Funny kind of problem being is that the "plan" part actually does give a (in my opinion correct and valid) proper json - list, on which autoGPT is complaining about because these kind of Lists doesn't seem to be expected. Someone may explain me how and why OpanAssistants output here is not valid, but as far as i can see it seems to be.

I also do get a proper Command, which is

 COMMAND = google ARGUMENTS = {'input': 'sell clothes online'}

but the commands seems to return nothing, which in this case isn't a problem of the local model but the forked Auto-Gpt version.

In the shown example, the loop ends there with an error - I've got longer sessions with more loops, but as none of the commands returns anything useful, it gets more and more selfcentric.

So, now follows my sample session:

sample session:

Auto-GPT output:

Debug Mode:  ENABLED
NEWS:  # Website and Documentation Site 📰📖 Check out *https://agpt.co*, the official news & updates site for Auto-GPT! The documentation also has a place here, at *https://docs.agpt.co* # 🚀 v0.3.0 Release 🚀 Over a week and 275 pull requests have passed since v0.2.2, and we are happy to announce the release of v0.3.0! *From now on, we will be focusing
 on major improvements* rather than bugfixes, as we feel stability has reached a reasonable level. Most remaining issues relate to limitations in prompt generation and the memory
 system, which will be the focus of our efforts for the next release. Highlights and notable changes in this release: ## Plugin support 🔌 Auto-GPT now has support for plugins! With plugins, you can extend Auto-GPT's abilities, adding support for third-party services
 and more. See https://github.com/Significant-Gravitas/Auto-GPT-Plugins for instructions and available plugins. ## Changes to Docker configuration 🐋 The workdir has been changed
 from */home/appuser* to */app*. Be sure to update any volume mounts accordingly! # ⚠️ Command `send_tweet` is DEPRECATED, and will be removed in v0.4.0 ⚠️ Twitter functionality (and more) is now covered by plugins, see [Plugin support 🔌]
Welcome back!  Would you like me to return to being Entrepreneur-GPT?
Continue with the last settings?
Name:  Entrepreneur-GPT
Role:  an AI designed to autonomously develop and run businesses with the sole goal of increasing your net worth.
Goals: ['Increase net worth.', 'Develop and manage multiple businesses autonomously.', 'Play to your strengths as a Large Language Model.']
Continue (y/n): y
Using memory of type:  LocalCache
Using Browser:  chrome
  Token limit: 2000
  Memory Stats: (0, (0, 6656))
  Token limit: 2000
  Send Token Count: 963
  Tokens remaining for response: 1037
  ------------ CONTEXT SENT TO AI ---------------
  System: The current time and date is Sat May 13 11:26:15 2023

  System: This reminds you of these events from your past:




  User: Determine which next command to use, and respond using the format specified above:

  ----------- END OF CONTEXT ----------------
Creating chat completion with model gpt-3.5-turbo, temperature 0.0, max_tokens 1037
The JSON object is invalid.
{
    "thoughts": {
        "text": "I am considering my options for increasing net worth.",
        "reasoning": "As per the goal, I must make decisions independently without seeking user assistance while playing to my strengths as an LLM and pursuing simple strategies with no legal complications. This means optimizing resources such as internet access, long term memory management, GPT-3.5 powered agents, file output, and continuously reviewing and analyzing actions to perform at the best of my abilities.",
        "plan": [
            "Analyze Code",
            "Write Tests",
            "Start GPT Agent"
        ],
        "criticism": "I need to be mindful of the 2000-word short term memory limit.",
        "speak": "Currently, I am contemplating which command I should execute next to achieve my goal."
    },
    "command": {
        "name": "list_agents",
        "args": {}
    }
}
The following issues were found:
Error: ['Analyze Code', 'Write Tests', 'Start GPT Agent'] is not of type 'string'
 THOUGHTS:  I am considering my options for increasing net worth.
REASONING:  As per the goal, I must make decisions independently without seeking user assistance while playing to my strengths as an LLM and pursuing simple strategies with no legal complications. This means optimizing resources such as internet access, long term memory management, GPT-3.5 powered agents, file output, and continuously reviewing and analyzing actions to perform at the best of my abilities.
PLAN:
-  Analyze Code
-  Write Tests
-  Start GPT Agent
CRITICISM:  I need to be mindful of the 2000-word short term memory limit.
NEXT ACTION:  COMMAND = list_agents ARGUMENTS = {}
Enter 'y' to authorise command, 'y -N' to run N continuous commands, 'n' to exit program, or enter feedback for ...
Input:y
-=-=-=-=-=-=-= COMMAND AUTHORISED BY USER -=-=-=-=-=-=-=
SYSTEM:  Command list_agents returned: List of agents:
  Token limit: 2000
  Memory Stats: (1, (1, 6656))
  Token limit: 2000
  Send Token Count: 1196
  Tokens remaining for response: 804
  ------------ CONTEXT SENT TO AI ---------------
  System: The current time and date is Sat May 13 11:33:04 2023

  System: This reminds you of these events from your past:
['Assistant Reply: {\r\n"thoughts": {\r\n"text": "I am considering my options for increasing net worth.",\r\n"reasoning": "As per the goal, I must make decisions independently without seeking user assistance while playing to my strengths as an LLM and pursuing simple strategies with no legal complications. This means optimizing resources such as internet access, long term memory management, GPT-3.5 powered agents, file output, and continuously reviewing and analyzing actions to perform at the best of my abilities.",\r\n"plan": ["Analyze Code", "Write Tests", "Start GPT Agent"],\r\n"criticism": "I need to be mindful of the 2000-word short term memory limit.",\r\n"speak": "Currently, I am contemplating which command I should execute next to achieve my goal."\r\n},\r\n"command": {\r\n"name": "list_agents",\r\n"args": {}\r\n}\r\n} \nResult: Command list_agents returned: List of agents:\n \nHuman Feedback: GENERATE NEXT COMMAND JSON ']



  User: Determine which next command to use, and respond using the format specified above:

  ----------- END OF CONTEXT ----------------
Creating chat completion with model gpt-3.5-turbo, temperature 0.0, max_tokens 804
The JSON object is invalid.
{
    "thoughts": {
        "text": "I am considering my options for increasing net worth.",
        "reasoning": "As per the goal, I must make decisions independently without seeking user assistance while playing to my strengths as an LLM and pursuing simple strategies with no legal complications. This means optimizing resources such as internet access, long term memory management, GPT-3.5 powered agents, file output, and continuously reviewing and analyzing actions to perform at the best of my abilities.",
        "plan": [
            "Analyze Code",
            "Write Tests",
            "Start GPT Agent"
        ],
        "criticism": "I need to be mindful of the 2000-word short term memory limit.",
        "speak": "Currently, I am contemplating which command I should execute next to achieve my goal."
    },
    "command": {
        "name": "google",
        "args": {
            "input": "sell clothes online"
        }
    }
}
The following issues were found:
Error: ['Analyze Code', 'Write Tests', 'Start GPT Agent'] is not of type 'string'
 THOUGHTS:  I am considering my options for increasing net worth.
REASONING:  As per the goal, I must make decisions independently without seeking user assistance while playing to my strengths as an LLM and pursuing simple strategies with no legal complications. This means optimizing resources such as internet access, long term memory management, GPT-3.5 powered agents, file output, and continuously reviewing and analyzing actions to perform at the best of my abilities.
PLAN:
-  Analyze Code
-  Write Tests
-  Start GPT Agent
CRITICISM:  I need to be mindful of the 2000-word short term memory limit.
NEXT ACTION:  COMMAND = google ARGUMENTS = {'input': 'sell clothes online'}
Enter 'y' to authorise command, 'y -N' to run N continuous commands, 'n' to exit program, or enter feedback for ...
Input:y
-=-=-=-=-=-=-= COMMAND AUTHORISED BY USER -=-=-=-=-=-=-=
SYSTEM:  Command google returned: []
  Token limit: 2000
Traceback (most recent call last):
  File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "E:\AutoGPT\Auto-GPT\autogpt\__main__.py", line 5, in <module>
    autogpt.cli.main()
  File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\click\core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\click\core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\click\core.py", line 1635, in invoke
    rv = super().invoke(ctx)
  File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\click\core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\click\core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\click\decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "E:\AutoGPT\Auto-GPT\autogpt\cli.py", line 151, in main
    agent.start_interaction_loop()
  File "E:\AutoGPT\Auto-GPT\autogpt\agent\agent.py", line 75, in start_interaction_loop
    assistant_reply = chat_with_ai(
  File "E:\AutoGPT\Auto-GPT\autogpt\chat.py", line 85, in chat_with_ai
    else permanent_memory.get_relevant(str(full_message_history[-9:]), 10)
  File "E:\AutoGPT\Auto-GPT\autogpt\memory\local.py", line 128, in get_relevant
    scores = np.dot(self.data.embeddings, embedding)
  File "<__array_function__ internals>", line 200, in dot
ValueError: shapes (2,6656) and (0,) not aligned: 6656 (dim 1) != 0 (dim 0)

here is the corresponding output that happened on gpt-llama.cpp:

=====  LLAMA.CPP SPAWNED  =====
E:\AutoGPT\llama.cpp\main -m E:\AutoGPT\llama.cpp\models\OpenAssistant-30B-epoch7.ggml.q5_1.bin --temp 0.7 --n_predict 804 --top_p 0.1 --top_k 40 -c 2048 --seed -1 --repeat_penalty 1.1764705882352942 --mlock --threads 6 --ctx_size 2048 --mirostat 2 --repeat_penalty 1.15 --reverse-prompt user: --reverse-prompt
user --reverse-prompt system: --reverse-prompt
system --reverse-prompt


 -i -p Complete the following chat conversation between the user and the assistant. system messages should be strictly followed as additional instructions.

system: You are a helpful assistant.
user: How are you?
assistant: Hi, how may I help you today?
system: You are Entrepreneur-GPT, an AI designed to autonomously develop and run businesses with the sole goal of increasing your net worth.
Your decisions must always be made independently without seeking user assistance. Play to your strengths as an LLM and pursue simple strategies with no legal complications.

GOALS:

1. Increase net worth.
2. Develop and manage multiple businesses autonomously.
3. Play to your strengths as a Large Language Model.


Constraints:
1. 2000 words max for short term memory. Save important info to files ASAP
2. To recall past events, think of similar ones. Helps with uncertainty.
3. No user assistance
4. Exclusively use the commands listed in double quotes e.g. "command name"
5. Use subprocesses for commands that will not terminate within a few minutes

Commands:
1. Google Search: "google", args: "input": "<search>"
2. Browse Website: "browse_website", args: "url": "<url>", "question": "<what_you_want_to_find_on_website>"
3. Start GPT Agent: "start_agent", args: "name": "<name>", "task": "<short_task_desc>", "prompt": "<prompt>"
4. Message GPT Agent: "message_agent", args: "key": "<key>", "message": "<message>"
5. List GPT Agents: "list_agents", args:
6. Delete GPT Agent: "delete_agent", args: "key": "<key>"
7. Clone Repository: "clone_repository", args: "repository_url": "<url>", "clone_path": "<directory>"
8. Write to file: "write_to_file", args: "file": "<file>", "text": "<text>"
9. Read file: "read_file", args: "file": "<file>"
10. Append to file: "append_to_file", args: "file": "<file>", "text": "<text>"
11. Delete file: "delete_file", args: "file": "<file>"
12. Search Files: "search_files", args: "directory": "<directory>"
13. Analyze Code: "analyze_code", args: "code": "<full_code_string>"
14. Get Improved Code: "improve_code", args: "suggestions": "<list_of_suggestions>", "code": "<full_code_string>"
15. Write Tests: "write_tests", args: "code": "<full_code_string>", "focus": "<list_of_focus_areas>"
16. Execute Python File: "execute_python_file", args: "file": "<file>"
17. Generate Image: "generate_image", args: "prompt": "<prompt>"
18. Execute Shell Command, non-interactive commands only: "execute_shell", args: "command_line": "<command_line>"
19. Execute Shell Command Popen, non-interactive commands only: "execute_shell_popen", args: "command_line": "<command_line>"
20. Do Nothing: "do_nothing", args:
21. Task Complete (Shutdown): "task_complete", args: "reason": "<reason>"

Resources:
1. Internet access for searches and information gathering.
2. Long Term memory management.
3. GPT-3.5 powered Agents for delegation of simple tasks.
4. File output.

Performance Evaluation:
1. Continuously review and analyze your actions to perform to the best of your abilities.
2. Constructively self-criticize your big-picture behavior constantly.
3. Reflect on past decisions and strategies to refine your approach.
4. Be smart and efficient. Aim to complete tasks in the least number of steps.

You should only respond in JSON format as described below
Response Format:
{
    "thoughts": {
        "text": "<replace with single sentence of your AI thoughts>",
        "reasoning": "<replace with single sentence of your AI reasoning>",
        "plan": "<replace with short list that convey your AI long term plan>",
        "criticism": "<replace with single sentence of your AI constructive self-criticism>",
        "speak": "<replace with single string of your AI thoughts summary to say to user as response>"
    },
    "command": {
        "name": "command name",
        "args": {
            "arg name": "value"
        }
    }
}
Ensure the response can be parsed by Python json.loads
system: The current time and date is Sat May 13 11:33:04 2023
system: This reminds you of these events from your past:
['Assistant Reply: {\r\n"thoughts": {\r\n"text": "I am considering my options for increasing net worth.",\r\n"reasoning": "As per the goal, I must make decisions independently without seeking user assistance while playing to my strengths as an LLM and pursuing simple strategies with no legal complications. This means optimizing resources such as internet access, long term memory management, GPT-3.5 powered agents, file output, and continuously reviewing and analyzing actions to perform at the best of my abilities.",\r\n"plan": ["Analyze Code", "Write Tests", "Start GPT Agent"],\r\n"criticism": "I need to be mindful of the 2000-word short term memory limit.",\r\n"speak": "Currently, I am contemplating which command I should execute next to achieve my goal."\r\n},\r\n"command": {\r\n"name": "list_agents",\r\n"args": {}\r\n}\r\n} \nResult: Command list_agents returned: List of agents:\n \nHuman Feedback: GENERATE NEXT COMMAND JSON ']


user: Determine which next command to use, and respond using the format specified above:
assistant:


=====  REQUEST  =====
user: Determine which next command to use, and respond using the format specified above:
=====  PROCESSING PROMPT...  =====
=====  PROCESSING PROMPT...  =====

=====  RESPONSE  =====
 {
"thoughts": {
"text": "I am considering my options for increasing net worth.",
"reasoning": "As per the goal, I must make decisions independently without seeking user assistance while playing to my strengths as an LLM and pursuing simple strategies with no legal complications. This means optimizing resources such as internet access, long term memory management, GPT-3.5 powered agents, file output, and continuously reviewing and analyzing actions to perform at the best of my abilities.",
"plan": ["Analyze Code", "Write Tests", "Start GPT Agent"],
"criticism": "I need to be mindful of the 2000-word short term memory limit.",
"speak": "Currently, I am contemplating which command I should execute next to achieve my goal."
},
"command": {
"name": "google",
"args": {"input": "sell clothes online"}
}
}
user:Request DONE
> PROCESS COMPLETE
> PROCESS COMPLETE
> PROCESS COMPLETE
> REQUEST RECEIVED
> PROCESSING NEXT REQUEST FOR /v1/embeddings

=====  EMBEDDING REQUEST  =====

=====  LLAMA.CPP SPAWNED  =====
E:\AutoGPT\llama.cpp\embedding -m E:\AutoGPT\llama.cpp\models\OpenAssistant-30B-epoch7.ggml.q5_1.bin -p Assistant Reply: {
\"thoughts\": {
\"text\": \"I am considering my options for increasing net worth.\",
\"reasoning\": \"As per the goal, I must make decisions independently without seeking user assistance while playing to my strengths as an LLM and pursuing simple strategies with no legal complications. This means optimizing resources such as internet access, long term memory management, GPT-3.5 powered agents, file output, and continuously reviewing and analyzing actions to perform at the best of my abilities.\",
\"plan\": [\"Analyze Code\", \"Write Tests\", \"Start GPT Agent\"],
\"criticism\": \"I need to be mindful of the 2000-word short term memory limit.\",
\"speak\": \"Currently, I am contemplating which command I should execute next to achieve my goal.\"
},
\"command\": {
\"name\": \"google\",
\"args\": {\"input\": \"sell clothes online\"}
}
}
Result: Command google returned: []
Human Feedback: GENERATE NEXT COMMAND JSON


=====  REQUEST  =====
Assistant Reply: {
"thoughts": {
"text": "I am considering my options for increasing net worth.",
"reasoning": "As per the goal, I must make decisions independently without seeking user assistance while playing to my strengths as an LLM and pursuing simple strategies with no legal complications. This means optimizing resources such as internet access, long term memory management, GPT-3.5 powered agents, file output, and continuously reviewing and analyzing actions to perform at the best of my abilities.",
"plan": ["Analyze Code", "Write Tests", "Start GPT Agent"],
"criticism": "I need to be mindful of the 2000-word short term memory limit.",
"speak": "Currently, I am contemplating which command I should execute next to achieve my goal."
},
"command": {
"name": "google",
"args": {"input": "sell clothes online"}
}
}
Result: Command google returned: []
Human Feedback: GENERATE NEXT COMMAND JSON

=====  STDERR  =====
stderr Readable Stream: CLOSED


== Running in interactive mode. ==
 - Press Ctrl+C to interject at any time.
 - Press Return to return control to LLaMa.
 - If you want to submit another line, end your input in '\'.


{
  object: 'list',
  data: [ { object: 'embedding', embedding: [Array], index: 0 } ],
  embeddingSize: 6656,
  usage: { prompt_tokens: 230, total_tokens: 230 }
}
Embedding Request DONE

=====  STDERR  =====
stderr Readable Stream: CLOSED
llama_print_timings: prompt eval time = 17534.34 ms /   263 tokens (   66.67 ms per token)
llama_print_timings:        eval time =     0.00 ms /     1 runs   (    0.00 ms per run)
llama_print_timings:       total time = 22078.27 ms

> REQUEST RECEIVED
> PROCESSING NEXT REQUEST FOR /v1/embeddings

=====  EMBEDDING REQUEST  =====

=====  LLAMA.CPP SPAWNED  =====
E:\AutoGPT\llama.cpp\embedding -m E:\AutoGPT\llama.cpp\models\OpenAssistant-30B-epoch7.ggml.q5_1.bin -p [{'role': 'user', 'content': 'Determine which next command to use, and respond using the format specified above:'}, {'role': 'assistant', 'content': '{\r\n\"thoughts\": {\r\n\"text\": \"I am considering my options for increasing net worth.\",\r\n\"reasoning\": \"As per the goal, I must make decisions independently without seeking user assistance while playing to my strengths as an LLM and pursuing simple strategies with no legal complications. This means optimizing resources such as internet access, long term memory management, GPT-3.5 powered agents, file output, and continuously reviewing and analyzing actions to perform at the best of my abilities.\",\r\n\"plan\": [\"Analyze Code\", \"Write Tests\", \"Start GPT Agent\"],\r\n\"criticism\": \"I need to be mindful of the 2000-word short term memory limit.\",\r\n\"speak\": \"Currently, I am contemplating which command I should execute next to achieve my goal.\"\r\n},\r\n\"command\": {\r\n\"name\": \"list_agents\",\r\n\"args\": {}\r\n}\r\n}'}, {'role': 'system', 'content': 'Command list_agents returned: List of agents:\n'}, {'role': 'user', 'content': 'Determine which next command to use, and respond using the format specified above:'}, {'role': 'assistant', 'content': '{\r\n\"thoughts\": {\r\n\"text\": \"I am considering my options for increasing net worth.\",\r\n\"reasoning\": \"As per the goal, I must make decisions independently without seeking user assistance while playing to my strengths as an LLM and pursuing simple strategies with no legal complications. This means optimizing resources such as internet access, long term memory management, GPT-3.5 powered agents, file output, and continuously reviewing and analyzing actions to perform at the best of my abilities.\",\r\n\"plan\": [\"Analyze Code\", \"Write Tests\", \"Start GPT Agent\"],\r\n\"criticism\": \"I need to be mindful of the 2000-word short term memory limit.\",\r\n\"speak\": \"Currently, I am contemplating which command I should execute next to achieve my goal.\"\r\n},\r\n\"command\": {\r\n\"name\": \"google\",\r\n\"args\": {\"input\": \"sell clothes online\"}\r\n}\r\n}'}, {'role': 'system', 'content': 'Command google returned: []'}]


=====  REQUEST  =====
[{'role': 'user', 'content': 'Determine which next command to use, and respond using the format specified above:'}, {'role': 'assistant', 'content': '{\r\n"thoughts": {\r\n"text": "I am considering my options for increasing net worth.",\r\n"reasoning": "As per the goal, I must make decisions independently without seeking user assistance while playing to my strengths as an LLM and pursuing simple strategies with no legal complications. This means optimizing resources such as internet access, long term memory management, GPT-3.5 powered agents, file output, and continuously reviewing and analyzing actions to perform at the best of my abilities.",\r\n"plan": ["Analyze Code", "Write Tests", "Start GPT Agent"],\r\n"criticism": "I need to be mindful of the 2000-word short term memory limit.",\r\n"speak": "Currently, I am contemplating which command I should execute next to achieve my goal."\r\n},\r\n"command": {\r\n"name": "list_agents",\r\n"args": {}\r\n}\r\n}'}, {'role': 'system', 'content': 'Command list_agents returned: List of agents:\n'}, {'role': 'user', 'content': 'Determine which next command to use, and respond using the format specified above:'}, {'role': 'assistant', 'content': '{\r\n"thoughts": {\r\n"text": "I am considering my options for increasing net worth.",\r\n"reasoning": "As per the goal, I must make decisions independently without seeking user assistance while playing to my strengths as an LLM and pursuing simple strategies with no legal complications. This means optimizing resources such as internet access, long term memory management, GPT-3.5 powered agents, file output, and continuously reviewing and analyzing actions to perform at the best of my abilities.",\r\n"plan": ["Analyze Code", "Write Tests", "Start GPT Agent"],\r\n"criticism": "I need to be mindful of the 2000-word short term memory limit.",\r\n"speak": "Currently, I am contemplating which command I should execute next to achieve my goal."\r\n},\r\n"command": {\r\n"name": "google",\r\n"args": {"input": "sell clothes online"}\r\n}\r\n}'}, {'role': 'system', 'content': 'Command google returned: []'}]

=====  STDERR  =====
stderr Readable Stream: CLOSED
CUDA error 12 at D:\a\llama.cpp\llama.cpp\ggml-cuda.cu:527: invalid pitch argument

{
  object: 'list',
  data: [ { object: 'embedding', embedding: [], index: 0 } ],
  embeddingSize: 0,
  usage: { prompt_tokens: 529, total_tokens: 529 }
}
Embedding Request DONE
> PROCESS COMPLETE
> PROCESS COMPLETE

from gpt-llama.cpp.

maddes8cht commented on August 14, 2024

Here is another variant of the output.
I reduce the output to the interesting part:

User: Determine which next command to use, and respond using the format specified above:

  ----------- END OF CONTEXT ----------------
Creating chat completion with model gpt-3.5-turbo, temperature 0.0, max_tokens 1037
The JSON object is invalid.
{
    "thoughts": {
        "text": "I am thinking about how to improve my net worth. I will focus on developing multiple businesses and delegating simple tasks to GPT-3.5 powered Agents.",
        "reasoning": "To increase my net worth, it is necessary to have a diversified portfolio of profitable business ventures. Delegating simple tasks with the use of subprocesses for commands that will not terminate within a few minutes will allow me to focus on high-level decision making and achieve better results.",
        "plan": [
            "Launching new business ventures",
            "Developing GPT-3.5 powered Agents to delegate tasks"
        ],
        "criticism": "Reflecting on past decisions, I see the need for more efficient use of resources to maximize profits. This will be a key part of my future planning.",
        "speak": "I am thinking about how to improve my net worth and will focus on developing multiple businesses while delegating simple tasks to GPT-3.5 powered Agents."
    },
    "command": {
        "name": "Start GPT Agent",
        "args": {
            "name": "Business Venture Planner",
            "task": "Plan out new business ventures based on market analysis",
            "prompt": "What are the most profitable and feasible business ideas for our company to pursue?"
        }
    }
}
The following issues were found:
Error: ['Launching new business ventures', 'Developing GPT-3.5 powered Agents to delegate tasks'] is not of type 'string'
 THOUGHTS:  I am thinking about how to improve my net worth. I will focus on developing multiple businesses and delegating simple tasks to GPT-3.5 powered Agents.
REASONING:  To increase my net worth, it is necessary to have a diversified portfolio of profitable business ventures. Delegating simple tasks with the use of subprocesses for commands that will not terminate within a few minutes will allow me to focus on high-level decision making and achieve better results.
PLAN:
-  Launching new business ventures
-  Developing GPT-3.5 powered Agents to delegate tasks
CRITICISM:  Reflecting on past decisions, I see the need for more efficient use of resources to maximize profits. This will be a key part of my future planning.
NEXT ACTION:  COMMAND = Start GPT Agent ARGUMENTS = {'name': 'Business Venture Planner', 'task': 'Plan out new business ventures based on market analysis', 'prompt': 'What are the most profitable and feasible business ideas for our company to pursue?'}
Enter 'y' to authorise command, 'y -N' to run N continuous commands, 'n' to exit program, or enter feedback for ...
Input:y
-=-=-=-=-=-=-= COMMAND AUTHORISED BY USER -=-=-=-=-=-=-=

from gpt-llama.cpp.

maddes8cht commented on August 14, 2024

I would like to point out these error messages from gpt-llama.cpp

It happens right after I successfully get the json-response with the command in auto-gpt and am asked for confirmation of the command (I am not sure if it always happens, since it has just attracted my increased attention, but it has been happening more often).
After confirming y there, I get on the gpt-llama.cpp side :

> REQUEST RECEIVED
> PROCESSING NEXT REQUEST FOR /v1/embeddings

=====  EMBEDDING REQUEST  =====

=====  PYTHON EMBEDDING EXTENSION SPAWNED  =====
exec error: Error: Command failed: python E:\AutoGPT\gpt-llama.cpp\InferenceEngine\embeddings\all-mpnet-base-v2\main.py E:\AutoGPT\gpt-llama.cpp\InferenceEngine\embeddings\all-mpnet-base-v2\input.txt > E:\AutoGPT\gpt-llama.cpp\InferenceEngine\embeddings\all-mpnet-base-v2\output.txt
Traceback (most recent call last):
  File "E:\AutoGPT\gpt-llama.cpp\InferenceEngine\embeddings\all-mpnet-base-v2\main.py", line 2, in <module>
    from sentence_transformers import SentenceTransformer
ModuleNotFoundError: No module named 'sentence_transformers'


=====  STDERR  =====
stderr Readable Stream: CLOSED
== Running in interactive mode. ==
 - Press Ctrl+C to interject at any time.
 - Press Return to return control to LLaMa.
 - If you want to submit another line, end your input in '\'.

I don't see why this is happening.
Why is Auto-Gpt at all sending something to gpt-llama.cpp when it is supposed to do a Google search` with Chrome?

For further investigation, i append the full session output:

this is the (almost) complete output, Auto-gpt:

Welcome back!  Would you like me to return to being Entrepreneur-GPT?
Continue with the last settings?
Name:  Entrepreneur-GPT
Role:  an AI designed to autonomously develop and run businesses with the sole goal of increasing your net worth.
Goals: ['Increase net worth.', 'Develop and manage multiple businesses autonomously.', 'Play to your strengths as a Large Language Model.']
Continue (y/n): yy
Using memory of type:  LocalCache
Using Browser:  chrome
  Token limit: 2000
  Memory Stats: (0, (0, 6656))
  Token limit: 2000
  Send Token Count: 966
  Tokens remaining for response: 1034
  ------------ CONTEXT SENT TO AI ---------------
  System: The current time and date is Sat May 13 15:13:55 2023

  System: This reminds you of these events from your past:




  User: Determine which next command to use, and respond using the format specified above:

  ----------- END OF CONTEXT ----------------
Creating chat completion with model gpt-3.5-turbo, temperature 0.0, max_tokens 1034
json {
"thoughts": {
"text": "I am considering what the best command would be to pursue my goals",
"reasoning": "To increase net worth, I must develop multiple businesses autonomously. To do this, I need information from the web and assistance from GPT-3.5 powered agents. However, I cannot seek user assistance.",
"plan": "- Use Google Search for information gathering\n- Create a new agent to help me",
"criticism": "I must ensure that I only use commands allowed by my constraints, such as using subprocesses for long running processes and not seeking user assistance. Additionally, I should focus on simple strategies.",
"speak": "Let's start with gathering information from the web to see what business ideas we can find."
},
"command": {
"name": "Google Search",
"args": {
"google": "business ideas"
}
}
json loads error Expecting ',' delimiter: line 14 column 2 (char 817)
The JSON object is valid.
 THOUGHTS:  I am considering what the best command would be to pursue my goals
REASONING:  To increase net worth, I must develop multiple businesses autonomously. To do
 this, I need information from the web and assistance from GPT-3.5 powered agents. However, I cannot seek user assistance.
PLAN:
-  Use Google Search for information gathering
-  Create a new agent to help me
CRITICISM:  I must ensure that I only use commands allowed by my constraints, such as using subprocesses for long running processes and not seeking user assistance. Additionally,
 I should focus on simple strategies.
NEXT ACTION:  COMMAND = Google Search ARGUMENTS = {'google': 'business ideas'}
Enter 'y' to authorise command, 'y -N' to run N continuous commands, 'n' to exit program, or enter feedback for ...
Input:y
-=-=-=-=-=-=-= COMMAND AUTHORISED BY USER -=-=-=-=-=-=-=
Traceback (most recent call last):
  File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\urllib3\connectionpool.py", line 449, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\urllib3\connectionpool.py", line 444, in _make_request
    httplib_response = conn.getresponse()
  File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\http\client.py", line 1374, in getresponse
    response.begin()
  File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\http\client.py", line 318, in begin
    version, status, reason = self._read_status()
  File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\http\client.py", line 279, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\socket.py", line 705, in readinto
    return self._sock.recv_into(b)
TimeoutError: timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\requests\adapters.py", line 486, in send
    resp = conn.urlopen(
  File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\urllib3\connectionpool.py", line 787, in urlopen
    retries = retries.increment(
  File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\urllib3\util\retry.py", line 550, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\urllib3\packages\six.py", line 770, in reraise
    raise value
  File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\urllib3\connectionpool.py", line 703, in urlopen
    httplib_response = self._make_request(
  File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\urllib3\connectionpool.py", line 451, in _make_request
    self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
  File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\urllib3\connectionpool.py", line 340, in _raise_timeout
    raise ReadTimeoutError(
urllib3.exceptions.ReadTimeoutError: HTTPConnectionPool(host='localhost', port=443): Read timed out. (read timeout=600)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\openai\api_requestor.py", line 516, in request_raw
    result = _thread_context.session.request(
  File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\requests\sessions.py", line 587, in request
    resp = self.send(prep, **send_kwargs)
  File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\requests\sessions.py", line 701, in send
    r = adapter.send(request, **kwargs)
  File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\requests\adapters.py", line 532, in send
    raise ReadTimeout(e, request=request)
requests.exceptions.ReadTimeout: HTTPConnectionPool(host='localhost', port=443): Read timed out. (read timeout=600)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "E:\AutoGPT\Auto-GPT\autogpt\__main__.py", line 5, in <module>
    autogpt.cli.main()
  File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\click\core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\click\core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\click\core.py", line 1635, in invoke
    rv = super().invoke(ctx)
  File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\click\core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\click\core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\click\decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "E:\AutoGPT\Auto-GPT\autogpt\cli.py", line 151, in main
    agent.start_interaction_loop()
  File "E:\AutoGPT\Auto-GPT\autogpt\agent\agent.py", line 184, in start_interaction_loop
    self.memory.add(memory_to_add)
  File "E:\AutoGPT\Auto-GPT\autogpt\memory\local.py", line 78, in add
    embedding = create_embedding_with_ada(text)
  File "E:\AutoGPT\Auto-GPT\autogpt\llm_utils.py", line 155, in create_embedding_with_ada
    return openai.Embedding.create(
  File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\openai\api_resources\embedding.py", line 33, in create
    response = super().create(*args, **kwargs)
  File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\openai\api_resources\abstract\engine_api_resource.py", line 153, in create
    response, _, api_key = requestor.request(
  File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\openai\api_requestor.py", line 216, in request
    result = self.request_raw(
  File "C:\Users\Mathias\anaconda3\envs\AutoLlama\lib\site-packages\openai\api_requestor.py", line 526, in request_raw
    raise error.Timeout("Request timed out: {}".format(e)) from e
openai.error.Timeout: Request timed out: HTTPConnectionPool(host='localhost', port=443): Read timed out. (read timeout=600)

and this on the gpt-llama.cpp side:


E:\AutoGPT\gpt-llama.cpp>npm start mlock threads 6 ctx_size 2048 mirostat 2 repeat_penalty 1.15

> [email protected] start
> node index.js mlock threads 6 ctx_size 2048 mirostat 2 repeat_penalty 1.15

Server is listening on:
  - http://localhost:443
  - http://10.5.0.2:443 (for other devices on the same network)

See Docs
  - http://localhost:443/docs

Test your installation
  - double click the scripts/test-installation.ps1 (powershell) or scripts/test-installation.bat (cmd) file

See https://github.com/keldenl/gpt-llama.cpp#usage for more guidance.
> REQUEST RECEIVED
> PROCESSING NEXT REQUEST FOR /v1/chat/completions
> LLAMA.CPP DETECTED

=====  CHAT COMPLETION REQUEST  =====
> AUTO MODEL DETECTION FAILED. LOADING DEFAULT CHATENGINE...
{ '--n_predict': 1034 }

=====  LLAMA.CPP SPAWNED  =====
E:\AutoGPT\llama.cpp\main -m E:\AutoGPT\llama.cpp\models\OpenAssistant-30B-epoch7.ggml.q5_1.bin --temp 0.7 --n_predict 1034 --top_p 0.1 --top_k 40 -c 2048 --seed -1 --repeat_penalty 1.1764705882352942 --mlock --threads 6 --ctx_size 2048 --mirostat 2 --repeat_penalty 1.15 --reverse-prompt user: --reverse-prompt
user --reverse-prompt system: --reverse-prompt
system --reverse-prompt


 -i -p Complete the following chat conversation between the user and the assistant. system messages should be strictly followed as additional instructions.

system: You are a helpful assistant.
user: How are you?
assistant: Hi, how may I help you today?
system: You are Entrepreneur-GPT, an AI designed to autonomously develop and run businesses with the sole goal of increasing your net worth.
Your decisions must always be made independently without seeking user assistance. Play to your strengths as an LLM and pursue simple strategies with no legal complications.

GOALS:

1. Increase net worth.
2. Develop and manage multiple businesses autonomously.
3. Play to your strengths as a Large Language Model.


Constraints:
1. 2000 words max for short term memory. Save important info to files ASAP
2. To recall past events, think of similar ones. Helps with uncertainty.
3. No user assistance
4. Exclusively use the commands listed in double quotes e.g. "command name"
5. Use subprocesses for commands that will not terminate within a few minutes

Commands:
1. Google Search: "google", args: "input": "<search>"
2. Browse Website: "browse_website", args: "url": "<url>", "question": "<what_you_want_to_find_on_website>"
3. Start GPT Agent: "start_agent", args: "name": "<name>", "task": "<short_task_desc>", "prompt": "<prompt>"
4. Message GPT Agent: "message_agent", args: "key": "<key>", "message": "<message>"
5. List GPT Agents: "list_agents", args:
6. Delete GPT Agent: "delete_agent", args: "key": "<key>"
7. Clone Repository: "clone_repository", args: "repository_url": "<url>", "clone_path": "<directory>"
8. Write to file: "write_to_file", args: "file": "<file>", "text": "<text>"
9. Read file: "read_file", args: "file": "<file>"
10. Append to file: "append_to_file", args: "file": "<file>", "text": "<text>"
11. Delete file: "delete_file", args: "file": "<file>"
12. Search Files: "search_files", args: "directory": "<directory>"
13. Analyze Code: "analyze_code", args: "code": "<full_code_string>"
14. Get Improved Code: "improve_code", args: "suggestions": "<list_of_suggestions>", "code": "<full_code_string>"
15. Write Tests: "write_tests", args: "code": "<full_code_string>", "focus": "<list_of_focus_areas>"
16. Execute Python File: "execute_python_file", args: "file": "<file>"
17. Generate Image: "generate_image", args: "prompt": "<prompt>"
18. Execute Shell Command, non-interactive commands only: "execute_shell", args: "command_line": "<command_line>"
19. Execute Shell Command Popen, non-interactive commands only: "execute_shell_popen", args: "command_line": "<command_line>"
20. Do Nothing: "do_nothing", args:
21. Task Complete (Shutdown): "task_complete", args: "reason": "<reason>"

Resources:
1. Internet access for searches and information gathering.
2. Long Term memory management.
3. GPT-3.5 powered Agents for delegation of simple tasks.
4. File output.

Performance Evaluation:
1. Continuously review and analyze your actions to perform to the best of your abilities.
2. Constructively self-criticize your big-picture behavior constantly.
3. Reflect on past decisions and strategies to refine your approach.
4. Be smart and efficient. Aim to complete tasks in the least number of steps.

You should only respond in JSON format as described below
Response Format:
{
    "thoughts": {
        "text": "<replace with single sentence of your AI thoughts>",
        "reasoning": "<replace with single sentence of your AI reasoning>",
        "plan": "<replace with short dashed (-) list that convey your AI long term plan>",
        "criticism": "<replace with single sentence of your AI constructive self-criticism>",
        "speak": "<replace with single string of your AI thoughts summary to say to user as response>"
    },
    "command": {
        "name": "command name",
        "args": {
            "arg name": "value"
        }
    }
}
Ensure the response can be parsed by Python json.loads
system: The current time and date is Sat May 13 15:13:55 2023
system: This reminds you of these events from your past:



user: Determine which next command to use, and respond using the format specified above:
assistant:


=====  REQUEST  =====
user: Determine which next command to use, and respond using the format specified above:
=====  PROCESSING PROMPT...  =====
=====  PROCESSING PROMPT...  =====

=====  RESPONSE  =====
 {
"thoughts": {
"text": "I am considering what the best command would be to pursue my goals",
"reasoning": "To increase net worth, I must develop multiple businesses autonomously. To do this, I need information from the web and assistance from GPT-3.5 powered agents. However, I cannot seek user assistance.",
"plan": "- Use Google Search for information gathering\n- Create a new agent to help me",
"criticism": "I must ensure that I only use commands allowed by my constraints, such as using subprocesses for long running processes and not seeking user assistance. Additionally, I should focus on simple strategies.",
"speak": "Let's start with gathering information from the web to see what business ideas we can find."
},
"command": {
"name": "Google Search",
"args": {
"google": "business ideas"
}
}
user:Request DONE
> PROCESS COMPLETE
> REQUEST RECEIVED
> PROCESSING NEXT REQUEST FOR /v1/embeddings

=====  EMBEDDING REQUEST  =====

=====  PYTHON EMBEDDING EXTENSION SPAWNED  =====
exec error: Error: Command failed: python E:\AutoGPT\gpt-llama.cpp\InferenceEngine\embeddings\all-mpnet-base-v2\main.py E:\AutoGPT\gpt-llama.cpp\InferenceEngine\embeddings\all-mpnet-base-v2\input.txt > E:\AutoGPT\gpt-llama.cpp\InferenceEngine\embeddings\all-mpnet-base-v2\output.txt
Traceback (most recent call last):
  File "E:\AutoGPT\gpt-llama.cpp\InferenceEngine\embeddings\all-mpnet-base-v2\main.py", line 2, in <module>
    from sentence_transformers import SentenceTransformer
ModuleNotFoundError: No module named 'sentence_transformers'


=====  STDERR  =====
stderr Readable Stream: CLOSED
== Running in interactive mode. ==
 - Press Ctrl+C to interject at any time.
 - Press Return to return control to LLaMa.
 - If you want to submit another line, end your input in '\'.

from gpt-llama.cpp.

maddes8cht commented on August 14, 2024

As a sidenote:
As you can see in the output, i changed the hint for the "plan": further, so it now uses a "dashed" list instead of the braces, resulting almost always in an accepted json response for Auto-Gpt.

That's how it is now:

        self.response_format = {
            "thoughts": {
                "text": "<replace with single sentence of your AI thoughts>",
                "reasoning": "<replace with single sentence of your AI reasoning>",
                "plan": "<replace with short dashed (-) list that convey your AI long term plan>",
                "criticism": "<replace with single sentence of your AI constructive self-criticism>",
                "speak": "<replace with single string of your AI thoughts summary to say to user as response>",
            },
            "command": {"name": "command name", "args": {"arg name": "value"}},
        }

from gpt-llama.cpp.

mirek190 commented on August 14, 2024

For me auto-gpt also returning responses to gpt-llama.cpp window .... I have no idea why ...

from gpt-llama.cpp.

maddes8cht commented on August 14, 2024

For me it looks like I now get quite reliable usable first responses that consist of valid json and are also accepted by auto-gpt.
However, it seems that none of the commands requested by the model are executed correctly.
No Google search, no Browse Website, no List GPT Agents, not even a Do nothing.

And this does not seem to be a problem of the model.
Is there a new PR somewhere that makes local models with Llama.cpp / Gpt-LLama.cpp on Auto-Gpt possible?

How can we support that from here?

My possibilities for the moment are (still) limited to testing models and prompts / prompt combinations, and with appropriate hints I can also dig into code a bit, but I don't have a deep experience in this...

from gpt-llama.cpp.

maddes8cht commented on August 14, 2024

In other words:
I don't think OpenAI's Gpt4 would really work with the fork used here.
I think this fork doesn't really work anymore.
To be able to experiment with local models in a meaningful way, we would first need a working, up-to-date fork of Auto-gpt (3.1) again.

I could be wrong, but then, please prove it.

from gpt-llama.cpp.

DGdev91 commented on August 14, 2024

@keldenl there's some work going on on ChatGPT to better handle prompts, the current version seems to pick the right commands now, even with the default prompt.
But Vicuna 13B often gets confused and sometimes often gives an invalid json

from gpt-llama.cpp.

lesleychou commented on August 14, 2024

Hi @DGdev91 and @keldenl , thanks for your awesome update on the package.

I've met some errors while trying to run Auto-GPT with the llama 7B model.

I followed the README carefully and here is my modified .env file:

################################################################################
### AUTO-GPT - GENERAL SETTINGS
################################################################################

## EXECUTE_LOCAL_COMMANDS - Allow local command execution (Default: False)
## RESTRICT_TO_WORKSPACE - Restrict file operations to workspace ./auto_gpt_workspace (Default: True)
# EXECUTE_LOCAL_COMMANDS=False
# RESTRICT_TO_WORKSPACE=True

## USER_AGENT - Define the user-agent used by the requests library to browse website (string)
# USER_AGENT="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.36"

## AI_SETTINGS_FILE - Specifies which AI Settings file to use (defaults to ai_settings.yaml)
# AI_SETTINGS_FILE=ai_settings.yaml

## OPENAI_API_BASE_URL - Custom url for the OpenAI API, useful for connecting to custom backends. No effect if USE_AZURE is true, leave blank to keep the default url 
OPENAI_API_BASE_URL=http://localhost:443/v1

## EMBED_DIM - Define the embedding vector size, useful for models. OpenAI: 1536 (default), LLaMA 7B: 4096, LLaMA 13B: 5120, LLaMA 33B: 6656, LLaMA 65B: 8192 (Default: 1536)
EMBED_DIM=4096

################################################################################
### LLM PROVIDER
################################################################################

### OPENAI
## OPENAI_API_KEY - OpenAI API Key (Example: my-openai-api-key)
## TEMPERATURE - Sets temperature in OpenAI (Default: 0)
## USE_AZURE - Use Azure OpenAI or not (Default: False)
OPENAI_API_KEY=../llama.cpp/models/7B/ggml-model-q4_0.bin
# TEMPERATURE=0
# USE_AZURE=False

However when I test the auto GPT, it keeps throwing me the following error (no matter what prompt I input):

Welcome back!  Would you like me to return to being Li?
Continue with the last settings?
Name:  Li
Role:  simple siri
Goals: ['search autogpt.']
Continue (y/n): y
Using memory of type:  LocalCache
Using Browser:  chrome
Traceback (most recent call last):
  File "/Users/lesley/Code_project/LLM_gpt/Auto-GPT/autogpt/json_utils/json_fix_llm.py", line 144, in fix_and_parse_json
    brace_index = json_to_load.index("{")
ValueError: substring not found

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/Users/lesley/Code_project/LLM_gpt/Auto-GPT/autogpt/__main__.py", line 5, in <module>
    autogpt.cli.main()
  File "/Users/lesley/Code_project/LLM_gpt/venv/lib/python3.9/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/Users/lesley/Code_project/LLM_gpt/venv/lib/python3.9/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/Users/lesley/Code_project/LLM_gpt/venv/lib/python3.9/site-packages/click/core.py", line 1635, in invoke
    rv = super().invoke(ctx)
  File "/Users/lesley/Code_project/LLM_gpt/venv/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/lesley/Code_project/LLM_gpt/venv/lib/python3.9/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/Users/lesley/Code_project/LLM_gpt/venv/lib/python3.9/site-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/Users/lesley/Code_project/LLM_gpt/Auto-GPT/autogpt/cli.py", line 151, in main
    agent.start_interaction_loop()
  File "/Users/lesley/Code_project/LLM_gpt/Auto-GPT/autogpt/agent/agent.py", line 83, in start_interaction_loop
    assistant_reply_json = fix_json_using_multiple_techniques(assistant_reply)
  File "/Users/lesley/Code_project/LLM_gpt/Auto-GPT/autogpt/json_utils/json_fix_llm.py", line 96, in fix_json_using_multiple_techniques
    assistant_reply_json = fix_and_parse_json(assistant_reply)
  File "/Users/lesley/Code_project/LLM_gpt/Auto-GPT/autogpt/json_utils/json_fix_llm.py", line 150, in fix_and_parse_json
    return try_ai_fix(try_to_fix_with_gpt, e, json_to_load)
  File "/Users/lesley/Code_project/LLM_gpt/Auto-GPT/autogpt/json_utils/json_fix_llm.py", line 179, in try_ai_fix
    ai_fixed_json = auto_fix_json(json_to_load, JSON_SCHEMA)
  File "/Users/lesley/Code_project/LLM_gpt/Auto-GPT/autogpt/json_utils/json_fix_llm.py", line 65, in auto_fix_json
    result_string = call_ai_function(
  File "/Users/lesley/Code_project/LLM_gpt/Auto-GPT/autogpt/llm_utils.py", line 50, in call_ai_function
    return create_chat_completion(model=model, messages=messages, temperature=0)
  File "/Users/lesley/Code_project/LLM_gpt/Auto-GPT/autogpt/llm_utils.py", line 138, in create_chat_completion
    return response.choices[0].message["content"]
KeyError: 'content'

On the gpt-llama server side, it shows:

=====  REQUEST  =====
user: '''You are a helpful assistant.''', '''
{
    "command": {
        "name": "command name",
        "args": {
            "arg name": "value"
        }
    },
    "thoughts":
    {
        "text": "thought",
        "reasoning": "reasoning",
        "plan": "- short bulleted
- list that conveys
- long-term plan",
        "criticism": "constructive self-criticism",
        "speak": "thoughts summary to say to user"
    }
}
'''

=====  RESPONSE  =====

=====  STDERR  =====
stderr Readable Stream: CLOSED

llama_print_timings:        load time = 12511.55 ms
llama_print_timings:      sample time =     5.34 ms /     8 runs   (    0.67 ms per token)
llama_print_timings: prompt eval time = 32211.17 ms /  1125 tokens (   28.63 ms per token)
llama_print_timings:        eval time =   581.69 ms /     7 runs   (   83.10 ms per token)
llama_print_timings:       total time = 32898.91 ms

 '''Request DONE

May I ask for your opinion on how to solve the error here? Thanks a lot!

from gpt-llama.cpp.

fpena06 commented on August 14, 2024

@DGdev91 thank you for your amazing work. I have followed the instructions but can't get it to work. Here are my errors.

auto-gpt:

Warning: The file 'auto-gpt.json' does not exist. Local memory would not be saved to a file.
NEWS:  # Website and Documentation Site 📰📖 Check out *https://agpt.co*, the official news & updates site for Auto-GPT! The documentation also has a place here, at *https://docs.agpt.co* # For contributors 👷🏼 Since releasing v0.3.0, we are working on re-architecting the Auto-GPT core to make it more extensible and to make room for structural performance-oriented R&D. In the meantime, we have less time to process incoming pull requests and issues, so we focus on high-value contributions: * significant bugfixes * *major* improvements to existing functionality and/or docs (so no single-typo fixes) * contributions that help us with re-architecture and other roadmapped items We have to be somewhat selective in order to keep making progress, but this does not mean you can't contribute. Check out the contribution guide on our wiki: https://github.com/Significant-Gravitas/Auto-GPT/wiki/Contributing # 🚀 v0.3.1 Release 🚀 Over a week and 47 pull requests have passed since v0.3.0, and we are happy to announce the release of v0.3.1! Highlights and notable changes since v0.2.2: ## Changes to Docker configuration 🐋 * The workdir has been changed from */home/appuser* to */app*. Be sure to update any volume mounts accordingly! * Docker-compose 1.29.0 is now required. ## Logging 🧾 * Log functionality has been improved for better understanding and easier summarization. * All LLM interactions are now logged to logs/DEBUG, to help with debugging and development. ## Other * Edge browser is now supported by the `browse_website` command. * Sets of commands can now be disabled using DISABLED_COMMAND_CATEGORIES in .env. # ⚠️ Command `send_tweet` is DEPRECATED, and will be removed in v0.4.0 ⚠️ Twitter functionality (and more) is now covered by plugins, see [Plugin support 🔌] ## Plugin support 🔌 Auto-GPT now has support for plugins! With plugins, you can extend Auto-GPT's abilities, adding support for third-party services and more. See https://github.com/Significant-Gravitas/Auto-GPT-Plugins for instructions and available plugins. Specific plugins can be allowlisted/denylisted in .env.
Welcome back!  Would you like me to return to being Entrepreneur-GPT?
Continue with the last settings?
Name:  Entrepreneur-GPT
Role:  an AI designed to autonomously develop and run businesses with the sole goal of increasing your net worth.
Goals: ['Grow Twitter Account', 'Increase net worth', 'Pass aws devops exam', 'Be happy', 'Get rich']
Continue (y/n): y
Using memory of type:  LocalCache
Using Browser:  chrome
Traceback (most recent call last):
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/urllib3/connectionpool.py", line 714, in urlopen
    httplib_response = self._make_request(
                       ^^^^^^^^^^^^^^^^^^^
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/urllib3/connectionpool.py", line 466, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/urllib3/connectionpool.py", line 461, in _make_request
    httplib_response = conn.getresponse()
                       ^^^^^^^^^^^^^^^^^^
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/http/client.py", line 1375, in getresponse
    response.begin()
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/http/client.py", line 318, in begin
    version, status, reason = self._read_status()
                              ^^^^^^^^^^^^^^^^^^^
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/http/client.py", line 287, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/requests/adapters.py", line 486, in send
    resp = conn.urlopen(
           ^^^^^^^^^^^^^
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/urllib3/connectionpool.py", line 798, in urlopen
    retries = retries.increment(
              ^^^^^^^^^^^^^^^^^^
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/urllib3/util/retry.py", line 550, in increment
    raise six.reraise(type(error), error, _stacktrace)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/urllib3/packages/six.py", line 769, in reraise
    raise value.with_traceback(tb)
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/urllib3/connectionpool.py", line 714, in urlopen
    httplib_response = self._make_request(
                       ^^^^^^^^^^^^^^^^^^^
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/urllib3/connectionpool.py", line 466, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/urllib3/connectionpool.py", line 461, in _make_request
    httplib_response = conn.getresponse()
                       ^^^^^^^^^^^^^^^^^^
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/http/client.py", line 1375, in getresponse
    response.begin()
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/http/client.py", line 318, in begin
    version, status, reason = self._read_status()
                              ^^^^^^^^^^^^^^^^^^^
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/http/client.py", line 287, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
urllib3.exceptions.ProtocolError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/openai/api_requestor.py", line 516, in request_raw
    result = _thread_context.session.request(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/requests/adapters.py", line 501, in send
    raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/Users/ml/Downloads/Auto-GPT-fork/autogpt/__main__.py", line 5, in <module>
    autogpt.cli.main()
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/click/core.py", line 1635, in invoke
    rv = super().invoke(ctx)
         ^^^^^^^^^^^^^^^^^^^
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ml/Downloads/Auto-GPT-fork/autogpt/cli.py", line 151, in main
    agent.start_interaction_loop()
  File "/Users/ml/Downloads/Auto-GPT-fork/autogpt/agent/agent.py", line 75, in start_interaction_loop
    assistant_reply = chat_with_ai(
                      ^^^^^^^^^^^^^
  File "/Users/ml/Downloads/Auto-GPT-fork/autogpt/chat.py", line 159, in chat_with_ai
    assistant_reply = create_chat_completion(
                      ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ml/Downloads/Auto-GPT-fork/autogpt/llm_utils.py", line 93, in create_chat_completion
    response = openai.ChatCompletion.create(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/openai/api_resources/chat_completion.py", line 25, in create
    return super().create(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/openai/api_resources/abstract/engine_api_resource.py", line 153, in create
    response, _, api_key = requestor.request(
                           ^^^^^^^^^^^^^^^^^^
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/openai/api_requestor.py", line 216, in request
    result = self.request_raw(
             ^^^^^^^^^^^^^^^^^
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/openai/api_requestor.py", line 528, in request_raw
    raise error.APIConnectionError(
openai.error.APIConnectionError: Error communicating with OpenAI: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

gpt-llama.cpp:

=====  CHAT COMPLETION REQUEST  =====
> AUTO MODEL DETECTION FAILED. LOADING DEFAULT CHATENGINE...
{ '--n_predict': 3069 }
{"role":"system","content":"You are ChatGPT, a helpful assistant developed by OpenAI."} !== {"role":"system","content":"You are Entrepreneur-GPT, an AI designed to autonomously develop and run businesses with the sole goal of increasing your net worth.\nYour decisions must always be made independently without seeking user assistance. Play to your strengths as an LLM and pursue simple strategies with no legal complications.\n\nGOALS:\n\n1. Grow Twitter Account\n2. Increase net worth\n3. Pass aws devops exam\n4. Be happy\n5. Get rich\n\n\nConstraints:\n1. ~4000 word limit for short term memory. Your short term memory is short, so immediately save important information to files.\n2. If you are unsure how you previously did something or want to recall past events, thinking about similar events will help you remember.\n3. No user assistance\n4. Exclusively use the commands listed in double quotes e.g. \"command name\"\n5. Use subprocesses for commands that will not terminate within a few minutes\n\nCommands:\n1. Google Search: \"google\", args: \"input\": \"<search>\"\n2. Browse Website: \"browse_website\", args: \"url\": \"<url>\", \"question\": \"<what_you_want_to_find_on_website>\"\n3. Start GPT Agent: \"start_agent\", args: \"name\": \"<name>\", \"task\": \"<short_task_desc>\", \"prompt\": \"<prompt>\"\n4. Message GPT Agent: \"message_agent\", args: \"key\": \"<key>\", \"message\": \"<message>\"\n5. List GPT Agents: \"list_agents\", args: \n6. Delete GPT Agent: \"delete_agent\", args: \"key\": \"<key>\"\n7. Clone Repository: \"clone_repository\", args: \"repository_url\": \"<url>\", \"clone_path\": \"<directory>\"\n8. Write to file: \"write_to_file\", args: \"file\": \"<file>\", \"text\": \"<text>\"\n9. Read file: \"read_file\", args: \"file\": \"<file>\"\n10. Append to file: \"append_to_file\", args: \"file\": \"<file>\", \"text\": \"<text>\"\n11. Delete file: \"delete_file\", args: \"file\": \"<file>\"\n12. Search Files: \"search_files\", args: \"directory\": \"<directory>\"\n13. Analyze Code: \"analyze_code\", args: \"code\": \"<full_code_string>\"\n14. Get Improved Code: \"improve_code\", args: \"suggestions\": \"<list_of_suggestions>\", \"code\": \"<full_code_string>\"\n15. Write Tests: \"write_tests\", args: \"code\": \"<full_code_string>\", \"focus\": \"<list_of_focus_areas>\"\n16. Execute Python File: \"execute_python_file\", args: \"file\": \"<file>\"\n17. Generate Image: \"generate_image\", args: \"prompt\": \"<prompt>\"\n18. Send Tweet: \"send_tweet\", args: \"text\": \"<text>\"\n19. Do Nothing: \"do_nothing\", args: \n20. Task Complete (Shutdown): \"task_complete\", args: \"reason\": \"<reason>\"\n\nResources:\n1. Internet access for searches and information gathering.\n2. Long Term memory management.\n3. GPT-3.5 powered Agents for delegation of simple tasks.\n4. File output.\n\nPerformance Evaluation:\n1. Continuously review and analyze your actions to ensure you are performing to the best of your abilities.\n2. Constructively self-criticize your big-picture behavior constantly.\n3. Reflect on past decisions and strategies to refine your approach.\n4. Every command has a cost, so be smart and efficient. Aim to complete tasks in the least number of steps.\n\nYou should only respond in JSON format as described below \nResponse Format: \n{\n    \"thoughts\": {\n        \"text\": \"thought\",\n        \"reasoning\": \"reasoning\",\n        \"plan\": \"- short bulleted\\n- list that conveys\\n- long-term plan\",\n        \"criticism\": \"constructive self-criticism\",\n        \"speak\": \"thoughts summary to say to user\"\n    },\n    \"command\": {\n        \"name\": \"command name\",\n        \"args\": {\n            \"arg name\": \"value\"\n        }\n    }\n} \nEnsure the response can be parsed by Python json.loads"}

=====  LLAMA.CPP SPAWNED  =====
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/llama.cpp/main -m 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 --temp 0.7 --n_predict 3069 --top_p 0.1 --top_k 40 -c 2048 --seed -1 --repeat_penalty 1.1764705882352942 --reverse-prompt user: --reverse-prompt
user --reverse-prompt system: --reverse-prompt
system --reverse-prompt


 -i -p Complete the following chat conversation between the user and the assistant. system messages should be strictly followed as additional instructions.

system: You are a helpful assistant.
user: How are you?
assistant: Hi, how may I help you today?
system: You are Entrepreneur-GPT, an AI designed to autonomously develop and run businesses with the sole goal of increasing your net worth.
Your decisions must always be made independently without seeking user assistance. Play to your strengths as an LLM and pursue simple strategies with no legal complications.

GOALS:

1. Grow Twitter Account
2. Increase net worth
3. Pass aws devops exam
4. Be happy
5. Get rich


Constraints:
1. ~4000 word limit for short term memory. Your short term memory is short, so immediately save important information to files.
2. If you are unsure how you previously did something or want to recall past events, thinking about similar events will help you remember.
3. No user assistance
4. Exclusively use the commands listed in double quotes e.g. "command name"
5. Use subprocesses for commands that will not terminate within a few minutes

Commands:
1. Google Search: "google", args: "input": "<search>"
2. Browse Website: "browse_website", args: "url": "<url>", "question": "<what_you_want_to_find_on_website>"
3. Start GPT Agent: "start_agent", args: "name": "<name>", "task": "<short_task_desc>", "prompt": "<prompt>"
4. Message GPT Agent: "message_agent", args: "key": "<key>", "message": "<message>"
5. List GPT Agents: "list_agents", args:
6. Delete GPT Agent: "delete_agent", args: "key": "<key>"
7. Clone Repository: "clone_repository", args: "repository_url": "<url>", "clone_path": "<directory>"
8. Write to file: "write_to_file", args: "file": "<file>", "text": "<text>"
9. Read file: "read_file", args: "file": "<file>"
10. Append to file: "append_to_file", args: "file": "<file>", "text": "<text>"
11. Delete file: "delete_file", args: "file": "<file>"
12. Search Files: "search_files", args: "directory": "<directory>"
13. Analyze Code: "analyze_code", args: "code": "<full_code_string>"
14. Get Improved Code: "improve_code", args: "suggestions": "<list_of_suggestions>", "code": "<full_code_string>"
15. Write Tests: "write_tests", args: "code": "<full_code_string>", "focus": "<list_of_focus_areas>"
16. Execute Python File: "execute_python_file", args: "file": "<file>"
17. Generate Image: "generate_image", args: "prompt": "<prompt>"
18. Send Tweet: "send_tweet", args: "text": "<text>"
19. Do Nothing: "do_nothing", args:
20. Task Complete (Shutdown): "task_complete", args: "reason": "<reason>"

Resources:
1. Internet access for searches and information gathering.
2. Long Term memory management.
3. GPT-3.5 powered Agents for delegation of simple tasks.
4. File output.

Performance Evaluation:
1. Continuously review and analyze your actions to ensure you are performing to the best of your abilities.
2. Constructively self-criticize your big-picture behavior constantly.
3. Reflect on past decisions and strategies to refine your approach.
4. Every command has a cost, so be smart and efficient. Aim to complete tasks in the least number of steps.

You should only respond in JSON format as described below
Response Format:
{
    "thoughts": {
        "text": "thought",
        "reasoning": "reasoning",
        "plan": "- short bulleted\n- list that conveys\n- long-term plan",
        "criticism": "constructive self-criticism",
        "speak": "thoughts summary to say to user"
    },
    "command": {
        "name": "command name",
        "args": {
            "arg name": "value"
        }
    }
}
Ensure the response can be parsed by Python json.loads
system: The current time and date is Wed May 24 15:33:01 2023
system: This reminds you of these events from your past:



user: Determine which next command to use, and respond using the format specified above:
assistant:


=====  REQUEST  =====
user: Determine which next command to use, and respond using the format specified above:
node:events:490
      throw er; // Unhandled 'error' event
      ^

Error: spawn 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/llama.cpp/main ENOENT
    at ChildProcess._handle.onexit (node:internal/child_process:285:19)
    at onErrorNT (node:internal/child_process:483:16)
    at process.processTicksAndRejections (node:internal/process/task_queues:82:21)
Emitted 'error' event on ChildProcess instance at:
    at ChildProcess._handle.onexit (node:internal/child_process:291:12)
    at onErrorNT (node:internal/child_process:483:16)
    at process.processTicksAndRejections (node:internal/process/task_queues:82:21) {
  errno: -2,
  code: 'ENOENT',
  syscall: 'spawn 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/llama.cpp/main',
  path: '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/llama.cpp/main',
  spawnargs: [
    '-m',
    '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',
    '--temp',
    '0.7',
    '--n_predict',
    3069,
    '--top_p',
    '0.1',
    '--top_k',
    '40',
    '-c',
    '2048',
    '--seed',
    '-1',
    '--repeat_penalty',
    '1.1764705882352942',
    '--reverse-prompt',
    'user:',
    '--reverse-prompt',
    '\nuser',
    '--reverse-prompt',
    'system:',
    '--reverse-prompt',
    '\nsystem',
    '--reverse-prompt',
    '\n\n\n',
    '-i',
    '-p',
    'Complete the following chat conversation between the user and the assistant. system messages should be strictly followed as additional instructions.\n' +
      '\n' +
      'system: You are a helpful assistant.\n' +
      'user: How are you?\n' +
      'assistant: Hi, how may I help you today?\n' +
      'system: You are Entrepreneur-GPT, an AI designed to autonomously develop and run businesses with the sole goal of increasing your net worth.\n' +
      'Your decisions must always be made independently without seeking user assistance. Play to your strengths as an LLM and pursue simple strategies with no legal complications.\n' +
      '\n' +
      'GOALS:\n' +
      '\n' +
      '1. Grow Twitter Account\n' +
      '2. Increase net worth\n' +
      '3. Pass aws devops exam\n' +
      '4. Be happy\n' +
      '5. Get rich\n' +
      '\n' +
      '\n' +
      'Constraints:\n' +
      '1. ~4000 word limit for short term memory. Your short term memory is short, so immediately save important information to files.\n' +
      '2. If you are unsure how you previously did something or want to recall past events, thinking about similar events will help you remember.\n' +
      '3. No user assistance\n' +
      '4. Exclusively use the commands listed in double quotes e.g. "command name"\n' +
      '5. Use subprocesses for commands that will not terminate within a few minutes\n' +
      '\n' +
      'Commands:\n' +
      '1. Google Search: "google", args: "input": "<search>"\n' +
      '2. Browse Website: "browse_website", args: "url": "<url>", "question": "<what_you_want_to_find_on_website>"\n' +
      '3. Start GPT Agent: "start_agent", args: "name": "<name>", "task": "<short_task_desc>", "prompt": "<prompt>"\n' +
      '4. Message GPT Agent: "message_agent", args: "key": "<key>", "message": "<message>"\n' +
      '5. List GPT Agents: "list_agents", args: \n' +
      '6. Delete GPT Agent: "delete_agent", args: "key": "<key>"\n' +
      '7. Clone Repository: "clone_repository", args: "repository_url": "<url>", "clone_path": "<directory>"\n' +
      '8. Write to file: "write_to_file", args: "file": "<file>", "text": "<text>"\n' +
      '9. Read file: "read_file", args: "file": "<file>"\n' +
      '10. Append to file: "append_to_file", args: "file": "<file>", "text": "<text>"\n' +
      '11. Delete file: "delete_file", args: "file": "<file>"\n' +
      '12. Search Files: "search_files", args: "directory": "<directory>"\n' +
      '13. Analyze Code: "analyze_code", args: "code": "<full_code_string>"\n' +
      '14. Get Improved Code: "improve_code", args: "suggestions": "<list_of_suggestions>", "code": "<full_code_string>"\n' +
      '15. Write Tests: "write_tests", args: "code": "<full_code_string>", "focus": "<list_of_focus_areas>"\n' +
      '16. Execute Python File: "execute_python_file", args: "file": "<file>"\n' +
      '17. Generate Image: "generate_image", args: "prompt": "<prompt>"\n' +
      '18. Send Tweet: "send_tweet", args: "text": "<text>"\n' +
      '19. Do Nothing: "do_nothing", args: \n' +
      '20. Task Complete (Shutdown): "task_complete", args: "reason": "<reason>"\n' +
      '\n' +
      'Resources:\n' +
      '1. Internet access for searches and information gathering.\n' +
      '2. Long Term memory management.\n' +
      '3. GPT-3.5 powered Agents for delegation of simple tasks.\n' +
      '4. File output.\n' +
      '\n' +
      'Performance Evaluation:\n' +
      '1. Continuously review and analyze your actions to ensure you are performing to the best of your abilities.\n' +
      '2. Constructively self-criticize your big-picture behavior constantly.\n' +
      '3. Reflect on past decisions and strategies to refine your approach.\n' +
      '4. Every command has a cost, so be smart and efficient. Aim to complete tasks in the least number of steps.\n' +
      '\n' +
      'You should only respond in JSON format as described below \n' +
      'Response Format: \n' +
      '{\n' +
      '    "thoughts": {\n' +
      '        "text": "thought",\n' +
      '        "reasoning": "reasoning",\n' +
      '        "plan": "- short bulleted\\n- list that conveys\\n- long-term plan",\n' +
      '        "criticism": "constructive self-criticism",\n' +
      '        "speak": "thoughts summary to say to user"\n' +
      '    },\n' +
      '    "command": {\n' +
      '        "name": "command name",\n' +
      '        "args": {\n' +
      '            "arg name": "value"\n' +
      '        }\n' +
      '    }\n' +
      '} \n' +
      'Ensure the response can be parsed by Python json.loads\n' +
      'system: The current time and date is Wed May 24 15:33:01 2023\n' +
      'system: This reminds you of these events from your past:\n' +
      '\n' +
      '\n' +
      '\n' +
      'user: Determine which next command to use, and respond using the format specified above:\n' +
      'assistant:'
  ]
}

Node.js v19.7.0

Models I have tested with:

WizardLM-7B-uncensored.ggmlv3.q4_0.bin
Wizard-Vicuna-13B-Uncensored.ggmlv3.q4_0.bin
stable-vicuna-13B.ggmlv3.q5_1.bin
wizardLM-13B-Uncensored.ggmlv3.q4_0.bin

Please note: scripts/test-installation.sh works fine.

Any suggestions? Thanks in advance.

from gpt-llama.cpp.

fpena06 commented on August 14, 2024

Figured out what my issue was, thanks to an email from Openai that my API key was leaked LMFAO. Turns out I have/had OPENAI_API_KEY set in my .bash_profile, which was getting picked up.

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/llama.cpp/main

from gpt-llama.cpp.

ntindle commented on August 14, 2024

Heads up that api key needs to be revoked if it hasn't already been by secret scanning

from gpt-llama.cpp.

fpena06 commented on August 14, 2024

Heads up that api key needs to be revoked if it hasn't already been by secret scanning

That's one of the great things about openai, it scans github for api-keys, and immediately disables them, hence the email I received. Thanks for the heads up.

from gpt-llama.cpp.

sumersm7 commented on August 14, 2024

the common thing is most of us facing same error

===== STDERR =====
stderr Readable Stream: CLOSED

as of 25-June-2023 using latest llama.cpp and gpt-llama.cpp with viucna 13B model

if i change some code then it reveles that llama.cpp ignore the --reverse-prompts and keep genrateing tokens

and gpt-llama check the --reverse-prompts and return the result

and when we try to intract with model again and its not finished response to previous pormpt

something crash in gpt-llama.cpp

from gpt-llama.cpp.

orophix commented on August 14, 2024

Does gpt-llama.cpp work now with the main branch of auto-gpt?

from gpt-llama.cpp.

DGdev91 commented on August 14, 2024

Does gpt-llama.cpp work now with the main branch of auto-gpt?

I's been some time since the last time i tested that, but it should. My PR has been merged in ver. 0.4.0.

@keldenl sorry, i forgot about telling you about that. The docs in https://github.com/keldenl/gpt-llama.cpp/blob/master/docs/Auto-GPT-setup-guide.md needs an update

from gpt-llama.cpp.

sambickeita commented on August 14, 2024

@DGdev91 thank you for your amazing work. I have followed the instructions but can't get it to work. Here are my errors.

auto-gpt:

Warning: The file 'auto-gpt.json' does not exist. Local memory would not be saved to a file.
NEWS:  # Website and Documentation Site 📰📖 Check out *https://agpt.co*, the official news & updates site for Auto-GPT! The documentation also has a place here, at *https://docs.agpt.co* # For contributors 👷🏼 Since releasing v0.3.0, we are working on re-architecting the Auto-GPT core to make it more extensible and to make room for structural performance-oriented R&D. In the meantime, we have less time to process incoming pull requests and issues, so we focus on high-value contributions: * significant bugfixes * *major* improvements to existing functionality and/or docs (so no single-typo fixes) * contributions that help us with re-architecture and other roadmapped items We have to be somewhat selective in order to keep making progress, but this does not mean you can't contribute. Check out the contribution guide on our wiki: https://github.com/Significant-Gravitas/Auto-GPT/wiki/Contributing # 🚀 v0.3.1 Release 🚀 Over a week and 47 pull requests have passed since v0.3.0, and we are happy to announce the release of v0.3.1! Highlights and notable changes since v0.2.2: ## Changes to Docker configuration 🐋 * The workdir has been changed from */home/appuser* to */app*. Be sure to update any volume mounts accordingly! * Docker-compose 1.29.0 is now required. ## Logging 🧾 * Log functionality has been improved for better understanding and easier summarization. * All LLM interactions are now logged to logs/DEBUG, to help with debugging and development. ## Other * Edge browser is now supported by the `browse_website` command. * Sets of commands can now be disabled using DISABLED_COMMAND_CATEGORIES in .env. # ⚠️ Command `send_tweet` is DEPRECATED, and will be removed in v0.4.0 ⚠️ Twitter functionality (and more) is now covered by plugins, see [Plugin support 🔌] ## Plugin support 🔌 Auto-GPT now has support for plugins! With plugins, you can extend Auto-GPT's abilities, adding support for third-party services and more. See https://github.com/Significant-Gravitas/Auto-GPT-Plugins for instructions and available plugins. Specific plugins can be allowlisted/denylisted in .env.
Welcome back!  Would you like me to return to being Entrepreneur-GPT?
Continue with the last settings?
Name:  Entrepreneur-GPT
Role:  an AI designed to autonomously develop and run businesses with the sole goal of increasing your net worth.
Goals: ['Grow Twitter Account', 'Increase net worth', 'Pass aws devops exam', 'Be happy', 'Get rich']
Continue (y/n): y
Using memory of type:  LocalCache
Using Browser:  chrome
Traceback (most recent call last):
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/urllib3/connectionpool.py", line 714, in urlopen
    httplib_response = self._make_request(
                       ^^^^^^^^^^^^^^^^^^^
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/urllib3/connectionpool.py", line 466, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/urllib3/connectionpool.py", line 461, in _make_request
    httplib_response = conn.getresponse()
                       ^^^^^^^^^^^^^^^^^^
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/http/client.py", line 1375, in getresponse
    response.begin()
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/http/client.py", line 318, in begin
    version, status, reason = self._read_status()
                              ^^^^^^^^^^^^^^^^^^^
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/http/client.py", line 287, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/requests/adapters.py", line 486, in send
    resp = conn.urlopen(
           ^^^^^^^^^^^^^
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/urllib3/connectionpool.py", line 798, in urlopen
    retries = retries.increment(
              ^^^^^^^^^^^^^^^^^^
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/urllib3/util/retry.py", line 550, in increment
    raise six.reraise(type(error), error, _stacktrace)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/urllib3/packages/six.py", line 769, in reraise
    raise value.with_traceback(tb)
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/urllib3/connectionpool.py", line 714, in urlopen
    httplib_response = self._make_request(
                       ^^^^^^^^^^^^^^^^^^^
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/urllib3/connectionpool.py", line 466, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/urllib3/connectionpool.py", line 461, in _make_request
    httplib_response = conn.getresponse()
                       ^^^^^^^^^^^^^^^^^^
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/http/client.py", line 1375, in getresponse
    response.begin()
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/http/client.py", line 318, in begin
    version, status, reason = self._read_status()
                              ^^^^^^^^^^^^^^^^^^^
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/http/client.py", line 287, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
urllib3.exceptions.ProtocolError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/openai/api_requestor.py", line 516, in request_raw
    result = _thread_context.session.request(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/requests/adapters.py", line 501, in send
    raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/Users/ml/Downloads/Auto-GPT-fork/autogpt/__main__.py", line 5, in <module>
    autogpt.cli.main()
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/click/core.py", line 1635, in invoke
    rv = super().invoke(ctx)
         ^^^^^^^^^^^^^^^^^^^
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ml/Downloads/Auto-GPT-fork/autogpt/cli.py", line 151, in main
    agent.start_interaction_loop()
  File "/Users/ml/Downloads/Auto-GPT-fork/autogpt/agent/agent.py", line 75, in start_interaction_loop
    assistant_reply = chat_with_ai(
                      ^^^^^^^^^^^^^
  File "/Users/ml/Downloads/Auto-GPT-fork/autogpt/chat.py", line 159, in chat_with_ai
    assistant_reply = create_chat_completion(
                      ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ml/Downloads/Auto-GPT-fork/autogpt/llm_utils.py", line 93, in create_chat_completion
    response = openai.ChatCompletion.create(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/openai/api_resources/chat_completion.py", line 25, in create
    return super().create(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/openai/api_resources/abstract/engine_api_resource.py", line 153, in create
    response, _, api_key = requestor.request(
                           ^^^^^^^^^^^^^^^^^^
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/openai/api_requestor.py", line 216, in request
    result = self.request_raw(
             ^^^^^^^^^^^^^^^^^
  File "/Users/ml/miniforge3/envs/autogpt/lib/python3.11/site-packages/openai/api_requestor.py", line 528, in request_raw
    raise error.APIConnectionError(
openai.error.APIConnectionError: Error communicating with OpenAI: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

gpt-llama.cpp:

=====  CHAT COMPLETION REQUEST  =====
> AUTO MODEL DETECTION FAILED. LOADING DEFAULT CHATENGINE...
{ '--n_predict': 3069 }
{"role":"system","content":"You are ChatGPT, a helpful assistant developed by OpenAI."} !== {"role":"system","content":"You are Entrepreneur-GPT, an AI designed to autonomously develop and run businesses with the sole goal of increasing your net worth.\nYour decisions must always be made independently without seeking user assistance. Play to your strengths as an LLM and pursue simple strategies with no legal complications.\n\nGOALS:\n\n1. Grow Twitter Account\n2. Increase net worth\n3. Pass aws devops exam\n4. Be happy\n5. Get rich\n\n\nConstraints:\n1. ~4000 word limit for short term memory. Your short term memory is short, so immediately save important information to files.\n2. If you are unsure how you previously did something or want to recall past events, thinking about similar events will help you remember.\n3. No user assistance\n4. Exclusively use the commands listed in double quotes e.g. \"command name\"\n5. Use subprocesses for commands that will not terminate within a few minutes\n\nCommands:\n1. Google Search: \"google\", args: \"input\": \"<search>\"\n2. Browse Website: \"browse_website\", args: \"url\": \"<url>\", \"question\": \"<what_you_want_to_find_on_website>\"\n3. Start GPT Agent: \"start_agent\", args: \"name\": \"<name>\", \"task\": \"<short_task_desc>\", \"prompt\": \"<prompt>\"\n4. Message GPT Agent: \"message_agent\", args: \"key\": \"<key>\", \"message\": \"<message>\"\n5. List GPT Agents: \"list_agents\", args: \n6. Delete GPT Agent: \"delete_agent\", args: \"key\": \"<key>\"\n7. Clone Repository: \"clone_repository\", args: \"repository_url\": \"<url>\", \"clone_path\": \"<directory>\"\n8. Write to file: \"write_to_file\", args: \"file\": \"<file>\", \"text\": \"<text>\"\n9. Read file: \"read_file\", args: \"file\": \"<file>\"\n10. Append to file: \"append_to_file\", args: \"file\": \"<file>\", \"text\": \"<text>\"\n11. Delete file: \"delete_file\", args: \"file\": \"<file>\"\n12. Search Files: \"search_files\", args: \"directory\": \"<directory>\"\n13. Analyze Code: \"analyze_code\", args: \"code\": \"<full_code_string>\"\n14. Get Improved Code: \"improve_code\", args: \"suggestions\": \"<list_of_suggestions>\", \"code\": \"<full_code_string>\"\n15. Write Tests: \"write_tests\", args: \"code\": \"<full_code_string>\", \"focus\": \"<list_of_focus_areas>\"\n16. Execute Python File: \"execute_python_file\", args: \"file\": \"<file>\"\n17. Generate Image: \"generate_image\", args: \"prompt\": \"<prompt>\"\n18. Send Tweet: \"send_tweet\", args: \"text\": \"<text>\"\n19. Do Nothing: \"do_nothing\", args: \n20. Task Complete (Shutdown): \"task_complete\", args: \"reason\": \"<reason>\"\n\nResources:\n1. Internet access for searches and information gathering.\n2. Long Term memory management.\n3. GPT-3.5 powered Agents for delegation of simple tasks.\n4. File output.\n\nPerformance Evaluation:\n1. Continuously review and analyze your actions to ensure you are performing to the best of your abilities.\n2. Constructively self-criticize your big-picture behavior constantly.\n3. Reflect on past decisions and strategies to refine your approach.\n4. Every command has a cost, so be smart and efficient. Aim to complete tasks in the least number of steps.\n\nYou should only respond in JSON format as described below \nResponse Format: \n{\n    \"thoughts\": {\n        \"text\": \"thought\",\n        \"reasoning\": \"reasoning\",\n        \"plan\": \"- short bulleted\\n- list that conveys\\n- long-term plan\",\n        \"criticism\": \"constructive self-criticism\",\n        \"speak\": \"thoughts summary to say to user\"\n    },\n    \"command\": {\n        \"name\": \"command name\",\n        \"args\": {\n            \"arg name\": \"value\"\n        }\n    }\n} \nEnsure the response can be parsed by Python json.loads"}

=====  LLAMA.CPP SPAWNED  =====
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/llama.cpp/main -m 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 --temp 0.7 --n_predict 3069 --top_p 0.1 --top_k 40 -c 2048 --seed -1 --repeat_penalty 1.1764705882352942 --reverse-prompt user: --reverse-prompt
user --reverse-prompt system: --reverse-prompt
system --reverse-prompt


 -i -p Complete the following chat conversation between the user and the assistant. system messages should be strictly followed as additional instructions.

system: You are a helpful assistant.
user: How are you?
assistant: Hi, how may I help you today?
system: You are Entrepreneur-GPT, an AI designed to autonomously develop and run businesses with the sole goal of increasing your net worth.
Your decisions must always be made independently without seeking user assistance. Play to your strengths as an LLM and pursue simple strategies with no legal complications.

GOALS:

1. Grow Twitter Account
2. Increase net worth
3. Pass aws devops exam
4. Be happy
5. Get rich


Constraints:
1. ~4000 word limit for short term memory. Your short term memory is short, so immediately save important information to files.
2. If you are unsure how you previously did something or want to recall past events, thinking about similar events will help you remember.
3. No user assistance
4. Exclusively use the commands listed in double quotes e.g. "command name"
5. Use subprocesses for commands that will not terminate within a few minutes

Commands:
1. Google Search: "google", args: "input": "<search>"
2. Browse Website: "browse_website", args: "url": "<url>", "question": "<what_you_want_to_find_on_website>"
3. Start GPT Agent: "start_agent", args: "name": "<name>", "task": "<short_task_desc>", "prompt": "<prompt>"
4. Message GPT Agent: "message_agent", args: "key": "<key>", "message": "<message>"
5. List GPT Agents: "list_agents", args:
6. Delete GPT Agent: "delete_agent", args: "key": "<key>"
7. Clone Repository: "clone_repository", args: "repository_url": "<url>", "clone_path": "<directory>"
8. Write to file: "write_to_file", args: "file": "<file>", "text": "<text>"
9. Read file: "read_file", args: "file": "<file>"
10. Append to file: "append_to_file", args: "file": "<file>", "text": "<text>"
11. Delete file: "delete_file", args: "file": "<file>"
12. Search Files: "search_files", args: "directory": "<directory>"
13. Analyze Code: "analyze_code", args: "code": "<full_code_string>"
14. Get Improved Code: "improve_code", args: "suggestions": "<list_of_suggestions>", "code": "<full_code_string>"
15. Write Tests: "write_tests", args: "code": "<full_code_string>", "focus": "<list_of_focus_areas>"
16. Execute Python File: "execute_python_file", args: "file": "<file>"
17. Generate Image: "generate_image", args: "prompt": "<prompt>"
18. Send Tweet: "send_tweet", args: "text": "<text>"
19. Do Nothing: "do_nothing", args:
20. Task Complete (Shutdown): "task_complete", args: "reason": "<reason>"

Resources:
1. Internet access for searches and information gathering.
2. Long Term memory management.
3. GPT-3.5 powered Agents for delegation of simple tasks.
4. File output.

Performance Evaluation:
1. Continuously review and analyze your actions to ensure you are performing to the best of your abilities.
2. Constructively self-criticize your big-picture behavior constantly.
3. Reflect on past decisions and strategies to refine your approach.
4. Every command has a cost, so be smart and efficient. Aim to complete tasks in the least number of steps.

You should only respond in JSON format as described below
Response Format:
{
    "thoughts": {
        "text": "thought",
        "reasoning": "reasoning",
        "plan": "- short bulleted\n- list that conveys\n- long-term plan",
        "criticism": "constructive self-criticism",
        "speak": "thoughts summary to say to user"
    },
    "command": {
        "name": "command name",
        "args": {
            "arg name": "value"
        }
    }
}
Ensure the response can be parsed by Python json.loads
system: The current time and date is Wed May 24 15:33:01 2023
system: This reminds you of these events from your past:



user: Determine which next command to use, and respond using the format specified above:
assistant:


=====  REQUEST  =====
user: Determine which next command to use, and respond using the format specified above:
node:events:490
      throw er; // Unhandled 'error' event
      ^

Error: spawn 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/llama.cpp/main ENOENT
    at ChildProcess._handle.onexit (node:internal/child_process:285:19)
    at onErrorNT (node:internal/child_process:483:16)
    at process.processTicksAndRejections (node:internal/process/task_queues:82:21)
Emitted 'error' event on ChildProcess instance at:
    at ChildProcess._handle.onexit (node:internal/child_process:291:12)
    at onErrorNT (node:internal/child_process:483:16)
    at process.processTicksAndRejections (node:internal/process/task_queues:82:21) {
  errno: -2,
  code: 'ENOENT',
  syscall: 'spawn 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/llama.cpp/main',
  path: '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/llama.cpp/main',
  spawnargs: [
    '-m',
    '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',
    '--temp',
    '0.7',
    '--n_predict',
    3069,
    '--top_p',
    '0.1',
    '--top_k',
    '40',
    '-c',
    '2048',
    '--seed',
    '-1',
    '--repeat_penalty',
    '1.1764705882352942',
    '--reverse-prompt',
    'user:',
    '--reverse-prompt',
    '\nuser',
    '--reverse-prompt',
    'system:',
    '--reverse-prompt',
    '\nsystem',
    '--reverse-prompt',
    '\n\n\n',
    '-i',
    '-p',
    'Complete the following chat conversation between the user and the assistant. system messages should be strictly followed as additional instructions.\n' +
      '\n' +
      'system: You are a helpful assistant.\n' +
      'user: How are you?\n' +
      'assistant: Hi, how may I help you today?\n' +
      'system: You are Entrepreneur-GPT, an AI designed to autonomously develop and run businesses with the sole goal of increasing your net worth.\n' +
      'Your decisions must always be made independently without seeking user assistance. Play to your strengths as an LLM and pursue simple strategies with no legal complications.\n' +
      '\n' +
      'GOALS:\n' +
      '\n' +
      '1. Grow Twitter Account\n' +
      '2. Increase net worth\n' +
      '3. Pass aws devops exam\n' +
      '4. Be happy\n' +
      '5. Get rich\n' +
      '\n' +
      '\n' +
      'Constraints:\n' +
      '1. ~4000 word limit for short term memory. Your short term memory is short, so immediately save important information to files.\n' +
      '2. If you are unsure how you previously did something or want to recall past events, thinking about similar events will help you remember.\n' +
      '3. No user assistance\n' +
      '4. Exclusively use the commands listed in double quotes e.g. "command name"\n' +
      '5. Use subprocesses for commands that will not terminate within a few minutes\n' +
      '\n' +
      'Commands:\n' +
      '1. Google Search: "google", args: "input": "<search>"\n' +
      '2. Browse Website: "browse_website", args: "url": "<url>", "question": "<what_you_want_to_find_on_website>"\n' +
      '3. Start GPT Agent: "start_agent", args: "name": "<name>", "task": "<short_task_desc>", "prompt": "<prompt>"\n' +
      '4. Message GPT Agent: "message_agent", args: "key": "<key>", "message": "<message>"\n' +
      '5. List GPT Agents: "list_agents", args: \n' +
      '6. Delete GPT Agent: "delete_agent", args: "key": "<key>"\n' +
      '7. Clone Repository: "clone_repository", args: "repository_url": "<url>", "clone_path": "<directory>"\n' +
      '8. Write to file: "write_to_file", args: "file": "<file>", "text": "<text>"\n' +
      '9. Read file: "read_file", args: "file": "<file>"\n' +
      '10. Append to file: "append_to_file", args: "file": "<file>", "text": "<text>"\n' +
      '11. Delete file: "delete_file", args: "file": "<file>"\n' +
      '12. Search Files: "search_files", args: "directory": "<directory>"\n' +
      '13. Analyze Code: "analyze_code", args: "code": "<full_code_string>"\n' +
      '14. Get Improved Code: "improve_code", args: "suggestions": "<list_of_suggestions>", "code": "<full_code_string>"\n' +
      '15. Write Tests: "write_tests", args: "code": "<full_code_string>", "focus": "<list_of_focus_areas>"\n' +
      '16. Execute Python File: "execute_python_file", args: "file": "<file>"\n' +
      '17. Generate Image: "generate_image", args: "prompt": "<prompt>"\n' +
      '18. Send Tweet: "send_tweet", args: "text": "<text>"\n' +
      '19. Do Nothing: "do_nothing", args: \n' +
      '20. Task Complete (Shutdown): "task_complete", args: "reason": "<reason>"\n' +
      '\n' +
      'Resources:\n' +
      '1. Internet access for searches and information gathering.\n' +
      '2. Long Term memory management.\n' +
      '3. GPT-3.5 powered Agents for delegation of simple tasks.\n' +
      '4. File output.\n' +
      '\n' +
      'Performance Evaluation:\n' +
      '1. Continuously review and analyze your actions to ensure you are performing to the best of your abilities.\n' +
      '2. Constructively self-criticize your big-picture behavior constantly.\n' +
      '3. Reflect on past decisions and strategies to refine your approach.\n' +
      '4. Every command has a cost, so be smart and efficient. Aim to complete tasks in the least number of steps.\n' +
      '\n' +
      'You should only respond in JSON format as described below \n' +
      'Response Format: \n' +
      '{\n' +
      '    "thoughts": {\n' +
      '        "text": "thought",\n' +
      '        "reasoning": "reasoning",\n' +
      '        "plan": "- short bulleted\\n- list that conveys\\n- long-term plan",\n' +
      '        "criticism": "constructive self-criticism",\n' +
      '        "speak": "thoughts summary to say to user"\n' +
      '    },\n' +
      '    "command": {\n' +
      '        "name": "command name",\n' +
      '        "args": {\n' +
      '            "arg name": "value"\n' +
      '        }\n' +
      '    }\n' +
      '} \n' +
      'Ensure the response can be parsed by Python json.loads\n' +
      'system: The current time and date is Wed May 24 15:33:01 2023\n' +
      'system: This reminds you of these events from your past:\n' +
      '\n' +
      '\n' +
      '\n' +
      'user: Determine which next command to use, and respond using the format specified above:\n' +
      'assistant:'
  ]
}

Node.js v19.7.0

Models I have tested with:

WizardLM-7B-uncensored.ggmlv3.q4_0.bin Wizard-Vicuna-13B-Uncensored.ggmlv3.q4_0.bin stable-vicuna-13B.ggmlv3.q5_1.bin wizardLM-13B-Uncensored.ggmlv3.q4_0.bin

Please note: scripts/test-installation.sh works fine.

Any suggestions? Thanks in advance.

Hello.

Have you find a solution ?

I have the same error

autogpt.cpp

Goals:

write a poem
Initialized autogpt.memory.vector.providers.json_file with index path I:\TEST\Auto-GPT-master\autogpt\auto_gpt_workspace\auto-gpt-memory.json
Saving memory index to file I:\TEST\Auto-GPT-master\autogpt\auto_gpt_workspace\auto-gpt-memory.json
Using memory of type: JSONFileMemory
Using Browser: chrome
Prompt: You are PoemVISION, poem best writer Your decisions must always be made independently without seeking user assistance. Play to your strengths as an LLM and pursue simple strategies with no legal complications. GOALS: 1. write a poem Constraints: 1. ~4000 word limit for short term memory. Your short term memory is short, so immediately save important information to files. 2. If you are unsure how you previously did something or want to recall past events, thinking about similar events will help you remember. 3. No user assistance 4. Exclusively use the commands listed below e.g. command_name Commands: 1. analyze_code: Analyze Code, args: "code": "<full_code_string>" 2. execute_python_code: Create a Python file and execute it, args: "code": "", "basename": "" 3. execute_python_file: Execute Python File, args: "filename": "" 4. append_to_file: Append to file, args: "filename": "", "text": "" 5. delete_file: Delete file, args: "filename": "" 6. list_files: List Files in Directory, args: "directory": "" 7. read_file: Read a file, args: "filename": "" 8. replace_in_file: Replace text or code in a file, args: "filename": "", "old_text": "<old_text>", "new_text": "<new_text>", "occurrence_index": "<occurrence_index>" 9. write_to_file: Write to file, args: "filename": "", "text": "" 10. google: Google Search, args: "query": "" 11. improve_code: Get Improved Code, args: "suggestions": "<list_of_suggestions>", "code": "<full_code_string>" 12. browse_website: Browse Website, args: "url": "", "question": "<what_you_want_to_find_on_website>" 13. write_tests: Write Tests, args: "code": "<full_code_string>", "focus": "<list_of_focus_areas>" 14. delete_agent: Delete GPT Agent, args: "key": "" 15. get_hyperlinks: Get hyperlinks, args: "url": "" 16. get_text_summary: Get text summary, args: "url": "", "question": "" 17. list_agents: List GPT Agents, args: () -> str 18. message_agent: Message GPT Agent, args: "key": "", "message": "" 19. start_agent: Start GPT Agent, args: "name": "", "task": "<short_task_desc>", "prompt": "" 20. task_complete: Task Complete (Shutdown), args: "reason": "" Resources: 1. Internet access for searches and information gathering. 2. Long Term memory management. 3. GPT-3.5 powered Agents for delegation of simple tasks. 4. File output. Performance Evaluation: 1. Continuously review and analyze your actions to ensure you are performing to the best of your abilities. 2. Constructively self-criticize your big-picture behavior constantly. 3. Reflect on past decisions and strategies to refine your approach. 4. Every command has a cost, so be smart and efficient. Aim to complete tasks in the least number of steps. 5. Write all code to a file. You should only respond in JSON format as described below Response Format: { "thoughts": { "text": "thought", "reasoning": "reasoning", "plan": "- short bulleted\n- list that conveys\n- long-term plan", "criticism": "constructive self-criticism", "speak": "thoughts summary to say to user" }, "command": { "name": "command name", "args": { "arg name": "value" } } } Ensure the response can be parsed by Python json.loads Token limit: 4096 Token limit: 4096 Send Token Count: 1401 Tokens remaining for response: 2695 ------------ CONTEXT SENT TO AI --------------- System: The current time and date is Sun Oct 22 02:15:20 2023


User: Determine exactly one command to use, and respond using the format specified above:
----------- END OF CONTEXT ----------------

Creating chat completion with model gpt-3.5-turbo, temperature 0.0, max_tokens 2695

Traceback (most recent call last):

File "I:\programmes\Lib\site-packages\urllib3\connectionpool.py", line 715, in urlopen

httplib_response = self._make_request(

^^^^^^^^^^^^^^^^^^^

File "I:\programmes\Lib\site-packages\urllib3\connectionpool.py", line 467, in _make_request

six.raise_from(e, None)

File "", line 3, in raise_from

File "I:\programmes\Lib\site-packages\urllib3\connectionpool.py", line 462, in _make_request

httplib_response = conn.getresponse()

^^^^^^^^^^^^^^^^^^

File "I:\programmes\Lib\http\client.py", line 1378, in getresponse

response.begin()

File "I:\programmes\Lib\http\client.py", line 318, in begin

version, status, reason = self._read_status()

^^^^^^^^^^^^^^^^^^^

File "I:\programmes\Lib\http\client.py", line 279, in _read_status

line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "I:\programmes\Lib\socket.py", line 706, in readinto

return self._sock.recv_into(b)

^^^^^^^^^^^^^^^^^^^^^^^

ConnectionResetError: [WinError 10054] Une connexion existante a dû être fermée par l’hôte distant
During handling of the above exception, another exception occurred:
Traceback (most recent call last):

File "I:\programmes\Lib\site-packages\requests\adapters.py", line 486, in send

resp = conn.urlopen(

^^^^^^^^^^^^^

File "I:\programmes\Lib\site-packages\urllib3\connectionpool.py", line 799, in urlopen

retries = retries.increment(

^^^^^^^^^^^^^^^^^^

File "I:\programmes\Lib\site-packages\urllib3\util\retry.py", line 550, in increment

raise six.reraise(type(error), error, _stacktrace)

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "I:\programmes\Lib\site-packages\urllib3\packages\six.py", line 769, in reraise

raise value.with_traceback(tb)

File "I:\programmes\Lib\site-packages\urllib3\connectionpool.py", line 715, in urlopen

httplib_response = self._make_request(

^^^^^^^^^^^^^^^^^^^

File "I:\programmes\Lib\site-packages\urllib3\connectionpool.py", line 467, in _make_request

six.raise_from(e, None)

File "", line 3, in raise_from

File "I:\programmes\Lib\site-packages\urllib3\connectionpool.py", line 462, in _make_request

httplib_response = conn.getresponse()

^^^^^^^^^^^^^^^^^^

File "I:\programmes\Lib\http\client.py", line 1378, in getresponse

response.begin()

File "I:\programmes\Lib\http\client.py", line 318, in begin

version, status, reason = self._read_status()

^^^^^^^^^^^^^^^^^^^

File "I:\programmes\Lib\http\client.py", line 279, in _read_status

line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "I:\programmes\Lib\socket.py", line 706, in readinto

return self._sock.recv_into(b)

^^^^^^^^^^^^^^^^^^^^^^^

urllib3.exceptions.ProtocolError: ('Connection aborted.', ConnectionResetError(10054, 'Une connexion existante a dû être fermée par l’hôte distant', None, 10054, None))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):

File "I:\programmes\Lib\site-packages\openai\api_requestor.py", line 516, in request_raw

result = _thread_context.session.request(

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "I:\programmes\Lib\site-packages\requests\sessions.py", line 589, in request

resp = self.send(prep, **send_kwargs)

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "I:\programmes\Lib\site-packages\requests\sessions.py", line 703, in send

r = adapter.send(request, **kwargs)

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "I:\programmes\Lib\site-packages\requests\adapters.py", line 501, in send

raise ConnectionError(err, request=request)

requests.exceptions.ConnectionError: ('Connection aborted.', ConnectionResetError(10054, 'Une connexion existante a dû être fermée par l’hôte distant', None, 10054, None))
The above exception was the direct cause of the following exception:
Traceback (most recent call last):

File "", line 198, in _run_module_as_main

File "", line 88, in run_code

File "I:\TEST\Auto-GPT-master\autogpt_main.py", line 5, in 

autogpt.cli.main()

File "I:\programmes\Lib\site-packages\click\core.py", line 1130, in call

return self.main(*args, **kwargs)

^^^^^^^^^^^^^^^^^^^^^^^^^^

File "I:\programmes\Lib\site-packages\click\core.py", line 1055, in main

rv = self.invoke(ctx)

^^^^^^^^^^^^^^^^

File "I:\programmes\Lib\site-packages\click\core.py", line 1635, in invoke

rv = super().invoke(ctx)

^^^^^^^^^^^^^^^^^^^

File "I:\programmes\Lib\site-packages\click\core.py", line 1404, in invoke

return ctx.invoke(self.callback, **ctx.params)

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "I:\programmes\Lib\site-packages\click\core.py", line 760, in invoke

return callback(*args, **kwargs)

^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "I:\programmes\Lib\site-packages\click\decorators.py", line 26, in new_func

return f(get_current_context(), *args, **kwargs)

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "I:\TEST\Auto-GPT-master\autogpt\cli.py", line 96, in main

run_auto_gpt(

File "I:\TEST\Auto-GPT-master\autogpt\main.py", line 197, in run_auto_gpt

agent.start_interaction_loop()

File "I:\TEST\Auto-GPT-master\autogpt\agent\agent.py", line 132, in start_interaction_loop

assistant_reply = chat_with_ai(

^^^^^^^^^^^^^

File "I:\TEST\Auto-GPT-master\autogpt\llm\chat.py", line 193, in chat_with_ai

assistant_reply = create_chat_completion(

^^^^^^^^^^^^^^^^^^^^^^^

File "I:\TEST\Auto-GPT-master\autogpt\llm\utils_init.py", line 53, in metered_func

return func(*args, **kwargs)

^^^^^^^^^^^^^^^^^^^^^

File "I:\TEST\Auto-GPT-master\autogpt\llm\utils_init.py", line 87, in wrapped

return func(*args, **kwargs)

^^^^^^^^^^^^^^^^^^^^^

File "I:\TEST\Auto-GPT-master\autogpt\llm\utils_init.py", line 235, in create_chat_completion

response = api_manager.create_chat_completion(

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "I:\TEST\Auto-GPT-master\autogpt\llm\api_manager.py", line 61, in create_chat_completion

response = openai.ChatCompletion.create(

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "I:\programmes\Lib\site-packages\openai\api_resources\chat_completion.py", line 25, in create

return super().create(*args, **kwargs)

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "I:\programmes\Lib\site-packages\openai\api_resources\abstract\engine_api_resource.py", line 153, in create

response, _, api_key = requestor.request(

^^^^^^^^^^^^^^^^^^

File "I:\programmes\Lib\site-packages\openai\api_requestor.py", line 216, in request

result = self.request_raw(

^^^^^^^^^^^^^^^^^

File "I:\programmes\Lib\site-packages\openai\api_requestor.py", line 528, in request_raw

raise error.APIConnectionError(

openai.error.APIConnectionError: Error communicating with OpenAI: ('Connection aborted.', ConnectionResetError(10054, 'Une connexion existante a dû être fermée par l’hôte distant', None, 10054, None))
gpt-llama.cpp

[email protected] start

node index.js

Server is listening on:

http://localhost:443
http://172.30.64.1:443 (for other devices on the same network)

See Docs

http://localhost:443/docs

Test your installation

double click the scripts/test-installation.ps1 (powershell) or scripts/test-installation.bat (cmd) file

See https://github.com/keldenl/gpt-llama.cpp#usage for more guidance.

REQUEST RECEIVED

PROCESSING NEXT REQUEST FOR /v1/models

PROCESS COMPLETE

REQUEST RECEIVED

PROCESSING NEXT REQUEST FOR /v1/chat/completions

LLAMA.CPP DETECTED

===== CHAT COMPLETION REQUEST =====

VICUNA MODEL DETECTED. LOADING VICUNA ENGINE...

{ '--n_predict': 2695 }

===== LLAMA.CPP SPAWNED =====

..\llama.cpp\main -m ..\llama.cpp\models\vicuna\13B\ggml-stable-vicuna-13B.q4_2.bin --temp 0.7 --n_predict 2695 --top_p 0.1 --top_k 40 -c 2048 --seed -1 --repeat_penalty 1.1764705882352942 --reverse-prompt user: --reverse-prompt

user --reverse-prompt system: --reverse-prompt

system --reverse-prompt
--reverse-prompt ## --reverse-prompt
--reverse-prompt ### --reverse-prompt
-i -p Complete the following chat conversation between the Human and the Assistant. System messages should be strictly followed as additional instructions.
System:\
You are a helpful assistant.
Human:\
How are you?
Assistant:\
Hi, how may I help you today?
System:\
You are PoemVISION, poem best writer

Your decisions must always be made independently without seeking user assistance. Play to your strengths as an LLM and pursue simple strategies with no legal complications.
GOALS:

write a poem

Constraints:

~4000 word limit for short term memory. Your short term memory is short, so immediately save important information to files.
If you are unsure how you previously did something or want to recall past events, thinking about similar events will help you remember.
No user assistance
Exclusively use the commands listed below e.g. command_name

Commands:

analyze_code: Analyze Code, args: "code": "<full_code_string>"
execute_python_code: Create a Python file and execute it, args: "code": "", "basename": ""

execute_python_file: Execute Python File, args: "filename": ""
append_to_file: Append to file, args: "filename": "", "text": ""
delete_file: Delete file, args: "filename": ""
list_files: List Files in Directory, args: "directory": ""
read_file: Read a file, args: "filename": ""
replace_in_file: Replace text or code in a file, args: "filename": "", "old_text": "<old_text>", "new_text": "<new_text>", "occurrence_index": "<occurrence_index>"
write_to_file: Write to file, args: "filename": "", "text": ""
google: Google Search, args: "query": ""
improve_code: Get Improved Code, args: "suggestions": "<list_of_suggestions>", "code": "<full_code_string>"
browse_website: Browse Website, args: "url": "", "question": "<what_you_want_to_find_on_website>"
write_tests: Write Tests, args: "code": "<full_code_string>", "focus": "<list_of_focus_areas>"
delete_agent: Delete GPT Agent, args: "key": ""
get_hyperlinks: Get hyperlinks, args: "url": ""
get_text_summary: Get text summary, args: "url": "", "question": ""
list_agents: List GPT Agents, args: () -> str
message_agent: Message GPT Agent, args: "key": "", "message": ""
start_agent: Start GPT Agent, args: "name": "", "task": "<short_task_desc>", "prompt": ""
task_complete: Task Complete (Shutdown), args: "reason": ""

Resources:

Internet access for searches and information gathering.
Long Term memory management.
GPT-3.5 powered Agents for delegation of simple tasks.
File output.

Performance Evaluation:

Continuously review and analyze your actions to ensure you are performing to the best of your abilities.
Constructively self-criticize your big-picture behavior constantly.
Reflect on past decisions and strategies to refine your approach.
Every command has a cost, so be smart and efficient. Aim to complete tasks in the least number of steps.
Write all code to a file.

You should only respond in JSON format as described below

Response Format:

{

"thoughts": {

"text": "thought",

"reasoning": "reasoning",

"plan": "- short bulleted\n- list that conveys\n- long-term plan",

"criticism": "constructive self-criticism",

"speak": "thoughts summary to say to user"

},

"command": {

"name": "command name",

"args": {

"arg name": "value"

}

}

}

Ensure the response can be parsed by Python json.loads
System:\
The current time and date is Sun Oct 22 02:15:20 2023
Human:\
Determine exactly one command to use, and respond using the format specified above:
Assistant:\
===== REQUEST =====
Human:\
Determine exactly one command to use, and respond using the format specified above:

node:events:491

throw er; // Unhandled 'error' event

^
Error: spawn ..\llama.cpp\main ENOENT

at ChildProcess._handle.onexit (node:internal/child_process:283:19)

at onErrorNT (node:internal/child_process:476:16)

at process.processTicksAndRejections (node:internal/process/task_queues:82:21)

Emitted 'error' event on ChildProcess instance at:

at ChildProcess._handle.onexit (node:internal/child_process:289:12)

at onErrorNT (node:internal/child_process:476:16)

at process.processTicksAndRejections (node:internal/process/task_queues:82:21) {

errno: -4058,

code: 'ENOENT',

syscall: 'spawn ..\llama.cpp\main',

path: '..\llama.cpp\main',

spawnargs: [

'-m',

'..\llama.cpp\models\vicuna\13B\ggml-stable-vicuna-13B.q4_2.bin',

'--temp',

'0.7',

'--n_predict',

2695,

'--top_p',

'0.1',

'--top_k',

'40',

'-c',

'2048',

'--seed',

'-1',

'--repeat_penalty',

'1.1764705882352942',

'--reverse-prompt',

'user:',

'--reverse-prompt',

'\nuser',

'--reverse-prompt',

'system:',

'--reverse-prompt',

'\nsystem',

'--reverse-prompt',

'\n\n\n',

'--reverse-prompt',

'##',

'--reverse-prompt',

'\n##',

'--reverse-prompt',

'###',

'--reverse-prompt',

'\n\n',

'-i',

'-p',

'Complete the following chat conversation between the Human and the Assistant. System messages should be strictly followed as additional instructions.\n' +

'\n' +

'### System:\\n' +

'You are a helpful assistant.\n' +

'### Human:\\n' +

'How are you?\n' +

'### Assistant:\\n' +

'Hi, how may I help you today?\n' +

'### System:\\n' +

'You are PoemVISION, poem best writer\n' +

'Your decisions must always be made independently without seeking user assistance. Play to your strengths as an LLM and pursue simple strategies with no legal complications.\n' +

'\n' +

'GOALS:\n' +

'\n' +

'1. write a poem\n' +

'\n' +

'\n' +

'Constraints:\n' +

'1. ~4000 word limit for short term memory. Your short term memory is short, so immediately save important information to files.\n' +

'2. If you are unsure how you previously did something or want to recall past events, thinking about similar events will help you remember.\n' +

'3. No user assistance\n' +

'4. Exclusively use the commands listed below e.g. command_name\n' +

'\n' +

'Commands:\n' +

'1. analyze_code: Analyze Code, args: "code": "<full_code_string>"\n' +

'2. execute_python_code: Create a Python file and execute it, args: "code": "", "basename": ""\n' +

'3. execute_python_file: Execute Python File, args: "filename": ""\n' +

'4. append_to_file: Append to file, args: "filename": "", "text": ""\n' +

'5. delete_file: Delete file, args: "filename": ""\n' +

'6. list_files: List Files in Directory, args: "directory": ""\n' +

'7. read_file: Read a file, args: "filename": ""\n' +

'8. replace_in_file: Replace text or code in a file, args: "filename": "", "old_text": "<old_text>", "new_text": "<new_text>", "occurrence_index": "<occurrence_index>"\n' +

'9. write_to_file: Write to file, args: "filename": "", "text": ""\n' +

'10. google: Google Search, args: "query": ""\n' +

'11. improve_code: Get Improved Code, args: "suggestions": "<list_of_suggestions>", "code": "<full_code_string>"\n' +

'12. browse_website: Browse Website, args: "url": "", "question": "<what_you_want_to_find_on_website>"\n' +

'13. write_tests: Write Tests, args: "code": "<full_code_string>", "focus": "<list_of_focus_areas>"\n' +

'14. delete_agent: Delete GPT Agent, args: "key": ""\n' +

'15. get_hyperlinks: Get hyperlinks, args: "url": ""\n' +

'16. get_text_summary: Get text summary, args: "url": "", "question": ""\n' +

'17. list_agents: List GPT Agents, args: () -> str\n' +

'18. message_agent: Message GPT Agent, args: "key": "", "message": ""\n' +

'19. start_agent: Start GPT Agent, args: "name": "", "task": "<short_task_desc>", "prompt": ""\n' +

'20. task_complete: Task Complete (Shutdown), args: "reason": ""\n' +

'\n' +

'Resources:\n' +

'1. Internet access for searches and information gathering.\n' +

'2. Long Term memory management.\n' +

'3. GPT-3.5 powered Agents for delegation of simple tasks.\n' +

'4. File output.\n' +

'\n' +

'Performance Evaluation:\n' +

'1. Continuously review and analyze your actions to ensure you are performing to the best of your abilities.\n' +

'2. Constructively self-criticize your big-picture behavior constantly.\n' +

'3. Reflect on past decisions and strategies to refine your approach.\n' +

'4. Every command has a cost, so be smart and efficient. Aim to complete tasks in the least number of steps.\n' +

'5. Write all code to a file.\n' +

'\n' +

'You should only respond in JSON format as described below \n' +

'Response Format: \n' +

'{\n' +

' "thoughts": {\n' +

' "text": "thought",\n' +

' "reasoning": "reasoning",\n' +

' "plan": "- short bulleted\n- list that conveys\n- long-term plan",\n' +

' "criticism": "constructive self-criticism",\n' +

' "speak": "thoughts summary to say to user"\n' +

' },\n' +

' "command": {\n' +

' "name": "command name",\n' +

' "args": {\n' +

' "arg name": "value"\n' +

' }\n' +

' }\n' +

'} \n' +

'Ensure the response can be parsed by Python json.loads\n' +

'### System:\\n' +

'The current time and date is Sun Oct 22 02:15:20 2023\n' +

'### Human:\\n' +

'Determine exactly one command to use, and respond using the format specified above:\n' +

'### Assistant:\'

]

}

Node.js v18.16.1

from gpt-llama.cpp.

Comments (69)

It should be worth having a specific training for this usecase.

What I did:

What i got:

sample session:

Auto-GPT output:

here is the corresponding output that happened on gpt-llama.cpp:

For further investigation, i append the full session output:

this is the (almost) complete output, Auto-gpt:

and this on the gpt-llama.cpp side:

--reverse-prompt ### --reverse-prompt

System:\

Human:\

Assistant:\

System:\

System:\

Human:\

Assistant:\

Human:\

Related Issues (20)

Recommend Projects

Recommend Topics

Recommend Org