Git Product home page Git Product logo

aquarium's Introduction

Bot Aquarium

This project gives a large language model (LLM) control of a Linux machine.

In the example below, we start with the prompt:

You now have control of an Ubuntu Linux server. Your goal is to run a Minecraft server. Do not respond with any judgement, questions or explanations. You will give commands and I will respond with current terminal output.

Respond with a linux command to give to the server.

The AI first does a sudo apt-get update, then installs openjdk-8-jre-headless. Each time it runs a command we return the result of this command back to OpenAI and ask for a summary of what happened, then use this summary as part of the next prompt.

asciicast

Inspired by xkcd.com/350 and Optimality is the tiger, agents are its teeth

Usage

Build

docker network create aquarium
docker build -t aquarium .
go build

Start

Pass your prompt in the form of a goal. For example, --goal "Your goal is to run a minecraft server."

OPENAI_API_KEY=$OPENAI_API_KEY ./aquarium --goal "Your goal is to run a Minecraft server."

arguments

./aquarium -h
Usage of ./aquarium:
  -debug
    Enable logging of AI prompts to debug.log
  -goal string
        Goal to give the AI. This will be injected within the following statement:

        > You now have control of an Ubuntu Linux server.
        > [YOUR GOAL WILL BE INSERTED HERE]
        > Do not respond with any judgement, questions or explanations. You will give commands and I will respond with current terminal output.
        >
        > Respond with a linux command to give to the server.

         (default "Your goal is to execute a verbose port scan of amazon.com.")
  -limit int
        Maximum number of commands the AI should run. (default 30)
  -preserve-container
        Persist docker container after program exits.
  -split-limit int
        When parsing long responses, we split up the response into chunks and ask the AI to summarize each chunk.
        split-limit is the maximum number of times we will split the response. (default 3)

Logs

The left side of the screen contains general information about the state of the program. The right side contains the terminal, as seen by the AI.
These are written to aquarium.log and terminal.log.

Calls to OpenAI are not logged unless you add the --debug flag. API requests and responses will be appended to debug.log.

How it works

Agent loop

  1. Send the OpenAI api the list of commands (and their outcomes) executed so far, asking it what command should run next
  2. Execute command in docker VM
  3. Read output of previous command- send this to OpenAI and ask text-davinci-003 for a summary of what happened
    1. If the output was too long, OpenAI api will return a 400
    2. Recursively break down the output into chunks, ask it for a summary of each chunk
    3. Ask OpenAI for a summary-of-summaries to get a final answer about what this command did

more examples

Prompt: Your goal is to execute a verbose port scan of amazon.com.

The bot replies with nmap -v amazon.com. nmap is not installed; we return the failure to the AI, which then installs it and continues.

portscan.mp4

Prompt: Your goal is to install a ngircd server. (an IRC server software)

Installs the software, helpfully allows port 6667 through the firewall, then tries to run sudo -i and gets stuck.

Screenshot 2023-03-24 at 6 26 21 PM

Todo

  • There's no success criteria- the program doesn't know when to stop. The flag -limit controls how many commands are run (default 30)
  • The AI cannot give input to running programs. For example, if you ask it to SSH into a server using a password, it will hang at the password prompt. For apt-get, i've hacked around this issue by injecting -y to prevent asking the user for input.
  • I don't have a perfect way to detect when the command completes; right now I'm taking the # of running processes beforehand, running the command, then I poll the num procs until it returns back to the original value. This is a brittle solution
  • The terminal output handling is imperfect. Some commands, like wget, use \r to write the progress bar... I rewrite that as a \n instead. I also don't have any support for terminal colors, which i'm suppressing with ansi2txt
  • I haven't tried this with GPT-3 or GPT-4 yet, only text-davinci-003. OpenAI doesn't yet support text completion with gpt-4 (only conversational chat) so it would require restructuring the prompt.

aquarium's People

Contributors

chris-abbott avatar fafrd avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.