Git Product home page Git Product logo

intent-pilot's Introduction

⧐ Intent Pilot

Discord License

What can be said can be solved.

Get early access to the PTA model‎ ‎ |‎ ‎ Scale on our shoulders


alt text

Intent Pilot

IntentPilot is an orchestration of two tools: AskUI's object detector with OpenAI's GPT-4v to achieve automation. It is designed to automate repetitive tasks, and to assist users in performing complex tasks with ease. This repository is our attempt to understand the GPT-4v's potential in automation and building an end-to-end automation tool.

We are inspired by and improve upon Self-Operating-Computer by a more accurate object detection model and an improved prompting strategy. We also provide a more user-friendly interface, and a more intuitive way to interact with the tool. For example, the notification feature to let users know what is happening and what to do next. Also, our tool works across all keyboard layouts - US, German, etc, which was one of the limitations of similar tools.

Demo

demo-eco.mp4

Quick Start

Setup

  • Python 3.11 or later
  • OpenAI Key
  • AskUI credentials
    • ASKUI_WORKSPACE_ID and ASKUI_TOKEN are needed in .env file to get the product running.
    • You can get your own AskUI credentials by signing up at AskUI
  • You can also copy the .env.example file to .env and fill in the required details OR You can enter the credentials in the terminal when you start the app.

⚠️IMPORTANT: If you saved the credentials with the flag -c --config, you MUST delete them with the flag -d --deleteconfig again for the local .env file to be read again.

Linux

  • In case of linux, you may need to install the following packages:
sudo apt-get install xsel xclip python3-tk python3-dev

macOS

  • In case of MacOS, you will have to grant permissions to the terminal to access the clipboard. You can do this by going to System Preferences -> Security & Privacy -> Privacy -> Accessibility and then adding the terminal to the list of apps that can control your computer.

Windows

We are currently working on the Windows version of the tool. It will be available soon.

Quick Fix: The package also works on Windows but the Windows Defender is deleting the src/intent_pilot/utils/screenshot.py file. You have to restore the file from the quarantine and add it to the exclusion list.

Installation

pip install intent-pilot

Terminal

After installation, run intent in your terminal:

intent

In case, you are unable to run the command, try running the following command:

python -m intent_pilot

Build from Source

We recommend using PDM to build and run Intent Pilot

Step 0: Install PDM

Run the following command in a terminal to install PDM:

curl -sSL https://pdm-project.org/install-pdm.py | python3 -

Step 1: Install Dependencies for Intent Pilot

Run the following command to install all the dependencies needed for Intent Pilot:

pdm install

Step 2: Run Intent Pilot

Start the Intent Pilot by running this command:

pdm run intent

Flags

  • --debug: Prints debug output to the console
  • --model -m <modelname>: The model to use - llava or default gpt4v
  • -c --config: Prompts to save the credentials configuration to ~/.askui/intent-pilot.env
  • -d --deleteconfig: Prompts to delete the credentials from ~/.askui/intent-pilot.env

Use Local llava Model Instead of gpt4-Vision

Install Ollama for your system from their Website.

  • ollama serve to start the API locally
  • ollama run llava to start the model locally
  • Start the Intent Pilot with the flag -m llava

Join Our Discord Community

For real-time discussions and community support, join our Discord server:

  • Join our Discord Server and then navigate to the #intent-pilot channel.

Contributing

Thank you for your interest in contributing! We welcome involvement from the community.

Roadmap

We are currently in the process of building PTA (Prompt-to-Automation) model, a Multi-Modal Model that can understand and execute commands in natural language, in real-time and faster than any Virtual Personal Assistant (VPA) in the market.

intent-pilot's People

Contributors

gitlost-murali avatar gitlost-murali-askui avatar johannesdienst-askui avatar programminx-askui avatar lumpin-askui avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.