Git Product home page Git Product logo

chatgpt-whatsapp-bot's Introduction

How to run

run by filling the credentials in the corresponding json files inside config folder examples are provided in JS files in the same folder

after that create an empty json inside history.json file in bot/storage folder this file keeps the history so chatgpt have a memory, you can configure that field inside config.json in config folder. If history.json does not exists create the file.

Create the following folders if they don't exist already

bot/audio bot/storage bot/images bot/videos & bot/videos/frames

Also, ffmpeg needs to be installed in the computer that runs this code

Fixes

When using video data the following package has a bug in this current version which does not allow to return the video 64 encoded data

"whatsapp-web.js": "1.20.0"

The only way to fix this is going to node_modules folder and whatsapp-web.js folder after that in src/structures/Message.js the line 406 must be

if (msg.mediaData.mediaStage.includes('ERROR')) { // media could not be downloaded return undefined; }

and not the fetching part, the browser does download the video, no need to cancel the ones that are fetching

Features

  • [General Knowledge] Uses chatgpt 3.5 to analyze text messages and replies exactly like chatgpt
    • Coding question
    • General knowledge
    • and more ...
  • [Can speak] It will respond to audio messages with another audio voice message using elevenlabs and whisperAI
    • add the id of the voice and keys in config.json
  • [Translation in an exclusive voice] When asked to translate something it can do that but if asked to translate something with your voice it will use "translate_id" inside elevenlabs
    • allows you to speak another language with your own voice pre trained in elevenlab's website
  • [Image Analysis] Uses a gpt2 model with transformers to get the context of an image and it will feed that to chatgpt 3.5 to reply accordingly
    • you can add text to the image to ask something related to it or add more context
    • due to gpt2 being way older than 3.5 is as accurate it can get, probably I'll watch out for another image to text model in huggingface that uses 3.5 as soon as one comes up
    • model "vit-gpt2-image-captioning": https://huggingface.co/nlpconnect/vit-gpt2-image-captioning
  • [Video Analysis] Uses the same gpt2 model to get the context of an image to understand videos
    • It does this by splitting them into a certain number of frames which transform to text and also takes the audio from the same video using whisperAI from openai, it feeds all of this to chatgpt 3.5 and uses that to get the content
    • Text can be added to the video to add more context or ask something related to the video

Future Features [Work in progress]

  • [Internet Connection] When it needs more updated information from the internet the goal is to scrap a certain amount of google links, transform that into text and pass it to chatgpt 3.5 as new information that can be later refer to, asked or anything that the model can do.
  • [Image Creation] Hopefully if all works out one of the next features will be to add Dali API so it can create images out of nothing.
  • [More random behavior] Now the bot replies with voice when you ask to it with voice, in the future a pre call to chatgpt will happen and the bot itself will decide if it wants to send you a audio voice or just text, when you send an audio voice to the bot the same will happen.

Examples

RPReplay_Final1684777570.MP4
RPReplay_Final1684897271.1.mp4
Screenshot 2023-05-28 at 02 52 04 Screenshot 2023-05-28 at 02 51 49 Screenshot 2023-05-28 at 02 51 25 Screenshot 2023-05-28 at 02 51 03 Screenshot 2023-05-28 at 02 50 46 Screenshot 2023-05-28 at 02 50 09 Screenshot 2023-05-28 at 02 50 29

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.