Comments (6)
And gpt-4o-mini but it looks like that's already included, so this issue can be closed?
from open-interpreter.
This should work out of the box! As it's just a new model name, you can run interpreter --model openai/gpt-4o
. By the next update, we will set this as the default model.
Let me know if that works @alperyilmaz, and thanks for opening this!
from open-interpreter.
It works!
I have a question about how things work, especially about images. When I paste an image and ask "what do you see?" I was expecting that the image is sent to openAI model gpt-4o and then gpt-4o can reply with what it sees in the image. But when I ask this question through open-interpreter, it lays out a plan like this:
- Load the Image: Load the image file from the specified path.
- Perform OCR: Use Tesseract to extract any text from the image.
- Analyze the Image: Provide information regarding the text and any detected features in the image.
So, if I understand correctly, it's not possible to take advantage of vision capabilities of gpt-4o. Did I understand that correctly? Or, am I using wrong prompts?
from open-interpreter.
@KillianLucas I think @alperyilmaz didn't run into problem to just chat with the new model, but when run OI with i --vision
param would override the model setting of gpt-4o
in profile with gpt-4-vision-preview
(luckily, only for current conversation, not really modifying the profile file even there's an unexpected prompt "We have updated our profile file format. Would you like to migrate your profile file to the new format? No data will be lost."). This usually means the new model gpt-4o
is not in the list of models that support vision mode defined in OI source code.
from open-interpreter.
I checked the source code found that if the --vision
param was passed when launch, OI will load settings from a normal user unaccessible profile vision.yaml
which will set the model to gpt-4-vision-preview
. And because the version code of vision.yaml
is still 0.2.1
(latest is 0.2.5
), so it prints the profile migration prompt. BTW, the checking of if a model is a vision model is done with litellm.supports_vision
.
from open-interpreter.
Just checked the latest version 1.37.16
of litellm
, they have added the support of gpt-4o
, so litellm.supports_vision
should work correctly. However, OI still try to use libraries like PIL to analyze the image locally on user's machine. Worth more investigation.
from open-interpreter.
Related Issues (20)
- ChatGPT API HOT 2
- When running `interpreter.chat(...)` with python multiprocessing, can occasionally get `zmq.error.ZMQError: Address already in use` error
- Model names in llama download have colons
- The line break character in the name of the conversation file causes the OSError error
- How to use with vllm correctly?
- LiteLLM:WARNING: litellm_logging.py:2323 - Model is not mapped in model cost map. Defaulting to None model_cost_information for staLiteLLM:WARNING: litellm_logging.py:2323 - Model is not mapped in model cost map. Defaulting to None model_cost_information for standard_loggin g_payloadndard_loggin g_payload HOT 2
- Show option to skip future "Was Open Interpreter helpful?" messages HOT 1
- Problem using Ollama HOT 3
- --local should respect $OLLAMA_HOST rather than defaulting to localhost HOT 2
- Where are chat logs saved? HOT 1
- recommendation for making 'gpt-4o-mini' the default model over '4-turbo' HOT 6
- Can we get the dataset for the formatting for local AI to use the UI? HOT 1
- Can I connect open interpreter to a running llamafile? HOT 1
- The code which should be exec in system, don't run and suspend. HOT 1
- import statement missing in a module
- NameError: name 'interpreter' is not defined after installation HOT 16
- Can't get connection to Oogabooga w/openai extension to work. HOT 1
- Internal Server Error with display functions
- How do I continue if it get's stuck at a prompt? HOT 2
- Self-learning
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from open-interpreter.