Is your feature request related to a problem? Please describe. <em

How to Extract Complete, Non-redundant, and Correct Code from Messages Testing on Benchmarks like HumanEval? about open-interpreter HOT 5 OPEN

huoliangyu commented on May 27, 2024

How to Extract Complete, Non-redundant, and Correct Code from Messages Testing on Benchmarks like HumanEval?

from open-interpreter.

Comments (5)

Steve235lab commented on May 27, 2024

Except the "non-redundant", your requirements can be done with some well designed custom instructions. You need to tell the LLM the expected way to response. However, the "non-redundant" is conflict with correct in most cases because of the limited ability of current LLMs, they need to debug their code for several times to give a final correct version, just like normal human programmers, which means there are always some redundant code in the conversations history. You may need to find a way by yourself to filter the history to get the final correct code.

from open-interpreter.

huoliangyu commented on May 27, 2024

Except the "non-redundant", your requirements can be done with some well designed custom instructions. You need to tell the LLM the expected way to response. However, the "non-redundant" is conflict with correct in most cases because of the limited ability of current LLMs, they need to debug their code for several times to give a final correct version, just like normal human programmers, which means there are always some redundant code in the conversations history. You may need to find a way by yourself to filter the history to get the final correct code.

Thank you for your quick response! Could you suggest any suitable prompt templates or methods for extracting code to test the open-interpreter's performance on HumanEval? In my tests (where I've designed prompts to ensure the agent always outputs code), the performance of GPT-3.5 with open-interpreter seems somewhat inferior compared to using GPT-3.5 directly. Any good advice would be greatly appreciated!

from open-interpreter.

Steve235lab commented on May 27, 2024

GPT-3.5 is provided as a RESTful API by OpenAI, so I don't really know what "using GPT-3.5 directly" means? Curl the API directly? If you mean the ChatGPT from OpenAI, then as long as the system prompts of ChatGPT are property of OpenAI, it's hard to compose something better than that. There are some tricks on the Internet teaching you how to get the system prompts of ChatGPT, maybe you can try that.

from open-interpreter.

Steve235lab commented on May 27, 2024

By the way, the default embedded system prompt of OI may not be suitable for your task, somehow it focuses too much on telling the LLM how to parse special message types of OI. If custom instructions can't solve your problem, you can try to modify the embedded system prompt in OI source code.

from open-interpreter.

huoliangyu commented on May 27, 2024

Thank you for your reply, I will try these methods and look forward to OI updating continuously.

from open-interpreter.

Recommend Projects

How to Extract Complete, Non-redundant, and Correct Code from Messages Testing on Benchmarks like HumanEval? about open-interpreter HOT 5 OPEN

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent