Git Product home page Git Product logo

uot's Issues

Questions about the experimental settings

Hello,

Thank you for sharing your awesome work! I have three questions about the experimental settings.

Q1: As shown in Table 1 of the paper, UoT (CS) consistently outperforms the DP (CS) approach for all models (e.g., Llama3-70B, Claude-3-Opus, GPT-4). I tried to reproduce these results using the scripts below, only changing the guesser (e.g., meta-llama/Meta-Llama-3.1-70B-Instruct, gpt-4o-mini, gpt-4o), examiner (originally gpt-4, but I changed it to gpt-4o, which might be more powerful), and internal extract model (originally gpt-3.5-turbo, but I changed it to gpt-4o-mini, which is more powerful). However, except for gpt-4o as the guesser model, I cannot reproduce the same performance trend (i.e., UoT (CS) > DP (CS)) for most of models. Am I correctly executing the scripts for DP (CS) and UoT (CS)?

Here are scripts for the guesser model as meta-llama/Meta-Llama-3.1-70B-Instruct model:

  • DP (CS)
# DP (CS) 
python run.py --guesser_model="meta-llama/Meta-Llama-3.1-70B-Instruct" --examiner_model="gpt-4o" --task=20q --dataset=common  --naive_run --inform
# UoT (CS)
python run.py --guesser_model="meta-llama/Meta-Llama-3.1-70B-Instruct" --examiner_model="gpt-4o" --task=20q --dataset=common

Q2: Following Table 11, especially DP (CS) setting, there are two times that the questioner is informed of the entire probability set (content marked in red): the first reminder before Q1, and at Q14. However, in this repository, there is only one instance where the entire probability set is suggested to the questioner. Is this code incorrect or are there any missing points?

UoT/src/uot/method.py

Lines 48 to 51 in e08f17f

if ques_id > int(task.max_turn*0.7):
prompt += task.prompts.urge_prompt
if task.inform:
prompt += task.prompts.inform_prompt.format(item_list_str=', '.join(task.set))

Q3: For the Things dataset, there are only 200 objects (THING200) in the current code. Was the paper's experiment conducted on these 200 objects or the original 1,854 items (as mentioned in the paper)?

A problem about the reply on does not match the target

Hello, may I ask if you have encountered this situation, where the final reply on GPT-4 does not match the target, and the same prompt triggers a consistent issue every time? Is this due to my mistake in operation? It seems that this issue only occurs when a new tree is created.
In addition, can I ask, is a dataset to build a tree?
image

set args --task=md --dataset=DX but can not work corrrectly

Traceback (most recent call last):
File "D:\Program Files\JetBrains\PyCharm Community Edition 2024.1.4\plugins\python-ce\helpers\pydev\pydevd.py", line 1551, in exec
pydev_imports.execfile(file, globals, locals) # execute the script
File "D:\Program Files\JetBrains\PyCharm Community Edition 2024.1.4\plugins\python-ce\helpers\pydev_pydev_imps_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "D:\UoT\run.py", line 111, in
run(args)
File "D:\UoT\run.py", line 14, in run
task = get_task(args)
File "D:\UoT\src\uot\tasks_init
.py", line 7, in get_task
return MDTask(args)
File "D:\UoT\src\uot\tasks\medical_diagnosis.py", line 15, in init
self.data = self.load_dataset(args.dataset)
File "D:\UoT\src\uot\tasks\medical_diagnosis.py", line 32, in load_dataset
return json.loads(os.path.join(os.path.dirname(file), f"../data/{name}.json").read())
AttributeError: 'str' object has no attribute 'read'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.