The uot's discuss from zhiyuanhubj

uot's Issues

Questions about the experimental settings

Hello,

Thank you for sharing your awesome work! I have three questions about the experimental settings.

Q1: As shown in Table 1 of the paper, UoT (CS) consistently outperforms the DP (CS) approach for all models (e.g., Llama3-70B, Claude-3-Opus, GPT-4). I tried to reproduce these results using the scripts below, only changing the guesser (e.g., meta-llama/Meta-Llama-3.1-70B-Instruct, gpt-4o-mini, gpt-4o), examiner (originally gpt-4, but I changed it to gpt-4o, which might be more powerful), and internal extract model (originally gpt-3.5-turbo, but I changed it to gpt-4o-mini, which is more powerful). However, except for gpt-4o as the guesser model, I cannot reproduce the same performance trend (i.e., UoT (CS) > DP (CS)) for most of models. Am I correctly executing the scripts for DP (CS) and UoT (CS)?

Here are scripts for the guesser model as meta-llama/Meta-Llama-3.1-70B-Instruct model:

DP (CS)

# DP (CS) 
python run.py --guesser_model="meta-llama/Meta-Llama-3.1-70B-Instruct" --examiner_model="gpt-4o" --task=20q --dataset=common  --naive_run --inform
# UoT (CS)
python run.py --guesser_model="meta-llama/Meta-Llama-3.1-70B-Instruct" --examiner_model="gpt-4o" --task=20q --dataset=common

Q2: Following Table 11, especially DP (CS) setting, there are two times that the questioner is informed of the entire probability set (content marked in red): the first reminder before Q1, and at Q14. However, in this repository, there is only one instance where the entire probability set is suggested to the questioner. Is this code incorrect or are there any missing points?

UoT/src/uot/method.py

Lines 48 to 51 in e08f17f

 if ques_id > int(task.max_turn*0.7): 

 prompt += task.prompts.urge_prompt 

 if task.inform: 

 prompt += task.prompts.inform_prompt.format(item_list_str=', '.join(task.set))

Q3: For the Things dataset, there are only 200 objects (THING200) in the current code. Was the paper's experiment conducted on these 200 objects or the original 1,854 items (as mentioned in the paper)?

A problem about the reply on does not match the target

Hello, may I ask if you have encountered this situation, where the final reply on GPT-4 does not match the target, and the same prompt triggers a consistent issue every time? Is this due to my mistake in operation? It seems that this issue only occurs when a new tree is created.
In addition, can I ask, is a dataset to build a tree?

set args --task=md --dataset=DX but can not work corrrectly

Traceback (most recent call last):
File "D:\Program Files\JetBrains\PyCharm Community Edition 2024.1.4\plugins\python-ce\helpers\pydev\pydevd.py", line 1551, in exec
pydev_imports.execfile(file, globals, locals) # execute the script
File "D:\Program Files\JetBrains\PyCharm Community Edition 2024.1.4\plugins\python-ce\helpers\pydev_pydev_imps_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "D:\UoT\run.py", line 111, in
run(args)
File "D:\UoT\run.py", line 14, in run
task = get_task(args)
File "D:\UoT\src\uot\tasks_init.py", line 7, in get_task
return MDTask(args)
File "D:\UoT\src\uot\tasks\medical_diagnosis.py", line 15, in init
self.data = self.load_dataset(args.dataset)
File "D:\UoT\src\uot\tasks\medical_diagnosis.py", line 32, in load_dataset
return json.loads(os.path.join(os.path.dirname(file), f"../data/{name}.json").read())
AttributeError: 'str' object has no attribute 'read'

zhiyuanhubj / uot Goto Github PK

uot's Issues

Questions about the experimental settings

A problem about the reply on does not match the target

set args --task=md --dataset=DX but can not work corrrectly

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

	if ques_id > int(task.max_turn*0.7):
	prompt += task.prompts.urge_prompt
	if task.inform:
	prompt += task.prompts.inform_prompt.format(item_list_str=', '.join(task.set))