Comments (5)
set -x
export PATH=$HOME/.local/bin/:$PATH
ray job submit --address="http://127.0.0.1:8266"
--runtime-env-json='{"working_dir": "/openrlhf", "pip": "/openrlhf/requirements.txt"}'
-- python3 examples/train_ppo_ray.py
--ref_num_nodes 1
--ref_num_gpus_per_node 1
--reward_num_nodes 1
--reward_num_gpus_per_node 1
--critic_num_nodes 1
--critic_num_gpus_per_node 1
--actor_num_nodes 1
--actor_num_gpus_per_node 1
--vllm_num_engines 2
--vllm_tensor_parallel_size 2
--colocate_critic_reward
--colocate_actor_ref
--ref_reward_offload
--pretrain /openrlhf/OpenLLMAI/Llama-3-8b-sft-mixture
--reward_pretrain /openrlhf/examples/scripts/checkpoint/llama3-8b-rm
--save_path ./checkpoint/llama-3-8b-rlhf
--micro_train_batch_size 1
--train_batch_size 128
--micro_rollout_batch_size 1
--rollout_batch_size 1024
--max_samples 100000
--max_epochs 1
--prompt_max_len 1024
--generate_max_len 1024
--zero_stage 3
--bf16
--actor_learning_rate 5e-7
--critic_learning_rate 9e-6
--init_kl_coef 0.01
--prompt_data /openrlhf/OpenLLMAI/prompt-collection-v0.1
--input_key context_messages
--apply_chat_template
--normalize_reward
--adam_offload
--flash_attn
--gradient_checkpointing
--use_wandb {wandb_token}
这是配置
from openrlhf.
Traceback (most recent call last):
File "/root/.local/lib/python3.10/site-packages/ray/dashboard/modules/dashboard_sdk.py", line 262, in _check_connection_and_version_with_url
r = self._do_request("GET", url)
File "/root/.local/lib/python3.10/site-packages/ray/dashboard/modules/dashboard_sdk.py", line 303, in _do_request
return requests.request(
File "/root/.local/lib/python3.10/site-packages/requests/api.py", line 59, in request
return session.request(method=method, url=url, **kwargs)
File "/root/.local/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
File "/root/.local/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
File "/root/.local/lib/python3.10/site-packages/requests/adapters.py", line 700, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='127.0.0.1', port=8266): Max retries exceeded with url: /api/version (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f0fca25a3e0>: Failed to establish a new connection: [Errno 111] Connection refused'))
from openrlhf.
要先运行:ray start --head --node-ip-address 0.0.0.0
然后nodes 卡数根据GPU数量配置
from openrlhf.
感谢...没用过ray
from openrlhf.
--colocate_critic_reward
--colocate_actor_ref
--ref_reward_offload
--colocate_critic_reward
--colocate_actor_ref
是用来合并节点的
from openrlhf.
Related Issues (20)
- DPO Finetuning constantly gives preference loss as 0.6931 HOT 8
- Difference between `DeepSpeedEngine.save_checkpoint()` and `DeepSpeedStrategy.save_model()` HOT 2
- DPO后的模型推理出的结果都是无序符号 HOT 1
- Support training from breakpoint HOT 3
- llama3 70B DPO example script
- where is gradient_accumulation HOT 1
- Support RLOO HOT 1
- 现在Train_PPO_llama_ray 过程中会把Actor Model切分到不同卡上吗 HOT 4
- packing的问题 HOT 2
- "right" padding hardcoded HOT 3
- Error while saving the model under 4bit lora HOT 2
- multinode ppo training extremely slow HOT 15
- 使用ray的时候Request Entity Too Large HOT 3
- dpo 训练显存 OOM HOT 1
- Online DPO 支持 HOT 4
- Feature: add DPO-P
- Zero stage 3 error HOT 1
- Performance of Iterative DPO? HOT 1
- Why multiplying rstd instead of dividing by rstd? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from openrlhf.