Tutorial on training, evaluating LLM, as well as utilizing RAG, Agent, Chain to build entertaining applications with LLMs.分享如何训练、评估LLMs,如何基于RAG、Agent、Chain构建有趣的LLMs应用。
This repo with videos is helpful to me.
Nevetheless, I still met some issues.
May the following tips save your time.
[Linux:Yes, Windows:No] Bitsandbytes is officially maintained on Linux. Hence, you may not use the 4-bit, 8-bit model on a Windows platform.
[Need fine-tune:Yes, Directly use:No] Firstly, you should train the 4-bit model from scratch. Secondly, you can get the fine-tuned model in the output folder. Finally, you can check the results in the notebook and compare the performance with/without fine-tuning.
[P40x1-4-bit-model:Yes, V100x1-bfloat16-model:No] Currently, you can fine-tune the 4-bit-model with a P40-GPU (24G) on Linux, but you can not fine-tune the bfloat16 model with a V100-GPU(32G) on Windows.
[Training time based on the default settings]: 4950 iteration in total, 300 iterations for the first 55 min, hence, the whole fine-tunning would take about 15 hours.