Git Product home page Git Product logo

Comments (9)

1049451037 avatar 1049451037 commented on July 20, 2024 2

layer_range接受一个list作为参数,代表要加lora的层编号,比如我们提供的脚本里list(range(0, 28, 14))其实就是[0, 14],这个是python的语法。你如果想对15和16也加lora,就直接传layer_range=[0, 14, 15, 16]就可以了。

from visualglm-6b.

1049451037 avatar 1049451037 commented on July 20, 2024

请改成

python3 cli_demo.py --from_pretrained /root/ai/VisualGLM-6B/checkpoints/finetune-visualglm-6b-05-25-15-02 --prompt_zh 请给这张图片写一段广告语?

from visualglm-6b.

magicwang1111 avatar magicwang1111 commented on July 20, 2024

请改成

python3 cli_demo.py --from_pretrained /root/ai/VisualGLM-6B/checkpoints/finetune-visualglm-6b-05-25-15-02 --prompt_zh 请给这张图片写一段广告语?

已成功运行,谢谢,
训练前
image
训练后
image

衣服
image

希望接下来支持在web_demo也能加载微调后的模型

from visualglm-6b.

magicwang1111 avatar magicwang1111 commented on July 20, 2024

image
多轮对话的推理能力也很不错

from visualglm-6b.

1049451037 avatar 1049451037 commented on July 20, 2024

如果数据多的话可以考虑增加训练的参数,目前的训练脚本只训练了2层lora,可以看到 https://github.com/THUDM/VisualGLM-6B/blob/main/finetune_visualglm.py#L18 的layer_range参数,目前只finetune了[0,14]两层的lora。

from visualglm-6b.

magicwang1111 avatar magicwang1111 commented on July 20, 2024

如果数据多的话可以考虑增加训练的参数,目前的训练脚本只训练了2层lora,可以看到 https://github.com/THUDM/VisualGLM-6B/blob/main/finetune_visualglm.py#L18 的layer_range参数,目前只finetune了[0,14]两层的lora。

实在是太感谢你了,在微调的过程中给予我这么多帮助,我看到这行代码了
self.add_mixin("lora", LoraMixin(args.num_layers, args.lora_rank, head_first=True, num_attention_heads=args.num_attention_heads, hidden_size_per_attention_head=args.hidden_size // args.num_attention_heads, layer_range=list(range(0, 28, 14))), reinit=True)

我知道在SD里lora层数是26层
BASE,IN00,IN01,IN02,IN03,IN04,IN05,IN06,IN07,IN08,IN09,IN10,IN11,M00,OUT00,OUT01,OUT02,OUT03,OUT04,OUT05,OUT06,OUT07,OUT08,OUT09,OUT10,OUT11
所以如果想修改层数,是否是改成 self.add_mixin("lora", LoraMixin(args.num_layers, args.lora_rank, head_first=True, num_attention_heads=args.num_attention_heads, hidden_size_per_attention_head=args.hidden_size // args.num_attention_heads, layer_range=list(range(0, 28, 14, 15, 16))), reinit=True)这种方式?

from visualglm-6b.

magicwang1111 avatar magicwang1111 commented on July 20, 2024

如果数据多的话可以考虑增加训练的参数,目前的训练脚本只训练了2层lora,可以看到 https://github.com/THUDM/VisualGLM-6B/blob/main/finetune_visualglm.py#L18 的layer_range参数,目前只finetune了[0,14]两层的lora。
visualglm是否支持chatglm一样的多轮对话训练?

{"prompt": "长城h3风扇不转。继电器好的。保险丝好的传感器新的风扇也新的这是为什么。就是继电器缺一个信号线", "response": "用电脑能读数据流吗?水温多少", "history": []}
{"prompt": "95", "response": "上下水管温差怎么样啊?空气是不是都排干净了呢?", "history": [["长城h3风扇不转。继电器好的。保险丝好的传感器新的风扇也新的这是为什么。就是继电器缺一个信号线", "用电脑能读数据流吗?水温多少"]]}
{"prompt": "是的。上下水管都好的", "response": "那就要检查线路了,一般风扇继电器是由电脑控制吸合的,如果电路存在断路,或者电脑坏了的话会出现继电器不吸合的情况!", "history": [["长城h3风扇不转。继电器好的。保险丝好的传感器新的风扇也新的这是为什么。就是继电器缺一个信号线", "用电脑能读数据流吗?水温多少"], ["95", "上下水管温差怎么样啊?空气是不是都排干净了呢?"]]}

from visualglm-6b.

magicwang1111 avatar magicwang1111 commented on July 20, 2024

layer_range接受一个list作为参数,代表要加lora的层编号,比如我们提供的脚本里list(range(0, 28, 14))其实就是[0, 14],这个是python的语法。你如果想对15和16也加lora,就直接传layer_range=[0, 14, 15, 16]就可以了。

感谢指导,visualglm是否支持chatglm一样的多轮对话训练?

{"prompt": "长城h3风扇不转。继电器好的。保险丝好的传感器新的风扇也新的这是为什么。就是继电器缺一个信号线", "response": "用电脑能读数据流吗?水温多少", "history": []}
{"prompt": "95", "response": "上下水管温差怎么样啊?空气是不是都排干净了呢?", "history": [["长城h3风扇不转。继电器好的。保险丝好的传感器新的风扇也新的这是为什么。就是继电器缺一个信号线", "用电脑能读数据流吗?水温多少"]]}
{"prompt": "是的。上下水管都好的", "response": "那就要检查线路了,一般风扇继电器是由电脑控制吸合的,如果电路存在断路,或者电脑坏了的话会出现继电器不吸合的情况!", "history": [["长城h3风扇不转。继电器好的。保险丝好的传感器新的风扇也新的这是为什么。就是继电器缺一个信号线", "用电脑能读数据流吗?水温多少"], ["95", "上下水管温差怎么样啊?空气是不是都排干净了呢?"]]}

from visualglm-6b.

1049451037 avatar 1049451037 commented on July 20, 2024

支持,可以参考cli_demo.py

from visualglm-6b.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.