Git Product home page Git Product logo

Comments (12)

zhangyebai avatar zhangyebai commented on May 18, 2024 1
  1. Tencent -> Llama

image

convert_tencentpretrain_to_llama.py

  1. Llama -> Huggingface

image

convert_llama_weights_to_hf.py
我理解的路径应该是这样

from linly.

jamestch avatar jamestch commented on May 18, 2024

同样的问题

from linly.

zhangyebai avatar zhangyebai commented on May 18, 2024

TencentPretain
scripts

from linly.

jamestch avatar jamestch commented on May 18, 2024

TencentPretain scripts

这个仓库下面,似乎没找到llama tencentpretrain格式到huggingface格式的转换脚本

from linly.

riverzhou avatar riverzhou commented on May 18, 2024

转llama的时候,layer_num参数怎么设置,是用默认(12层)么?

from linly.

ydli-ai avatar ydli-ai commented on May 18, 2024

from linly.

zhangyebai avatar zhangyebai commented on May 18, 2024

期待作者给出 ChatLLaMA-zh-7B 到 ChatLLaMA-zh-7B-hf的转换脚本,在线等

from linly.

riverzhou avatar riverzhou commented on May 18, 2024

期待作者给出 ChatLLaMA-zh-7B 到 ChatLLaMA-zh-7B-hf的转换脚本,在线等

其实能直接转llama我很合用,因为我是用llama.cpp

from linly.

riverzhou avatar riverzhou commented on May 18, 2024

转llama的时候,layer_num参数怎么设置,是用默认(12层)么?

自己回答自己的问题。7B的模型是32层,13B的模型是40层。
如有错误请大家指正。

from linly.

Minami-su avatar Minami-su commented on May 18, 2024

转成huggingface后效果咋样,会有损失吗?

from linly.

hepj987 avatar hepj987 commented on May 18, 2024

@riverzhou
请问llama.cpp你是如何运行的?

from linly.

riverzhou avatar riverzhou commented on May 18, 2024

@riverzhou 请问llama.cpp你是如何运行的?

先用 TencentPretrain 项目里的转换脚本把作者的腾讯格式的数据转成原始的 llama 的格式(layer_num参数:7B的模型是32层,13B的模型是40层。),
再用 llama.cpp 项目里 转换脚本转成 ggml 的格式,
最后,可选做量化,Q4 Q5 Q8都可以。

from linly.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.