Comments (5)
Hello, my fine-tuning snippet is working now and you may close this ticket. Thank you for the help!
from starcoder.
By the way, it is because our team is more often working with Gaming design languages like C-sharp.
from starcoder.
Hi, we used the same pre-training framework Megatron-LM to do the fine-tuning on the Python dataset, that you can use with any other dataset, or use the code provided in this repository, with the dataset of your choice.
from starcoder.
Thanks a lot for the update. I am not familiar with code implementations in Megatron-LM, however. May I know with which file(s) should I follow to understand the fine-tuning logic?
Thanks in advance!
from starcoder.
Hi,
I am looking into the examples directory and will update this ticket once it has been executed properly. Thanks.
from starcoder.
Related Issues (20)
- Generating Embeddings of Code Tokens using StarCoder HOT 1
- Fine-tuning Starcoder or Octocoder for IDE Integration: Instruction Tuning vs Base Model Training Approach HOT 1
- does this support deepspeed zero train?
- inference problem
- Could somebody guide me how to fine-tune with fill-in-middle task based on StarCoderBase? HOT 1
- HuggingFaceH4/oasst1_en - missing dataset HOT 1
- Empty Generations / Failing Reproducing 40% on HumanEval HOT 3
- How many shots are used for evaluating HumanEval? HOT 1
- Fine tuning With SQLcoder-7b
- torch.cuda.OutOfMemoryError on HuhhingFace NVidia 4xA10G Large HOT 2
- Question about Improving Code Generation with Promting
- Better inference based on starcode2-3b model HOT 1
- FileNotFoundError: [Errno 2] No such file or directory: 'checkpoint-100/model-00001-of-00003.safetensors'
- Is finetune.py incompatible with older GPUs?
- What should be masking id . should it be -100 only . giving device side assert triggered
- v0.10.0 of Peft breaks finetune.py
- RuntimeError: CUDA error: CUDA-capable device(s) is/are busy or unavailable CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
- Removal request & notice: permissive licensing might often still be unsuitable(!) for training set inclusion HOT 2
- zero3 DPO starcoder OOM
- Can starcoder be used to create a structured file format?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from starcoder.