Comments (1)
The algorithm is designed to be lossless if the computation is done at infinity precision. In reality the computation is done at 32 or 16-bit precision, which might cause some discrepancies due to rounding errors.
In our test, such cases are extremely rare. If you encounter frequent differences between greedy and llma outputs, it must be some bugs or errors we are not aware of.
from lmops.
Related Issues (20)
- RuntimeError: probability tensor contains either `inf`, `nan` or element < 0
- the logits between MP=1 and MP=4 is different when control all other variables to be the same HOT 9
- Details for GPT4 evaluation
- The file name is missing l HOT 2
- AdaptLLM models with Llama Index HOT 6
- Paper:ADAPTING LARGE LANGUAGE MODELS VIA READING COMPREHENSION HOT 3
- prompt_optimization HOT 2
- top-p < 1 fails inf assertion HOT 1
- why is the mpu/cross_entropy missing a softmax_logits_t HOT 2
- [MiniLLM] sft of llama2-7b out of memory on V100 HOT 2
- [MiniLLM]LLama sft on Dolly hard to reproduce results in paper. HOT 2
- Questions about the free-law data used in the paper "Adapt LLM to domains" HOT 2
- 【MiniLLM】About the number of training data of dolly HOT 4
- [MiniLLM]Why dolly only has 12435 training samples? HOT 2
- [MiniLLM] About the gradient accumulation in finetune.py HOT 2
- [tuna] Libraries are conflicting and/or very aged HOT 5
- Missing Jailbreak dataset from protegi? HOT 2
- ImportError: cannot import name 'mpu' from 'transformers' HOT 4
- ModuleNotFoundError: No module named 'deepspeed'
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lmops.