Comments (7)
the tokenizer is doing this:
return mx.array([self.tokenize(t, prepend_bos, append_eos) for t in text])
if text is a List[str], and mx.array doesn't like non uniform list of numbers.
I can "hack" this to work by padding this here with EOT. I went over to HF and they seemed to also "hack" this??
pad_token="<|endoftext|>", # hack to enable padding
so just to be consistent with it.
I have this local changes done. I am not sure if this repos is for edu purpose on how to use MLX and not being completely robust and faithful port of huggingface, otherwise, I can do a PR if this fix helps.
from mlx-examples.
I think it would be fine to pad in the tokenizer. Usually we try to keep the examples simple so they are useful for educational purposes / starting points for building from. In this case I don't think it will add much complexity to pad in the tokenizer.
But I guess the clip_loss would be a bit off if you don't consider the padding?
from mlx-examples.
Understood. On the way, i took a quick look at the tokenizer and found probably wasn’t the goal to replicate the HF completely. I followed closely what HF and fixed it locally, and it worked fine. HF tokenizer use end-of-Text to pad, so it is at least as correct as HF seemed to have it.
I wasn’t into training yet so not sure if the clip loss wants to be taken into account the padding method.
from mlx-examples.
I wasn’t into training yet so not sure if the clip loss wants to be taken into account the padding method.
Right but we do evaluate the loss in the current example even when not training. I'm not entirely sure why. Maybe we should remove it o/w it would be incorrect in the padded case.
from mlx-examples.
@kechan where do you add the pad_token="<|endoftext|>",
? Thanks
from mlx-examples.
Related Issues (20)
- Text to Speech MLX model. HOT 1
- SLM Example Code HOT 1
- Enhance load function to support model configuration editing HOT 1
- Support for full set of output formats - e.g. vtt, json and json-full HOT 2
- Whisper stutters HOT 8
- mlx 0.13 very slow with q8 and fp16 HOT 5
- Fine tuned a Mixtral-8x7B-Instruct-v0.1 model and unable to load with AutoModelForCausalLM HOT 1
- Phi-3-mini-4k-instruct : Failing to stop at <|end|> on generating the answer. HOT 5
- PaliGemma 4bit Quantization broken and Inference issues. HOT 27
- [Feature Request] Function Calling for mlx_lm.server HOT 4
- OS system requirement for mlx HOT 1
- 01-ai/Yi-1.5-9B-Chat got ValueError: Cannot instantiate this tokenizer from a slow version. HOT 4
- Package 'mlx_whisper.assets' is absent from the `packages` configuration HOT 1
- [Feature Request] Add support for logprobs to the mlx_lm server HOT 3
- Request for Example on Full Parameter and Training for LLM Model HOT 5
- Phi-3 128K Context Variants' `su` RoPE Scaling HOT 12
- [REGRESSION] Some MoE models display 0% GPU utilization with mlx-ops 0.14.0 HOT 3
- I would like to inquire about a solution to the following problem. HOT 1
- link to Phi-2 example in readme broken HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mlx-examples.