Comments (2)
Happened to me using GPT-2 and solved this issue by adding the following line:
if self.config['token_prefix'] is not None and token[0] == self.config['token_prefix']: token = token[1:]
right after the first loop line of nmf.explore() method:
for idx, token in enumerate(self.tokens[input_sequence]): # self.tokens[:-1]
if self.config['token_prefix'] is not None and token[0] == self.config['token_prefix']: token = token[1:]
type = "input" if idx < self.n_input_tokens else 'output'
tokens.append({'token': token,
'token_id': int(self.token_ids[input_sequence][idx]),
# 'token_id': int(self.token_ids[idx]),
'type': type,
# 'value': str(components[0][comp_num][idx]), # because json complains of floats
'position': idx
})
from ecco.
Yeah, that shouldn't happen. A bunch of tokenizers have a character like Ġ in the beginning of a token to indicate that the token is linked to whatever token comes before them in the sequence. Which is why rendering the output needs to run in tandem with the tokenizer and its settings.
from ecco.
Related Issues (20)
- Install ecco on Apple M1 Max clip fails HOT 1
- Google Colab support only Python 3.10 HOT 7
- How to add my own model like chinese-bert-wwm-ext
- How to retrieve salience of some specific words? HOT 1
- Support for pre-generated output
- How to serialize the LMOutput object?
- installing ecco HOT 2
- Decoder hidden states not found HOT 3
- baseline for sentence classification task using IG/Deeplift?
- Installing Ecco, unable to install in Colab HOT 3
- import ecco ERRO HOT 9
- llama support HOT 4
- bug -object has no attribute 'lm_head'- HOT 2
- lm.generate and HuggingFace's generate give different results with do_sample=False
- Request for feature addition: Visualization of Neuron Activations and Clustered Neuron Firings HOT 1
- Adding GENRE model to ecco HOT 1
- I can't find jupyter notebook for Input_Saliency.
- Error encountered when comparing activation matrices in batch generation loop
- Support to LLM like LLAMA-2 and Vicuna? HOT 15
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ecco.