Comments (6)
@younesbelkada Thanks for confirmation.
Just FYI, the difference between the original implementation and transformers
version is at this line:
def forward(
self,
input_ids=None, # <---- the input_ids is the instruction for Qformer
position_ids=None,
query_embeds=None, # <---- the query_embeds is embedding of pretrained query_tokens of Qformer
past_key_values_length=0,
):
if input_ids is not None:
seq_length = input_ids.size()[1]
else:
seq_length = 0
if position_ids is None:
position_ids = self.position_ids[
:, past_key_values_length : seq_length + past_key_values_length
].clone()
if input_ids is not None:
embeddings = self.word_embeddings(input_ids)
if self.position_embedding_type == "absolute":
position_embeddings = self.position_embeddings(position_ids)
embeddings = embeddings + position_embeddings
if query_embeds is not None:
embeddings = torch.cat((query_embeds, embeddings), dim=1) # <---- if the input_ids exists, the final embeddings is concatenation of both. The query embed part and instruction embed part are treated in different way in the later process.
else:
embeddings = query_embeds
embeddings = self.LayerNorm(embeddings)
embeddings = self.dropout(embeddings)
return embeddings
@amyeroberts I would like to make the change. However some unittest breaks, I will make a PR when I figure out how to fix the tests.
from transformers.
I agree that this makes sense, although if we do that it won't be backward compatible as we're going to change the way layers are designed. I think our implementation follows quite closely the original implementation: https://github.com/salesforce/LAVIS/blob/main/lavis/models/blip2_models/Qformer.py that also implements it that way (one needs to pass already processed query_embeds
)
from transformers.
Thanks for raising this issue, it is is actually being addressed at #29261
from transformers.
Hi @tongda, thanks for raising this issue! Would you like to open a PR with this suggestion?
from transformers.
Thanks for double checking, this is clearer for me indeed this could make it BC - let us know when you open the PR !
from transformers.
Cool! I have checked the PR, it indeed include the change that I need and even more. Good job.
from transformers.
Related Issues (20)
- RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1! (HF/Accelerate) HOT 2
- Meta FAIR Chameleon 7b and 30b HOT 1
- load qwen2-72b-instruct sft awq q4_0 gguf ValueError: Trying to set a tensor of shape torch.Size HOT 2
- 如果在单个GPU上out of memory 如何用两个GPU加载推理同一个模型? HOT 3
- Check diff files in `check_copies`
- from_pretrained 加载checkpoint过慢的问题 HOT 1
- LLM during inference do not deallocate memory
- NVMLError_NotSupported when creating Trainer() object. HOT 2
- Stopping criteria not working with \n
- GGML (GGUF) Llama3 unit test fails HOT 1
- Error on fine tuning paligemma for object detection HOT 7
- Potential Bug in llava_next when calling pack_image_features function. HOT 5
- Source link to `LlamaForSequenceClassification` seems broken, if so, update it. HOT 2
- Process hangs when evaluating the model before finishing an epoch using `accelerate` in a multi-GPU environment (no trainer). HOT 3
- HuggingFace GroundingDINO inference execution time is slower than the original groundingDINO (~100ms) HOT 1
- Batch Generation giving different output when using batch size > 1 or when using padding in MambaForCausalLM HOT 2
- gh: consider `i18n` HOT 2
- Nested from_pretrained() gives warnings loading weights - "copying from a non-meta parameter"
- Problem with the masked language modeling tutorial HOT 1
- When running `ruff format src/transformers`, some files needs to be reformatted HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from transformers.