Git Product home page Git Product logo

Comments (6)

tongda avatar tongda commented on June 25, 2024 3

@younesbelkada Thanks for confirmation.
Just FYI, the difference between the original implementation and transformers version is at this line:

    def forward(
        self,
        input_ids=None, # <---- the input_ids is the instruction for Qformer
        position_ids=None,
        query_embeds=None, # <---- the query_embeds is embedding of pretrained query_tokens of Qformer
        past_key_values_length=0,
    ):
        if input_ids is not None:
            seq_length = input_ids.size()[1]
        else:
            seq_length = 0

        if position_ids is None:
            position_ids = self.position_ids[
                :, past_key_values_length : seq_length + past_key_values_length
            ].clone()

        if input_ids is not None:
            embeddings = self.word_embeddings(input_ids)
            if self.position_embedding_type == "absolute":
                position_embeddings = self.position_embeddings(position_ids)
                embeddings = embeddings + position_embeddings

            if query_embeds is not None:
                embeddings = torch.cat((query_embeds, embeddings), dim=1) # <---- if the input_ids exists, the final embeddings is concatenation of both. The query embed part and instruction embed part are treated in different way in the later process.
        else:
            embeddings = query_embeds

        embeddings = self.LayerNorm(embeddings)
        embeddings = self.dropout(embeddings)
        return embeddings

@amyeroberts I would like to make the change. However some unittest breaks, I will make a PR when I figure out how to fix the tests.

from transformers.

younesbelkada avatar younesbelkada commented on June 25, 2024 1

I agree that this makes sense, although if we do that it won't be backward compatible as we're going to change the way layers are designed. I think our implementation follows quite closely the original implementation: https://github.com/salesforce/LAVIS/blob/main/lavis/models/blip2_models/Qformer.py that also implements it that way (one needs to pass already processed query_embeds)

from transformers.

NielsRogge avatar NielsRogge commented on June 25, 2024 1

Thanks for raising this issue, it is is actually being addressed at #29261

from transformers.

amyeroberts avatar amyeroberts commented on June 25, 2024

Hi @tongda, thanks for raising this issue! Would you like to open a PR with this suggestion?

cc @younesbelkada

from transformers.

younesbelkada avatar younesbelkada commented on June 25, 2024

Thanks for double checking, this is clearer for me indeed this could make it BC - let us know when you open the PR !

from transformers.

tongda avatar tongda commented on June 25, 2024

Cool! I have checked the PR, it indeed include the change that I need and even more. Good job.

from transformers.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.