Git Product home page Git Product logo

Comments (5)

adrienchaton avatar adrienchaton commented on July 17, 2024 1

in case others face same issue, actually one should also use generate method for inference to compute logits

it goes as

ids_backtranslation = tokenizer.batch_encode_plus(["<fold2AA>"+" "+seq_3di], add_special_tokens=True,  padding="longest", return_tensors='pt').to(model.device)
outputs = model.generate(ids_backtranslation.input_ids, attention_mask=ids_backtranslation.attention_mask, max_length=len(seq_3di)+1, min_length=len(seq_3di)+1, output_scores=True, return_dict_in_generate=True, repetition_penalty=repetition_penalty)
logits = torch.cat(outputs.scores).cpu()

one thing which seemed off in your example is that if I didnt add +1 to the expected length (here only one example) then the output would be shorter by one residue compared to the expected length from the 3Di encoding ...

Any corrections on what I came up with would be greatly appreciated. For sanity check, the recovery from the sequence I computed the 3Di from is pretty fine, i.e. >40% so it seems not buggy to me ..

from prostt5.

mheinzinger avatar mheinzinger commented on July 17, 2024 1

Thanks for sharing the details in how you got scores. -From what I remember, I used a similar logic at one point, so I would not immediately see what to change.

Only thing: on the +1 offset: maybe double check but the decoder should not need those special pre-fixes which indicate the direction of translation ("" etc). Those prefixes are only added to the encoder input to tell the model already how to interpret the input to the encoder and how to optimally embed it for the translation direction you are interested in.
That being said: I think there is a special token added to the decoder to kick-off the translation (<s> if I am not mistaken) but this should get stripped off automatically when you do sth like decoded_translations = tokenizer.batch_decode( translations, skip_special_tokens=True )

from prostt5.

mheinzinger avatar mheinzinger commented on July 17, 2024 1

Sounds interesting, let me know in case you hit any problems on the way.
Regarding finetuning, I would recommend to consider some parameter-efficient version which we made good experience with previously. .

from prostt5.

adrienchaton avatar adrienchaton commented on July 17, 2024 1

@mheinzinger Thanks for the advice, here specifically I am thinking about finetuning the encoder-decoder models together on sequence-3Di pairs, not on e.g. supervised fitness prediction with the encoder alone (as with e.g. ESM2). It could be interesting to tune the models to antibodies for example, or retrain from scratch but still I guess starting from the general pretrained model would be an advantage.

from prostt5.

adrienchaton avatar adrienchaton commented on July 17, 2024

for inference it looks like it is easiest to stick to using model.generate() along with the special tokens which indicate what processing is expected (e.g. encoder or decoder)

additionally thanks for sharing this batch_decode method which takes care of dropping special tokens from sequence outputs

I am considering to try to finetune ProstT5 for other protein types, I might come back with a few questions if you dont mind!

from prostt5.

Related Issues (19)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.