Git Product home page Git Product logo

Comments (6)

cwognum avatar cwognum commented on June 13, 2024 1

@rohanvarm I took a bit of a different route than @HannesStark is suggesting, but with his help I got a working solution:

  1. You can adapt the QMDataset class in 3DInfomax/datasets/qm9_dataset.py. I stripped all functionality that was not needed for inference and made it possible to provide a custom list of SMILES strings. You can see my implementation of this class in my fork of this repo
  2. Using a similar process, I adapted the load_model() function in train.py and finally created a little CLI using Click that loads a list of SMILES from a .npy file, creates a dataset and feeds the datapoints through the model. You can see all of that here.
  3. I still need to figure out how to get the fingerprints from this, but I think we should be able to change the forward() method of the PNA class to do so. I'll look into that next and can let you know if I figure it out.

Please note that this code assumes you use the provided checkpoint. For other models I might have stripped too much functionality. At the same time, I am not 100% sure if any more code could be removed, so the code could possibly be further simplified / made more efficient.

@HannesStark We could consider merging this back to your repo and write a bit about it in the README once finished? Let me know if that would interest you! (right now my fork is too different I think because I restructured it a bit to my liking, but it should be easy to just merge the relevant files once they're done)

from 3dinfomax.

HannesStark avatar HannesStark commented on June 13, 2024 1

Thank you @cwognum !
I will make some changes soon such that the finger print extraction is easier!

from 3dinfomax.

rohanvarm avatar rohanvarm commented on June 13, 2024 1

from 3dinfomax.

HannesStark avatar HannesStark commented on June 13, 2024

Hi Rohan,
You could use the uploaded GNN weights that were obtained by pre-training on GEOM-Drugs
To do so, you could add a PyTorch dataset with the molecules of which you want to generate fingerprints to the datasets directory.
Then you can use a config file like tune_QM9_homo.yml where you set eval_on_test: True, dataset: 'ClassNameOfDataset , and num_epochs: 0 such that you directly run the model on the test molecules.

from 3dinfomax.

cwognum avatar cwognum commented on June 13, 2024

but I think we should be able to change the forward() method of the PNA class to do so

This is exactly what I ended up doing. You can see the changes here. This should be a non-breaking, backward-compatible change. It should could be cleaned up a bit more, but this is the gist of it! πŸ™‚

from 3dinfomax.

HannesStark avatar HannesStark commented on June 13, 2024

I now made it a bit easier:
Just place your SMILES into the file dataset/inference_smiles.txt and run

python inference.py --config=configs_clean/fingerprint_inference.yml

Your fingerprints are saved as pickle file into the dataset_directory

And in the config file you can specify different pre-trained models if you want.

from 3dinfomax.

Related Issues (19)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.