Comments (6)
@rohanvarm I took a bit of a different route than @HannesStark is suggesting, but with his help I got a working solution:
- You can adapt the
QMDataset
class in3DInfomax/datasets/qm9_dataset.py
. I stripped all functionality that was not needed for inference and made it possible to provide a custom list of SMILES strings. You can see my implementation of this class in my fork of this repo - Using a similar process, I adapted the
load_model()
function intrain.py
and finally created a little CLI using Click that loads a list of SMILES from a.npy
file, creates a dataset and feeds the datapoints through the model. You can see all of that here. - I still need to figure out how to get the fingerprints from this, but I think we should be able to change the
forward()
method of the PNA class to do so. I'll look into that next and can let you know if I figure it out.
Please note that this code assumes you use the provided checkpoint. For other models I might have stripped too much functionality. At the same time, I am not 100% sure if any more code could be removed, so the code could possibly be further simplified / made more efficient.
@HannesStark We could consider merging this back to your repo and write a bit about it in the README once finished? Let me know if that would interest you! (right now my fork is too different I think because I restructured it a bit to my liking, but it should be easy to just merge the relevant files once they're done)
from 3dinfomax.
Thank you @cwognum !
I will make some changes soon such that the finger print extraction is easier!
from 3dinfomax.
from 3dinfomax.
Hi Rohan,
You could use the uploaded GNN weights that were obtained by pre-training on GEOM-Drugs
To do so, you could add a PyTorch dataset with the molecules of which you want to generate fingerprints to the datasets
directory.
Then you can use a config file like tune_QM9_homo.yml
where you set eval_on_test: True
, dataset: 'ClassNameOfDataset
, and num_epochs: 0
such that you directly run the model on the test molecules.
from 3dinfomax.
but I think we should be able to change the forward() method of the PNA class to do so
This is exactly what I ended up doing. You can see the changes here. This should be a non-breaking, backward-compatible change. It should could be cleaned up a bit more, but this is the gist of it! π
from 3dinfomax.
I now made it a bit easier:
Just place your SMILES into the file dataset/inference_smiles.txt
and run
python inference.py --config=configs_clean/fingerprint_inference.yml
Your fingerprints are saved as pickle file into the dataset_directory
And in the config file you can specify different pre-trained models if you want.
from 3dinfomax.
Related Issues (19)
- having trouble training for GEOM-Mol + trained models HOT 5
- DglPCQM4MDataset ImportError for inference HOT 5
- Pretrained 3d model HOT 1
- Fine-tuning a model with moltox21 dataset - error HOT 1
- Some questions about 3DInfomax HOT 1
- Help HOT 2
- Embedding views in NTXentMultiplePositiveLoss HOT 1
- "conda env create" problem HOT 1
- pretrain loss with negative value HOT 4
- RuntimeError: Error(s) in loading state_dict for PNA: size mismatch for node_gnn.atom_encoder.atom_embedding_list.1.weight: copying a param with shape torch.Size([4, 200]) from checkpoint, the shape in current model is torch.Size([5, 200]).
- ResolvePackageNotFound in Windows HOT 2
- classification yml file
- Having trouble pre-training with example code HOT 4
- Linked video freezes HOT 1
- Fine-tuning a model with `BACEGeomol` dataset - error when collating HOT 4
- What model is gin.py? HOT 4
- What is the MAE of QM9? Which parameter in the following operation result is Mean Absolute Error? Thank you. HOT 1
- Where is the "bbbpscaffold123.pkl"? And what is the "bbbpscaffold123.pkl"? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from 3dinfomax.