rosettacommons / rfdiffusion Goto Github PK

View Code? Open in Web Editor NEW

1.5K 1.5K 278.0 7.86 MB

Code for running RFdiffusion

License: Other

Python 99.41% Dockerfile 0.35% Shell 0.24%

rfdiffusion's People

Stargazers

Watchers

Forkers

ardagoreci labdao xxchenxx effectstocause aravinda1879 jamaliki martinez-zacharya brianloyal dineshravindraraju jameshennessytempus jamesthesnake ddemonte0306 sokrypton alex6095 goswamig zerodesigner dongcf tianflame yaoyinying lahplover maikuraky imseaton samuel-gwb biocheming wesleywt peldom davidswang zihua gerard-ompad brianjimenez chaitjo tudorcotet liunanln poko18 thisisredditulous lqx-ai yamule ruisu11 cyrusbiotechnology designium codeaudit lucidrains puukallistaja dwbaron keithtab mdcao tims-ml techthiyanes kenibrewer uriheep buguashushu agavri enochpan7 eltociear iuriimattos2 jaedukseo leiguanape lupitaduraan sgilflores3 ricardcamp gaus07 mosstoy justzverozybr corina26 dantebertha14 zhuangwuxin tingasfich865 alexyen1000 wonkung anjali14692 jlingford ailabteam jjermy mooreliving777 wall-ee hhy5277 lixu6-alt mosayebi norsage timodonnell yubeenkim truatpasteurdotfr xqh19970407 zfgao66 nejbiz ggaviles azamh-um zackmawaldi shunzhangwa pvilaca outpace-bio cclough arikat openfold-io gangyusun xinluzhu rhsimplex thudxz spyfighting perfectmartini

rfdiffusion's Issues

How to design a dimer enzyme

There are some proteins that need to form a dimer to function as an enzyme, how to design these enzymes

Icosahedral symmetric oligomers?

Hi, amazing repo!

Will icosahedral symmetric design (as mentioned in the preprint) also be made available on GitHub?

Thanks!

Ability to redesign existing scaffolds to create binding interfaces in comparison to de novo binder design

Hello!

Thank you for publishing your work! I'm currently experimenting with your software, and I have some ideas in mind, but I am not sure to what extent they are practical and possible with RFdiffusion. As I am new to the protein design field, feel free to correct any assumptions I may have made; I would greatly appreciate any guidance.

I have a target protein and a highly stable scaffold protein; however, the scaffold protein was not designed to bind to this specific target, so no actual binding motif is present. Is it possible (and effective) to use the RFdiffusion protocol, such as motif scaffolding, to redesign some parts of the existing scaffold that face the target hotspot to create a binding interface? Or would it be more effective to use partial diffusion/fold conditioned binder design to create new, but structurally similar scaffolds? If I understand correctly, the latter approach will cause loss of the binder sequence, which might lead to a possible loss of stability compared to the original scaffold.
I apologize if I am missing something about presented pipelines.

Thank you!

Handling of non natural amino acids (Colab notebook version)

On the Colab notebook I am using as a template for binder generation the mirror image of a naturally occurring protein (uploaded pdb file). The generation of the poly glycine backbone works well but when this input is used for protein MPNN and Alphafold evaluation the outputs alter the stereochemistry of the input protein back to the natural enantiomer. It's not clear whether this is happening at the protein MPNN step or whether this is a step taken by Alphafold. Is there a way to discern whether Protein MPNN is assigning side chains based the mirror image protein input (or whether Protein MPNN reverts the stereochemistry)?

run on multiple GPUs?

Any way to get this to run on multiple GPUs simultaneously?
Right now it only ones on a single GPU even when there are multiple present. Any flags I might try?

Output circular proteins?

Any way to get RFdiffusion to connect an N to C terminus to form a circularized protein?
If possible, this would be phenomenally useful functionality.

Many thanks.

Colabfold implementation

I've been using RFdiffusion on colab pro for a few weeks now, using Google's GPUs. Realistically, we want to generate hundreds to thousands of designs and then filter through these. How is this supposed to be implemented when the max I can do is 32 designs in colab pro? I'm not sure how running RFdiffusion locally on my own GPU (RTX3060) would work out, any thoughts? I have never done any serious computing since I'm mostly at the bench, so technically I am very naive.

Thanks for RFdiffusion!

NVTX missing using SE3nv.yml -- Pytorch 2.0 solution

Device

OS: CentOS Linux 7
GPU: gtx 1080

Issue

Hi! I get the following error running any of the examples scripts

RuntimeError: NVTX functions not installed. Are you sure you have a CUDA build?

When using the current SE3nv.yml I get the following versions

pytorch                   1.9.1           cpu_py39hc5866cc_3    conda-forge
torchaudio                0.9.1           py39                  pytorch
torchvision               0.14.1          cpu_py39h39206e8_1    conda-forge

Solution

I did a clean install running pip3 install --force-reinstall torch torchvision torchaudio

torch                     2.0.0                    pypi_0    pypi
torchaudio                2.0.1                    pypi_0    pypi
torchvision               0.15.1                   pypi_0    pypi

That seems to run every example without an issue. I've come into issues before with conda installs for pytorch when not using the most recent version. Is there a known issue from keeping RFdiffusion from moving to pytorch 2.0?

Help With Binder Design Denoiser

Hi there, I've been using the google colab version of the program and wanted to know where do I key in the denoiser.noise_scale and denoiser.noise_scale_frame command on the code? Also would like to know how to filter i_pae results down to < 10. Thank you! Love the program, keep up the great work! :)

Question about the modified RosettaFold and training

Dear Authors,

Thank you for your great work! I am writing to inquire if there are any plans to release the pretraining code for both the modified RosettaFold and RF diffusion on GitHub. As someone with a keen interest in this field, I am particularly curious about this aspect and would appreciate any information or updates you could provide.

Enable discussions in Github repo

Really amazing work firstly. I had a suggestion to enable discussion in the Github repository so that interested users could discuss about potential uses etc. https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/enabling-features-for-your-repository/enabling-or-disabling-github-discussions-for-a-repository

RuntimeError: Index put requires the source and destination dtypes match, got Long for the destination and Int for the source.

When I am trying to run inference to yield an unconditional monomer as described in the README, I get the following error:

_[2023-04-01 20:22:08,689][main][INFO] - Making design test_outputs/test_0
[2023-04-01 20:22:08,692][inference.model_runners][INFO] - Using contig: ['150-150']
Error executing job with overrides: ['contigmap.contigs=[150-150]', 'inference.output_prefix=test_outputs/test', 'inference.num_designs=10']
Traceback (most recent call last):
File "C:\Users\Norb\RFdiffusion\run_inference.py", line 76, in main
x_init, seq_init = sampler.sample_init()
File "C:\Users\Norb\RFdiffusion\inference\model_runners.py", line 341, in sample_init
seq_t[contig_map.hal_idx0] = seq_orig[contig_map.ref_idx0]
RuntimeError: Index put requires the source and destination dtypes match, got Long for the destination and Int for the source.

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
(SE3nv)_

Sorry in case I am missing something basic, I am an absolute beginner. Thank you so much in advance.

Repo examples only sampling glycine

Hey! Thanks for sharing this :)
I just had the following problem—maybe I'm doing something wrong on my end—but by trying both (i) RFdiffusion/examples/design_motifscaffolding.sh and (ii) RFdiffusion/examples/design_unconditional.sh, the randomly sampled ranges only contain Gs (e.g., for the motif example: "GGGGGGGGGGGGGGGGGGEVNKIKSALLSTNKAVVSLGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG")

Help with error: no such file or directory /models/Base_ckpt.pt

I've just set this up in WSL2 and had no issues during setup. I just copied the unconditional design example to quickly test if it works but I'm getting the error below. I'm not sure how to interpret this error so any help would be great!

~/RFdiffusion$ scripts/run_inference.py inference.output_prefix=example_outputs/design_unconditional 'contigmap.contigs=[100-200]' inference.num_designs=10
[2023-06-26 22:44:09,438][main][INFO] - Found GPU with device_name NVIDIA GeForce RTX 3060. Will run RFdiffusion on NVIDIA GeForce RTX 3060
Reading models from /home/usr/RFdiffusion/rfdiffusion/inference/../../models
[2023-06-26 22:44:09,439][rfdiffusion.inference.model_runners][INFO] - Reading checkpoint from /home/usr/RFdiffusion/rfdiffusion/inference/../../models/Base_ckpt.pt
This is inf_conf.ckpt_path
/home/usr/RFdiffusion/rfdiffusion/inference/../../models/Base_ckpt.pt
Error executing job with overrides: ['inference.output_prefix=example_outputs/design_unconditional', 'contigmap.contigs=[100-200]', 'inference.num_designs=10']
Traceback (most recent call last):
File "/home/usr/RFdiffusion/scripts/run_inference.py", line 54, in main
sampler = iu.sampler_selector(conf)
File "/home/usr/RFdiffusion/rfdiffusion/inference/utils.py", line 511, in sampler_selector
sampler = model_runners.SelfConditioning(conf)
File "/home/usr/RFdiffusion/rfdiffusion/inference/model_runners.py", line 37, in init
self.initialize(conf)
File "/home/usr/RFdiffusion/rfdiffusion/inference/model_runners.py", line 103, in initialize
self.load_checkpoint()
File "/home/usr/RFdiffusion/rfdiffusion/inference/model_runners.py", line 181, in load_checkpoint
self.ckpt = torch.load(
File "/home/usr/anaconda3/envs/SE3nv/lib/python3.9/site-packages/torch/serialization.py", line 594, in load
with _open_file_like(f, 'rb') as opened_file:
File "/home/usr/anaconda3/envs/SE3nv/lib/python3.9/site-packages/torch/serialization.py", line 230, in _open_file_like
return _open_file(name_or_buffer, mode)
File "/home/usr/anaconda3/envs/SE3nv/lib/python3.9/site-packages/torch/serialization.py", line 211, in init
super(_open_file, self).init(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: '/home/usr/RFdiffusion/rfdiffusion/inference/../../models/Base_ckpt.pt'

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

Integration of ProteinMPNN & AF2 filtering

Thanks a lot for making the RFDiffusion project available! I am trying to wrap my head around what is needed to get the whole design workflow set up locally.

RFdiffusion as described here only seems to output "poly-glycine" PDBs. So we still need to run ProteinMPNN and AF2 filtering on all candidate solutions. The colab version of RFdiffusion seems to perform these steps through a call to colabdesign/rf/designability_test.py. However, that script doesn't seem to exist neither in this repo nor in the colabdesign/rf one.

Could you please add this script to this repo here so that one can really reproduce the workflow described in your paper?

conda error

Hello, how to solve the following error?
conda error.txt

FileNotFoundError: [Errno 2] No such file or directory: 'outputs/traj/test_0_pX0_traj.pdb'

Hi,

For the past week, whenever I am trying to run partial diffusion, I get the following error:
FileNotFoundError: [Errno 2] No such file or directory: 'outputs/traj/test_0_pX0_traj.pdb'

I attached a picture with my input. After settings those, I have only did Runtime->Run All.

Could you help me to solve this problem?

Thank you!

How to set the seed value

Is it possible to set deterministic at conf/inference/base.yamlas True and set its seed value from the argument?

RFdiffusion/run_inference.py

Lines 42 to 43 in 1a39202

 if conf.inference.deterministic: 

 make_deterministic()

Or is it preferable to simply set deterministic to True and a large value of inference.num_designs?
But in that case, it will take longer to get the result because it will be executed sequentially instead of in parallel.
Is there a big difference in output between running in parallel with many seeds with inference.num_designs = 1 and running sequentially with a large value of inference.num_designs?

I was installing RFdiffusion in my Linux system but ran into the following error.

I was installing RFdiffusion in my Linux system but ran into the following error.

I have tried updating the pip and pytorch versions but nothing seems to work. Any help would be appreciated!

Originally posted by @sHr3y4s1 in #64

Problem Hydra arguments on windows conda

I found out after several failures to run the inference script that on Windows when running the inference script, if one use
'contigmap.contigs=[B1-100/0 100-100]'
or 'ppi.hotspot_res=[A30,A33,A34]' , it will not work.
One need to use " and not ' around the arguments otherwise the arguments are not correctly parsed and it raises an error.
eg.. 'contigmap.contigs=[B1-100/0 100-100]' becomes "contigmap.contigs=[B1-100/0 100-100]"

Not sure if this is because the default language of my windows is not english

Create a jupyter notebook

The colab notebook is great but google colab is particularly unreliable recently in free account.
Would it be possible that someone translate the colab notebook to Jupyter notebook?

Colab Error [Errno 2] No such file or directory: 'outputs/traj/test_0_pX0_traj.pdb'

Today I got the error in google colab when running a diffusion. last Friday it was working well.

[Errno 2] No such file or directory: 'outputs/traj/test_0_pX0_traj.pdb' w

Help: RFdiffusion EOFError: Ran out of input

Dear all, please help me with this error. Thank you very much.

./scripts/run_inference.py 'contigmap.contigs=[150-150]' inference.output_prefix=test_outputs/test inference.num_designs=10

[2023-06-04 19:49:19,789][main][INFO] - Found GPU with device_name NVIDIA GeForce RTX 3090. Will run RFdiffusion on NVIDIA GeForce RTX 3090
Reading models from /home/hesong/local/RFdiffusion/rfdiffusion/inference/../../models
[2023-06-04 19:49:19,790][rfdiffusion.inference.model_runners][INFO] - Reading checkpoint from /home/hesong/local/RFdiffusion/rfdiffusion/inference/../../models/Base_ckpt.pt
This is inf_conf.ckpt_path
/home/hesong/local/RFdiffusion/rfdiffusion/inference/../../models/Base_ckpt.pt
Assembling -model, -diffuser and -preprocess configs from checkpoint
USING MODEL CONFIG: self._conf[model][n_extra_block] = 4
USING MODEL CONFIG: self._conf[model][n_main_block] = 32
USING MODEL CONFIG: self._conf[model][n_ref_block] = 4
USING MODEL CONFIG: self._conf[model][d_msa] = 256
USING MODEL CONFIG: self._conf[model][d_msa_full] = 64
USING MODEL CONFIG: self._conf[model][d_pair] = 128
USING MODEL CONFIG: self._conf[model][d_templ] = 64
USING MODEL CONFIG: self._conf[model][n_head_msa] = 8
USING MODEL CONFIG: self._conf[model][n_head_pair] = 4
USING MODEL CONFIG: self._conf[model][n_head_templ] = 4
USING MODEL CONFIG: self._conf[model][d_hidden] = 32
USING MODEL CONFIG: self._conf[model][d_hidden_templ] = 32
USING MODEL CONFIG: self._conf[model][p_drop] = 0.15
USING MODEL CONFIG: self._conf[model][SE3_param_full] = {'num_layers': 1, 'num_channels': 32, 'num_degrees': 2, 'n_heads': 4, 'div': 4, 'l0_in_features': 8, 'l0_out_features': 8, 'l1_in_features': 3, 'l1_out_features': 2, 'num_edge_features': 32}
USING MODEL CONFIG: self._conf[model][SE3_param_topk] = {'num_layers': 1, 'num_channels': 32, 'num_degrees': 2, 'n_heads': 4, 'div': 4, 'l0_in_features': 64, 'l0_out_features': 64, 'l1_in_features': 3, 'l1_out_features': 2, 'num_edge_features': 64}
USING MODEL CONFIG: self._conf[model][freeze_track_motif] = False
USING MODEL CONFIG: self._conf[model][use_motif_timestep] = True
USING MODEL CONFIG: self._conf[diffuser][T] = 50
USING MODEL CONFIG: self._conf[diffuser][b_0] = 0.01
USING MODEL CONFIG: self._conf[diffuser][b_T] = 0.07
USING MODEL CONFIG: self._conf[diffuser][schedule_type] = linear
USING MODEL CONFIG: self._conf[diffuser][so3_type] = igso3
USING MODEL CONFIG: self._conf[diffuser][crd_scale] = 0.25
USING MODEL CONFIG: self._conf[diffuser][so3_schedule_type] = linear
USING MODEL CONFIG: self._conf[diffuser][min_b] = 1.5
USING MODEL CONFIG: self._conf[diffuser][max_b] = 2.5
USING MODEL CONFIG: self._conf[diffuser][min_sigma] = 0.02
USING MODEL CONFIG: self._conf[diffuser][max_sigma] = 1.5
USING MODEL CONFIG: self._conf[preprocess][sidechain_input] = False
USING MODEL CONFIG: self._conf[preprocess][motif_sidechain_input] = True
USING MODEL CONFIG: self._conf[preprocess][d_t1d] = 22
USING MODEL CONFIG: self._conf[preprocess][d_t2d] = 44
USING MODEL CONFIG: self._conf[preprocess][prob_self_cond] = 0.5
USING MODEL CONFIG: self._conf[preprocess][str_self_cond] = True
USING MODEL CONFIG: self._conf[preprocess][predict_previous] = False
[2023-06-04 19:49:22,270][rfdiffusion.inference.model_runners][INFO] - Loading checkpoint.
[2023-06-04 19:49:24,666][rfdiffusion.diffusion][INFO] - Using cached IGSO3.
Error executing job with overrides: ['contigmap.contigs=[150-150]', 'inference.output_prefix=test_outputs/test', 'inference.num_designs=10']
Traceback (most recent call last):
File "/home/hesong/local/RFdiffusion/./scripts/run_inference.py", line 54, in main
sampler = iu.sampler_selector(conf)
File "/home/hesong/local/RFdiffusion/rfdiffusion/inference/utils.py", line 511, in sampler_selector
sampler = model_runners.SelfConditioning(conf)
File "/home/hesong/local/RFdiffusion/rfdiffusion/inference/model_runners.py", line 37, in init
self.initialize(conf)
File "/home/hesong/local/RFdiffusion/rfdiffusion/inference/model_runners.py", line 130, in initialize
self.diffuser = Diffuser(**self._conf.diffuser, cache_dir=schedule_directory)
File "/home/hesong/local/RFdiffusion/rfdiffusion/diffusion.py", line 582, in init
self.so3_diffuser = IGSO3(
File "/home/hesong/local/RFdiffusion/rfdiffusion/diffusion.py", line 198, in init
self.igso3_vals = self._calc_igso3_vals(L=L)
File "/home/hesong/local/RFdiffusion/rfdiffusion/diffusion.py", line 233, in _calc_igso3_vals
igso3_vals = read_pkl(cache_fname)
File "/home/hesong/local/RFdiffusion/rfdiffusion/diffusion.py", line 144, in read_pkl
raise (e)
File "/home/hesong/local/RFdiffusion/rfdiffusion/diffusion.py", line 140, in read_pkl
return pickle.load(handle)
EOFError: Ran out of input

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

Is it possible to use RFDiffusion for generating small molecule binders ?

Thank you for providing this excellent tool for protein design.
I have a question. Is it possible to use RFDiffusion for generating small molecule-binding protein assuming that there is a designed pocket (only separate amino acids that have good interactions with the molecule)?

Thanks

Running RFdiffusion on Intel macbook pro without Nvidia GPU

Hi all,

May I check if it is possible to run the code in this repo on a Intel Macbook without Nvidia GPU? I installed Pytorch but this error keeps coming up:

I installed Pytorch like this:

I searched Nvidia's website for CUDA tookit 11.1 but it seesm like there isn't an option for Mac.

If it is possible, may I know how I can install the missing packages?

Greatly appreciate any help! Thank you!

Mulitple sequences ranges to `provide_seq`

Hi RFdiffusion team, thank you for this great project and taking the time to make it public and write comprehensive documentation!

I was wondering if it's was possible to provide multiple sequence ranges to provide_seq when doing partial diffusion? For instance, the following does not throw an error:

'contigmap.provide_seq=[0-383,498-580,692-821]'

However only AAs 0-383 appear to be unmasked. Am I missing something?

Thank you!

About ‘Generation of Symmetric Oligomers’

Hi，in the part of ‘Generation of Symmetric Oligomers’，
I saw the command is:
"./scripts/run_inference.py --config-name symmetry inference.symmetry=tetrahedral 'contigmap.contigs=[360]' inference.output_prefix=test_sample/tetrahedral inference.num_designs=1"
, but an error will be reported when running:
File ".../RFdiffusion/rfdiffusion/contigs.py", line 137, in get_sampled_mask
contig_list = self.contigs[0].strip().split()
AttributeError: 'int' object has no attribute 'strip'

the symmetry.yaml in config/inference is set as follow:
contigmap:
# Specify a single integer value to sample unconditionally.
# Must be evenly divisible by the number of chains in the symmetry.
contigs: ['100']

So, is there some problem in this part？

Finally, please standardize the content of the README.md, because there seem to be some differences between the README.md and the code，thanks

Symmetry mode NOT usable for e.g. inpainting a loop?

I've tried to use the cyclic symmetric mode to generate a set of symmetric loops between symmetric structural units (e.g. loops to join together helices in a helical bundle with cyclic symmetry).

However, while this kind of operation works to generate non-symmetric loops, when switching to use of symmetry it acts to always assume chain-breaks after the newly generated loops, such that they are never positioned so as to actually join the e.g. helices of the symmetric helical bundle.

Based on the text in the nickel design example, which says that chain breaks don't strictly need to be given the contigs when using symmetry, it seems this might be a known limitation of the symmetry mode. Is this true? Is it possible to circumvent this limiation?

Compatibility issues when creating conda env

This is really amazing!

Not sure what I am doing wrong, but conda is telling me that it found some conflicts when creating the environment. So I removed the constraints, changed the channel from defaults to conda-forge and it solved my issue:

name: SE3nv
channels:
  - conda-forge
  - pytorch
  - dglteam
dependencies:
  - python
  - pytorch
  - torchaudio
  - torchvision
  - cudatoolkit
  - dgl-cuda11.1
  - pip
  - pip:
    - hydra-core
    - pyrsistent
    - icecream

Not sure neither if this is the way to go but seems to have worked for me! And all the examples I tried so far are running smoothly.

Computer config:

Ubuntu 22.04.2 LTS, 32,0 GiB
AMD® Ryzen 9 5900x 12-core processor × 24
NVIDIA Corporation GA102 [GeForce RTX 3080]

... again amazing!

Loading IGSO3 from a cache directory

Hello! I'm having a lot of fun playing around with this!

One feature that would be convenient to have would be a constant directory where all of the diffuser's pre-computed schedules live. Right now the inference code recomputes the schedule pkl if it doesn't exists in ./. To avoid recomputing whenever you're running in a different directory, the constant cache directory ensures that the schedule is only computed once.

I've implemented a simple version of this by replacing line 116 in inference/model_runners.py with
self.diffuser = Diffuser(**self._conf.diffuser, cache_dir=f'{SCRIPT_DIR}/../schedules')
instead of
self.diffuser = Diffuser(**self._conf.diffuser).
This assumes that there is a schedules directory in the RFdiffusion folder where all pre-computed schedules will live. Only downside to this simple approach is that the cache directory couldn't be overwritten (but this is technically consistent with the model checkpoints).

Length ranges in symmetry modes

Hi,

I've noticed that when using a symmetric mode (cyclic, dihedral, etc), it's not possible to supply a length range for the new diffused regions. A single value must be used, otherwise 'ValueError: Sequence length must be divisble by n' is returned.

I'm guessing it's because the lengths of the diffused regions in each chain/monomer aren't tied, so usually the total sequence length ends up indivisible by the oligomeric state?

Is this something that might be possible in future updates?

Thanks!

Ali

Design C3-symmetric oligomers to bind the SARS-CoV-2 Spike protein.

Dear developers, I am repeating the analysis of "Design of C3-symmetric oligomers to scaffold the binding interface of the designed ACE2 mimic". But I found RFDiffusion may change the orientation of motif protomer.

Data link: https://drive.google.com/drive/folders/19BZTqTx-uKEjVqGp06Ez2zufb7hgva-q?usp=share_link

The file 7uhc.pdb has been centrelized by me can been accessed from this link. I used Chimera to prove it is C3 symmetry along z axis:

open 7uhc.pdb
delete #0:.B #0:.C #0:.E #0:.F
sym #0 group C3 axis z
open 7uhc.pdb

Then I used RFDiffusion to design the C3-symmetric oligomers with following command:

run_inference.py \
    inference.symmetry=C3 \
    inference.num_designs=1 \
    inference.output_prefix=Spike_Symmetric_PPI/1_structure_design/Spike_Symmetric_PPI_1.0_0.1_Base \
    'potentials.guiding_potentials=["type:olig_contacts,weight_intra:1.0,weight_inter:0.1"]' \
    potentials.olig_intra_all=True \
    potentials.olig_inter_all=True \
    potentials.guide_scale=2 \
    diffuser.T=50 \
    potentials.guide_decay=quadratic \
    inference.input_pdb=7uhc.pdb \
    'contigmap.contigs=[D1-55/120/0 E1-55/120/0 F1-55/120/0]' \
    inference.ckpt_override_path=models/Base_ckpt.pt

The output file can be accessed from this link. And I open 7uhc.pdb and Spike_Symmetric_PPI_1.0_0.1_Base_0.pdb with Chimera:

open 7uhc.pdb
open Spike_Symmetric_PPI_1.0_0.1_Base_0.pdb
delete #0:.A #0:.B #0:.C
display @CA
~ribbon

And I align model #1 to model #0:

 mm #0 #1

Only one motif can be perfectly matched.

OSError: [Errno 30] Read-only file system: 'outputs'

After having built RFdiffusion's docker container and pulled it to our HPC, I have run the following test by using singularity:

singularity run --env TF_FORCE_UNIFIED_MEMORY=1,XLA_PYTHON_CLIENT_MEM_FRACTION=4.0,OPENMM_CPU_THREADS=10,HYDRA_FULL_ERROR=1 \
        -B $HOME/outputs,$HOME/models,$HOME/inputs \
        --pwd /app/RFdiffusion \
        --nv $HOME/rfdiffusion/rfdiffusion_v1.1.0.sif \
        inference.output_prefix=$HOME/outputs/motifscaffolding \
        inference.model_directory_path=$HOME/models \
        inference.input_pdb=$HOME/inputs/5TPN.pdb \
        inference.num_designs=3 \
        'contigmap.contigs=[10-40/A163-181/10-40]'

However, I got errors as follows:

Traceback (most recent call last):
  File "/usr/lib/python3.9/pathlib.py", line 1313, in mkdir
    self._accessor.mkdir(self, mode)
FileNotFoundError: [Errno 2] No such file or directory: 'outputs/2023-06-24/15-54-50'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.9/pathlib.py", line 1313, in mkdir
    self._accessor.mkdir(self, mode)
FileNotFoundError: [Errno 2] No such file or directory: 'outputs/2023-06-24'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/app/RFdiffusion/scripts/run_inference.py", line 194, in <module>
    main()
  File "/usr/local/lib/python3.9/dist-packages/hydra/main.py", line 94, in decorated_main
    _run_hydra(
  File "/usr/local/lib/python3.9/dist-packages/hydra/_internal/utils.py", line 394, in _run_hydra
    _run_app(
  File "/usr/local/lib/python3.9/dist-packages/hydra/_internal/utils.py", line 457, in _run_app
    run_and_report(
  File "/usr/local/lib/python3.9/dist-packages/hydra/_internal/utils.py", line 223, in run_and_report
    raise ex
  File "/usr/local/lib/python3.9/dist-packages/hydra/_internal/utils.py", line 220, in run_and_report
    return func()
  File "/usr/local/lib/python3.9/dist-packages/hydra/_internal/utils.py", line 458, in <lambda>
    lambda: hydra.run(
  File "/usr/local/lib/python3.9/dist-packages/hydra/_internal/hydra.py", line 119, in run
    ret = run_job(
  File "/usr/local/lib/python3.9/dist-packages/hydra/core/utils.py", line 146, in run_job
    Path(str(output_dir)).mkdir(parents=True, exist_ok=True)
  File "/usr/lib/python3.9/pathlib.py", line 1317, in mkdir
    self.parent.mkdir(parents=True, exist_ok=True)
  File "/usr/lib/python3.9/pathlib.py", line 1317, in mkdir
    self.parent.mkdir(parents=True, exist_ok=True)
  File "/usr/lib/python3.9/pathlib.py", line 1313, in mkdir
    self._accessor.mkdir(self, mode)
OSError: [Errno 30] Read-only file system: 'outputs'

Can you please let me know what I did incorrectly?
Thanks a lot in advance.

Unable to install environment from SE3nv.yml

First of all, thanks to the RFDiffusion team for this tool and making it open source! I'm excited to use it!

I've begun the installation process and have tried to create the conda environment from the SE3nv.yml file with conda env create -f env/SE3nv.yml. But get the following error:

ResolvePackageNotFound: 
  - icecream
  - cudatoolkit=11.1

I've resolved this issue by adding - nvidia to the channels and moving icecream to the pip installs. The final file looks like this:

name: SE3nv
channels:
  - defaults
  - pytorch
  - dglteam
  - nvidia
dependencies:
  - python=3.9
  - pytorch=1.9
  - torchaudio
  - torchvision
  - cudatoolkit=11.1
  - dgl-cuda11.1
  - pip
  - pip:
    - icecream
    - hydra-core
    - pyrsistent

Not sure if this is the best way to handle creating the environment but seems to have worked for me!

Question about RFDiffusion -> ProteinMPNN handoff for binder design

First, thank you to the authors for releasing this code and model!

When I run RFDiffusion for binder design, the output .pdbs show the binder as polyglycine (expected) and the target protein with the original sequence (also expected). However, when you look at the structure, the target protein no longer has side chains, but only the backbone atoms are preserved (not what I expected). Is this a problem if I intend to use these .pdbs as inputs to ProteinMPNN? Or should I take the RFDiffusion backbone and make a new pdb with the original target structure with sidechains and all?

Problems with using Complex_beta_ckpt.pt

Hi, RF diffusion team

This is a great work.

I am trying to make a binder to a beta sheet, and i tried using Complex_beta_ckpt.pt.

The results showed that the binder sequences were GGGGGGGG....

What should i do to solve this problems?

My code is python ${script_path}/run_inference.py inference.output_prefix=out/design_ppi inference.input_pdb=input/target.pdb 'contigmap.contigs=[C1-62/0 50-79]' 'ppi.hotspot_res=[C2,C3,C14,C15,C16,C17,C18,C20]' inference.ckpt_override_path=${script_path}/models/Complex_beta_ckpt.pt inference.num_designs=10 denoiser.noise_scale_ca=0 denoiser.noise_scale_frame=0

Thank you.

Binder Design for Small Ligands

Is there any prediction for when RF diffusion will be extended to allow for design of ligand-protein interactions?

Really awesome program so far!

Thanks!

[WSL2] nvrtc compilation failed

Trying to use the tool in WSL2 with my RTX4090. Windows version doesn't work (see the issue #13 ).

Everything loads fine, but then I see an error:

(SE3nv) pavel@Gigabyte-PC:~/RFdiffusion$ python scripts/run_inference.py inference.model_directory_path=/mnt/d/Models/RFdiffusion 'contigmap.contigs=[150-150]' inference.output_prefix=test_outputs/test inference.num_designs=10
Reading models from /mnt/d/Models/RFdiffusion
[2023-04-06 15:59:06,421][rfdiffusion.inference.model_runners][INFO] - Reading checkpoint from /mnt/d/Models/RFdiffusion/Base_ckpt.pt
This is inf_conf.ckpt_path
/mnt/d/Models/RFdiffusion/Base_ckpt.pt
Assembling -model, -diffuser and -preprocess configs from checkpoint
USING MODEL CONFIG: self._conf[model][n_extra_block] = 4
USING MODEL CONFIG: self._conf[model][n_main_block] = 32
USING MODEL CONFIG: self._conf[model][n_ref_block] = 4
USING MODEL CONFIG: self._conf[model][d_msa] = 256
USING MODEL CONFIG: self._conf[model][d_msa_full] = 64
USING MODEL CONFIG: self._conf[model][d_pair] = 128
USING MODEL CONFIG: self._conf[model][d_templ] = 64
USING MODEL CONFIG: self._conf[model][n_head_msa] = 8
USING MODEL CONFIG: self._conf[model][n_head_pair] = 4
USING MODEL CONFIG: self._conf[model][n_head_templ] = 4
USING MODEL CONFIG: self._conf[model][d_hidden] = 32
USING MODEL CONFIG: self._conf[model][d_hidden_templ] = 32
USING MODEL CONFIG: self._conf[model][p_drop] = 0.15
USING MODEL CONFIG: self._conf[model][SE3_param_full] = {'num_layers': 1, 'num_channels': 32, 'num_degrees': 2, 'n_heads': 4, 'div': 4, 'l0_in_features': 8, 'l0_out_features': 8, 'l1_in_features': 3, 'l1_out_features': 2, 'num_edge_features': 32}
USING MODEL CONFIG: self._conf[model][SE3_param_topk] = {'num_layers': 1, 'num_channels': 32, 'num_degrees': 2, 'n_heads': 4, 'div': 4, 'l0_in_features': 64, 'l0_out_features': 64, 'l1_in_features': 3, 'l1_out_features': 2, 'num_edge_features': 64}
USING MODEL CONFIG: self._conf[model][d_time_emb] = 0
USING MODEL CONFIG: self._conf[model][d_time_emb_proj] = 10
USING MODEL CONFIG: self._conf[model][freeze_track_motif] = False
USING MODEL CONFIG: self._conf[model][use_motif_timestep] = True
USING MODEL CONFIG: self._conf[diffuser][T] = 50
USING MODEL CONFIG: self._conf[diffuser][b_0] = 0.01
USING MODEL CONFIG: self._conf[diffuser][b_T] = 0.07
USING MODEL CONFIG: self._conf[diffuser][schedule_type] = linear
USING MODEL CONFIG: self._conf[diffuser][so3_type] = igso3
USING MODEL CONFIG: self._conf[diffuser][crd_scale] = 0.25
USING MODEL CONFIG: self._conf[diffuser][so3_schedule_type] = linear
USING MODEL CONFIG: self._conf[diffuser][min_b] = 1.5
USING MODEL CONFIG: self._conf[diffuser][max_b] = 2.5
USING MODEL CONFIG: self._conf[diffuser][min_sigma] = 0.02
USING MODEL CONFIG: self._conf[diffuser][max_sigma] = 1.5
USING MODEL CONFIG: self._conf[preprocess][sidechain_input] = False
USING MODEL CONFIG: self._conf[preprocess][motif_sidechain_input] = True
USING MODEL CONFIG: self._conf[preprocess][d_t1d] = 22
USING MODEL CONFIG: self._conf[preprocess][d_t2d] = 44
USING MODEL CONFIG: self._conf[preprocess][prob_self_cond] = 0.5
USING MODEL CONFIG: self._conf[preprocess][str_self_cond] = True
USING MODEL CONFIG: self._conf[preprocess][predict_previous] = False
[2023-04-06 15:59:10,778][rfdiffusion.inference.model_runners][INFO] - Loading checkpoint.
[2023-04-06 15:59:13,459][rfdiffusion.diffusion][INFO] - Calculating IGSO3.
Successful diffuser __init__
[2023-04-06 15:59:17,256][__main__][INFO] - Making design test_outputs/test_0
[2023-04-06 15:59:17,260][rfdiffusion.inference.model_runners][INFO] - Using contig: ['150-150']
With this beta schedule (linear schedule, beta_0 = 0.04, beta_T = 0.28), alpha_bar_T = 0.00013696048699785024
[2023-04-06 15:59:17,271][rfdiffusion.inference.model_runners][INFO] - Sequence init: ------------------------------------------------------------------------------------------------------------------------------------------------------
Error executing job with overrides: ['inference.model_directory_path=/mnt/d/Models/RFdiffusion', 'contigmap.contigs=[150-150]', 'inference.output_prefix=test_outputs/test', 'inference.num_designs=10']
Traceback (most recent call last):
  File "/home/pavel/RFdiffusion/scripts/run_inference.py", line 85, in main
    px0, x_t, seq_t, plddt = sampler.sample_step(
  File "/home/pavel/RFdiffusion/rfdiffusion/inference/model_runners.py", line 665, in sample_step
    msa_prev, pair_prev, px0, state_prev, alpha, logits, plddt = self.model(msa_masked,
  File "/home/pavel/.local/share/miniconda3/envs/SE3nv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/pavel/RFdiffusion/rfdiffusion/RoseTTAFoldModel.py", line 114, in forward
    msa, pair, R, T, alpha_s, state = self.simulator(seq, msa_latent, msa_full, pair, xyz[:,:,:3],
  File "/home/pavel/.local/share/miniconda3/envs/SE3nv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/pavel/RFdiffusion/rfdiffusion/Track_module.py", line 420, in forward
    msa_full, pair, R_in, T_in, state, alpha = self.extra_block[i_m](msa_full,
  File "/home/pavel/.local/share/miniconda3/envs/SE3nv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/pavel/RFdiffusion/rfdiffusion/Track_module.py", line 332, in forward
    R, T, state, alpha = self.str2str(msa, pair, R_in, T_in, xyz, state, idx, motif_mask=motif_mask, top_k=0)
  File "/home/pavel/.local/share/miniconda3/envs/SE3nv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/pavel/.local/share/miniconda3/envs/SE3nv/lib/python3.9/site-packages/torch/cuda/amp/autocast_mode.py", line 141, in decorate_autocast
    return func(*args, **kwargs)
  File "/home/pavel/RFdiffusion/rfdiffusion/Track_module.py", line 266, in forward
    shift = self.se3(G, node.reshape(B*L, -1, 1), l1_feats, edge_feats)
  File "/home/pavel/.local/share/miniconda3/envs/SE3nv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/pavel/RFdiffusion/rfdiffusion/SE3_network.py", line 83, in forward
    return self.se3(G, node_features, edge_features)
  File "/home/pavel/.local/share/miniconda3/envs/SE3nv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/pavel/.local/share/miniconda3/envs/SE3nv/lib/python3.9/site-packages/se3_transformer-1.0.0-py3.9.egg/se3_transformer/model/transformer.py", line 140, in forward
    basis = basis or get_basis(graph.edata['rel_pos'], max_degree=self.max_degree, compute_gradients=False,
  File "/home/pavel/.local/share/miniconda3/envs/SE3nv/lib/python3.9/site-packages/se3_transformer-1.0.0-py3.9.egg/se3_transformer/model/basis.py", line 167, in get_basis
    spherical_harmonics = get_spherical_harmonics(relative_pos, max_degree)
  File "/home/pavel/.local/share/miniconda3/envs/SE3nv/lib/python3.9/site-packages/se3_transformer-1.0.0-py3.9.egg/se3_transformer/model/basis.py", line 58, in get_spherical_harmonics
    sh = o3.spherical_harmonics(all_degrees, relative_pos, normalize=True)
  File "/home/pavel/.local/share/miniconda3/envs/SE3nv/lib/python3.9/site-packages/e3nn/o3/_spherical_harmonics.py", line 180, in spherical_harmonics
    return sh(x)
  File "/home/pavel/.local/share/miniconda3/envs/SE3nv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/pavel/.local/share/miniconda3/envs/SE3nv/lib/python3.9/site-packages/e3nn/o3/_spherical_harmonics.py", line 82, in forward
    sh = _spherical_harmonics(self._lmax, x[..., 0], x[..., 1], x[..., 2])
RuntimeError: nvrtc: error: invalid value for --gpu-architecture (-arch)

nvrtc compilation failed:

#define NAN __int_as_float(0x7fffffff)
#define POS_INFINITY __int_as_float(0x7f800000)
#define NEG_INFINITY __int_as_float(0xff800000)


template<typename T>
__device__ T maximum(T a, T b) {
  return isnan(a) ? a : (a > b ? a : b);
}

template<typename T>
__device__ T minimum(T a, T b) {
  return isnan(a) ? a : (a < b ? a : b);
}

extern "C" __global__
void fused_pow_pow_pow_su_9196483836509741110(float* tz_1, float* ty_1, float* tx_1, float* aten_mul, float* aten_mul_1, float* aten_mul_2, float* aten_sub, float* aten_add, float* aten_mul_3, float* aten_pow) {
{
  if (512 * blockIdx.x + threadIdx.x<22350 ? 1 : 0) {
    float ty_1_1 = __ldg(ty_1 + 3 * (512 * blockIdx.x + threadIdx.x));
    aten_pow[512 * blockIdx.x + threadIdx.x] = ty_1_1 * ty_1_1;
    float tz_1_1 = __ldg(tz_1 + 3 * (512 * blockIdx.x + threadIdx.x));
    float tx_1_1 = __ldg(tx_1 + 3 * (512 * blockIdx.x + threadIdx.x));
    aten_mul_3[512 * blockIdx.x + threadIdx.x] = (float)((double)(tz_1_1 * tz_1_1 - tx_1_1 * tx_1_1) * 0.8660254037844386);
    aten_add[512 * blockIdx.x + threadIdx.x] = tx_1_1 * tx_1_1 + tz_1_1 * tz_1_1;
    aten_sub[512 * blockIdx.x + threadIdx.x] = ty_1_1 * ty_1_1 - (float)((double)(tx_1_1 * tx_1_1 + tz_1_1 * tz_1_1) * 0.5);
    aten_mul_2[512 * blockIdx.x + threadIdx.x] = (float)((double)(ty_1_1) * 1.732050807568877) * tz_1_1;
    aten_mul_1[512 * blockIdx.x + threadIdx.x] = (float)((double)(tx_1_1) * 1.732050807568877) * ty_1_1;
    aten_mul[512 * blockIdx.x + threadIdx.x] = (float)((double)(tx_1_1) * 1.732050807568877) * tz_1_1;
  }
}
}


Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

partial diffusion with a few residues fixed

How can I perform partial diffusion for all residues except 3 (residues of an active site)? I tried to fix them in 'contigs', but they were moving anyway. I also tried to use models/ActiveSite_ckpt.pt, but it ruined the fold.

why there is no example for design binder for protein complex

I got error with paras like this, i wonder if it is a incompatible problem that i try to design binder for more than one molecule? ['contigmap.contigs=[C/0 F/0 100-120]', 'ppi.hotspot_res=[C80,C82,C86,C87,C90,C92,C93,E128,C138,C185,C187]', 'denoiser.noise_scale_ca=0', 'denoiser.noise_scale_frame=0'
Traceback (most recent call last):
File "/Share/app/RFdiffusion/scripts/run_inference.py", line 194, in
main()
File "/Share/app/miniconda3.9/envs/SE3nv/lib/python3.9/site-packages/hydra/main.py", line 94, in decorated_main
_run_hydra(
File "/Share/app/miniconda3.9/envs/SE3nv/lib/python3.9/site-packages/hydra/_internal/utils.py", line 394, in _run_hydra
_run_app(
File "/Share/app/miniconda3.9/envs/SE3nv/lib/python3.9/site-packages/hydra/_internal/utils.py", line 457, in _run_app
run_and_report(
File "/Share/app/miniconda3.9/envs/SE3nv/lib/python3.9/site-packages/hydra/_internal/utils.py", line 223, in run_and_report
raise ex
File "/Share/app/miniconda3.9/envs/SE3nv/lib/python3.9/site-packages/hydra/_internal/utils.py", line 220, in run_and_report
return func()
File "/Share/app/miniconda3.9/envs/SE3nv/lib/python3.9/site-packages/hydra/_internal/utils.py", line 458, in
lambda: hydra.run(
File "/Share/app/miniconda3.9/envs/SE3nv/lib/python3.9/site-packages/hydra/_internal/hydra.py", line 132, in run
_ = ret.return_value
File "/Share/app/miniconda3.9/envs/SE3nv/lib/python3.9/site-packages/hydra/core/utils.py", line 260, in return_value
raise self._return_value
File "/Share/app/miniconda3.9/envs/SE3nv/lib/python3.9/site-packages/hydra/core/utils.py", line 186, in run_job
ret.return_value = task_function(task_cfg)
File "/Share/app/RFdiffusion/scripts/run_inference.py", line 84, in main
x_init, seq_init = sampler.sample_init()
File "/Share/app/RFdiffusion/rfdiffusion/inference/model_runners.py", line 278, in sample_init
self.contig_map = self.construct_contig(self.target_feats)
File "/Share/app/RFdiffusion/rfdiffusion/inference/model_runners.py", line 240, in construct_contig
return ContigMap(target_feats, **self.contig_conf)
File "/Share/app/RFdiffusion/rfdiffusion/contigs.py", line 78, in init
) = self.expand_sampled_mask()
File "/Share/app/RFdiffusion/rfdiffusion/contigs.py", line 225, in expand_sampled_mask
int(subcon.split("-")[0][1:]), int(subcon.split("-")[1]) + 1
ValueError: invalid literal for int() with base 10: ''

About downstream sequence assignment

Hi, RF diffusion team

The results showed that every designed residue is output as glycine, and we should use ProteinMPNN to assign these residues.

My question is, if we want to fix some residues from the input structure (e.g., enzyme design or scaffold functional motif, I want to fix the activate site or functional motif), how to specify in ProteinMPNN?

Thank you.

Extend N-terminus with a helical bundle structure

Hi,
thank you for providing this code and the examples! Suppose I have a .pdb file of a protein with a long alpha helix at the N-terminus. Is it possible to extend this helix by a helix bundle, i.e. to generate an N-terminal fusion of a three helix bundle to my target protein? The helical bundles that are generated by the design_ppi script would be perfect.

ModuleNotFoundError: No module named 'rfdiffusion'

Thanks so much to the RosettaCommons team for open-sourcing RFdiffusion! I'm looking forward to getting started.

Unfortunately, something appears to have gone wrong with the folder organization of my installation. I'm trying to execute the first example in the README, generating unconstrained backbones with 150 residues. Here is my attempt to execute, and my output:

(SE3nv) john@john-Desktop:~/RFdiffusion$ ./scripts/run_inference.py 'contigmap.contigs=[150-150]' inference.output_prefix=test_outputs/test inference.num_designs=10
Traceback (most recent call last):
  File "/home/john/RFdiffusion/./scripts/run_inference.py", line 24, in <module>
    from rfdiffusion.util import writepdb_multi, writepdb
ModuleNotFoundError: No module named 'rfdiffusion'

So run_inference.py is being found -- but it's looking for the util.py file somewhere other than where it is, which is /home/john/RFdiffusion/rfdiffusion.

I know how to reorganize Python folders and/or to provide a folder link, to hack my way around this problem, but I'm concerned that I would cause other problems by doing so.

Please advise, thanks.

Any plan to release training code?

I'd like to finetune RFdiffusion to fit my data.
However, there seems no training code provided for now.

Is there any plan to release training code?

Thanks!

Code comments

Hi team, congratulation on the great paper and milestone results.

I've collected multiple remarks on the code while I was studying it, hopefully this will help in improving the codebase.

terminology

templates dimension: sometimes called T, sometimes N, sometimes s.
t1d and t2d are very non-descriptive names.

dependencies

opt_einsum is used, but should be replaced with torch.einsum: 1. pytorch relies on opt_einsum anyway 2. opt_einsum handles contraction order for 3 or more tensors, which is never the case in rosetta code 3. it precludes scripting/compiling code

ignored parameters/code

igso3.py: calculate_igso3 ignores L argument
init_lecun_normal and init_lecun_normal_param ignore passed initialization scale
Attention and MaskedTokenNetwork ignore p_drop, and dropout is never applied (while signature suggests the opposite)
class Denoise: multiple parameters are not used
looks like * self.guide_scale is missing here

RFdiffusion/rfdiffusion/potentials/manager.py

Line 202 in 92b83de

'cubic' : lambda t: t**3/self.T**3
diff_util.py: unused file
olig_intra_contacts: misses self.d0 and self.r0

distributing weights

I'd recommend replacing unsafe pickles with something simple and safe (npz/safetensors)

computations

manual normalizations (lots of them) can be replaced with F.normalize, cosines (in many places) can be computed with F.cosine_similarity
Sergey's one hot trick (used in multiple places) - strangely, created embedding layer is never used. Better just create a linear module, or only a trainable parameter

looks like you try to implement torch.nanmean here

RFdiffusion/rfdiffusion/kinematics.py

Lines 293 to 295 in 92b83de

 mask = torch.isnan(xyz_t[:,:,:,:3]).any(dim=-1).any(dim=-1) # (B, T, L) 

 # 

 center_CA = ((~mask[:,:,:,None]) * torch.nan_to_num(xyz_t[:,:,:,1,:])).sum(dim=2) / ((~mask[:,:,:,None]).sum(dim=2)+1e-4) # (B, T, 3)

checkpointing

Checkpointing on iterblock: wouldn’t it be more efficient to checkpoint whole block instead of checkpointing every step? This will simplify logic as well
create_custom_forward: unclear why you need it. This function was used only once with an actual kwarg (topk), and it can be replaced with lambda anyway: lambda *args: module(*args, topk=...)

duplicated code

computation of Cbeta: three torch implementations in repo, and there is discrepancy in used weights
make_contact_matrix: implemented twice
dihedral computation: implemented twice in numpy, once in pytorch, implementations scattered

Minor:

PositionalEncoding2D: I think binning is asymmetric (enumerates shifts from -32 to 31 + outside regions instead of -32 to 32)
icecream is imported but never used (and required in dependencies)

N/A

How to fix part of sequence in binder design or provide hotspot in partial denoising?

Hello authors,

Thank you for providing this fantastic code for protein design! I was wondering if there is a smart way to incorporate some sequence information in binder design process or add a hot spot potential in partial denoise process. I was trying to optimize a binder. If I use the partial denoise pipeline, I cannot define the hotspot; If I use the binder design pipeline, I cannot fix the part of the sequence of interest. I assume there should be a way to combine both features because they are simply two different potential functions. Could you let me know if that's possible or some clue for me to implement this feature? Thank you very much!

Best,
Shuhao

always get mistaken chain connection in binder design

here shows the partial comparison between original stucture and the one after diffusion. confusingly, i always get wrong connection within those part that i did not declare to be disigned.

i use /0 as the sign for chain break but seem to be failed, since chain E and F was connected wrong

i havent figure out the exact rule behind, any advice will be appreciated!

The provided environment yaml for SE3nv installs pytorch-cpu version, when the installed cuda version is not 11.1

A different driver version (in my case: Driver Version: 510.108.03 CUDA Version: 11.6) fails to install the cuda version of pytorch, resulting in the runtime error when trying to run all of the examples.
This is probably due to the fact that there's no cudatoolkit version 11.1 (which is required by the original SE3nv yaml) for this driver.

To solve it, I installed a cudatoolkit11.6 and pytorch1.12.1, I attached the export of my environment, in case someone else encounters this problem

environment.yaml.gz

Long inference time, GPU avaialble but not using

Thanks for sharing RFDiffusion.

I'm using it on a T4 GPU. It takes 30-60seconds for this:

RFdiffusion-main/scripts/run_inference.py 'contigmap.contigs=[10-20]' inference.output_prefix=test_outputs/test inference.num_designs=1

Cuda is also available.
How could I make sure if it is using GPU?

This is my dockerfile:

FROM nvidia/cuda:11.1.1-cudnn8-runtime-ubuntu20.04
ENV PATH="/root/miniconda3/bin:${PATH}"
ARG PATH="/root/miniconda3/bin:${PATH}"
RUN apt-get update

RUN apt-get install -y wget git && rm -rf /var/lib/apt/lists/*

RUN wget \
    https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh \
    && mkdir /root/.conda \
    && bash Miniconda3-latest-Linux-x86_64.sh -b \
    && rm -f Miniconda3-latest-Linux-x86_64.sh
RUN conda --version

COPY RFdiffusion-main RFdiffusion-main
RUN conda env create -f RFdiffusion-main/env/SE3nv.yml

RUN echo "conda activate SE3nv" >> ~/.bashrc
SHELL ["/bin/bash", "--login", "-c"]
SHELL ["conda", "run", "--no-capture-output", "-n", "SE3nv", "/bin/bash", "-c"]

RUN pip install --no-cache-dir -r RFdiffusion-main/env/SE3Transformer/requirements.txt
RUN python RFdiffusion-main/env/SE3Transformer/setup.py install

RUN pip install -e RFdiffusion-main 
COPY requirements.txt requirements.txt
RUN pip install --no-cache-dir -r requirements.txt
COPY entrypoint.sh entrypoint.sh

RUN chmod +x entrypoint.sh
RUN pip install se3-transformer-pytorch
RUN pip install -e RFdiffusion-main/env/SE3Transformer

RUN pip install --force-reinstall torch torchvision torchaudio

	mask = torch.isnan(xyz_t[:,:,:,:3]).any(dim=-1).any(dim=-1) # (B, T, L)
	#
	center_CA = ((~mask[:,:,:,None]) * torch.nan_to_num(xyz_t[:,:,:,1,:])).sum(dim=2) / ((~mask[:,:,:,None]).sum(dim=2)+1e-4) # (B, T, 3)

rosettacommons / rfdiffusion Goto Github PK

rfdiffusion's People

Stargazers

Watchers

Forkers

rfdiffusion's Issues

Device

Issue

Solution

Recommend Projects

Recommend Topics

Recommend Org