ml4bio / rna-fm Goto Github PK
View Code? Open in Web Editor NEWRNA foundation model
Home Page: https://ml4bio.github.io/RNA-FM/
License: MIT License
RNA foundation model
Home Page: https://ml4bio.github.io/RNA-FM/
License: MIT License
Hi,
Great work!
I am trying to reproduce the SS prediction results (attached image) for ArchiveII600 (3911 sequences) and TS0 (1305 sequences).
While I could exactly reproduce UFold's scores, I could not reproduce RNAFM's scores in the same way. I used the model weights for RNAFM from here.
I got the F1 score 0.666 for TS0, using "RNA-FM-ResNet_bpRNA.pth"; the paper reported 0.704. For ArchiveII600, I got 0.933 using "RNA-FM-ResNet_RNAStralign.pth"; the paper reported 0.941.
I was wondering if the evaluation in your paper was done differently than how UFold did it
I'd really appreciate any help. Thank you!
Is the training code going to be available? Thanks
Great job!I would like to continue training your model on a new data. I would be grateful if you could provide the training script.
Hi,
I'm reaching out with a question related to the applicability of your models for tasks with a smaller dataset. Specifically, I'm interested in whether there are available scripts for fine-tuning the RNA-FM + ResNet model with few data points, similar to how the RNA-FM (TL) model was adjusted in the paper's RNA 3D closeness prediction task.
Any guidance or resources would be greatly appreciated.
Hi there, I haven't been able to install RNA-FM. I've tried using the recommended conda install from the github repo, pip, on a mac, on a linux cluster, and nothing works. Could you provide a little more detail on the system requirements?
Thank you. Looks like nice work and I'd love to try it out.
Would you be able to release mRNA-FM's/RNA-FM's training, tutorial and evaluation data? For instance, as a Drive download link or on HuggingFace? Thanks!
Hi Guys! I am curious whether I can access the preprocessed dataset RNAcentral100 which was used to pre-train the foundation model. If not, should I directly download the data from RNAcentral website? https://ftp.ebi.ac.uk/pub/databases/RNAcentral/releases/19.0/.
Thanks a lot!
thank your great job, I want to know what datasets you used for RNA-FM-ResNet_bpRNA.pth and RNA-FM-ResNet_RNAStralign.pth respectively. I guess you use the TR0 training set to obtain RNA-FM-ResNet_bpRNA.pth and the RNAStralign training set to obtain RNA-FM-ResNet_RNAStralign.pth, Is it right? If not, could you tell me which dataset these two files were trained on?
Hi,
Thank you for your work.
Can I ask what's the training data of mRNA-FM?
Can you share the whole data?
Hi, the rna-fm paper mentions the supplementary tables. But I could not find any affiliated tables beyond the main text in the arXiv version https://arxiv.org/abs/2204.00300. Can we access the supplementary parts?
Thanks!
Hi,
thanks for developing RNA-FM. I run the extract embedding step using python launch/predict.py
, and it raised an Error:
ImportError: cannot import name 'make_data_loader' from 'data'
.
Thanks.
Hi,
I use the data archiveII_all from e2efold on task secondary structure pred and get the pair using L/5. But the accuracy is 0.7138. Could you please give more details on how to reproduce results.
Hi,
Thanks for open-sourcing this awesome work.
I met some errors when trying to install RNA-FM. Can you help me out?
Here is part of error logs.
pip install .
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Processing /workspace/work/CLIP/RNA-FM
Preparing metadata (setup.py) ... done
Collecting numpy==1.22.0 (from rna-fm==0.1.2)
Downloading numpy-1.22.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (2.0 kB)
Collecting pandas==1.3.1 (from rna-fm==0.1.2)
Downloading pandas-1.3.1.tar.gz (4.7 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.7/4.7 MB 11.0 MB/s eta 0:00:00
Installing build dependencies ... error
error: subprocess-exited-with-error
× pip subprocess to install build dependencies did not run successfully.
│ exit code: 1
╰─> [824 lines of output]
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com, https://pypi.ngc.nvidia.com
Ignoring numpy: markers 'python_version == "3.7" and (platform_machine != "arm64" or platform_system != "Darwin") and platform_machine != "aarch64"' don't match your environment
Ignoring numpy: markers 'python_version == "3.8" and (platform_machine != "arm64" or platform_system != "Darwin") and platform_machine != "aarch64"' don't match your environment
Ignoring numpy: markers 'python_version == "3.7" and platform_machine == "aarch64"' don't match your environment
Ignoring numpy: markers 'python_version == "3.8" and platform_machine == "aarch64"' don't match your environment
Ignoring numpy: markers 'python_version == "3.8" and platform_machine == "arm64" and platform_system == "Darwin"' don't match your environment
Ignoring numpy: markers 'python_version == "3.9" and platform_machine == "arm64" and platform_system == "Darwin"' don't match your environment
Collecting setuptools>=38.6.0
Downloading setuptools-69.1.1-py3-none-any.whl.metadata (6.2 kB)
Collecting wheel
Downloading wheel-0.43.0-py3-none-any.whl.metadata (2.2 kB)
Collecting Cython<3,>=0.29.21
Downloading Cython-0.29.37-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.metadata (3.1 kB)
Collecting numpy==1.19.3
Downloading numpy-1.19.3.zip (7.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.3/7.3 MB 11.3 MB/s eta 0:00:00
Installing build dependencies: started
Installing build dependencies: finished with status 'done'
Getting requirements to build wheel: started
Getting requirements to build wheel: finished with status 'done'
Preparing metadata (pyproject.toml): started
Preparing metadata (pyproject.toml): finished with status 'done'
Downloading setuptools-69.1.1-py3-none-any.whl (819 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 819.3/819.3 kB 11.8 MB/s eta 0:00:00
Downloading wheel-0.43.0-py3-none-any.whl (65 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 65.8/65.8 kB 13.1 MB/s eta 0:00:00
Downloading Cython-0.29.37-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (1.9 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.9/1.9 MB 11.8 MB/s eta 0:00:00
Building wheels for collected packages: numpy
Building wheel for numpy (pyproject.toml): started
Building wheel for numpy (pyproject.toml): finished with status 'error'
error: subprocess-exited-with-error
× Building wheel for numpy (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [782 lines of output]
setup.py:67: RuntimeWarning: NumPy 1.19.3 may not yet support Python 3.10.
warnings.warn(
Running from numpy source directory.
/tmp/pip-install-9x5z1ps4/numpy_11ee67fb2b2142c4bcbc63b744069658/tools/cythonize.py:67: DeprecationWarning: The distutils package is deprecated and slated for removal in Python 3.12. Use setuptools or check PEP 632 for potential alternatives
from distutils.version import LooseVersion
numpy/random/_bounded_integers.pxd.in has not changed
numpy/random/_pcg64.pyx has not changed
numpy/random/_philox.pyx has not changed
numpy/random/bit_generator.pyx has not changed
numpy/random/_common.pyx has not changed
numpy/random/_bounded_integers.pyx.in has not changed
numpy/random/mtrand.pyx has not changed
numpy/random/_mt19937.pyx has not changed
numpy/random/_sfc64.pyx has not changed
numpy/random/_generator.pyx has not changed
Processing numpy/random/_bounded_integers.pyx
Cythonizing sources
blas_opt_info:
blas_mkl_info:
customize UnixCCompiler
FOUND:
libraries = ['mkl_rt', 'pthread']
library_dirs = ['/usr/local/lib']
define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
include_dirs = ['/usr/local/include', '/usr/include']
FOUND:
libraries = ['mkl_rt', 'pthread']
library_dirs = ['/usr/local/lib']
define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
include_dirs = ['/usr/local/include', '/usr/include']
non-existing path in 'numpy/distutils': 'site.cfg'
lapack_opt_info:
lapack_mkl_info:
FOUND:
libraries = ['mkl_rt', 'pthread']
library_dirs = ['/usr/local/lib']
define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
include_dirs = ['/usr/local/include', '/usr/include']
FOUND:
libraries = ['mkl_rt', 'pthread']
library_dirs = ['/usr/local/lib']
define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
include_dirs = ['/usr/local/include', '/usr/include']
/usr/lib/python3.10/distutils/dist.py:274: UserWarning: Unknown distribution option: 'define_macros'
warnings.warn(msg)
running bdist_wheel
running build
running config_cc
unifing config_cc, config, build_clib, build_ext, build commands --compiler options
running config_fc
unifing config_fc, config, build_clib, build_ext, build commands --fcompiler options
running build_src
build_src
building py_modules sources
building library "npymath" sources
adding 'build/src.linux-x86_64-3.10/numpy/core/src/npymath' to include_dirs.
None - nothing done with h_files = ['build/src.linux-x86_64-3.10/numpy/core/src/npymath/npy_math_internal.h']
building library "npysort" sources
adding 'build/src.linux-x86_64-3.10/numpy/core/src/common' to include_dirs.
None - nothing done with h_files = ['build/src.linux-x86_64-3.10/numpy/core/src/common/npy_sort.h', 'build/src.linux-x86_64-3.10/numpy/core/src/common/npy_partition.h', 'build/src.linux-x86_64-3.10/numpy/core/src/common/npy_binsearch.h']
building library "npyrandom" sources
building extension "numpy.core._multiarray_tests" sources
building extension "numpy.core._multiarray_umath" sources
adding 'build/src.linux-x86_64-3.10/numpy/core/src/umath' to include_dirs.
adding 'build/src.linux-x86_64-3.10/numpy/core/src/npymath' to include_dirs.
adding 'build/src.linux-x86_64-3.10/numpy/core/src/common' to include_dirs.
numpy.core - nothing done with h_files = ['build/src.linux-x86_64-3.10/numpy/core/src/umath/funcs.inc', 'build/src.linux-x86_64-3.10/numpy/core/src/umath/simd.inc', 'build/src.linux-x86_64-3.10/numpy/core/src/umath/loops.h', 'build/src.linux-x86_64-3.10/numpy/core/src/umath/matmul.h', 'build/src.linux-x86_64-3.10/numpy/core/src/umath/clip.h', 'build/src.linux-x86_64-3.10/numpy/core/src/npymath/npy_math_internal.h', 'build/src.linux-x86_64-3.10/numpy/core/src/common/templ_common.h', 'build/src.linux-x86_64-3.10/numpy/core/include/numpy/config.h', 'build/src.linux-x86_64-3.10/numpy/core/include/numpy/_numpyconfig.h', 'build/src.linux-x86_64-3.10/numpy/core/include/numpy/__multiarray_api.h', 'build/src.linux-x86_64-3.10/numpy/core/include/numpy/__ufunc_api.h']
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.