Git Product home page Git Product logo

powerfit's Introduction

PowerFit

DOI

About PowerFit

PowerFit is a Python package and simple command-line program to automatically fit high-resolution atomic structures in cryo-EM densities. To this end it performs a full-exhaustive 6-dimensional cross-correlation search between the atomic structure and the density. It takes as input an atomic structure in PDB-format and a cryo-EM density with its resolution; and outputs positions and rotations of the atomic structure corresponding to high correlation values. PowerFit uses the local cross-correlation function as its base score. The score can optionally be enhanced by a Laplace pre-filter and/or a core-weighted version to minimize overlapping densities from neighboring subunits. It can further be hardware-accelerated by leveraging multi-core CPU machines out of the box or by GPU via the OpenCL framework. PowerFit is Free Software and has been succesfully installed and used on Linux and MacOSX machines.

Requirements

Minimal requirements for the CPU version:

  • Python2.7
  • NumPy 1.8+
  • SciPy
  • GCC (or another C-compiler)

Optional requirement for faster CPU version:

  • FFTW3
  • pyFFTW 0.10+

To offload computations to the GPU the following is also required

  • OpenCL1.1+
  • pyopencl
  • clFFT
  • gpyfft

Recommended for installation

  • git
  • pip

Installation

If you already have fulfilled the requirements, the installation should be as easy as opening up a shell and typing

git clone https://github.com/haddocking/powerfit.git
cd powerfit
sudo python setup.py install

If you are starting from a clean system, follow the instructions for your particular operating system as described below, they should get you up and running in no time.

Docker

First install docker by following the instructions.

A docker container comprised of PowerFit and its CPU/GPU dependencies can be created for your compute platform as follows

docker build -t haddocking/powerfit:v2.1.0 -f Dockerfile .
docker run haddocking/powerfit:v2.1.0 <map> <resolution> <pdb>

Linux

Linux systems usually already include a Python2.7 distribution. First make sure the Python header files, NumPy, SciPy, and git are available by opening up a terminal and typing for Debian and Ubuntu systems

sudo apt-get install python-dev python-numpy python-scipy git

If you are working on Fedora, this should be replaced by

sudo yum install python-devel numpy scipy git

Sit back and wait till the compilation and installation is finished. Your system is now prepared, follow the general instructions above to install PowerFit.

MacOSX

First install git by following the instructions on their website, or using a package manager such as brew

brew install git

Next install pip, the Python package manager, by following the installation instructions on the website or open a terminal and type

sudo easy_install pip

Next, install NumPy and SciPy by typing

sudo pip install numpy scipy

Wait for the installation to finish. Follow the general instructions above to install PowerFit.

Installing pyFFTW for faster CPU version can be done as follows using brew

brew install fftw
sudo pip install pyfftw

Windows

First install git for Windows, as it comes with a handy bash shell. Go to git-scm, download git and install it. Next, install a Python distribution with NumPy and Scipy included such as Anaconda. After installation, open up the bash shell shipped with git and follow the general instructions written above.

Usage

After installing PowerFit the command line tool powerfit should be at your disposal. The general pattern to invoke powerfit is

powerfit <map> <resolution> <pdb>

where <map> is a density map in CCP4 or MRC-format, <resolution> is the resolution of the map in ångstrom, and <pdb> is an atomic model in the PDB-format. This performs a 10° rotational search using the local cross-correlation score on a single CPU-core. During the search, powerfit will update you about the progress of the search if you are using it interactively in the shell.

Running PowerFit in a docker container named powerfit on data located at a hypothetical /path/to/data on your machine can be done as follows

docker run --rm -v /path/to/data:/data powerfit \
    powerfit /data/<map> <resolution> /data/<pdb> -d /data

Options

First, to see all options and their descriptions type

powerfit --help

The information should explain all options decently. In addtion, here are some examples for common operations.

To perform a search with an approximate 24° rotational sampling interval

powerfit <map> <resolution> <pdb> -a 24

To use multiple CPU cores with laplace pre-filter and 5° rotational interval

powerfit <map> <resolution> <pdb> -p 4 -l -a 5

To off-load computations to the GPU and use the core-weighted scoring function and write out the top 15 solutions

powerfit <map> <resolution> <pdb> -g -cw -n 15

Note that all options can be combined except for the -g and -p flag: calculations are either performed on the CPU or GPU.

Output

When the search is finished, several output files are created

  • fit_N.pdb: the top N best fits.
  • solutions.out: all the non-redundant solutions found, ordered by their correlation score. The first column shows the rank, column 2 the correlation score, column 3 and 4 the Fisher z-score and the number of standard deviations (see N. Volkmann 2009, and Van Zundert and Bonvin 2016); column 5 to 7 are the x, y and z coordinate of the center of the chain; column 8 to 17 are the rotation matrix values.
  • lcc.mrc: a cross-correlation map, showing at each grid position the highest correlation score found during the rotational search.
  • powerfit.log: a log file, including the input parameters with date and timing information.

Creating an image-pyramid

The use of multi-scale image pyramids can signicantly increase the speed of fitting. PowerFit comes with a script to quickly build a pyramid called image-pyramid. The calling signature of the script is

image-pyramid <map> <resolution> <target-resolutions ...>

where <map is the original cryo-EM data, <resolution is the original resolution, and <target-resolutions> is a sequence of resolutions for the resulting maps. The following example will create an image-pyramid with resolutions of 12, 13 and 20 angstrom

image-pyramid EMD-1884/1884.map 9.8 12 13 20

To see the other options type

image-pyramid --help

Licensing

If this software was useful to your research, please cite us

G.C.P. van Zundert and A.M.J.J. Bonvin. Fast and sensitive rigid-body fitting into cryo-EM density maps with PowerFit. AIMS Biophysics 2, 73-87 (2015).

For the use of image-pyramids and reliability measures for fitting, please cite

G.C.P van Zundert and A.M.J.J. Bonvin. Defining the limits and reliability of rigid-body fitting in cryo-EM maps using multi-scale image pyramids. J. Struct. Biol. 195, 252-258 (2016).

Apache License Version 2.0

The elements.py module is licensed under MIT License (see header). Copyright (c) 2005-2015, Christoph Gohlke

Tested platforms

Operating System CPU single CPU multi GPU
Linux Yes Yes Yes
MacOSX Yes Yes Yes
Windows Yes Fail No

The GPU version has been tested on:

  • NVIDIA GeForce GTX 680 and AMD Radeon HD 7730M for Linux
  • NVIDIA GeForce GTX 775M for MacOSX 10.10

powerfit's People

Contributors

amjjbonvin avatar flolangenfeld avatar latrocinia avatar maurerv avatar mtrellet avatar orviz avatar rvhonorato avatar schaarj avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

powerfit's Issues

Install on Nvidia GPU

Thanks for developing this useful tool. I have used it on CPUs and it works well. But when I tried to run it on GPUs, I meet some problems, maybe about the OpenCL on GPU? I actually don't know what cause the problem. When I tried to run the test_gpyfft.py, I get the error message, ERROR: setUpClass (main.TestCLKernels), and Traceback message is AttributeError: 'NoneType' object has no attribute 'context'. I installed "pocl 1.2 h6bb024c_1002 conda-forge/label/gcc7". Could you help me fix that? Many thanks!

Containerize application

To facilitate portability and distribution we could containerize the powerfit as a singularity image.

how to install powerfit without sudo?

Hi,
We try running install like this: python2.7 setup.py install --prefix=install
and we got the warning below. How it is possible to install it?
Thanks.

Warning: Extension name 'powerfit._powerfit' does not match fully qualified name '_powerfit' of 'src/_powerfit.pyx'
running install
Checking .pth file support in install/lib/python2.7/site-packages/
/usr/bin/python2.7 -E -c pass
TEST FAILED: install/lib/python2.7/site-packages/ does NOT support .pth files
error: bad install directory or PYTHONPATH

You are attempting to install a package to a directory that is not
on PYTHONPATH and which Python does not read ".pth" files from. The
installation directory you specified (via --install-dir, --prefix, or
the distutils default setting) was:

install/lib/python2.7/site-packages/

and your PYTHONPATH environment variable currently contains:

'/vol/sci/bio/bio3d/meravb/project_dina/powerfit'

Here are some of your options for correcting the problem:

  • You can choose a different installation directory, i.e., one that is
    on PYTHONPATH or supports .pth files

  • You can add the installation directory to the PYTHONPATH environment
    variable. (It must then also be on PYTHONPATH whenever you run
    Python and want to use the package(s) you are installing.)

  • You can set up the installation directory to support ".pth" files by
    using one of the approaches described here:

    https://setuptools.readthedocs.io/en/latest/easy_install.html#custom-installation-locations

Please make the appropriate changes for your system and try again.

Local fine sampling around selected solutions

A feature request from a user: Would it be possible to perform a local fine search around a specific solution. Say finer rotational and translational sampling around the starting configuration - no longer full search

Installing powerfit in a Centos 8 machine

Hi, I am trying to install powerfit into a new centos 8 machine, and after running successfully the install script within a conda environment that I specifically created for this purpose, when trying to execute powerfit, I get the following error:

ImportError: /my/home/anaconda3/envs/emlearning/lib/python3.6/site-packages/powerfit-2.0.0-py3.6-linux-x86_64.egg/powerfit/_extensions.cpython-36m-x86_64-linux-gnu.so: undefined symbol: Py_InitModule

the conda list command is as follows

packages in environment at /..../emlearning:

Name Version Build Channel

_libgcc_mutex 0.1 main
blas 1.0 mkl
ca-certificates 2020.10.14 0
certifi 2020.6.20 pyhd3eb1b0_3
intel-openmp 2020.2 254
ld_impl_linux-64 2.33.1 h53a641e_7
libedit 3.1.20191231 h14c3975_1
libffi 3.3 he6710b0_2
libgcc-ng 9.1.0 hdf63c60_0
libgfortran-ng 7.3.0 hdf63c60_0
libstdcxx-ng 9.1.0 hdf63c60_0
mkl 2020.2 256
mkl-service 2.3.0 py36he904b0f_0
mkl_fft 1.2.0 py36h23d657b_0
mkl_random 1.1.1 py36h0573a6f_0
ncurses 6.2 he6710b0_1
numpy 1.19.2 py36h54aff64_0
numpy-base 1.19.2 py36hfa32c7d_0
openssl 1.1.1h h7b6447c_0
pip 20.2.4 py36h06a4308_0
powerfit 2.0.0 pypi_0 pypi
python 3.6.12 hcff3b4d_2
readline 8.0 h7b6447c_0
scipy 1.5.4 pypi_0 pypi
setuptools 50.3.1 py36h06a4308_1
six 1.15.0 py_0
sqlite 3.33.0 h62c20be_0
tk 8.6.10 hbc83047_0
wheel 0.35.1 py_0
xz 5.2.5 h7b6447c_0
zlib 1.2.11 h7b6447c_3

It seems it can be a Python 2 / 3 issue related problem, but I am not sure on how to proceed, so I am hoping you can have some useful advice. Please let me know if there is any extra information you need.

Unit testing is broken

The current setup of the unit tests is likely to break if you try to run run_tests.py. I did not test this on powerfit (don't have dependencies installed) but it should break because tests is not a package (lacks a __init__.py file) and so, test discovery comes up empty. Adding one should fix the problem and run the tests.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.