demianriccardi / p5-hackamol Goto Github PK
View Code? Open in Web Editor NEWObject-Oriented Perl 5, Moose Library for Molecular Hacking
License: Other
Object-Oriented Perl 5, Moose Library for Molecular Hacking
License: Other
HackaMol needs a pdbreader that does not depend on the Atom being in the 78 as in the PDB specifications. This is a common type of pdb. The atom type will need to be converted from the atom name via a hash table look-up. Something similar was implemented in pdbqt, just not to this extreme.
Currently missing docs on read_file_mol and read_file_atoms
This seems to be very convenient.
creating grouping of unique cysteine residues required too much Perl. The trouble for those with same resid, but different chains; group_by_atom_attr would need another attr. Perhaps group_by_atom_attrs with an internal union would work.
read_file '1L2Y.pdb' - sysopen: No such file or directory at t/Roles/HackaMol-FileFetchRole.t line 41.
# Tests were run but no plan was declared and done_testing() was not seen.
# Looks like your test exited with 2 just after 6.
this is not an issue with most OSs. see testers on metacpan.
it would be nice to have a way to print all the coordinates into a single pdb file without breaks (model 1 end model 1, etc). Useful for things like crystals and capsids
$mol->charge(0) should make molecule at time $mol->t have charge = 0.
It may not make sense to have it.
There are currently no tests for the Z-matrix manipulation
need to fix up
see Bio3d, mdtraj, etc...
Currently, the charge method does not allow the current (t) charge to be set. This was to safe-guard atom data, but it is annoying in practice for molecules. (both atoms and molecules consume the role that provides charge). Override the method in the molecule to make it easier to use for molecules.
is currently, wrongly listed as atomgroups
It would be helpful to have a function that can generate multiple molecules from configurations are to be able to read in separate files from a given file. e.g. for a pdb with different models, each model can be loaded into a new molecule. similar for xyz files
use Modern::Perl;
use HackaMol;
use Math::Trig;
my $bldr = HackaMol->new();
my $pdbid = shift || '5umo';
my $add_2 = shift;
my $sel1 = 'resid 62-75';
my $sel2 = 'resid 121-140';
if ($add_2) {
$sel1 = 'resid 64-77';
$sel2 = 'resid 123-142';
}
ERK_0P: {
my $mol = $bldr->pdbid_mol($pdbid);
my $bb = $mol->select_group('protein')->select_group('backbone');
my $helix_a = $bb->select_group($sel1);
my $helix_b = $bb->select_group($sel2);
my @ca_a = $helix_a->select_group('name CA')->all_atoms;
my @ca_b = $helix_b->select_group('name CA')->all_atoms;
my $dist_a = $ca_a[0]->distance( $ca_a[-1] );
my $dist_b = $ca_b[0]->distance( $ca_b[-1] );
my $vec_a = $helix_a->centered_vector;
my $vec_b = $helix_b->centered_vector;
#$helix_a->print_pdb;
#$helix_b->print_pdb;
say $pdbid;
say rad2deg( atan2( $vec_a, $vec_b ) );
# to visually debug
my $catom_a1 = HackaMol::Atom->new(
symbol => 'Pb',
name => 'PB',
resname => 'FK1',
coords => [ $helix_a->center - $vec_a * $dist_a / 2 ]
);
my $catom_a2 = HackaMol::Atom->new(
symbol => 'Au',
name => 'AU',
resname => 'FK1',
coords => [ $helix_a->center + $vec_a * $dist_a / 2 ]
);
my $catom_b1 = HackaMol::Atom->new(
symbol => 'Pb',
name => 'PB',
resname => 'FK2',
coords => [ $helix_b->center - $vec_b * $dist_b / 2 ]
);
my $catom_b2 = HackaMol::Atom->new(
symbol => 'Au',
name => 'AU',
resname => 'FK2',
coords => [ $helix_b->center + $vec_b * $dist_b / 2 ]
);
my $dihe = HackaMol::Dihedral->new(
atoms => [ $catom_a1, $catom_a2, $catom_b1, $catom_b2 ] );
say $dihe->dihe_deg;
my $mol2 = HackaMol::Molecule->new(
atoms => [
$helix_a->all_atoms, $catom_a1, $catom_a2,
$helix_b->all_atoms, $catom_b1, $catom_b2
]
);
my $i = $bb->count_atoms;
$_->iatom( $i++ ) foreach $mol2->all_atoms;
$_->resid(999) foreach $mol2->all_atoms;
$mol2->print_pdb("shit.pdb");
}
Pdb is annoying in printing out the atom names. most of the time there are 1-3 characters in a name. There can be four, and it should print like this
'\s\w\s\s'
'\s\w\w\s'
'\s\w\w\w'
'\w\w\w\w'
Programs that are particular about reading pdbs, will choke if you send them "%-4s" for all of them. So we need a conditional formatting. Bleh.
Think about activation and deactivation of rotatable groups, and whether there is a tricky way to tie into an existing pdbqt file.
Much of the core remains unchanged as roles are added and enhanced. They may need to be extracted from the core repo and developed more modularly to improve releases. Think on this.
using the lwp user-agent on a machine with a failure uncovered the dependencies that needed to be installed to get it installed.
If you download a lot of pdb by the method, the website(pdb.org) will refuse your request.
the website will think downloading pdb by the robot not human.
one solvtion:
try mojo or LWP, disguise a browser.
currently, print_pdb, xyz, etc. will either print to the screen or to a file handle. A better approach would be to return a string (or lines or an iterator) that the user can decide where the output should go.
a string could then be used for easy clones (not really clone, but gens of similar objects) via reading in strings.
Perhaps adding the new method would be a good start:
form_string('pdb', [xyz .. etc])
Programs such as Orca and Gaussian allow coordinates to be fixed in calculations. adding the attribute would allow groups of atoms to be flagged when using these programs.
The test suite needs refactoring.
http://blogs.perl.org/users/ovid/2013/10/the-problem-with-perl-testing.html
can we support 4 letter resnames without messing up pdb spec for writing/reading?
found by @tlfobe ,
pdb chains are typically letters; big complexes use numbers.
$mol->select("chain 1") will give a warning when comparisons to letters because of how the regex method is initiated (numeric -> == ); a possible fix is to use strings are attrs and revisit type. However, the selector still works, ('A' == 1) -> '' which is still false, but the warning is annoying.
real world example.
5l8r:
Argument "F" isn't numeric in numeric eq (==) at (eval 865) line 1.
and revisit how it's done.
http://www.rcsb.org/pdb/software/rest.do
$mol->push_groups($some_mol);
is this the intended behavior?
How good is POPS?
The YAML molecular format should allow for variables to be fed in as arrays.
currently debating simple vs flexible representations:
simple:
---
atoms:
- N 0 0.483824 1.697569 -0.701935
- C 0 0.027824 0.314569 -0.780935
- C 0 0.152824 -0.407431 0.563065
- O 0 -0.559176 -1.416431 0.778065
- C 2 CC 3 CCC 4 CCCC
vars:
- CC : 1.54
- CCC : 106.42
- CCCC : [-81.90, 89, 09]
flexible:
---
atoms :
- N : { '0' : [[0,0,0], [0.483824, 1.697569, -0.701935]] }
- C : { '0' : [[0.027824, 0.314569, -0.780935]] }
- C : { '0' : [[0.152824, -0.407431, 0.563065]] }
- O : { '0' : [[-0.559176, -1.416431, 0.778065]] }
- C : { '2' : CC, '3' :CCC, '4' : CCCC }
vars:
- CC : [1.54]
- CCC : [106.42]
- CCCC : [-81.90, 80, 90]
using grep is awesome, but challenging for newbies.
charmm, pymol, and VMD selections could serve as inspirations.
$atomgroup = $mol->vmd_select("same residue as within 5 of element Zn");
that one seems annoying
however,
$backbone = $mol->select_group("backbone");
doesn't seem so bad.
Currently, a PDB will be downloaded only if it doesn't exist in the current directory.
Rather than shift, maybe we should just export some functions for common tasks.
e.g.
my $mol = HackaMol->new()->pdbid_mol("2cba");
could be called via
my $mol = pdbid_mol "2cba"
is there a DRY way to do this?
the api may be cleaner with another class that contains collections of molecules. The molecule collection should have methods for adjusting and comparing structures, such as superposition, packing, etc.
One useful attribute would be that of symmetry operators, and lattice parameters. with one molecule and a handful of symmetry operators one could have a faithful representation of a crystal lattice.
you just gotta have Z matrices. We need support for variables.
The parsing of Z-matrices should occur in two stages:
Stage 1. substitute variables
Stage 2. build the molecule using NERF
Dream: implement a YAML specification for molecular geometries that stores variable as arrays. Looping over the arrays of Z-matrix variables can compactly populate a large ensemble of configurations.
Need to coerce coordinates into a new memory location....
what is the expected behavior? document and test
This would allow a little more information to be stored in the file
Does HackaMol can parse out the solution of x-ray pdb?
for example ,give an pdb id,it will tell us what techology (X-RAY or NMR) and resolution ratio?
The resname field is not properly formatted with respect to that typically expected by programs that render secondary structure.
see Bio::PDB::Structure for some relevant code needed to get the names right.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.