Git Product home page Git Product logo

mdx's Introduction

mdx's People

Contributors

ashenoy463 avatar

Watchers

 avatar

mdx's Issues

Special chunks are handled awkwardly

Can put them in a separate directory with structure isomorphic to /home/Work/sim_data and just initialize a different mdx.ingest.Simulation object for them since they are always analysed separately from the regular data anyway.

Rework meta file format

Reworks needed for xarray:

  • Execution dates need to be extracted (through #5)
  • Current nested format is not supported by xarray

Expand thermo parsing to entire LAMMPS spec

one args = none
multi args = none
yaml args = none
custom args = list of keywords
  possible keywords = step, elapsed, elaplong, dt, time,
                      cpu, tpcpu, spcpu, cpuremain, part, timeremain,
                      atoms, temp, press, pe, ke, etotal,
                      evdwl, ecoul, epair, ebond, eangle, edihed, eimp,
                      emol, elong, etail,
                      enthalpy, ecouple, econserve,
                      vol, density,
                      xlo, xhi, ylo, yhi, zlo, zhi,
                      xy, xz, yz,
                      avecx, avecy, avecz,
                      bvecx, bvecy, bvecz,
                      cvecx, cvecy, cvecz,
                      lx, ly, lz,
                      xlat, ylat, zlat,
                      cella, cellb, cellc, cellalpha, cellbeta, cellgamma,
                      pxx, pyy, pzz, pxy, pxz, pyz,
                      bonds, angles, dihedrals, impropers,
                      fmax, fnorm, nbuild, ndanger,
                      c_ID, c_ID[I], c_ID[I][J],
                      f_ID, f_ID[I], f_ID[I][J],
                      v_name, v_name[I]
    step = timestep
    elapsed = timesteps since start of this run
    elaplong = timesteps since start of initial run in a series of runs
    dt = timestep size
    time = simulation time
    cpu = elapsed CPU time in seconds since start of this run
    tpcpu = time per CPU second
    spcpu = timesteps per CPU second
    cpuremain = estimated CPU time remaining in run
    part = which partition (0 to Npartition-1) this is
    timeremain = remaining time in seconds on timer timeout.
    atoms = # of atoms
    temp = temperature
    press = pressure
    pe = total potential energy
    ke = kinetic energy
    etotal = total energy (pe + ke)
    evdwl = van der Waals pairwise energy (includes etail)
    ecoul = Coulombic pairwise energy
    epair = pairwise energy (evdwl + ecoul + elong)
    ebond = bond energy
    eangle = angle energy
    edihed = dihedral energy
    eimp = improper energy
    emol = molecular energy (ebond + eangle + edihed + eimp)
    elong = long-range kspace energy
    etail = van der Waals energy long-range tail correction
    enthalpy = enthalpy (etotal + press*vol)
    ecouple = cumulative energy change due to thermo/baro statting fixes
    econserve = pe + ke + ecouple = etotal + ecouple
    vol = volume
    density = mass density of system
    xlo,xhi,ylo,yhi,zlo,zhi = box boundaries
    xy,xz,yz = box tilt for restricted triclinic (non-orthogonal) simulation boxes
    avecx,avecy,avecz = components of edge vector A of the simulation box
    bvecx,bvecy,bvecz = components of edge vector B of the simulation box
    cvecx,cvecy,cvecz = components of edge vector C of the simulation box
    lx,ly,lz = box lengths in x,y,z
    xlat,ylat,zlat = [lattice](https://docs.lammps.org/lattice.html) spacings as calculated by lattice command
    cella,cellb,cellc = periodic cell lattice constants a,b,c
    cellalpha, cellbeta, cellgamma = periodic cell angles alpha,beta,gamma
    pxx,pyy,pzz,pxy,pxz,pyz = 6 components of pressure tensor
    bonds,angles,dihedrals,impropers = # of these interactions defined
    fmax = max component of force on any atom in any dimension
    fnorm = length of force vector for all atoms
    nbuild = # of neighbor list builds
    ndanger = # of dangerous neighbor list builds
    c_ID = global scalar value calculated by a compute with ID
    c_ID[I] = Ith component of global vector calculated by a compute with ID, I can include wildcard (see below)
    c_ID[I][J] = I,J component of global array calculated by a compute with ID
    f_ID = global scalar value calculated by a fix with ID
    f_ID[I] = Ith component of global vector calculated by a fix with ID, I can include wildcard (see below)
    f_ID[I][J] = I,J component of global array calculated by a fix with ID
    v_name = value calculated by an equal-style variable with name
    v_name[I] = value calculated by a vector-style variable with name, I can include wildcard (see below)

Extend trajectory parsing to entire LAMMPS spec

custom or custom/gz or custom/zstd or cfg or cfg/gz or cfg/zstd or cfg/uef or netcdf or netcdf/mpiio or yaml attributes:

id = atom ID
mol = molecule ID
proc = ID of processor that owns atom
procp1 = ID+1 of processor that owns atom
type = atom type
element = name of atom element, as defined by dump_modify command
mass = atom mass
x,y,z = unscaled atom coordinates
xs,ys,zs = scaled atom coordinates
xu,yu,zu = unwrapped atom coordinates
xsu,ysu,zsu = scaled unwrapped atom coordinates
ix,iy,iz = box image that the atom is in
vx,vy,vz = atom velocities
fx,fy,fz = forces on atoms
q = atom charge
mux,muy,muz = orientation of dipole moment of atom
mu = magnitude of dipole moment of atom
radius,diameter = radius, diameter of spherical particle
omegax,omegay,omegaz = angular velocity of spherical particle
angmomx,angmomy,angmomz = angular momentum of aspherical particle
tqx,tqy,tqz = torque on finite-size particles
c_ID = per-atom vector calculated by a compute with ID
c_ID[I] = Ith column of per-atom array calculated by a compute with ID, I can include wildcard (see below)
f_ID = per-atom vector calculated by a fix with ID
f_ID[I] = Ith column of per-atom array calculated by a fix with ID, I can include wildcard (see below)
v_name = per-atom vector calculated by an atom-style variable with name
i_name = custom integer vector with name
d_name = custom floating point vector with name
i2_name[I] = Ith column of custom integer array with name, I can include wildcard (see below)
d2_name[I] = Ith column of custom floating point vector with name, I can include wildcard (see below)

local or local/gz or local/zstd attributes:

possible attributes = index, c_ID, c_ID[I], f_ID, f_ID[I]
  index = enumeration of local values
  c_ID = local vector calculated by a compute with ID
  c_ID[I] = Ith column of local array calculated by a compute with ID, I can include wildcard (see below)
  f_ID = local vector calculated by a fix with ID
  f_ID[I] = Ith column of local array calculated by a fix with ID, I can include wildcard (see below)

grid or grid/vtk attributes:

possible attributes = c_ID:gname:dname, c_ID:gname:dname[I], f_ID:gname:dname, f_ID:gname:dname[I]
  gname = name of grid defined by compute or fix
  dname = name of data field defined by compute or fix
  c_ID = per-grid vector calculated by a compute with ID
  c_ID[I] = Ith column of per-grid array calculated by a compute with ID, I can include wildcard (see below)
  f_ID = per-grid vector calculated by a fix with ID
  f_ID[I] = Ith column of per-grid array calculated by a fix with ID, I can include wildcard (see below)

`mdx.ingest.Simulation` should hold data states

Instead of just being a holder for parsing methods, the Simulation object should hold trajectory/bonds/species data in its attributes.

  • Formulate a list of attributes to have

  • Make class methods update those attributes

  • handlers from mdx.io can then be initialized separately by passing a get_writer() function the simulation object

Adopting xarray as sole canonical format

Reasons for:

  • 1 time investment. no more dealing with text stream overhead, only optimised operations.

  • Respect the poly-indexability of our data. We can index with timestep, box-time or atom_id, we could even form and track groups

  • Naturally supports binning, chunking, averaging and mapping array functions

  • Badaulat Dask; efficient parallel and out of core processing; optimised analysis functions can be developed with minimal effort. we do not need to reinvent the wheel.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.