Git Product home page Git Product logo

42's Introduction

Open Chemistry

Open Chemistry

Introduction

The Open Chemistry project is a collection of open source, cross platform libraries and applications for the exploration, analysis and generation of chemical data. The project builds upon various efforts by collaborators and innovators in open chemistry such as the Blue Obelisk, Quixote and the associated projects. We aim to improve the state of the art, and facilitate the open exchange of ideas and exchange of chemical data leveraging the best technologies ranging from quantum chemistry codes, molecular dynamics, informatics and visualization.

This repository contains git submodules for the Open Chemistry projects: Avogadro, MoleQueue and MongoChem. It can be used to download all relevant source files as well as building many of the necessary dependencies. Please see the documentation in the submodules for more details about each project.

Kitware, Inc.

Installing

We provide nightly binaries built by our dashboards for Mac OS X and Windows. If you would like to build from source we recommend that you follow our building Open Chemistry guide that will take care of building most dependencies.

Contributing

Our project uses the standard GitHub pull request process for code review and integration. Please check our development guide for more details on developing and contributing to the project. The GitHub issue tracker can be used to report bugs, make feature requests, etc.

Our wiki is used to document features, flesh out designs and host other documentation. Our API is [documented using Doxygen][Doxygen] with updated documentation generated nightly. We have several mailing lists to coordinate development and to provide support.

42's People

Watchers

 avatar  avatar  avatar  avatar  avatar

42's Issues

Remote Open Chemistry server from local JupyterLab

We want to be able to connect to a specified Open Chemistry server using the server URL for public API/searches, and a username/API key for privileges endpoints. This can be used with a local JupyterLab instance, or central resources such as those provided at NERSC.

Search API and interfaces to it

The current state is that endpoints such as molecules, calculations will return all elements in the database which will not scale, or they can search on name, inchi etc. Our search capabilities are also quite limited at present. I would say on the backend we want to support search with some common features across our different collections/endpoints:

  • Automatically limit to 25 search results
  • Ensure returned data is useful, but not too big
  • Support for changing the limit
  • Support for specifying an offset
  • Support for ordering by parameters (default to most recent first)
  • Support for changing the sort order/direction

The returned data should follow a similar pattern too, with a JSON object containing high level summary of the results, and a results array containing result objects:

{
  "matches": 42,
  "limit": 2,
  "offset": 0,
  "results": [ { ... }, { ... } ]
}

I think we need to work on extending our concept of users to include some useful data such as ORCID, Twitter username, etc that can be set publicly so that you might search on a name, ORCID, etc to see results for that person in molecules, calculations, ... There are a few things we should try and get working in search too, including queries like USER AND heavy atom count = 10, > 10, etc. Same for molecular weight, formula, InChI, InChI key, SMILES.

I think starting with molecules search is good as it is simpler, then calculations doing things like calculations run by Marcus using NWChem or Psi4 sorted by most recent would be good to think about. These should come after the card stuff, and it likely needs further discussion but writing down some ideas.

Card and table view for molecules

Previous work featured a card view:
image
and a table view:
image
We need some equivalents for displaying results to searches, and should look at possibilities for 3D structures that summarize a sequence of results. I think we need to think about this for the single page interface, and also what we can do within Jupyter.

Run multiple SMILES in one docker container

In order to be able to do this, I think we will need to modify a little bit of our calculation workflow. @alesgenova can correct me if I'm wrong about any parts, but I think this is what happens:

  1. Before the docker container runs, a calculation is posted to the calculations collection. This calculation contains a single molecule ID and most of the needed input. It is considered to be a pending calculation.

  2. A taskflow is created using information from this calculation. It gets the molecule information via the molecule ID, converts it to the format needed by the docker container, and also gets the input from the calculation. The docker container is then ran for this single molecule.

  3. When the docker container is finished, the output is written to the calculation and then put back into the database (overwriting the original calculation used for the input).

  4. When a user tries to run a calculation in the future, it checks to see if a calculation already exists that used that moleculeId, input parameters, and docker image (to avoid re-running the same calculation).

In order to, in general, be able to run multiple calculations in a single docker container (such as running multiple SMILES through chemml), I think we'll need to modify this workflow.

How? I'm not quite sure yet. But I have one proposal at least: what if we make it so that a calculation can have a list of molecule IDs instead of just a single molecule ID? And then the single docker container can run the same input parameters on all of the molecules in the list in one go.

Thoughts, @alesgenova, @cjh1, and @cryos?

Notes on issues with data.openchemistry.org

I was just taking a look at the new deployment, and thought I would point out a few issues I saw as I was using it.

  • Pasting in a link from a view I was looking at results in a 404, e.g. here
  • The menu on the visualization for that page (or any of the menus on the visualization widget) is broken, just shows text but no actual menu
  • Need user API key management piece

I think that is all I have for now, it would be good to get both working soon.

Plots/graphs in notebooks

Plotly and bqplot offer significant out of the box functionality for standard plots in notebooks. We should integrate some examples of their use, especially for simple plots where we just want to display scatter plots, bar charts, etc. This likely involves adding some optional dependencies, and ensuring they are present in our demonstration installation.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.