Git Product home page Git Product logo

dpgen2's People

Contributors

amcadmus avatar angel-jia avatar angusezhang avatar huangjiameng avatar imgbot[bot] avatar njzjz avatar pre-commit-ci[bot] avatar wanghan-iapcm avatar wangzyphysics avatar zjgemi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

dpgen2's Issues

Implement Gaussian

This issue asks for the support of first-principles (FP) software Gaussian.
The implementation is supposed to take the advantage of the new interface for adding new FP methods by PR #98

dpgen2 showkey seems not work at debug mode

urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='127.0.0.1', port=2746): Max retries exceeded with url: /api/v1/workflows/argo/dpgen-f05hl/ (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x2b372af03cd0>: Failed to establish a new connection: [Errno 111] Connection refused'))

automatic docs

including:

  • automatic Python API docs
  • automatic dargs docs

Bug: exceeded workflow templates limited

The scheduler is a class with large amount of data. Passing scheduler as dflow.Parameter causes an error like

workflow templates are limited to 128KB, this workflow is 172157 bytes

Fast fixing: encode the class in a file and pass the scheduler as a dflow.Artifact.

User interface for monitoring the dpgen2 workflow

Via the interface the user should be able to check

  • the stage of the exploration
  • the iteration of the exploration
  • the accurate, candidate, failed ratios of the exploration
  • the number of data generated in each iteration, and the accumulated amount of data.

System.from does not suppot the mixed_type data

The model of lmp does not support "fmt" : "deepmd/npy/mixed",

"convergence": {
         "type":                    "adaptive-lower",
        "conv_tolerance":            0.00,
        "numb_candi_f":             10,
        "_rate_candi_f":              0,
        "level_f_hi":                0.5,
        "n_checked_steps":           8,
        "_command":      "all"
     },
     "max_numb_iter" :       1,
     "fatal_at_max" :        false,
     "output_nopbc":         true,
     "configuration_prefix": null,
     "configurations":       [
         {
             "type" : "file",
             "files" : ["md.data/35"],
             "fmt" : "deepmd/npy/mixed",
             "remove_pbc" : true
         }

Implement the op RunVasp

Implement the op RunVasp. This OP run a VASP DFT task prepared by PrepVasp, and outputs the labeled configuration (coordinate, simulation cell, dft energy, force and virial) in the deepmd/npy data format provided by dpdata.

We have to implement the execute method of the class. What the op does is explained in the docstr of the class and the interface of the execute method is provided in the docstr of the method.

One can also take the implementation of RunDPTrain as an example for the RunVasp

Implement adaptive trust level stage scheduler

Implement the adaptive trust level scheduler. In each iteration this scheduler:

  1. sort the model deviation of all explored configurations.
  2. select a certain number of configurations of the highest model deviations as candidates.
  3. set the new trust level as the highest model deviation of the rest configurations
  4. the exploration stage converges when the trust level does not decrease.

User interface for downloading resultant files.

With the interface, a user should be able to download the resultant files of the dpgen iterations.

  • the result files of training, including the models, the learning curves, the logs.
  • the result files of lmp exploration. including the input and output files of lmp exploration tasks.
  • the result files of labeling. including the input and output files of the vasp tasks.

how to set group_size in run_train_config?

As described in the doc, the step_configs/run_train_config/template_slice_config can be set, but it occur like that:

Traceback (most recent call last):
  File "/home/user/miniconda3/envs/dp/bin/dpgen2", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/user/miniconda3/envs/dp/lib/python3.11/site-packages/dpgen2/entrypoint/main.py", line 329, in main
    submit_concurrent_learning(
  File "/home/user/miniconda3/envs/dp/lib/python3.11/site-packages/dpgen2/entrypoint/submit.py", line 621, in submit_concurrent_learning
    dpgen_step, finetune_step = workflow_concurrent_learning(
                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/dp/lib/python3.11/site-packages/dpgen2/entrypoint/submit.py", line 399, in workflow_concurrent_learning
    concurrent_learning_op = make_concurrent_learning_op(
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/dp/lib/python3.11/site-packages/dpgen2/entrypoint/submit.py", line 142, in make_concurrent_learning_op
    prep_run_train_op = PrepRunDPTrain(
                        ^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/dp/lib/python3.11/site-packages/dpgen2/superop/prep_run_dp_train.py", line 187, in __init__
    self = _prep_run_dp_train(
           ^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/envs/dp/lib/python3.11/site-packages/dpgen2/superop/prep_run_dp_train.py", line 240, in _prep_run_dp_train
    prep_train = Step(
                 ^^^^^
TypeError: Step.__init__() got an unexpected keyword argument 'template_slice_config'

implement more OPs and workflows

DPGEN2 refactor DP-GEN based on dflow. The technical details can be found here. Currently, we have only implemented DP+LAMMPS+VASP workflow.

The following features, which are part of DP-GEN, have not been implemented into DPGEN2 yet:

  • Workflows
    • simplify
    • init data
  • MD OPs
    • Gromacs
    • AMBER
    • calypso
    • extra features like enhanced sampling
  • FP OPs
    • cp2k
    • gaussian
    • siesta
    • abacus
  • Taking clusters for sampling

DPGEN2 also accepts features that were not implemented in DP-GEN.


Notes:

  1. Unit tests and examples should be added.
  2. PEP-8 should be obeyed.
  3. DP-GEN codes can be reused under the LGPL-3.0 license.
  4. DPGEN2 offers some useful utils, including dpgen2.utils.chdir.set_directory and dpgen2.utils.run_command.run_command.

dpgen2 can not support CP2K

I attempted to utilize the CP2K package for implementing DPGEN2, but encountered the following error:

raise ArgumentValueError(
dargs.dargs.ArgumentValueError: [at location `fp`] get invalid choice `cp2k` for flag key `type`.

It would be greatly beneficial if the developer could provide support for CP2K in DPGEN2, as CP2K is a valuable tool.

The CICD of the project

setup the continuous integration and continuous deployment for dpgen2.

  • unit test on pull request
  • distribution via pip and/or conda
  • building docker images

Implement the op PrepVasp

This op prepare the VASP DFT calculation tasks. This op takes the vasp input template and a list of configurations, outputs a list of path that contains all the necessary files to run a VASP DFT task.

We have to implement the execute method of the class. What the op does is explained in the docstr of the class and the interface of the execute method is provided in the docstr of the method.

One can also take the implementation of PrepDPTrain as an example for the PrepVasp

CH4 example needs update

Some of the keys in the input file example/ch4/param_CH4_deepmd-kit-2.0.1.json are outdated. For example run_fp_image is not supported anymore. We need to update the input file

Implement the op CollectData

This op

  1. collect the labeled data stored in each fp task directories,
  2. store the data in one directory
  3. add the directory to the training data.

The source code of the op is here. One may consult the docstr for the interface of the op.

RFC: Refactor DPGEN2 with a new design

Hi community,

This RFC is about a proposal to refactor DPGEN workflow with a new design based on DFlow

A typical DPGEN2 configuration is like the below:
https://github.com/deepmodeling/dpgen2/blob/master/examples/chno/input.json

IMHO there are some issues in the configuration:

  1. The context (executor, container, etc) configuration is mix with the configuration of algorithm
  2. It is hard to validate such configuration with tool like pydantic, which would be error prone
  3. Data files are not allowed to carry their own configuration, which makes it hard to training different systems at the same time.

A suggested pseudo configuration design is like the below, which borrow some ideas from ai2-kit project.
This configuration is supposed to be more formal and clean to maintain.

# executor configuration
executor:
  bohrium: ...

# dflow configuration for each software
dflow:
  python:
    container: ai2-kit/0.12.10
    python_cmd: python3
  deepmd:
    container: deepmd/2.7.1
    dp_cmd: dp
  lammps:
    container: deepmd/2.7.1
    lammps_cmd: lmp
  cp2k:
    container: cp2k/2023.1
    cp2k_cmd: mpirun cp2k.psmp

# declare file resources as datasets before use them
# so that we can assign extra attributes to them
datasets:
  dpdata-Ni13Pd12:
    url: /path/to/data
    format:  deepmd/npy

  sys-Ni13Pd12:
    url: /path/to/data
    includes: POSCAR*
    format: vasp
    attrs:
    # allow user to defined system-wise configuration
    # so that we can explore multiple types of systems in an iteration
      lammps:
        plumed_config: !load_text plumed.inp # use custom yaml tags to embed data from other file
      cp2k:
        input_template: !load_text cp2k.inp

workflow:
  general:
    type_map: [C, O, H]
    mass_map: [12, 16, 1]
    max_iters: 5

  train:
    deepmd:
      init_dataset: [dpdata-Ni13Pd12]
      input_template: !load_yaml deepmd.json  # use custom yaml tags to embed data from other file

  explore:
    # instead of using `type: lammps` to specific different software
    # specific a dedicated entry for different softwares of the same stage
    # so that we can use pydantic to validate the configuration item
    # and lead to a better code structure:
    # https://github.com/chenggroup/ai2-kit/blob/main/ai2_kit/workflow/cll_mlp.py#L163-L293
    lammps:
      nsteps: 10
      systems: [ sys-Ni13Pd12 ]  # reference dataset via key
      # support different way of variable combination strategies to avoid combination explosion
      # vars defined in `explore_vars` will combines with system_files with Cartesian product
      # vars defined in `broadcast_vars` will just broadcast to system_files
      # this design is useful if there are a lot of file
      explore_vars:
        TEMP: [330, 430, 530]
      broadcast_vars:
        LAMBDA_f: [0.0, 0.25, 0.5. 0.75. 1.0]
      template_vars:
        POST_INIT:  |
          neighbor bin 2.0
      plumed_config: !load_text plumed.inp

   # isolated select stage from explore so that we can implement more complex structure selection algorithm
  select:
    model_devi:
      decent_f: [0.12, 0.18]
    limit: 50

  label:
    cp2k:
      input_template: !load_text cp2k.inp

next:
  # specify configuration for next iteration
  # it will merge with the current configuration as a new configuration file for next round
  config: !load_yml iter-001.yml

The above configuration is easy to validate with pydantic, for example:
https://github.com/chenggroup/ai2-kit/blob/main/ai2_kit/workflow/cll_mlp.py#L32-L111

I believe a better design of configuration will lead to a better software design.
I post my thoughts for the community to review, and it would be appreciated to get some feedbacks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.