Git Product home page Git Product logo

docker-slurmbase's Introduction

Docker SLURM Cluster

This repository is forked from Data Driven HPC that provides a set of containers that can be used to run a SLURM HPC cluster as a set of Docker containers. The project consists of three components:

  1. docker-slurmctld provide a SLURM controller or "head node".

  2. docker-slurmd provides a SLURM compute node.

  3. docker-slurmbase is the base container from which both docker-slurmctld and docker-slurmd inherit.

In the repository was added influxdb-plugin that gather accounting data about tasks and store it in an external database (InfluxDB).

This repository contains the container source files. The ready built container images are available via DockerHub: https://hub.docker.com/r/gromr1.

The Docker SLURM cluster is configured with the following software packages:

  • Ubuntu 16.04 LTS
  • SLURM 16.05.9
  • GlusterFS 3.8
  • Open MPI 1.10.2

A user ddhpc is configured across all nodes for MPI job execution and a shared GlusterFS volume ddhpc is mounted on all nodes as /data/ddhpc. The head node runs an SSH server for accessing the cluster.

Launch a New SLURM cluster

Create a new directory with a docker-compose.yml file:

version: '2'

services:
  slurmctld:
    container_name: slurmctld
    environment:
      SLURM_CLUSTER_NAME: ddhpc
      SLURM_CONTROL_MACHINE: slurmctld
      SLURM_NODE_NAMES: slurmd
      INFLUXDB_HOST: influxdb
      INFLUXDB_DATABASE_NAME: docker_slurm
    tty: true
    hostname: slurmctld
    networks:
      default:
        aliases:
          - slurmctld
    image: gromr1/slurmctld
    stdin_open: true
  slurmd:
    container_name: slurmd
    environment:
      SLURM_CONTROL_MACHINE: slurmctld
      SLURM_CLUSTER_NAME: ddhpc
      SLURM_NODE_NAMES: slurmd
      INFLUXDB_HOST: influxdb
      INFLUXDB_DATABASE_NAME: docker_slurm
    tty: true
    hostname: slurmd
    networks:
      default:
        aliases:
          - slurmd
    image: gromr1/slurmd
    depends_on:
      - slurmctld
    stdin_open: true

After that you can create and run the configured containers with a command docker-compose up -d.

For a stopping them run docker-compose down.

Configuration variables:

  • SLURM_CLUSTER_NAME: the name of the SLURM cluster.
  • SLURM_CONTROL_MACHINE: the host name of the controller container. This should match hostname in the slurmctld section.
  • SLURM_NODE_NAMES: the host name of the compute node container. This should match hostname in the slurmd section.
  • INFLUXDB_HOST: the host name of the database host.
  • INFLUXDB_DATABASE_NAME: the name of existing database in influxdb host. Database should exists a retention policy with name 'default'.

docker-slurmbase's People

Contributors

gromr1 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.