Git Product home page Git Product logo

nomad-driver-singularity's Introduction

nomad-driver-singularity

GoDoc Build Status Code Coverage Go Report Card

Hashicorp Nomad driver plugin using Singularity containers to execute tasks.

Requirements

Building The Driver

Clone repository on your prefered path

git clone [email protected]:sylabs/nomad-driver-singularity

Enter the provider directory and build the provider

cd nomad-driver-singularity
make dep
make build

Developing the Provider

If you wish to contribute on the project, you'll first need Go installed on your machine, and have have singularity installed.

To compile the provider, run make build. This will build the provider and put the task driver binary under the NOMAD plugin dir, which by default is located under <nomad-data-dir>/plugins/.

Check Nomad -data-dir and -plugin-dir flags for more information.

make build

In order to test the provider, you can simply run make test.

make test

nomad-driver-singularity's People

Contributors

arangogutierrez avatar dtrudg avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

nomad-driver-singularity's Issues

[Error] Could not get exit code for failed program

@bilke reported an error on slack

Apr 24 15:40:15 singularity1 nomad[789]:     2019-04-24T15:40:15.196+0200 [INFO ] client.alloc_runner.task_runner.task_hook.logmon.nomad: opening fifo: alloc_id=f9bd314e-57d3-56bb-bbb8-8eb9ff6f87e5 task="moo moo" @module=logmon path="/var/nomad/alloc/f9bd314e-57d3-56bb-bbb8-8eb9ff6f87e5/alloc/logs/.moo moo.stdout.fifo" timestamp=2019-04-24T15:40:15.195+0200
Apr 24 15:40:15 singularity1 nomad[789]:     2019-04-24T15:40:15.196+0200 [INFO ] client.alloc_runner.task_runner.task_hook.logmon.nomad: opening fifo: alloc_id=f9bd314e-57d3-56bb-bbb8-8eb9ff6f87e5 task="moo moo" @module=logmon path="/var/nomad/alloc/f9bd314e-57d3-56bb-bbb8-8eb9ff6f87e5/alloc/logs/.moo moo.stderr.fifo" timestamp=2019-04-24T15:40:15.196+0200
Apr 24 15:40:15 singularity1 nomad[789]:     2019-04-24T15:40:15.210+0200 [INFO ] client.driver_mgr.nomad-driver-singularity: starting singularity task: driver=singularity driver_cfg="{Image:library://sylabsed/examples/lolcow:latest Args:[] Command:run Debug:true Verbose:true Binds:[] Security:[] KeepPrivs:false DropCaps: Contain:false NoHome:false Home: Workdir: Pwd: App: Overlay:[]}" @module=singularity timestamp=2019-04-24T15:40:15.210+0200
Apr 24 15:40:15 singularity1 nomad[789]:     2019-04-24T15:40:15.211+0200 [ERROR] client.driver_mgr.nomad-driver-singularity: Could not get exit code for failed program: : driver=singularity @module=singularity singularity="[-d -v run library://sylabsed/examples/lolcow:latest]" timestamp=2019-04-24T15:40:15.210+0200
Apr 24 15:40:15 singularity1 nomad[789]:     2019-04-24T15:40:15.215+0200 [INFO ] client.alloc_runner.task_runner: failed to start task because plugin shutdown unexpectedly; attempting to recover: alloc_id=f9bd314e-57d3-56bb-bbb8-8eb9ff6f87e5 task="moo moo"
Apr 24 15:40:15 singularity1 nomad[789]:     2019-04-24T15:40:15.215+0200 [WARN ] client.driver_mgr: received fingerprint error from driver: driver=singularity error="plugin is shut down"
Apr 24 15:40:15 singularity1 nomad[789]:     2019-04-24T15:40:15.230+0200 [INFO ] client.driver_mgr.nomad-driver-singularity: starting singularity task: driver=singularity @module=singularity driver_cfg="{Image:library://sylabsed/examples/lolcow:latest Args:[] Command:run Debug:true Verbose:true Binds:[] Security:[] KeepPrivs:false DropCaps: Contain:false NoHome:false Home: Workdir: Pwd: App: Overlay:[]}" timestamp=2019-04-24T15:40:15.229+0200
Apr 24 15:40:15 singularity1 nomad[789]:     2019-04-24T15:40:15.230+0200 [ERROR] client.driver_mgr.nomad-driver-singularity: Could not get exit code for failed program: : driver=singularity @module=singularity singularity="[-d -v run library://sylabsed/examples/lolcow:latest]" timestamp=2019-04-24T15:40:15.230+0200
Apr 24 15:40:15 singularity1 nomad[789]:     2019-04-24T15:40:15.234+0200 [ERROR] client.alloc_runner.task_runner: running driver failed: alloc_id=f9bd314e-57d3-56bb-bbb8-8eb9ff6f87e5 task="moo moo" error="failed to start task after driver exited unexpectedly: plugin is shut down"
Apr 24 15:40:15 singularity1 nomad[789]:     2019-04-24T15:40:15.234+0200 [WARN ] client.driver_mgr: received fingerprint error from driver: driver=singularity error="plugin is shut down"
Apr 24 15:40:15 singularity1 nomad[789]:     2019-04-24T15:40:15.237+0200 [INFO ] client.alloc_runner.task_runner: not restarting task: alloc_id=f9bd314e-57d3-56bb-bbb8-8eb9ff6f87e5 task="moo moo" reason="Error was unrecoverable"
Apr 24 15:40:15 singularity1 nomad[789]:     2019-04-24T15:40:15.243+0200 [INFO ] client.gc: marking allocation for GC: alloc_id=f9bd314e-57d3-56bb-bbb8-8eb9ff6f87e5
Apr 24 15:40:23 singularity1 nomad[789]:     2019-04-24T15:40:23.017+0200 [WARN ] client.host_stats: error fetching host disk usage stats: error="no such file or directory" partition=/var/nomad/alloc/9bd114e0-e5e3-d571-1dd9-f075e77d7b7e/moo\040moo/alloc
Apr 24 15:40:24 singularity1 nomad[789]:     2019-04-24T15:40:24.700+0200 [INFO ] client: node registration complete
Apr 24 15:40:27 singularity1 nomad[789]:     2019-04-24T15:40:27.091+0200 [INFO ] client.alloc_runner.task_runner.task_hook.logmon.nomad: opening fifo: alloc_id=9bd114e0-e5e3-d571-1dd9-f075e77d7b7e task="moo moo" @module=logmon path="/var/nomad/alloc/9bd114e0-e5e3-d571-1dd9-f075e77d7b7e/alloc/logs/.moo moo.stdout.fifo" timestamp=2019-04-24T15:40:27.091+0200
Apr 24 15:40:27 singularity1 nomad[789]:     2019-04-24T15:40:27.091+0200 [INFO ] client.alloc_runner.task_runner.task_hook.logmon.nomad: opening fifo: alloc_id=9bd114e0-e5e3-d571-1dd9-f075e77d7b7e task="moo moo" path="/var/nomad/alloc/9bd114e0-e5e3-d571-1dd9-f075e77d7b7e/alloc/logs/.moo moo.stderr.fifo" @module=logmon timestamp=2019-04-24T15:40:27.091+0200
Apr 24 15:40:27 singularity1 nomad[789]:     2019-04-24T15:40:27.112+0200 [INFO ] client.driver_mgr.nomad-driver-singularity: starting singularity task: driver=singularity driver_cfg="{Image:library://sylabsed/examples/lolcow:latest Args:[] Command:run Debug:true Verbose:true Binds:[] Security:[] KeepPrivs:false DropCaps: Contain:false NoHome:false Home: Workdir: Pwd: App: Overlay:[]}" @module=singularity timestamp=2019-04-24T15:40:27.112+0200
Apr 24 15:40:27 singularity1 nomad[789]:     2019-04-24T15:40:27.113+0200 [ERROR] client.driver_mgr.nomad-driver-singularity: Could not get exit code for failed program: : driver=singularity @module=singularity singularity="[-d -v run library://sylabsed/examples/lolcow:latest]" timestamp=2019-04-24T15:40:27.112+0200
Apr 24 15:40:27 singularity1 nomad[789]:     2019-04-24T15:40:27.116+0200 [INFO ] client.alloc_runner.task_runner: failed to start task because plugin shutdown unexpectedly; attempting to recover: alloc_id=9bd114e0-e5e3-d571-1dd9-f075e77d7b7e task="moo moo"
Apr 24 15:40:27 singularity1 nomad[789]:     2019-04-24T15:40:27.117+0200 [WARN ] client.driver_mgr: received fingerprint error from driver: driver=singularity error="plugin is shut down"
Apr 24 15:40:27 singularity1 nomad[789]:     2019-04-24T15:40:27.133+0200 [INFO ] client.driver_mgr.nomad-driver-singularity: starting singularity task: driver=singularity @module=singularity driver_cfg="{Image:library://sylabsed/examples/lolcow:latest Args:[] Command:run Debug:true Verbose:true Binds:[] Security:[] KeepPrivs:false DropCaps: Contain:false NoHome:false Home: Workdir: Pwd: App: Overlay:[]}" timestamp=2019-04-24T15:40:27.133+0200
Apr 24 15:40:27 singularity1 nomad[789]:     2019-04-24T15:40:27.133+0200 [ERROR] client.driver_mgr.nomad-driver-singularity: Could not get exit code for failed program: : driver=singularity @module=singularity singularity="[-d -v run library://sylabsed/examples/lolcow:latest]" timestamp=2019-04-24T15:40:27.133+0200
Apr 24 15:40:27 singularity1 nomad[789]:     2019-04-24T15:40:27.137+0200 [WARN ] client.driver_mgr: received fingerprint error from driver: driver=singularity error="plugin is shut down"
Apr 24 15:40:27 singularity1 nomad[789]:     2019-04-24T15:40:27.137+0200 [ERROR] client.alloc_runner.task_runner: running driver failed: alloc_id=9bd114e0-e5e3-d571-1dd9-f075e77d7b7e task="moo moo" error="failed to start task after driver exited unexpectedly: plugin is shut down"
Apr 24 15:40:27 singularity1 nomad[789]:     2019-04-24T15:40:27.140+0200 [INFO ] client.alloc_runner.task_runner: not restarting task: alloc_id=9bd114e0-e5e3-d571-1dd9-f075e77d7b7e task="moo moo" reason="Error was unrecoverable"
Apr 24 15:40:27 singularity1 nomad[789]:     2019-04-24T15:40:27.146+0200 [INFO ] client.gc: marking allocation for GC: alloc_id=9bd114e0-e5e3-d571-1dd9-f075e77d7b7e
Apr 24 15:40:27 singularity1 nomad[789]:     2019-04-24T15:40:27.150+0200 [ERROR] client.alloc_runner.task_runner.task_hook.logmon.nomad: reading plugin stderr: alloc_id=9bd114e0-e5e3-d571-1dd9-f075e77d7b7e task="moo moo" error="read |0: file already closed"
Apr 24 15:40:35 singularity1 nomad[789]:     2019-04-24T15:40:35.705+0200 [INFO ] client: node registration complete

documentation outdated on config options?

hi there,
in the hashicorp docs for this plugin i see an example mentioning a singularity_path setting, which i also see mentioned in #41 here.

now, when i try to set this i am met with an error:

No argument or block type is named "singularity_path"

to be fair, given the source code i don't see support for such a setting at present either, tho that makes me wonder what happened given it appeared to be documented before.

for context, i am running NixOS which changes a bunch of default paths, for which i would prefer to have such a setting to point to the binary (to prevent workarounds involving symlinks).

example.hcl: no license

example.hcl have "Copyright" and "All rights reserved" but no license that suggests non-free non-distributable file.

This is probably unintentional so please consider adding the following license grant, like in all other files:

// This Source Code Form is subject to the terms of the Mozilla Public
// License, v. 2.0. If a copy of the MPL was not distributed with this
// file, You can obtain one at http://mozilla.org/MPL/2.0/.

Thanks.

needless "Building Task Directory" in "contain" mode

Even when contain is set to true Nomad builds a chroot in task directory, which would be expected for "exec" driver but redundant for Singularity in "contain" mode.

Avoiding chroot in "contain" mode can dramatically speed-up container start.

Inconsistent Use of Vendoring

This module includes a checked-in vendor/ directory, and portions of the Makefile appear to aim to help maintain it. For example, make dep runs go mod vendor, among other things.

On the other hand, the build target explicitly enables module mode via GO111MODULE=on, which if I'm reading this post correctly, means the vendor directory is ignored (assuming Go >= 1.11.) The test and cover don't explicitly set GO111MODULE, and so the behaviour of these rules depends on whether the code is being compiled inside or outside of $GOPATH/src.

Is there a compelling reason that vendor/ is required? The Go maintainers appear to be signalling a move away from vendoring going forward (ref), and removing it would make the repo smaller and the Makefile simpler.

Improve CI Workflow

There are some recommendations about how best to pass around source code and dependencies on the CircleCI blog. Use checkout to retrieve source, and restore_cache to fetch cached Go modules.

The check_formatting and vet_source jobs can be folded into lint_source, since golangci-lint is capable of running these tools.

Strongly believe we should also be reporting code coverage (even if it's unflattering before #5 is actioned.)

unable to run example.hcl: Driver Failure | plugin is shut down

Attempt to run verbatim example.hcl failed as follows:

Driver Failure | failed to start task after driver exited unexpectedly: plugin is shut down

Here is Nomad log:

2019-11-05T03:00:36.204+1100 [INFO]  client.alloc_runner.task_runner.task_hook.logmon.nomad: opening fifo: alloc_id=f1f36ca9-1864-dd55-777f-570e6adf5eb2 task=mooo @module=logmon path=/var/lib/nomad/alloc/f1f36ca9-1864-dd55-777f-570e6adf5eb2/alloc/logs/.mooo.stdout.fifo timestamp=2019-11-05T03:00:36.204+1100
2019-11-05T03:00:36.205+1100 [INFO]  client.alloc_runner.task_runner.task_hook.logmon.nomad: opening fifo: alloc_id=f1f36ca9-1864-dd55-777f-570e6adf5eb2 task=mooo @module=logmon path=/var/lib/nomad/alloc/f1f36ca9-1864-dd55-777f-570e6adf5eb2/alloc/logs/.mooo.stderr.fifo timestamp=2019-11-05T03:00:36.204+1100
2019-11-05T03:00:36.225+1100 [INFO]  client.driver_mgr.nomad-driver-singularity: starting singularity task: driver=singularity @module=singularity driver_cfg="{Image:library://sylabsed/examples/lolcow:latest Args:[] Command:run Debug:true Verbose:true Binds:[] Security:[] KeepPrivs:false DropCaps: Contain:false NoHome:false Home: Workdir: Pwd: App: Overlay:[]}" timestamp=2019-11-05T03:00:36.224+1100
2019-11-05T03:00:36.226+1100 [ERROR] client.driver_mgr.nomad-driver-singularity: Could not get exit code for failed program: : driver=singularity @module=singularity singularity=[-d, -v, run, library://sylabsed/examples/lolcow:latest] timestamp=2019-11-05T03:00:36.226+1100
2019-11-05T03:00:36.229+1100 [WARN]  client.driver_mgr: received fingerprint error from driver: driver=singularity error="plugin is shut down"
2019-11-05T03:00:36.230+1100 [INFO]  client.alloc_runner.task_runner: failed to start task because plugin shutdown unexpectedly; attempting to recover: alloc_id=f1f36ca9-1864-dd55-777f-570e6adf5eb2 task=mooo
2019-11-05T03:00:36.255+1100 [INFO]  client.driver_mgr.nomad-driver-singularity: starting singularity task: driver=singularity @module=singularity driver_cfg="{Image:library://sylabsed/examples/lolcow:latest Args:[] Command:run Debug:true Verbose:true Binds:[] Security:[] KeepPrivs:false DropCaps: Contain:false NoHome:false Home: Workdir: Pwd: App: Overlay:[]}" timestamp=2019-11-05T03:00:36.254+1100
2019-11-05T03:00:36.255+1100 [ERROR] client.driver_mgr.nomad-driver-singularity: Could not get exit code for failed program: : driver=singularity @module=singularity singularity=[-d, -v, run, library://sylabsed/examples/lolcow:latest] timestamp=2019-11-05T03:00:36.255+1100
2019-11-05T03:00:36.262+1100 [WARN]  client.driver_mgr: received fingerprint error from driver: driver=singularity error="plugin is shut down"
2019-11-05T03:00:36.263+1100 [ERROR] client.alloc_runner.task_runner: running driver failed: alloc_id=f1f36ca9-1864-dd55-777f-570e6adf5eb2 task=mooo error="failed to start task after driver exited unexpectedly: plugin is shut down"
2019-11-05T03:00:36.263+1100 [INFO]  client.alloc_runner.task_runner: not restarting task: alloc_id=f1f36ca9-1864-dd55-777f-570e6adf5eb2 task=mooo reason="Error was unrecoverable"
2019-11-05T03:00:36.271+1100 [INFO]  client.gc: marking allocation for GC: alloc_id=f1f36ca9-1864-dd55-777f-570e6adf5eb2

Alloc directory appears to have prepared chroot (Nomad was "Building Task Directory" as with driver = "exec")...

Nomad 0.10.0 (current release), Singularity 3.4.2 (current release), driver-singularity built from HEAD of master.

Direct invocation of singularity run -C --net library://sylabsed/examples/lolcow:latest works as expected.

apptainer

Hi, does the driver work with apptainer??? is the new version of singularity, right?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.