This repository defines a utility for running workers.
It handles:
- Getting Taskcluster credentials
- Interacting with the worker manager
- Gathering configuration from various sources
- Polling for interruptions of cloud instances (e.g., spot termination)
In operation, this tool performs the following steps to determine the parameters for a run of a worker:
- Read the runner configuration (
<runnerConfig>
). - Load the given provider and ask it to add settings to the run. This
step provides
- Taskcluster credentials for the worker,
- worker identification information (worker pool, worker ID, etc.),
- the location of the worker, and
- worker configuration.
- Using the Taskcluster credentials, load configuration from the secrets service.
- Load support for the given worker implementation and ask it to add settings to the run.
With all of this complete, the run parameters are fully determined:
- Taskcluster contact information (rootURL, credentials)
- Worker identity
- Metadata from the provider (useful for user debugging)
- Configuration for the worker (see below for details)
The final step, then, is to start the worker with the derived configuration. The worker is run as a subprocess, with a simple, text-based protocol between start-worker and the worker itself. The protocol is defined in protocol.md. This protocol is used to communicate events such as an impending shutdown.
Worker configuration comes from a number of sources; in order from lowest to highest precedence, these are:
- The worker runner config file
- The configuration defined by the provider, if any
- Configuration stored in the secrets service
Providers can supply configuration to the worker via whatever means makes sense. For example, an EC2 or GCP provider would read configuration from the instance's userData.
The runner configuration file is described in more detail in the "Usage" section below.
Its workerConfig
property can contain arbitrary worker configuration values.
For example:
provider: ..
worker: ..
workerConfig:
shutdownMachineOnIdle: true
Note that the deeply-nested format described in the next section is not available in the runner config file.
Providers that interact with the worker-manager service can get configuration from that service. That configuration formally has the form:
<workerImplementation>:
config:
workerConfigValue: ...
files:
- ...
Where all fields are optional.
The <workerImplementation>
is replaced with the worker implementation name, in camel case (genericWorker
, dockerWorker
, etc.)
The contents of <workerImplementation>.config
are merged into the worker configuration.
Files are handled as described below.
For backward compatibility, configuration may be specified as a simple object with configuration properties at the top level. Support for this form will be removed in future versions.
Putting all of this together, a worker pool definition for a generic-worker instance might contain:
launchConfigs:
- ...
workerConfig:
genericWorker:
config:
shutdownMachineOnInternalError: true
Secrets are stored in the secrets service under a secret named
worker-pool:<workerPoolId>
, in the format
config:
workerConfigValue: ...
files:
- ...
Where config
is an object that is merged directly into the worker config.
Two backward-compatibility measures exist:
- A secret named
worker-type:<workerPoolId>
is also consulted, as used before RFC#145 landed. - If a secret does not have properties
config
andfiles
, then its top-level contents are assumed to be worker configuration, with no files.
Files can also be stored in the secrets service and in provider configuration, under the files
properties described above.
These can be used to write (small) files to disk on the worker before it starts up.
For example:
files:
- content: U....x8j==
description: Secret Data!
encoding: base64
format: zip
path: 'C:\secrets'
This would unzip the zipfile represented by content
at C:\secrets
.
The only encoding supported is base64
.
The formats supported are:
file
-- the content is decoded and written to the file named bypath
zip
-- the content is treated as a ZIP archive and extracted at the directory named bypath
This binary is configured to run at instance start up, getting a configuration file as its argument. It logs to its stdout.
start-worker <runnerConfig>
Configuration for taskcluster-worker-runner is in the form of a YAML file with the following fields:
-
provider
: (required) information about the provider for this workerproviderType
: (required) the worker-manager providerType responsible for this worker; this generally indicates the cloud the worker is running in, or 'static' for a non-cloud-based worker; see below.
-
worker
: (required) information about the worker being runimplementation
: (required) the name of the worker implementation; see below.
-
workerConfig
: arbitrary data which forms the basics of the config passed to the worker; this will be merged with several other sources of configuration. Note that the nested<workerImplementation>.config
structure is not allowed here. -
getSecrets
: if true (the default), then configuration is fetched from the secrets service and merged with the worker configuration. This option is generally only used in testing. -
cacheOverRestarts
: if set to a filename, then the runner state is written to this JSON file at startup. On subsequent startups, if the file exists, then it is loaded and the worker started directly without consulting worker-manager or any other external resources. This is useful for worker implementations that restart the system as part of their normal operation and expect to start up with the same config after a restart.
NOTE for Windows users: the configuration file must be a UNIX-style text file. DOS-style newlines and encodings other than utf-8 are not supported.
Providers configuration depends on the providerType:
The providerType "aws" is intended for workers provisioned with worker-manager providers using providerType "aws". It requires
provider:
providerType: aws
The $TASKCLUSTER_WORKER_LOCATION defined by this provider has the following fields:
- cloud: aws
- region
- availabilityZone
The providerType "azure" is intended for workers provisioned with worker-manager providers using providerType "azure". It requires
provider:
providerType: azure
The $TASKCLUSTER_WORKER_LOCATION defined by this provider has the following fields:
- cloud: azure
- region
The providerType "google" is intended for workers provisioned with worker-manager providers using providerType "google". It requires
provider:
providerType: google
The $TASKCLUSTER_WORKER_LOCATION defined by this provider has the following fields:
- cloud: google
- region
- zone
The providerType "standalone" is intended for workers that have all of their configuration pre-loaded. Such workers do not interact with the worker manager. This is not a recommended configuration - prefer to use the static provider.
It requires the following properties be included explicitly in the runner configuration:
provider:
providerType: standalone
rootURL: .. # note the Golang spelling with capitalized "URL"
clientID: .. # ..and similarly capitalized ID
accessToken: ..
workerPoolID: ..
workerGroup: ..
workerID: ..
# (optional) custom provider-metadata entries to be passed to worker
providerMetadata: {prop: val, ..}
# (optional) custom properties for TASKCLUSTER_WORKER_LOCATION
# (values must be strings)
workerLocation: {prop: val, ..}
The $TASKCLUSTER_WORKER_LOCATION defined by this provider has the following fields:
- cloud: standalone
as well as any worker location values from the configuration.
The providerType "static" is intended for workers provisioned with worker-manager providers using providerType "static". It requires
provider:
providerType: static
rootURL: .. # note the Golang spelling with capitalized "URL"
providerID: .. # ..and similarly capitalized ID
workerPoolID: ...
workerGroup: ...
workerID: ...
staticSecret: ... # shared secret configured for this worker in worker-manager
# (optional) custom provider-metadata entries to be passed to worker
providerMetadata: {prop: val, ..}
# (optional) custom properties for TASKCLUSTER_WORKER_LOCATION
# (values must be strings)
workerLocation: {prop: val, ..}
The $TASKCLUSTER_WORKER_LOCATION defined by this provider has the following fields:
- cloud: static
as well as any worker location values from the configuration.
The following worker implementations are supported:
The "docker-worker" worker implementation starts docker-worker (https://github.com/taskcluster/docker-worker). It takes the following values in the 'worker' section of the runner configuration:
worker:
implementation: docker-worker
# path to the root of the docker-worker repo clone
path: /path/to/docker-worker/repo
# path where taskcluster-worker-runner should write the generated
# docker-worker configuration.
configPath: ..
The "dummy" worker implementation does nothing but dump the state instead of "starting" anything. It is intended for debugging.
worker:
implementation: dummy
The "generic-worker" worker implementation starts generic-worker (https://github.com/taskcluster/generic-worker). It takes the following values in the 'worker' section of the runner configuration:
worker:
implementation: generic-worker
# path to the root of the generic-worker executable
# can also be a wrapper script to which args will be passed
path: /usr/local/bin/generic-worker
# (Windows only) service name to start
service: "Generic Worker"
# (Windows only) named pipe (\\.\pipe\<something>) with which generic-worker
# will communicate with worker-runner; default value is as shown here:
protocolPipe: \\.\pipe\generic-worker
# path where taskcluster-worker-runner should write the generated
# generic-worker configuration.
configPath: /etc/taskcluster/generic-worker/config.yaml
Specify either 'path' to run the executable directly, or 'service' to name a Windows service that will run the worker. In the latter case, the configPath must match the path configured within the service definition. See windows-services for details. Note that running as a service requires at least generic-worker v16.6.0.
See deployment for advice on deploying worker-runner itself.
This application requires go1.12.
Test with go test ./...
.
To make a new release, run ./release.sh <version>
.
Examine the resulting commit and tag for completeness, then push to the upstream repository.