cirruslabs / cirrus-cli Goto Github PK

View Code? Open in Web Editor NEW

201.0 201.0 18.0 3.15 MB

CLI for executing Cirrus tasks locally and in any CI

License: GNU Affero General Public License v3.0

Go 99.89% Dockerfile 0.05% Starlark 0.01% Shell 0.04%

ci-cd containers devops golang

cirrus-cli's People

Contributors

Stargazers

Watchers

Forkers

virtumartoz sarwosusilo8 alexwayfer abravalheri pinkdiamond1 randymcmillan mkenigs ink-splatters rdil samrose gsakun minusiq rohankumardubey crowfroggie09 you54f suryaoruganti hedinasr android-sudo

cirrus-cli's Issues

Validate .cirrus.star when running cirrus validate

Currently only YAML configuration is validated.

Logic from cirrus run can probably be re-used for this one.

CLI should respect custom agent version

Which can be set via CIRRUS_AGENT_VERSION environment variable.

Support Docker Builder Instances

Should be pretty similar to persistent worker tasks. Instead of Docker container just shell out agent execution locally.

Starlark templating guide

We need a guide on writing and testing Starlark templates.

Support GitHub Action Environment

CLI can be smart about environment where it is being executed.

GitHub Actions have http cache. See https://github.com/actions/cache
Output can have special instructions to group things. Can implement a custom Echelon renderer for Actions.

Cirrus Template Testing

Since now you can create configurations in Starlark we need to think about how to test them.

A simple template repository looks like:

lib.star

One thing we can do it to have tests folder with test projects and check that evaluation the template yielded expected tasks.

tests/foo-project/.cirrus.env # optional file with environment variables for evaluation
tests/foo-project/.cirrus.star # will contain load("../../lib.star", "foo") and main for testing
tests/foo-project/.cirrus.expected.yml # expected result to diff agains
tests/foo-project/package.json # some other files for testing `fs` for example
lib.star

Then we can have a separate internal subcommand like cirrus internal test tests/* that will take a list of folders and run evaluation and checking against the expected YML.

Complete the parser

#58 introduces a YAML parser that's able to run most of the CLI tasks without network connection.

However, this results in at least two parsers in-the-wild (the other being the one that processes .cirrus.yml in the cloud). To avoid the divergence in the future, it might be a good idea to consolidate everything as a CLI package, which is available for use in any Go program.

To do this, the following things need to be implemented:

support all execution environments
support JSON Schema
integrate already existing tests

Option for non-emoji output

In some cases my terminals aren't set up for emoji support, so it'd be nice to have a way (maybe an environment variable?) to disable emoji.

One way to do this is instead of ✅ and ❌ you could go with [OKAY] and [FAIL]. I'm not sure what would make the most sense to replace the clocks, but one possibility could be something like cycling through these:

[-   ]
[ -  ]
[  - ]
[   -]
[  - ]
[ -  ]

I've encountered this problem when using urxvt with certain fonts and when using the local console directly (as opposed to a terminal emulator in Xorg/Wayland) on Manjaro Linux.

(EDIT: Changed wording because the problem is specifically emoji.)

Add instructions how to run on Google Cloud Build

Ideally includuing using an http cache to store on Google Cloud Storage.

--container-pull-policy instead of --container-no-pull

With the latest Docker rate limits it seems not reasonable to always try to pull container image since it seems they count such requests agains individual rate limits even if the request ends up in no pull.

Let's instead have a pull policy similar to Kubernetes' imagePullPolicy with a default value of "IfNotPresent" which will check if image is locally presented and only attempt to pull if the image is missing locally.

Reuse already prebuilt images from GCR

Right now PrebuiltInstance always builds the image. Instead it should first check locally if the image is already available and then check if the image is available remotely. And it should build the image only if the it's not available both locally and remotely.

Docker Pull Progress

It will be great to show progress of docker pull in console so it's clear why things are not progressing

TeamCity Foldable Log integration

CLI should detect if it's running in TeamCity the same way it detects GH Actions and Travis CI and use special instructions to fold output: https://teamcity-support.jetbrains.com/hc/en-us/community/posts/206879375-How-to-structure-output-for-build-log-so-it-is-foldable-expandable

Parse YAML config on CLI side

Right now CLI uses GRPC endpoint of Cirrus CI (https://grpc.cirrus-ci.com/) in order to parse .cirrus.yml config file. The backend is written in Kotlin which makes it difficult to reuse it in Go. In order to make the CLI 100% independent from Cirrus CI and have option to run it offline or in a network without public internet we should eventually implement parsing of the YAML config in Go on the CLI side.

This is a not a priority at the moment but lets put it on the table and keep it in mind.

Use rsync to speed up "clone"

Right now cp can take a lot of time. For example, it took almost 3 minutes to execute it for https://github.com/cirruslabs/cirrus-ci-web when I had node_modules locally.

One option will be to use rsync if repository has .gitignore availabe. It seems rsync can respect .gitignore which shoiuld significantly speed up "clone" time.

This issue can be cooped with #13 in order to make sure rsync is available inside a container we control.

Config was parsed correctly but no tasks were found!

AlexWayfer/flame-cli#57

Probably because of cirruslabs/cirrus-ci-docs#689

Override RPC endpoint for persistent worker mode

Will be great to support passing http or https endpoint to cirrus worker run command. Will be usefull for integration tests and Cirrus On-Prem in the future.

Option to run a task by id

It will be nice to be able to run a task by id locally like cirrus run -task-id 123. We'll need to implement a separate RPC method to get a configuration and environment variables. Can be handy in addition to #8

Support --version flag

cirrus --version should output CLI's version for easier identification. Came around this issue while investigating installation via brew.

This StackOverflow answer can be useful: https://stackoverflow.com/a/47665780

Failed to create a volume from working directory

> cirrus run -v rubocop
gRPC server is listening at http://172.17.0.1:36311
running task rubocop (3)
Started 'rubocop'
Started 'Preparing execution environment...'
Preparing volume to work with...
Failed to create a volume from working directory: working volume creation failed: Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Post "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/images/create?fromImage=gcr.io%2Fcirrus-ci-community%2Fcirrus-ci-agent&tag=v1.8.0": dial unix /var/run/docker.sock: connect: permission denied
'Preparing execution environment...' failed in 0.0s!
Error: working volume creation failed: Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Post "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/images/create?fromImage=gcr.io%2Fcirrus-ci-community%2Fcirrus-ci-agent&tag=v1.8.0": dial unix /var/run/docker.sock: connect: permission denied
2020/09/10 00:29:03 working volume creation failed: Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Post "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/images/create?fromImage=gcr.io%2Fcirrus-ci-community%2Fcirrus-ci-agent&tag=v1.8.0": dial unix /var/run/docker.sock: connect: permission denied

Repository: https://github.com/AlexWayfer/alexwayfer.name

Environment: ArchLinux. There is no docker permissions for regular user, we should use sudo. So, when and how I need to use sudo more correctly?

Starlark Configuration Testing

Since Starlark is a real language we need to figure out a testing scenario for .cirrus.star file. Ideally there should be a way to provide environment variables and see what kind of YAML config Starlark will output.

Should it be a part of validate command? How to mock file system access?

Creating this issue to brainstorm possible options.

My main use case can be described like that: we'll have github.com/cirrus-templates/node template that will generate either yarn specific or npm specific tasks depending on files in the repository. How can we test this template?

`--version` option

Hello.

I think it'd be handful to have some -v or --version or version option/command to debug which one cirrus-cli you're using exactly.

Current output:

> cirrus help
Cirrus CLI

Usage:
  cirrus [command]

Available Commands:
  help        Help about any command
  run         Execute Cirrus CI tasks locally
  validate    Validate Cirrus CI configuration file

Flags:
  -h, --help   help for cirrus

Use "cirrus [command] --help" for more information about a command.

GitLab CI Support

Add support for running GitLab CI jobs.

Don’t fail the whole build if only one task has failed

Currently the logic in the executor's main loop is quite simple and looks like this:

cirrus-cli/internal/executor/executor.go

Lines 132 to 135 in 3b9cb68

 // Bail-out if the task has failed 

 if task.Status() != taskstatus.Succeeded { 

 return fmt.Errorf("%w: task %s %s", ErrBuildFailed, task.String(), task.Status().String()) 

 }

It would be nice to preserve the same behavior when running tasks in the cloud.

It also seems that to do that one needs to calculate the dependencies in Build.GetNextTask() in a bit more complicated fashion than it currently gets away with.

cirrus run should allow to filter by labels

Related to #62. When there are bunch of tasks with the same name but different labels (container image, some env variables) it not possible to target a particular task. cirrus run should be able to match by UniqueName too.

Option to override default registry

Right now Docker as a CI environment always uses gcr.io as registry. For local runs with CLI it might be usefully to override it. Consider the following use case: CLI is used in a private on-premise Jenkins cluster where there is a local Docker registry under hub.mycorp.internal. It will be great to override gcr.io with hub.mycorp.internal as well as allow to push a newly built PrebuiltImage to the registry directly from the CLI.

Run "clone" in a separate container

Let's run CLI's clone instruction in a separate container (for example, alpine:latest) since the task container might not have cp or rsync on the $PATH.

For example Kaniko executor image gcr.io/kaniko-project/executor doesn't have anything.

Document Podman

How to install and use it as an alternative container backend.

Cirrus Config via Starlark

YAML is pretty limiting and doesn't allow easily to have custom logic and makes it harder to be concise in some cases.

It will be great to have an option to generate tasks via .cirrus.star written in Starlark.

def main(ctx):
  return {
    "name": "Test Task",
    "instance": container("alpine:latest")
    "steps": [
      clone(),
      script(name = "env",  content = "printenv")
    ]
  }

ctx will include ctx.env for getting environment variables and ctx.functions to invoke helper functions while generating a set of tasks.

Dirty mode for running tasks

Right now run command rsyncs working directory to a container respecting .gitignore. It takes time and might not be always necessary because some tasks like linting are readonly. It will be good to have a --dirty flag so instead of rsyncing project folder will be mounted directly.

Run tests on Windows

Previous take on this: https://github.com/cirruslabs/cirrus-cli/tree/run-on-windows.

Set CIRRUS_TAG if Git ref is for tag

Respect CIRRUS_WORKING_DIR in dirty mode

--dirty should mount project dir into the specified CIRRUS_WORKING_DIR. Here is .cirrus.yml example which should pass:

task:
  container:
    image: alpine:latest
  env:
    CIRRUS_WORKING_DIR: /defs
  test_script: test -f .cirrus.yml

Unable to install

By these instructions: https://github.com/cirruslabs/cirrus-cli/blob/2f6cd75/INSTALL.md#golang

> go get github.com/cirruslabs/cirrus-cli/...
# github.com/go-git/go-git/plumbing/transport/ssh
../go/src/github.com/go-git/go-git/plumbing/transport/ssh/common.go:147:15: undefined: proxy.Dial
# google.golang.org/grpc/internal/transport
../go/src/google.golang.org/grpc/internal/transport/http_util.go:603:6: f.fr.SetReuseFrames undefined (type *http2.Framer has no field or method SetReuseFrames)

> ls $GOPATH/bin/cirrus*
fish: No matches for wildcard “$GOPATH/bin/cirrus*”.

// there are many other binaries

> go version
go version go1.15.1 linux/amd64

Productionize Startlark Configuration

Basic support for Startlark configuration was added in #48. We still need couple of things in order to productionize Starlark format to align with the vision. Things like:

List of builtins (filesystem plus https://github.com/qri-io/starlib for now?)
Template loading
Template testing primitives
gRPC service for evaluating config
Documentation

This will be an issue for tracking overall progress.

Option to keep containers after execution for debugging

Generate JSON Schema

The only follow up of #59

Cache cleanup strategy based on overall cache size

Related to #17

Support Dockerfile as CI environment

https://cirrus-ci.org/guide/docker-builder-vm/#dockerfile-as-a-ci-environment

Enforce CPU and memory limits

Currently these limits are available to the executor (#6) at hand for each task.

The CPU limit need to be translated from the Kubernetes format first, since it's not a 1-to-1 mapping to Docker container's resources configuration.

Flag to use local YAML parser instead of the remote one

Let's add --experimental-local-parser flag to use the experimental parser package instead of rpcparser that makes an RPC call to the Koltin backend.

Build for more platforms

Since we are about to introduce persistent worker functionality to support Arm, Solaris platforms and so on, we need to build CLI's binaries for them.

`cirrus run` hangs forever; keeps failing to connect to 172.17.0.1

When running cirrus run --verbose --dirty lint[1] in a fresh clone of https://github.com/awooos/flail on my system, it prints this and then hangs until I Ctrl-c it:

~/dev/os/awooos/flail$ cirrus run --verbose --dirty lint
🕚 'Lint' Task 05:25
   ✅ Preparing execution environment... 3.2s
   ✅ docker pull 2.7s
   ...
   creating container using working volume cirrus-working-volume-83694402-93ab-4d7d-b7f2-4507dea658ae
   starting container 177f7516fc6d18567e73cfc1bce0b4a1621d9b5b4ce943f0805d287c468be516
   waiting for container 177f7516fc6d18567e73cfc1bce0b4a1621d9b5b4ce943f0805d287c468be516 to finish

If I check the logs of the container (see below), I get this warning repeated a bunch:

WARNING: 2020/10/10 22:59:42 [core] grpc: addrConn.createTransport failed to connect to {172.17.0.1:46131 172.17.0.1:46131 <nil> 0 <nil>}. Err: connection error: desc = "transport: Error while dialing dial tcp 172.17.0.1:46131: connect: no route to host". Reconnecting...

I've had it sit there for over 20 minutes, at least once.

Various information found while debugging that seems relevant:

Output of `ip addr` on host system, minus parts I know for-sure are not docker-related

7: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default 
    link/ether 02:42:d3:3e:95:56 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever
    inet6 fe80::42:d3ff:fe3e:9556/64 scope link 
       valid_lft forever preferred_lft forever
8: br-bf7c4bd9284c: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default 
    link/ether 02:42:15:3e:9c:77 brd ff:ff:ff:ff:ff:ff
    inet 172.20.0.1/16 brd 172.20.255.255 scope global br-bf7c4bd9284c
       valid_lft forever preferred_lft forever

Output of `ip addr` in a Docker container

root@be108f6f8e96:/# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
55: eth0@if56: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default 
    link/ether 02:42:ac:11:00:02 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 172.17.0.2/16 brd 172.17.255.255 scope global eth0
       valid_lft forever preferred_lft forever
root@be108f6f8e96:/#

Output of `docker logs -f <container name>`

~$ docker logs -f charming_pare                                                                                        
INFO: 2020/10/10 22:59:42 [core] parsed scheme: ""                                                                     
INFO: 2020/10/10 22:59:42 [core] scheme "" not registered, fallback to default scheme                                  
INFO: 2020/10/10 22:59:42 [core] ccResolverWrapper: sending update to cc: {[{172.17.0.1:46131  <nil> 0 <nil>}] <nil> <n
il>}                                                                                                                   
INFO: 2020/10/10 22:59:42 [core] ClientConn switching balancer to "pick_first"
INFO: 2020/10/10 22:59:42 [core] Channel switches to new LB policy "pick_first"
INFO: 2020/10/10 22:59:42 [core] Subchannel Connectivity change to CONNECTING
INFO: 2020/10/10 22:59:42 [core] Subchannel picks a new address "172.17.0.1:46131" to connect
INFO: 2020/10/10 22:59:42 [core] Channel Connectivity change to CONNECTING
WARNING: 2020/10/10 22:59:42 [core] grpc: addrConn.createTransport failed to connect to {172.17.0.1:46131 172.17.0.1:4$
131 <nil> 0 <nil>}. Err: connection error: desc = "transport: Error while dialing dial tcp 172.17.0.1:46131: connect: no route to host". Reconnecting...
WARNING: 2020/10/10 22:59:42 [core] grpc: addrConn.createTransport failed to connect to {172.17.0.1:46131 172.17.0.1:4$
131 <nil> 0 <nil>}. Err: connection error: desc = "transport: Error while dialing dial tcp 172.17.0.1:46131: connect: no route to host". Reconnecting...
INFO: 2020/10/10 22:59:42 [core] Subchannel Connectivity change to TRANSIENT_FAILURE
INFO: 2020/10/10 22:59:42 [core] Channel Connectivity change to TRANSIENT_FAILURE
INFO: 2020/10/10 22:59:43 [core] Subchannel Connectivity change to CONNECTING
INFO: 2020/10/10 22:59:43 [core] Subchannel picks a new address "172.17.0.1:46131" to connect
INFO: 2020/10/10 22:59:43 [core] Channel Connectivity change to CONNECTING
WARNING: 2020/10/10 22:59:43 [core] grpc: addrConn.createTransport failed to connect to {172.17.0.1:46131 172.17.0.1:4$
131 <nil> 0 <nil>}. Err: connection error: desc = "transport: Error while dialing dial tcp 172.17.0.1:46131: connect: no route to host". Reconnecting...

Operating system is Fedora 31.

~$ uname -mnrsv
Linux cerberus.fox 5.8.9-101.fc31.x86_64 #1 SMP Mon Sep 14 19:29:57 UTC 2020 x86_64
~$

[1]: WRT needing to use --dirty, I'm still trying to figure that one out.

task script "fails" even though it had exit code 0?

When running cirrus run --verbose --dirty lint[1] in a fresh clone of https://github.com/awooos/flail on my system, it says container exited with <nil> error and exit code 0, then tells me it failed.

I suspect having logs from the actual running commands would help, but I'm not sure if/how I can get those.

~/dev/os/awooos/flail$ cirrus run --verbose --dirty lint
🕙 'Lint' Task 02:45
   ✅ Preparing execution environment... 3.8s
   ✅ docker pull 6.5s
   ✅ 'install' script 02:27
   ❌ 'main' script 1.0s
      begin streaming logs
      command failed

   creating Docker client
   creating container using working volume cirrus-working-volume-9ca71a13-5127-4bd1-bc50-ed060f8b3519
   starting container 0b05f1ae00845dc2b8012cff07494d31a01f4929380f6d98f87bec0b7b2e8635
   received heartbeat
   agent signal: urgent I/O condition
   received heartbeat
   waiting for container 0b05f1ae00845dc2b8012cff07494d31a01f4929380f6d98f87bec0b7b2e8635 to finish
   received heartbeat
   received heartbeat
   container exited with <nil> error and exit code 0
   cleaning up container 0b05f1ae00845dc2b8012cff07494d31a01f4929380f6d98f87bec0b7b2e8635

Error: build failed: task Lint (0) failed
2020/10/10 19:38:27 build failed: task Lint (0) failed
~/dev/os/awooos/flail$

When printing logs from RPC don’t print an extraneous newline

Can be reproduced with cirrus run -o simple -v.

Starlark hooks

Since we're now supporting Starlark, some build-time decisions can be handled there instead of writing the logic in the YAML.

There are a couple use-cases that we can cover by letting users to define their own hooks:

customizing re-run logic: cirruslabs/cirrus-ci-docs#238
handling GitHub events: #45 (comment)

The hook name for the first use-case will be probably defined explicitly in the YAML to avoid limiting all tasks to a single hook, while the latter will be defined implicitly based on whether a function with a specific name exists in .cirrus.star or not.

$ cirrus run
panic: listen tcp 172.17.0.1:0: bind: cannot assign requested address

goroutine 1 [running]:
github.com/cirruslabs/cirrus-cli/internal/executor/rpc.(*RPC).Start(0xc00009e060)
	/home/jaq/go/pkg/mod/github.com/cirruslabs/[email protected]/internal/executor/rpc/rpc.go:119 +0x359
github.com/cirruslabs/cirrus-cli/internal/executor.(*Executor).Run(0xc0001ed2f0, 0xf4e920, 0xc0003d8300, 0x0, 0x0)
	/home/jaq/go/pkg/mod/github.com/cirruslabs/[email protected]/internal/executor/executor.go:82 +0x62
github.com/cirruslabs/cirrus-cli/internal/commands.run(0xc0003ccb00, 0x1582820, 0x0, 0x0, 0x0, 0x0)
	/home/jaq/go/pkg/mod/github.com/cirruslabs/[email protected]/internal/commands/run.go:158 +0x5f5
github.com/spf13/cobra.(*Command).execute(0xc0003ccb00, 0x1582820, 0x0, 0x0, 0xc0003ccb00, 0x1582820)
	/home/jaq/go/pkg/mod/github.com/spf13/[email protected]/command.go:842 +0x453
github.com/spf13/cobra.(*Command).ExecuteC(0xc0003cc580, 0xe410d8, 0xc00029b440, 0xc0003dc6a0)
	/home/jaq/go/pkg/mod/github.com/spf13/[email protected]/command.go:950 +0x349
github.com/spf13/cobra.(*Command).Execute(...)
	/home/jaq/go/pkg/mod/github.com/spf13/[email protected]/command.go:887
github.com/spf13/cobra.(*Command).ExecuteContext(...)
	/home/jaq/go/pkg/mod/github.com/spf13/[email protected]/command.go:880
main.main()
	/home/jaq/go/pkg/mod/github.com/cirruslabs/[email protected]/cmd/cirrus/main.go:27 +0x137
$ echo $?
2

I can't figure out why it's trying to bind to this address, it doesn't exist on my machine.

	// Bail-out if the task has failed
	if task.Status() != taskstatus.Succeeded {
	return fmt.Errorf("%w: task %s %s", ErrBuildFailed, task.String(), task.Status().String())
	}