plexsystems / sinker Goto Github PK
View Code? Open in Web Editor NEWA tool to sync images from one container registry to another
License: MIT License
A tool to sync images from one container registry to another
License: MIT License
I'm trying to use sinker with the AWS ECR credential helper for docker, but getting errors:
AWS_PROFILE=k8s sinker copy -i quay.io/argoproj/argocd:v2.8.3 -t 555337501170.dkr.ecr.us-east-1.amazonaws.com
INFO[0000] Finding images that need to be copied ...
INFO[0001] Copying image quay.io/argoproj/argocd:v2.8.3 to 552312313.dkr.ecr.us-east-1.amazonaws.com/argoproj/argocd:v2.8.3
Error: copy: copy image: copying system image from manifest list: trying to reuse blob sha256:3153aa388d026c26a2235e1ed0163e350e451f41a8a313e1804d7e1afb857ab4 at destination: checking whether a blob sha256:3153aa388d026c26a2235e1ed0163e350e451f41a8a313e1804d7e1afb857ab4 exists in 552312313.dkr.ecr.us-east-1.amazonaws.com/argoproj/argocd: authentication required
I am able to push/pull images to/from this repo using docker command line.
I notice the docs say that
All auth is handled by looking at the clients Docker auth. If the client can perform a docker push or docker pull, sinker will be able to as well.
But I wonder if that extends to the use of credential helpers?
If I switch to storing credentials in the docker config, it works again.
Sinker expects the following format when havings args
with a Docker image in them:
spec:
serviceAccountName: "argo"
containers:
- name: controller
image: "argoproj/workflow-controller:v2.9.5"
imagePullPolicy: IfNotPresent
command: [ "workflow-controller" ]
args:
- "--configmap"
- "release-name-workflow-controller-configmap"
- "--executor-image=argoproj/argoexec:v2.9.5"
and when I have the following in my manifest, it panics:
spec:
serviceAccountName: "argo"
containers:
- name: controller
image: "argoproj/workflow-controller:v2.9.5"
imagePullPolicy: IfNotPresent
command: [ "workflow-controller" ]
args:
- "--configmap"
- "release-name-workflow-controller-configmap"
- "--executor-image"
- "argoproj/argoexec:v2.9.5"
The cause is the missing =
sign, as this manifest is using newline characters instead.
panic: runtime error: index out of range [1] with length 1
goroutine 1 [running]:
github.com/plexsystems/sinker/internal/manifest.getImagesFromContainers(0xc0000e8b00, 0x1, 0x4, 0x0, 0x0, 0x0)
/Users/daniel.megyesi/go/pkg/mod/github.com/plexsystems/[email protected]/internal/manifest/kubernetes.go:236 +0x4c2
github.com/plexsystems/sinker/internal/manifest.getImagesFromResource(0xc0003eac37, 0x532, 0x1, 0x1, 0xc00030b5b0, 0x1, 0x1)
/Users/daniel.megyesi/go/pkg/mod/github.com/plexsystems/[email protected]/internal/manifest/kubernetes.go:166 +0x251
github.com/plexsystems/sinker/internal/manifest.GetImagesFromKubernetesResources(0xc00025faf8, 0x1, 0x1, 0x3, 0x0, 0x3744, 0x0, 0x0)
/Users/daniel.megyesi/go/pkg/mod/github.com/plexsystems/[email protected]/internal/manifest/kubernetes.go:45 +0x17b
github.com/plexsystems/sinker/internal/manifest.GetImagesFromStandardInput(0x7ffeefbff943, 0x4, 0x7ffeefbff943, 0x4, 0x4)
/Users/daniel.megyesi/go/pkg/mod/github.com/plexsystems/[email protected]/internal/manifest/manifest.go:276 +0x319
github.com/plexsystems/sinker/internal/commands.runCreateCommand(0x7ffeefbff93e, 0x1, 0x0, 0x0, 0x0, 0x0)
/Users/daniel.megyesi/go/pkg/mod/github.com/plexsystems/[email protected]/internal/commands/create.go:68 +0x3d6
github.com/plexsystems/sinker/internal/commands.newCreateCommand.func1(0xc000398580, 0xc0003da7e0, 0x1, 0x3, 0x0, 0x0)
/Users/daniel.megyesi/go/pkg/mod/github.com/plexsystems/[email protected]/internal/commands/create.go:38 +0x2f1
github.com/spf13/cobra.(*Command).execute(0xc000398580, 0xc0003da7b0, 0x3, 0x3, 0xc000398580, 0xc0003da7b0)
/Users/daniel.megyesi/go/pkg/mod/github.com/spf13/[email protected]/command.go:842 +0x453
github.com/spf13/cobra.(*Command).ExecuteC(0xc0003982c0, 0x0, 0x0, 0xc0000400b8)
/Users/daniel.megyesi/go/pkg/mod/github.com/spf13/[email protected]/command.go:950 +0x349
github.com/spf13/cobra.(*Command).Execute(...)
/Users/daniel.megyesi/go/pkg/mod/github.com/spf13/[email protected]/command.go:887
main.main()
/Users/daniel.megyesi/go/pkg/mod/github.com/plexsystems/[email protected]/main.go:10 +0x27
Having newline separated args (without the =
sign) is a very common thing in the Kubernetes world, so it could be very nice if this is also supported and parsed correctly.
I'd love to see pull/push progress with human readable sizes so I can easily understand it without having to count the digits.
Command(s) impacted: List
In later versions of the prometheus-operator (the operator that listens for Promtheus and Alertmanager kind
s), the baseImage
field has been deprecated in favor of just image
.
As of v0.1.0
, imagesync only looks for baseImage
.
https://github.com/containers/skopeo allow to pull/push images without the need for the Docker stack. Adding compatibility with skopeo would allow running sinker
where no Docker daemon are available.
EDIT: Just seen that skopeo
is a CLI and should not be directly integrated as is in other softs. Maybe this is the right way: https://github.com/containers/image https://pkg.go.dev/github.com/containers/image
If a list of images that are stored in the repository are the mirrored URLs, it would be nice to be able to get the original URL of the images.
A use case for this could be scanning images before pushing.
imagesync sync
operation to push the image to the registry.I would like to be able to sync images from a quay repository to my docker hub account.
For example, I have this
target:
repository: fairwindsops
sources:
- repository: coreos/flannel
host: quay.io
tag: v0.12.0
As it is right now, this tries to push fairwindsops/coreos/flannel:v0.12.0
which is not valid. It would be nice to be able to rewrite the repository name somehow. Perhaps fairwindsops/coreos_flannel:v0.12.0
or something like that.
To allow users to not have to first generate a yaml file, sinker should support standard input:
kustomize build manifestDirectory | sinker update -
When working with a big list of images, getting the remote manifests synchronously can take minutes.
I believe this can easily be parallelized to get some better results.
This is the synchroneous loop that takes time https://github.com/plexsystems/sinker/blob/main/internal/commands/copy.go#L84
There is probably value in including a dry-run command to let users know what images will be sync'd. This could also be useful for users to be able to perform additional tasks on newly sync'd images (e.g. locks, notifications, changelogs)
OS info: MacOS 11.6.2
Go version info: go version go1.17.5 darwin/amd64
I couldn't install sinker with the command provided in the readme file;
GO111MODULE=on go get github.com/plexsystems/sinker
go get: installing executables with 'go get' in module mode is deprecated.
Use 'go install pkg@version' instead.
For more information, see https://golang.org/doc/go-get-install-deprecation
or run 'go help get' or 'go help install'.
With the --dry-run
flag enabled, Sinker looks for images that needs sync and print them without doing the actual pull-push action.
Github Container Registry also does not support nested path.
Hello!
I started playing with Sinker today and it's great, definitely what we need for our use case.
However, I wanted to use it as part of our CI pipelines to keep project images synced to our internal registries and I can't find a Sinker Docker image anywhere... before I rolled out my own I wanted to ask, does it already exist somewhere?
Hi there, first of all, great tool!
I'm trying in my local, basic use case, read from a company repo and push to docker.io but I'm getting:
Error: push: pushing image to target: pushing image after tag: denied: requested access to the resource is denied
This is the dummy example:
target:
registry: docker.io
repository: torrespro
images:
- repository: authentication-dev
version: IPS-1.12.0-no-production
source: company/docker-releases
I can both push and pull from the terminal using plain docker commands, I understood if I can push and pull the client could do it also, or do I need to set up credentials to push to docker.io? if so, how?
Thanks in advance
Hi,
we are using sinker 0.17 and recently start receiving this error.
INFO[0003] Copying image mcr.microsoft.com/oss/kubernetes-csi/csi-provisioner:v3.2.0 to ******.azurecr.io/oss/kubernetes-csi/csi-provisioner:v3.2.0
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x190 pc=0xea6032]
goroutine 89 [running]:
github.com/containers/image/v5/docker.(*dockerImageDestination).PutBlob(0xc0007548d0, {0x12bbd58, 0xc0000a5140}, {0x12aecc0?, 0xc0003fa320?}, {{0x0, 0x0}, 0xffffffffffffffff, {0x0, 0x0, ...}, ...}, ...)
/Users/john.reese/go/pkg/mod/github.com/containers/image/[email protected]/docker/docker_image_dest.go:136 +0xb2
github.com/containers/image/v5/internal/imagedestination.(*wrapped).PutBlobWithOptions(0xc0004d8998?, {0x12bbd58?, _}, {_, _}, {{0x0, 0x0}, 0xffffffffffffffff, {0x0, 0x0, ...}, ...}, ...)
/Users/john.reese/go/pkg/mod/github.com/containers/image/[email protected]/internal/imagedestination/wrapper.go:48 +0xe3
github.com/containers/image/v5/copy.(*copier).copyBlobFromStream(0xc0003e5290, {0x12bbd58, 0xc0000a5140}, {_, _}, {{0xc000272d20, 0x47}, 0x386800, {0x0, 0x0, ...}, ...}, ...)
/Users/john.reese/go/pkg/mod/github.com/containers/image/[email protected]/copy/copy.go:1570 +0x1fa4
github.com/containers/image/v5/copy.(*imageCopier).copyLayerFromStream(0xc00023e690, {0x12bbd58, 0xc0000a5140}, {0x12b0fc0, 0xc000515140}, {{0xc000272d20, 0x47}, 0x386800, {0x0, 0x0, ...}, ...}, ...)
/Users/john.reese/go/pkg/mod/github.com/containers/image/[email protected]/copy/copy.go:1306 +0x419
github.com/containers/image/v5/copy.(*imageCopier).copyLayer.func3(0xc00023e690, 0x0?, {{0xc000272d20, 0x47}, 0x386800, {0x0, 0x0, 0x0}, 0x0, {0xc000421770, ...}, ...}, ...)
/Users/john.reese/go/pkg/mod/github.com/containers/image/[email protected]/copy/copy.go:1241 +0x345
github.com/containers/image/v5/copy.(*imageCopier).copyLayer(0xc00023e690, {0x12bbd58, 0xc0000a5140}, {{0xc000272d20, 0x47}, 0x386800, {0x0, 0x0, 0x0}, 0x0, ...}, ...)
/Users/john.reese/go/pkg/mod/github.com/containers/image/[email protected]/copy/copy.go:1273 +0x6b4
github.com/containers/image/v5/copy.(*imageCopier).copyLayers.func1(0x0, {{0xc000272d20, 0x47}, 0x386800, {0x0, 0x0, 0x0}, 0x0, {0xc000421770, 0x2c}, ...}, ...)
/Users/john.reese/go/pkg/mod/github.com/containers/image/[email protected]/copy/copy.go:940 +0x41d
created by github.com/containers/image/v5/copy.(*imageCopier).copyLayers.func2
/Users/john.reese/go/pkg/mod/github.com/containers/image/[email protected]/copy/copy.go:977 +0x393
I see that containers/image is now on 5.23. Maybe cou could solve the issue.
When connecting to a Registry running Harbor , sinker is unable to check if an image exists as Harbor is returning a 404 NOT_FOUND instead of MANIFEST_UNKNOWN
Observed behaviour:
time="2020-09-17T21:07:13+02:00" level=info msg="Finding images that need to be pushed ..." Error: push: image exists at remote: get image: GET https://harbor.myreg.org/v2/xyz/coreos/prometheus-operator/manifests/v0.38.0: NOT_FOUND: artifact xyz/coreos/prometheus-operator:v0.38.0 not found
Expected behaviour:
Image gets pushed
Mis-identification of container URLs on a specific arguments. A quick manifest that demos the error is below:
apiVersion: apps/v1
kind: Deployment
metadata:
labels: {}
name: testdeployment
spec:
selector:
matchLabels: {}
template:
spec:
containers:
- args:
- --proxyComponentLogLevel=misc:error
- --log_output_level=default:info
name: test
image: busybox
The docker client will configure its HTTP client transport to use TLS if you configure it properly through environment variables:
However, sinker doesn't use the docker client for every registry interaction. When it doesn't, it uses the go container registry client with basic auth.
For example: https://github.com/plexsystems/sinker/blob/v0.17.0/internal/docker/docker.go#L178
Due to this, for container registries that are TLS enabled and where basic auth isn't possible, sinker won't completely work.
I think there may be a fix, though. The docker client exposes its TLS-configured HTTPClient.Transport
, which can be provided to the go-containerregistry
remote client. For example:
func (c Client) ImageExistsAtRemote(ctx context.Context, image string) (bool, error) {
// ...
if _, err := remote.Get(reference,
remote.WithTransport(c.docker.HTTPClient().Transport),
remote.WithAuthFromKeychain(authn.DefaultKeychain)); err != nil {
// do something
}
}
This way, if the environment is configured correctly then the docker client will load client certs which can be used in comms in sinker.
It's completely possible i'm approaching this incorrectly. If this is possible without code changes, please enlighten me ๐
I'm also happy to raise a PR. If my proposed solution seems reasonable, I'll raise one.
While the intent of the image manifest is to have a central source of truth for all images, there may exist use cases where users just need to perform operations on images ad-hoc.
Its a poor user experience to force a user to create a file to perform trivial actions.
It would be great to have a wildcard in tags, something like this
sources:
- host: 111111111111.dkr.ecr.ap-southeast-2.amazonaws.com
repository: config-monkey
tag: dev-*
target:
host: 222222222222.dkr.ecr.ap-southeast-2.amazonaws.com
Docker image for v0.14 was pushed to docker hub not not the latest release.
This looks like a really interesting project. We're currently have some Jenkins jobs to sync specific images between registres using Skopeo, and this may be a nicer approach for us moving forward.
However, most of our images are multi-arch and inspired by Openshift 4 we reference everything by digest rather than tag.
Are multi-arch (manifest list) and digests something you currently support or plan to support?
I currently use lstags
for syncing. It supports all kinds of fancy regexes for defining versions (but of course they come with their own caveats)
Maybe supporting a list of tags for syncing images
would come handy as well (do not know how to handle digest, though):
target:
host: myhost.com
repository: my/repo
images:
- repository: library/debian
host: registry.hub.docker.com
tag:
- stretch
- stretch-slim
- buster
- buster-slim
- testing
- sid
Originally posted by @mfriedenhagen in #8 (comment)
For any reason, if the pull process is experiencing some network issues on downloading a specific image, the whole process is getting interrupted. This could be problematic for CI environment. I am proposing some PR to don't block but still emit an error during the execution.
For example, I would like to be able to run something like:
kubectl deploy,ds,statefulset -A -oyaml > all.yaml
sinker update all.yaml
This would enable generating a manifest from all the objects in a cluster.
The list format is something like this
apiVersion: v1
items:
- apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
First thank you for the project! I'd like to make a suggestion:
Use case: you want to retrieve all images from one cluster to make it air-gapped.
Let's assume you're already running a cluster. Probably its convenient to get all images from it and then generate the config needed for sinker:
Something like this will get all the images running:
package main
import (
"context"
"flag"
"fmt"
"os"
"path/filepath"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/client-go/kubernetes"
"k8s.io/client-go/tools/clientcmd"
)
func AppendIfMissing(slice []string, i string) []string {
for _, ele := range slice {
if ele == i {
return slice
}
}
return append(slice, i)
}
func homeDir() string {
if h := os.Getenv("HOME"); h != "" {
return h
}
return os.Getenv("USERPROFILE")
}
func main() {
var kubeconfig *string
if home := homeDir(); home != "" {
kubeconfig = flag.String("kubeconfig", filepath.Join(home, ".kube", "config"), "(optional) absolute path to the kubeconfig file")
} else {
kubeconfig = flag.String("kubeconfig", "", "absolute path to the kubeconfig file")
}
flag.Parse()
// use the current context in kubeconfig
config, err := clientcmd.BuildConfigFromFlags("", *kubeconfig)
if err != nil {
panic(err.Error())
}
// create the clientset
clientset, err := kubernetes.NewForConfig(config)
if err != nil {
panic(err.Error())
}
pods, err := clientset.CoreV1().Pods("").List(context.TODO(), metav1.ListOptions{})
if err != nil {
panic(err.Error())
}
s := []string{}
for _, p := range pods.Items {
for _, c := range p.Spec.Containers {
s = AppendIfMissing(s, c.Image)
}
}
for _, image := range s {
fmt.Println(image)
}
}
From there we'd only need to generate the yaml for sinker and/or filter it. Thoughts?
Hey, love the tool!
However, looks like a lot of repo's are not recognized by sinker create
cmd. I had to manually add host:
to the manifest yaml for following public repo's we use, but wish to sync to our own private repo:
registry.opensource.zalan.do
docker.elastic.co
Example of the image:
definition in k8s: docker.elastic.co/elasticsearch/elasticsearch:7.4.0
This was not properly added to image manifest and sinker was trying to pull it from default (docker hub).
What's up with that. Am I doing something wrong?
It would be great to support multiple targets, something like this
sources:
- repository: vikas027/alpine
host: docker.io
tag: minimal
targets:
- host: 111111111111.dkr.ecr.ap-southeast-2.amazonaws.com
- host: 222222222222.dkr.ecr.ap-southeast-2.amazonaws.com
Allow filtering images to synchronize based on the name of the images.
eg: images.yaml
sources:
- repository: coreos/prometheus-operator
host: quay.io
tag: v0.40.0
- repository: super/secret
tag: v0.3.0
auth:
username: DOCKER_USER_ENV
password: DOCKER_PASSWORD_ENV
- repository: nginx
digest: sha256:bbda10abb0b7dc57cfaab5d70ae55bd5aedfa3271686bace9818bba84cd22c29
sinker -m images.yaml push --filter nginx
will push only the images matching the filter (as a regexp).
I want to be able to specify a new tag or tags that will be set (in addition to the existing ones) when syncing.
For example something like this:
target:
host: ghcr.io
repository: briantist
additional_tags:
- stable-2.11
sources:
- repository: ansible/base-test-container
host: quay.io
tag: 2.1.0
- repository: ansible/default-test-container
host: quay.io
tag: 5.4.0
So that the two source images would get synced to the target with both their original tag, and the additional tag(s) entered, like ansible/base-test-container:2.1.0
would also be ansible/base-test-container:stable-2.12
.
Running sinker push
automatically pulls the image version matching the local machine. This is fine as long as the local machine (or CI build server) runs on the same architecture as the servers running the workload.
But if I happen to run sinker push
on my M1 Mac, I'll silently push a bunch of linux/arm64
images to the repo. This can result in the runtime environment pulling bad images, and workloads unable to start.
It would be good to be able to specify the expected platform when pulling and pushing with sinker:
sinker push --platform linux/amd64
A few questions have been asked about the ability to set the target on a case by case basis, and set the Auth
for an entire registry.
To support the most common case of syncing images to a single target, the parent level target should still continue to exist.
Optionally, each image could override the parent target with a target of its own. A target would include a registry, repository, and auth.
The source
field should also be renamed to host
to more accurately reflect its contents.
Question: when using sinker create
, we're seeing some false image repos in the output manifest...
Given that we have
- args:
- --secure-listen-address=[$(IP)]:9100
- --grpc-address=0.0.0.0:10901
- --http-address=0.0.0.0:10902
we get
- repository: '[$(IP)]'
tag: "9100"
- repository: 0.0.0.0
host: 0.0.0.0:10901
tag: "10901"
- repository: 0.0.0.0
host: 0.0.0.0:10902
tag: "10902"
Is there some way to get around this behaviour?
I'm on version 0.14.2.
Great project!
One idea that I have for enhancement is to add possibility to sync whole repository.
For example:
I have myuser
at Docker Hub with some containers and myotheruser
at Quay and I want to sync all containers from Docker to Quay just in case one of platforms is down.
Just an idea which would be cool and possibly useful for companies that maintain multiple registries with the same containers. :)
Maybe I'm missing something obvious, but I had expected sinker to create repositories on the target for me.
target:
host: <account-number>.dkr.ecr.ap-southeast-2.amazonaws.com
sources:
- repository: coreos/prometheus-operator
host: quay.io
tag: v0.40.0
$ sinker --version
sinker version v0.15.0
$ sinker push
INFO[0000] Finding images that need to be pushed ...
Error: push: image exists at remote: get image: GET https://<account-number>.dkr.ecr.ap-southeast-2.amazonaws.com/v2/coreos/prometheus-operator/manifests/v0.40.0: NAME_UNKNOWN: The repository with name 'coreos/prometheus-operator' does not exist in the registry with id '<account-number>'
If I create the repository manually in ECR and then try again.
$ sinker push
INFO[0000] Finding images that need to be pushed ...
INFO[0000] Pulling quay.io/coreos/prometheus-operator:v0.40.0
INFO[0010] Pulling quay.io/coreos/prometheus-operator:v0.40.0 (Processing 937640B of 11251687B)
INFO[0013] Pushing <account-number>.dkr.ecr.ap-southeast-2.amazonaws.com/coreos/prometheus-operator:v0.40.0
INFO[0016] Pushing <account-number>.dkr.ecr.ap-southeast-2.amazonaws.com/coreos/prometheus-operator:v0.40.0 (Processing 8291328B of 35590144B)
INFO[0024] All images have been pushed!
Is this expected, or am I using it incorrectly, or? ๐ค
In the current implementation, it is assumed that the images.txt
produced by the tool contains the mirrored images. Typically when you create a bundle of Kubernetes manifests, the image source will be your mirror.
Then when running commands like sync
and check
, the --mirror
flag will figure out the original URL behind the scenes.
Though it does feel more natural to have the images.txt
be the original images, and have --mirror
.. mirror to the URL specified, rather than as a way of telling the tool "strip this URL away and figure out the source"
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.