Comments (7)
Ah, sorry this is actually caused by the automatic compression/decompression done during copying. It looks like ShouldCompressLayers
isn't being handled properly, causing a copy from docker-daemon
-> docker-archive
to work, but docker
-> docker-archive
to not work.
from image.
Ah, it's because copy/copy.go
doesn't actually appear to decompress layers if the target layer doesn't accept compression. Which is a bit ... odd.
from image.
So, to be explicit, am I correct that the issue is:
skopeo copy docker://… docker-archive:$x
creates a valid (i.e. consumable bydocker load
) tarball, which contains compressed layers and DiffIDs of uncompressed data [but no record of the digests of the compressed data]skopeo copy docker-archive:$x …
then fails, because it uses the DiffIDs to identify layers but the actual files are compressed
?
Yeah, that’s pretty ugly. (For the record, it is not an issue for docker-daemon:
, because the daemon transparently unpacks the compressed layers and verifies DiffIDs on docker load
, and on docker save
it always creates uncompressed layers.)
Now, whether we should be creating and/or accepting tarballs which contain compressed layers is really a matter of definition; AFAIK the tarball format is not formally documented, so strictly speaking we can do anything, e.g. one (or more) of the following:
- Always uncompress layers in
archiveImageDestination
so that the DiffID matches in the tarball. May waste space. - Always uncompress layers in
archiveImageSource
so that the DiffID matches during verification. Fairly likely wastes time. - Teach the
copy.go
digest verification code about some digests being of the compressed data / of the uncompressed data (we already have ugly code to compute DiffID values for compressed tarballs). - Teach
copy.go
that some sources have expected digest verification failures. (That’s pretty risky and ugly.)
I guess my weak preference would be 2. or 3., because that makes any tarball acceptable to docker load
also acceptable to docker-archive:
. OTOH 1. is probably a bit easier to implement.
from image.
A particular concern, of course, is signatures: whether we verify, what we verify, and who decides what we verify, all matters. Right now, in the case of the docker-daemon:
format, this is academic because the format can’t record or provide signatures, but we do intend to integrate that $somehow fairly soon. I’m afraid right now I don’t have a completely clear idea what that would look like and what that would mean for the whole problem space. (It does seem attractive to be able to authenticate the config.json
from the signature $somehow, and then use the DiffID values from there to authenticate layers; rebuilding compressed tarballs to match expected digests of compressed data is pretty much a non-starter, see #157 .)
from image.
Thinking about this a bit more, the abstraction we impose on an ImageSource
forces us to create an artificial manifest for GetManifest
; and that manifest must refer to blobs using the same digests which we return in PutBlob
. So, unless we invent a new manifest schema, options 3 and 4 are not viable.
(OTOH we also now have option 5: rework the abstraction, perhaps to expose the tar manifest.json
as a new kind of manifest supported by containers/image/image
and using the UpdatedImage
mechanism for manifest conversion from/to schema2 and others. But that is a manifest of the tar file which may contain several images, it is a poor fit for an “image manifest”, and defining a single manifestItem
to be the tar image manifest would be inventing a completely new format.)
So right now I am leaning towards either 2 (silently uncompress blobs in archiveImageSource.GetBlob
) or 2a (when collecting the data for archiveImageSource.GetManifest
, use the available DiffID if we detect the blob as uncompressed, and compute a new digest of the compressed data if the blob is compressed or unrecognizable).
from image.
@mtrmac Alright I'm back. Sorry for the extended silence.
I would prefer that we just decompress blobs when we're an ImageSource
-- which means that there's no messing around with manifests or other structures -- we are just providing a different blob to the one we were originally given.
Handling signatures is a bit of an issue, but ultimately because of the hacks done with DiffIDs
we will have to modify something. And that something will always be either the manifest or the actual blobs. I'd prefer if we maintain the invariant of identifiers of layers being content-addressible...
Currently I'm reworking #193 to make my current solution better.
from image.
I would prefer that we just decompress blobs when we're an
ImageSource
ACK.
from image.
Related Issues (20)
- Blob reuse decisions do not take into account manifest support HOT 1
- Support for structured logging (using `log/slog`) HOT 5
- proposal: Support append images into docker archive HOT 1
- Make a new release HOT 2
- Docker client code can no longer talk to the latest verson of the docker daemon 25.0.0 HOT 5
- Allow empty OCI configs for artifacts HOT 9
- policy.json overwrite not honouring $XDG_CONFIG_HOME HOT 3
- Podman cannot pull image from local registry HOT 4
- copy.Options.EnsureCompressionVariantsExist doesn’t detect existing variants with zstd:chunked
- support multiple sigstore keys HOT 6
- How can I copy from a tar file stream HOT 7
- "slices" module only in go 1.21 HOT 1
- Cannot pull sigstore signed image with podman HOT 4
- Error inspecting local manifest-lists HOT 6
- platform.WantedPlatforms is noisy on macOS HOT 7
- Incorrect syntax highlighting in containers-transports.5
- Why do we get the whole image when inspect with docker daemon? HOT 2
- Support sigstore BYO PKI verification HOT 1
- Support more arbitrary credential helper executable names? HOT 4
- OCI image index loose the artifactType property on copy HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from image.