This is a bit of a weird issue, but it's something that is exposed if you apply <a cla

So, to be explicit, am I correct that the issue is: <code clas

Thinking about this a bit more, the abstraction we impose on an <code class="notransla

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

I would prefer that we just decompress blobs when we're an <code class="n

copy: dest !compression not complied with if source is compressed about image HOT 7 CLOSED

containers commented on July 20, 2024

copy: dest !compression not complied with if source is compressed

from image.

Comments (7)

cyphar commented on July 20, 2024

Ah, sorry this is actually caused by the automatic compression/decompression done during copying. It looks like ShouldCompressLayers isn't being handled properly, causing a copy from docker-daemon -> docker-archive to work, but docker -> docker-archive to not work.

from image.

cyphar commented on July 20, 2024

Ah, it's because copy/copy.go doesn't actually appear to decompress layers if the target layer doesn't accept compression. Which is a bit ... odd.

from image.

mtrmac commented on July 20, 2024

So, to be explicit, am I correct that the issue is:

skopeo copy docker://… docker-archive:$x creates a valid (i.e. consumable by docker load) tarball, which contains compressed layers and DiffIDs of uncompressed data [but no record of the digests of the compressed data]
skopeo copy docker-archive:$x … then fails, because it uses the DiffIDs to identify layers but the actual files are compressed

Yeah, that’s pretty ugly. (For the record, it is not an issue for docker-daemon:, because the daemon transparently unpacks the compressed layers and verifies DiffIDs on docker load, and on docker save it always creates uncompressed layers.)

Now, whether we should be creating and/or accepting tarballs which contain compressed layers is really a matter of definition; AFAIK the tarball format is not formally documented, so strictly speaking we can do anything, e.g. one (or more) of the following:

Always uncompress layers in archiveImageDestination so that the DiffID matches in the tarball. May waste space.
Always uncompress layers in archiveImageSource so that the DiffID matches during verification. Fairly likely wastes time.
Teach the copy.go digest verification code about some digests being of the compressed data / of the uncompressed data (we already have ugly code to compute DiffID values for compressed tarballs).
Teach copy.go that some sources have expected digest verification failures. (That’s pretty risky and ugly.)

I guess my weak preference would be 2. or 3., because that makes any tarball acceptable to docker load also acceptable to docker-archive:. OTOH 1. is probably a bit easier to implement.

from image.

mtrmac commented on July 20, 2024

A particular concern, of course, is signatures: whether we verify, what we verify, and who decides what we verify, all matters. Right now, in the case of the docker-daemon: format, this is academic because the format can’t record or provide signatures, but we do intend to integrate that $somehow fairly soon. I’m afraid right now I don’t have a completely clear idea what that would look like and what that would mean for the whole problem space. (It does seem attractive to be able to authenticate the config.json from the signature $somehow, and then use the DiffID values from there to authenticate layers; rebuilding compressed tarballs to match expected digests of compressed data is pretty much a non-starter, see #157 .)

from image.

mtrmac commented on July 20, 2024

Thinking about this a bit more, the abstraction we impose on an ImageSource forces us to create an artificial manifest for GetManifest; and that manifest must refer to blobs using the same digests which we return in PutBlob. So, unless we invent a new manifest schema, options 3 and 4 are not viable.

(OTOH we also now have option 5: rework the abstraction, perhaps to expose the tar manifest.json as a new kind of manifest supported by containers/image/image and using the UpdatedImage mechanism for manifest conversion from/to schema2 and others. But that is a manifest of the tar file which may contain several images, it is a poor fit for an “image manifest”, and defining a single manifestItem to be the tar image manifest would be inventing a completely new format.)

So right now I am leaning towards either 2 (silently uncompress blobs in archiveImageSource.GetBlob) or 2a (when collecting the data for archiveImageSource.GetManifest, use the available DiffID if we detect the blob as uncompressed, and compute a new digest of the compressed data if the blob is compressed or unrecognizable).

from image.

cyphar commented on July 20, 2024

@mtrmac Alright I'm back. Sorry for the extended silence.

I would prefer that we just decompress blobs when we're an ImageSource -- which means that there's no messing around with manifests or other structures -- we are just providing a different blob to the one we were originally given.

Handling signatures is a bit of an issue, but ultimately because of the hacks done with DiffIDs we will have to modify something. And that something will always be either the manifest or the actual blobs. I'd prefer if we maintain the invariant of identifiers of layers being content-addressible...

Currently I'm reworking #193 to make my current solution better.

from image.

mtrmac commented on July 20, 2024

I would prefer that we just decompress blobs when we're an ImageSource

ACK.

from image.

copy: dest !compression not complied with if source is compressed about image HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent