Git Product home page Git Product logo

Comments (13)

runcom avatar runcom commented on July 20, 2024

/cc @mtrmac

from image.

aweiteka avatar aweiteka commented on July 20, 2024

related issue #120

from image.

mtrmac avatar mtrmac commented on July 20, 2024

Ouch.

Anyway, to be more specific, quoting that RFC:

These characters are called
"reserved" because they may (or may not) be defined as delimiters by
the generic syntax, by each scheme-specific syntax, or by the
implementation-specific syntax of a URI's dereferencing algorithm.
If data for a URI component would conflict with a reserved
character's purpose as a delimiter, then the conflicting data must be
percent-encoded before the URI is formed.

(emphasis mine)

Then, looking at http URI ( https://tools.ietf.org/html/rfc7230#section-2.7.1 ), we are back in https://tools.ietf.org/html/rfc3986#section-3.3 for path-abempty, and ultimately pchar, which explicitly contains :, and then

instead, each syntax rule lists the characters
allowed within that component (i.e., not delimiting it)

so : is allowed in principle, with

, and any of
those characters that are also in the reserved set are "reserved" for
use as subcomponent delimiters within the component.

which… changes nothing?

And then

URI producing applications should percent-encode data octets that
correspond to characters in the reserved set unless these characters
are specifically allowed by the URI scheme to represent data in that
component.

The HTTP specification says nothing in that respect, but

If a reserved character is found in a URI component and
no delimiting role is known for that character, then it must be
interpreted as representing the data octet corresponding to that
character's encoding in US-ASCII.

if we include : in the URL, it “must” be interpreted as a : byte.


Anyway, with

Percent-encoding the path (%3A) does not help.

the issue of : being reserved or not by the URI spec is moot, any percent-encoded byte is allowed in the URI; AFAICS the issue is simply that Artifactory refuses a :, however represented.

(That’s not to say that this isn’t an issue and we can just define it away… but if we care about what Artifactory accepts, then we need to know what it refuses so that any possible replacement to the scheme isn’t rejected again.)

from image.

rhatdan avatar rhatdan commented on July 20, 2024

I believe we care what Artifactory accepts. Should we open a conversation with them?

from image.

aweiteka avatar aweiteka commented on July 20, 2024

For reference, here's the code (line 101):
http://subversion.jfrog.org/artifactory/public/trunk/base/common/src/main/java/org/artifactory/util/PathValidator.java

I'll bump this thread.

from image.

mtrmac avatar mtrmac commented on July 20, 2024

I guess this could be because of https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx , the impossibility to use colons in file names on Windows? That does seem to be important for cross-platform portability.

from image.

rhatdan avatar rhatdan commented on July 20, 2024

So from my reading these characters are out.

< (less than)
> (greater than)
: (colon)
" (double quote)
/ (forward slash)
\ (backslash)
| (vertical bar or pipe)
? (question mark)
* (asterisk)

from image.

rhatdan avatar rhatdan commented on July 20, 2024

Suggestions:

!
#
&
+
-
_
@
~

from image.

aweiteka avatar aweiteka commented on July 20, 2024

# is a fragment delimiter, example.com#some-anchor
& is a query key=value delimiter, example.com?foo=bar&bar=foo
@ is an email address delimiter, [email protected]
~ is a user delimiter, example.com/~aweiteka

from image.

mtrmac avatar mtrmac commented on July 20, 2024

Based on https://github.com/docker/distribution/blob/master/reference/reference.go ,

  • / delimits components
  • _, . and - are allowed in reference components
  • +, _, . and - are allowed in digest algorithm
  • : and @ are used in the tag@algo:digest syntax

(This duplicates some of the exclusions mentioned above, repeating to explicitly associate them with the semantics / rationale.)

from image.

rhatdan avatar rhatdan commented on July 20, 2024

Does that leave from my original list.

!
~

from image.

mtrmac avatar mtrmac commented on July 20, 2024

So the URI reserved characters are really a red herring; containers/image does already percent-encode them in the path when sending the request, and HTTP servers obviously percent-decode. This can be demonstrated by applying #201 and configuring e.g.

docker:
    docker.io/library:
        sigstore: https://example.com/a:b/A%3AB/c[d/e]f/g@h

in a registries.d/something.yaml, which results in e.g.

GET /a:b/A:B/c%5Bd/e%5Df/g@h/docker.io/library/busybox@sha256:817a12c32a39bbe394944ba49de563e085f1d3c5266eb8e9723256bc4448680e/signature-1 HTTP/1.1

(note how both : and @ are unquoted, and how Go’s net/url parsing+quoting canonicalizes %3A to :; apparently the correct reading of RFC 3986 is that when : and @ are explicitly listed in the pchar rule,

URI producing applications should percent-encode data octets that
correspond to characters in the reserved set unless these characters
are specifically allowed by the URI scheme to represent data in that
component.

“specifically allowed’ characters are not percent-encoded)

So, the binding constraints are only 1) Windows pathname restrictions, 2) characters which already have defined meanings in docker digest references, prohibiting ", *, +, -, ., :, <, >, ?, \, _, |.

That leaves us quite a few options.

To make things simple for naive clients constructing HTTP by pasting strings, we can use one of the “specifically allowed” characters for pchar, i.e. :, @ (explicitly allowed), !, $, &, ', (, ), *, +, ,, ;, = (sub-delims), -, ., _, ~. (with strikethrough for the characters prohibited above).

And to make things simple for us and copy&paste in shell, let’s only consider characters which are not treated by sh specially: %, ,, =, @, ^, {, }, ~ (the last three are treated specially only in specific positions).

The intersection is ,, =, @, ~, and we are already using @. In this set, = seems a fairly obvious choice: …/busybox@sha256=817a12c32a39bbe394944ba49de563e085f1d3c5266eb8e9723256bc4448680e.

from image.

aweiteka avatar aweiteka commented on July 20, 2024

I like it. '=' it is.

from image.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.