Git Product home page Git Product logo

Comments (4)

jaraco avatar jaraco commented on June 2, 2024 1

Okay. Yes. I think perhaps I now see.

And to be sure, I'd like to honor RFCs as accurately as possible.

And now that I look into it more, I'm reminded that curl doesn't do any URL encoding like other HTTP clients.

$ curl 'https://httpbin.org/anything/foo bar' -v
*   Trying 54.243.202.193...
* TCP_NODELAY set
* Connected to httpbin.org (54.243.202.193) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/cert.pem
  CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server accepted to use http/1.1
* Server certificate:
*  subject: CN=httpbin.org
*  start date: Nov 12 23:32:05 2017 GMT
*  expire date: Feb 10 23:32:05 2018 GMT
*  subjectAltName: host "httpbin.org" matched cert's "httpbin.org"
*  issuer: C=US; O=Let's Encrypt; CN=Let's Encrypt Authority X3
*  SSL certificate verify ok.
> GET /anything/foo bar HTTP/1.1
> Host: httpbin.org
> User-Agent: curl/7.54.0
> Accept: */*
> 
< HTTP/1.1 505 HTTP Version Not Supported
< Connection: close
< Server: Cowboy
< Date: Tue, 21 Nov 2017 02:10:29 GMT
< Content-Length: 0
< 
* Closing connection 0
* TLSv1.2 (OUT), TLS alert, Client hello (1):

So curl was a bad example for how URIs should be encoded.

So I'm now convinced you're right... and it's the application I'm troubleshooting that should be quoting its requests in tests, rather than cherrypy/cheroot that should be supporting those URIs.

from cheroot.

jaraco avatar jaraco commented on June 2, 2024 1

Thanks @webknjaz for the distilled references and the clarification. That made all the difference in helping me understand the issue.

from cheroot.

webknjaz avatar webknjaz commented on June 2, 2024

@jaraco I cannot agree with you on this:

https://tools.ietf.org/html/rfc3986#section-3.3:

  path          = path-abempty    ; begins with "/" or is empty
               / path-absolute   ; begins with "/" but not "//"
               / path-noscheme   ; begins with a non-colon segment
               / path-rootless   ; begins with a segment
               / path-empty      ; zero characters

 path-abempty  = *( "/" segment )
 path-absolute = "/" [ segment-nz *( "/" segment ) ]
 path-noscheme = segment-nz-nc *( "/" segment )
 path-rootless = segment-nz *( "/" segment )
 path-empty    = 0<pchar>
 segment       = *pchar
 segment-nz    = 1*pchar
 segment-nz-nc = 1*( unreserved / pct-encoded / sub-delims / "@" )
               ; non-zero-length segment without any colon ":"

 pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"

https://tools.ietf.org/html/rfc3986#section-2.1:

pct-encoded = "%" HEXDIG HEXDIG

https://tools.ietf.org/html/rfc3986#section-2.2:

  sub-delims  = "!" / "$" / "&" / "'" / "(" / ")"
              / "*" / "+" / "," / ";" / "="

https://tools.ietf.org/html/rfc3986#section-2.3:

unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"

https://tools.ietf.org/html/rfc3986#section-2.4:

2.4. When to Encode or Decode

Under normal circumstances, the only time when octets within a URI
are percent-encoded is during the process of producing the URI from
its component parts. This is when an implementation determines which
of the reserved characters are to be used as subcomponent delimiters
and which can be safely used as data. Once produced, a URI is always
in its percent-encoded form.

When a URI is dereferenced, the components and subcomponents
significant to the scheme-specific dereferencing process (if any)
must be parsed and separated before the percent-encoded octets within
those components can be safely decoded, as otherwise the data may be
mistaken for component delimiters. The only exception is for
percent-encoded octets corresponding to characters in the unreserved
set, which can be decoded at any time. For example, the octet
corresponding to the tilde ("") character is often encoded as "%7E"
by older URI processing implementations; the "%7E" can be replaced by
"
" without changing its interpretation.

Because the percent ("%") character serves as the indicator for
percent-encoded octets, it must be percent-encoded as "%25" for that
octet to be used as data within a URI. Implementations must not
percent-encode or decode the same string more than once, as decoding
an already decoded string might lead to misinterpreting a percent
data octet as the beginning of a percent-encoding, or vice versa in
the case of percent-encoding an already percent-encoded string.

from cheroot.

jaraco avatar jaraco commented on June 2, 2024

I'm going to do some investigation in another app, but I'll plan to restore this test.

from cheroot.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.