Git Product home page Git Product logo

http-types's Introduction

Generic HTTP types for Haskell (for both client and server code).

This library also contains some utility functions, e.g. related to URI
handling, that are not necessarily restricted in use to HTTP, but the scope is
restricted to things that are useful inside HTTP, i.e. no FTP URI parsing.

http-types's People

Contributors

aristidb avatar aslatter avatar chessai avatar chris-martin avatar fmaste avatar hvr avatar ianbamforth avatar ianbollinger avatar jkarni avatar julianbirch avatar kazu-yamamoto avatar lexi-lambda avatar mbbx6spp avatar mschristiansen avatar nsvedberg avatar pmlodawski avatar qrilka avatar simonmichael avatar singpolyma avatar snoyberg avatar sol avatar steve-chavez avatar tkvogt avatar ulikoehler avatar vincenthz avatar zohl avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

http-types's Issues

Typeable

It could be useful to derive Typeable on various types. HttpVersion comes to mind. a) do you agree, and b) would you accept a pull request with some changes (limited to sprinkling deriving Typeable around)?

Add support of URI Fragment

Hi @aristidb,

Problem statement

I found that URI fragment is missing in the package.

Screenshot 2020-07-16 at 23 15 48

I am talking about secondary part of URI (after hash sign, #). Despite it does not used in HTTP REST API, it is quite useful in HTTP (e.g. as HTML anchor). I would like to have it as a part of http-types just not to introduce it separately in several other packages dependent on this package (e.g. yesod-core).

Motivation

Consider GitHub, e.g. following URL: https://github.com/aristidb/http-types/blob/master/Network/HTTP/Types/URI.hs#L340-L347

While the content of the page with and without anchor remains the same, the page markup is different.

Imagine the situation where I want to introduce server-side rendering for HTML page with similar (highlighting) functionality based on fragment.

Would you like to accept PR with this missing part?

Thanks,
Andrey

Either a bug or an error in documentation for encodePathSegments

Here https://github.com/aristidb/http-types/blob/master/Network/HTTP/Types/URI.hs#L253 documentation states Performs percent encoding on all unreserved characters, as well as @\:\@\=\+\$@, while under the hood it calls urlEncodeBuilder False which does not escape characters :@&=+$,. Also in the docs the characters :@&=+$, called unreserved but they are actually reserved https://developers.google.com/maps/url-encoding. I also send related PR here fizruk/http-api-data#119

`status308` is not exported from Network.HTTP.Types

It is defined and exported in Network.HTTP.Types.Status but is not exported from Network.HTTP.Types like most other status codes. This is also true for statuses 422,428,429,431 and 511.

Is there a reason that Types.hs does not simply export "module Network.HTTP.Types.Status"?

is renderByteRanges doing the right thing?

In this function:

renderByteRangesBuilder :: ByteRanges -> Blaze.Builder

The ByteRanges are render with an = sign. But according to section 14.16 of RFC2616:

https://www.ietf.org/rfc/rfc2616.txt

When rendering a byte range in a Content-Range header, there is no '=',

Content-Range = "Content-Range" ":" content-range-spec

       content-range-spec      = byte-content-range-spec
       byte-content-range-spec = bytes-unit SP
                                 byte-range-resp-spec "/"
                                 ( instance-length | "*" )

       byte-range-resp-spec = (first-byte-pos "-" last-byte-pos)
                                      | "*"
       instance-length           = 1*DIGIT

This is, of course, an = in the request. This seems like senseless inconsistency in the RFC itself.

Perhaps renderByteRanges is only intended to render the byte ranges in requests? And there should be a different function for rendering byte ranges in a response? Or I am misunderstanding something?

HTTP method types

Instead of having a (fairly useless) type synonym for HTTP method, and then a (fairly restrictive) enumeration of so-called "standard methods", why not have:

data Method = GET | POST | PROPFIND | ... | OtherMethod ByteString

renderQuery should not insert a ? on empty queries

After all, this is the whole reason why I added the useQuestionMark parameter in the first place. Otherwise, just adding a ? to the string if necessary would have been easier.

This is how my old renderQuery behaved, but Michael Snoyman's updated version changed the behavior to always include the ?.

This is a problem because I use the result of renderQuery in a message that has to be signed cryptographically for AWS.

Go back to uppercase URI encoding

The URI spec says:

For consistency, URI producers and
normalizers should use uppercase hexadecimal digits for all percent-
encodings.

Given how widely used http-types is in the Haskell ecosystem, I think it would be wise to follow the spec's recommendation on this.

Additionally, please note that we may keep http-types constrained to 0.10 on Stackage, due to 0.11's adverse effect on amazonka request signing (see brendanhay/amazonka#440) and other such packages.

See also: commercialhaskell/stackage#3226

surprising behavior of renderByteRange

I had expected (hRange, renderByteRange (ByteRangeFrom 1000)) would be a http request Range header, but in fact renderByteRange does not include the "bytes=" part, so that generates an invalid "Range: 1000-" header.

renderByteRanges does include the "bytes=" part, so the workaround is to use
(hRange, renderByteRanges [ByteRangeFrom 1000])

The only way to learn of either behavior is to read the source code. This at least needs to be documented better.

It might be useful to parameterize the rendering functions with the role the range is being rendered for, eg a Request or a Response. See also #64.

include status code in the long status names

3 characters is a cheap price for easier comprehension when you are already typing long names like statusForbidden

I suggest changing to statusForbidden403 or status403Forbidden

provide pattern synonyms

I think it's a good idea to provide pattern synonyms for Header/Method/Status... along aside constants, so people can match them with ease.

Enum instance for Status

Would this be reasonable? toEnum would just be a giant case statement for all the status codes supported, with a fallthrough that used a null ByteString for the message.

Thoughts?

What are the goals of URL handling in http-types?

Is the goal to provide some URL handling utilities (like extract the path from a bytestring, which http-types already provides) or to eventually provide comprehensive URL handling (e.g. extracting the scheme and fragment from a URL as well)?

Clarifying this would help users understand the difference between http-types and other libraries like uri-bytestring and network-uri.

Unnecessary url encoding of some query parameters characters

I think the set of special symbols not url encoded in query parameters can be expanded from

ALPHA / DIGIT / "-" / "_" / "." / "~"

with

":" / "@" / "/" / "!" / "$" / "'" / "(" / ")" / "*" / ","

e.g ":", "@", "/" (extra pchar values) and sub-delims without "?" (query component delimiter) and "&;=+" (form url encoding sub component delimiters).

This is based on the query ABNF in RFC 3986 Appendix A. Appropriate sections below

unreserved    = ALPHA / DIGIT / "-" / "." / "_" / "~"
sub-delims    = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "="
pct-encoded   = "%" HEXDIG HEXDIG

pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"
query         = *( pchar / "/" / "?" ) 

and RFC 3986 Section 2.4: When to Encode or Decode quoted in part below

Under normal circumstances, the only time when octets within a URI
are percent-encoded is during the process of producing the URI from
its component parts. This is when an implementation determines which
of the reserved characters are to be used as subcomponent delimiters
and which can be safely used as data. Once produced, a URI is always
in its percent-encoded form.

But do check my working and I can send over a PR if you agree.

blaze-builder-0.4.0.0

Hi! The new blaze-builder was released, is there a chance to make this libarry work with it?

What is the type of encodePathSegments?

OK, if you take a look at:

https://hackage.haskell.org/package/http-types-0.9.1/docs/Network-HTTP-Types-URI.html#v:encodePathSegments

The Haddock says that it returns Builder from Data.Binary.Builder, and indeed if I write code that interacts with encodePathSegments it seems true. While according to the source code you export the builder from Blaze.ByteString.Builder and older versions of Haddock also indicate that. I tried to blame on GitHub and I don't see when the change happened. Data.Binary.Builder and Blaze.ByteString.Builder do not seem to export each other's Builder type (in fact the packages do not depend on each other).

Any idea what is going on here?

Parsing query strings with keys of the same name

The browser behavior (and standard if I can dig up the forms spec) of HTML forms with array-like values is to include a query string with repeated keys.

This multi-select:

<select name="colors" multiple>
  <option value="orange">orange</option>
  <option value="purple">purple</option>
</select>

Will result in ?colors=orange&colors=purple if both options are selected. Instead of parsing this as a list, parseQuery ignores the second value. Furthermore, the query is represented as a lookup map of bytestrings, so has no notion of multiple values per key.

The reason I bring this issue up is this use case: basic HTML form with a multi-select and no JS and a haskell backend using http-types. The http-types Query type and parsers fail that use case.

New Release?

Any plans to push a new release? There have been a few headers added since 0.9.1 (notably "Set-Cookie" would be nice to have).

Add changelog

It would be nice to be able to see what has changed in a new release (e.g. the latest 0.9), by adding a changelog file (which would automatically be shown on hackage).

Avoid creating new `ByteString` s

While i'm tackling #66 , i found an interesting problem: http-type would create too many ByteString:

  • parseQueryText/parseQuery use urlDecode which create new ByteString s.
  • HeaderItem use CI ByteString which in turn create a case-folded ByteString.

These ByteString s are not created via slice thus will be slow and create memory fragment, i would suggest to simply use Text instead.

parseByteRanges support for unsatisfied ranges

parseByteRanges cannot parse a response like ("Content-Range","bytes */10000")

This can occur when a server does not support a requested range. In particular, I was resuming the download of a file, but the whole file was already actually downloaded. The server responds with partialContent206 and parsing the Content-Range header is the only way to detect if the whole file was already downloaded.

This is called an "unsatisfied-range" in the HTTP RFC, and says it SHOULD be used for a 206 response (leaving open the possibility that some servers send one of the other range formats). None of the ByteRange constructors seem appropriate to parse that into, so it seems that would need a new ByteRangeUnsatisfied constructor.

array 0.5.0.0

Coming GHC 7.8 uses array 0.5.0.0. So, http-types cannot be compiled with GHC head at this moment. Please relax the upper bound. A new release would be appreciated.

Add the Generic typeclass to `StdMethod`

Is there any reason why the StdMethod type does not have a Generic instance ?
I need the Generic instance to derive another instance (Hashable) and was wondering if it would be possible to add it or if there's a specific reason why it's not present.

Rendering the http headers

I am not seeing on obvious way to render the various types into a Byte String that I can send through a socket.

How would I render the http version, status, and headers into a usable byte string?

encodePathSegments incorrectly prefixes path with /

The encodePathSegments function incorrectly prefixes output with a /. This makes it nearly useless for constructing relative URI references, and this behavior also directly contradicts the documentation.

Test program:

{-# LANGUAGE OverloadedStrings #-}
import Blaze.ByteString.Builder
import Network.HTTP.Types
import Data.Text
main = print . toByteString $ encodePathSegments ["foo", "bar", "baz"]

Expected output (taken directly from the haddock documentation):

"foo/bar/baz"

Actual output:

"/foo/bar/baz"

Submitting URL encoded array of structured arguments.

This might be slightly related to #62.

Let's say that I have these data types:

data Something = Something
    { field01 :: Text
    , field02 :: Text
    }

data StepInstruction = StepInstruction
    { id :: Int
    , order :: Int
    , description :: Text
    }

data Recipe = Recipe
    { id :: Int
    , name :: Text
    , something :: Something
    , instructions :: [StepInstruction]
    }

In PHP I could easily submit a form with such data type and get it back in the form of a nested array. Convention for submitting Recipe is as follows:

?id=0
&name=Recipe01

&instructions[0][id]=0
&instructions[0][order]=1
&instructions[0][description]=somedescription

&instructions[1][id]=1
&instructions[1][order]=2
&instructions[1][description]=someotherdescription

Currently Query flat and I'd have to do additional parsing on the argument names to get an array of structured arguments.

I'm not sure if this is a standard, but I've also seen the same convention in C#/MVC so I think it would be a good idea to support such feature, because I might want to add repeating "sub forms" to my main form that should be contained in array, and currently there is no good way to do that from what I can see.

Here is also an example of a form that might submit such query:

<form>
    <input type="hidden" name="id">
    <input type="text" name="name">

    <input type="text" name="something[field01]">
    <input type="text" name="something[field02]">

    <div>
        <input type="hidden" name="instructions[0][id]">
        <input type="number" name="instructions[0][order]">
        <input type="text" name="instructions[0][description]">
    </div>

    <div>
        <input type="hidden" name="instructions[1][id]">
        <input type="number" name="instructions[1][order]">
        <input type="text" name="instructions[1][description]">
    </div>

    <div>
        <input type="hidden" name="instructions[2][id]">
        <input type="number" name="instructions[2][order]">
        <input type="text" name="instructions[2][description]">
    </div>
</form>

Maybe QueryItem should look something like this:

data QueryItem
    = PlainValue ByteString
    | QueryItemList [QueryItem]
    | QueryItem (ByteString, Maybe QueryItem)

Documentation of urlEncode is misleading

Version 0.12.3

The documentation of urlEncode for the boolean parameter is:

Whether to decode '+' to ' '

This doesn't appear to be the whole story though. It also controls other characters like @

ghci> import Prelude
ghci> URI.urlEncode False "[email protected]"
"[email protected]"
ghci> URI.urlEncode True "[email protected]"
"test%40example.com"

This documentation appears true for urlDecoding (replacePlus is only used in one place).

-- | Percent-decoding.
urlDecode :: Bool -- ^ Whether to decode @\'+\'@ to @\' \'@
          -> B.ByteString -> B.ByteString
urlDecode replacePlus z = fst $ B.unfoldrN (B.length z) go z
  where
    go bs =
        case B.uncons bs of
            Nothing -> Nothing
            Just (43, ws) | replacePlus -> Just (32, ws) -- plus to space
            Just (37, ws) -> Just $ fromMaybe (37, ws) $ do -- percent
                (x, xs) <- B.uncons ws
                x' <- hexVal x
                (y, ys) <- B.uncons xs
                y' <- hexVal y
                Just (combine x' y', ys)
            Just (w, ws) -> Just (w, ws)
    hexVal w
        | 48 <= w && w <= 57  = Just $ w - 48 -- 0 - 9
        | 65 <= w && w <= 70  = Just $ w - 55 -- A - F
        | 97 <= w && w <= 102 = Just $ w - 87 -- a - f
        | otherwise = Nothing
    combine :: Word8 -> Word8 -> Word8
    combine a b = shiftL a 4 .|. b

But in URL encoding, the true/false flag isn't restricted to just spaces:

unreservedQS, unreservedPI :: [Word8]
unreservedQS = map ord8 "-_.~"
unreservedPI = map ord8 "-_.~:@&=+$,"

-- | Percent-encoding for URLs.
urlEncodeBuilder' :: [Word8] -> B.ByteString -> B.Builder
urlEncodeBuilder' extraUnreserved = mconcat . map encodeChar . B.unpack
    where
      encodeChar ch | unreserved ch = B.word8 ch
                    | otherwise     = h2 ch

      unreserved ch | ch >= 65 && ch <= 90  = True -- A-Z
                    | ch >= 97 && ch <= 122 = True -- a-z
                    | ch >= 48 && ch <= 57  = True -- 0-9
      unreserved c = c `elem` extraUnreserved

      -- must be upper-case
      h2 v = B.word8 37 `mappend` B.word8 (h a) `mappend` B.word8 (h b) -- 37 = %
          where (a, b) = v `divMod` 16
      h i | i < 10    = 48 + i -- zero (0)
          | otherwise = 65 + i - 10 -- 65: A

-- | Percent-encoding for URLs (using 'B.Builder').
urlEncodeBuilder
    :: Bool -- ^ Whether input is in query string. True: Query string, False: Path element
    -> B.ByteString
    -> B.Builder
urlEncodeBuilder True  = urlEncodeBuilder' unreservedQS
urlEncodeBuilder False = urlEncodeBuilder' unreservedPI

-- | Percent-encoding for URLs.
urlEncode :: Bool -- ^ Whether to decode @\'+\'@ to @\' \'@
          -> B.ByteString -- ^ The ByteString to encode as URL
          -> B.ByteString -- ^ The encoded URL
urlEncode q = BL.toStrict . B.toLazyByteString . urlEncodeBuilder q

I'm not super familiar with this domain. Should URL decoding have an option to work with those other characters? Should just the documentation for URL encoding change?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.