stevenwinfo / haskell-soda Goto Github PK

View Code? Open in Web Editor NEW

4.0 4.0 0.0 196 KB

Haskell bindings for the Socrata Open Data API

License: MIT License

Haskell 100.00%

haskell soda

haskell-soda's People

Contributors

Stargazers

Watchers

haskell-soda's Issues

Small things to do

There's quite a few small things that should be done before opening this up to the public.

Clean up dirtier code and make names more consistant and understandable.
Write a few basic covering tests for all the different areas.
Clean up the unnecessary tests.
Figure out how to generate and host the Haddock generated documentation.
Reorganize some of the code, files, and directory structure.
Make use of String, Text, ByteString, and/or Lazy ByteString more consistent throughout the code.
Go over all the other tasks and see if there's anything else that should be done before opening this up to the public.
Consider adding an examples directory with examples of code using the library.

Of course, there are the other issues with the "opening up to the public" milestone as well.

Use maps instead of lists of tuples for returned data

SodaFunc parameters are both not restrictive enough, and too restrictive at the same time.

Many of the SodaFunc constructors, which currently have the type something like a -> a -> b can actually have different types for the first two parameters, like +. They can only allow a subset of SodaType types though, like the numeric ones.

Another related problem is that SodaFunc constructors like within_circle currently allow any SodaType for the first parameter, whereas it should really only be geometric SodaTypes.

These two problems can be solved by creating a bunch of typeclasses that just specify different subsets of SodaType that need to be used. They don't need to have any methods.

There might be another, more terse way of going about this, but I think this is the simplest way that doesn't add any more boilerplate for the user.

Create a better complex example

The complex example is pretty construed, doesn't really display the things I wanted it to well, and isn't really clear on what it's doing. The second dataset also wasn't a great choice either because it isn't really clear what the data it contains in it is. It also isn't very closely related to the first set so finding ways to compare them aren't clear.

Make paging data simpler with helper functions or something.

Improve the README

There's a list of things to add or improve about the current README.

Improve the return types of some of the binary operators

Some of the binary operators can have different types on the left and right of the operator. This makes it difficult to decide what the resulting type of the operator should be. Right now it just arbitrarily picks the one on the right.

I think that maybe the Constructors should be exported with type a -> b -> c where they can put a type annotation to explicitly say what type they want, and then the infix operator will have type a -> b -> b. Possibly also give a hint in the operator that it is the one on the right, like changing $+ to $+> or something.

Create functionality to specify response with record type

I mentioned this when closing #6, but I think that it should be possible to specify the $select parameters with a user created record type, which we could create a function which fills out that record type and returns that. This would make some things a little more convenient and consistent over the current method.

Users can do this themselves by parsing the string response, however, it would be nice to incorporate it through the whole process.

Write more tests

There are a handful of tests currently, but they are not very extensive and don't provide a lot of coverage. Write more tests that will make you more confident when something is changed that it still works.

Upload the package to Hackage

Once this becomes a little more stable, tested, and many of the problems are hammered out, we can make the package available the way that most Haskell packages are provided, through Hackage. I'm not sure when this will be though. After we do this we could consider adding it to Stackage as well.

Create constants for the metadata fields

The metadata/system fields are always the same for all datasets, so I might as well make named constants that people can add to the query to get that information.

(As a side note, I should see what other kinds of metadata that I can get from datasets and that come with all queries).

Account for the fact that all datatypes have a null value

Currently, only Checkbox typed columns are considered to be able to have null values, but any column of any type can actually have null values. This means our model is a bit off. Functionally it also means we can't currently filter a text field where it isn't (or is) null. Need to change all of the types to add that value. Possibly by making them maybe types, or by making them sum types with another value. I suppose both would be similar, although the latter might have one less constructor depending on the type.

Incorporate runtime error checking and handling

There were a few things I knew that needed to be checked at runtime, but with the addition of the aggregate stuff, the basics of this should really be implemented before being made public.

Determine what actually needs to be exported and how to do it

Right now, the library is probably exporting a lot more things than it needs to so that should be tightened up to only the things that need to be exported. I also know that a lot of other libraries have a particular file dedicated to specifying what is publicly exported, and other exporting schemes, so possibly consider something like that.

Separate and distinguish the aggregate SODA functions

The aggregate SODA functions have some unique things about them that are currently not reflected in these bindings. For example, you can't use them in $where clauses. This means I'll have to separate out those functions from the other SODA functions and use them slightly differently throughout the code. It will probably involve separating them out into another type, as well as probably making some typeclasses.

There are also some more involved things with aggregates, such as requiring columns to be in the $group clause, however, those will be more involved and I'll probably create another issue for those later.

Double isn't interpreted in the response

Because SODA returns SODA Doubles as JSON strings, the FromJSON parsing of those fields into Haskell's Doubles breaks and returns nothing. You also can't override an already declared instance for a specific type, which means we can't write a new instance FromJSON instance for Double to be like SodaNum's instance. This means that right now, the interpreted responses just don't contain Double values (it's been a while since I tested, so I can't remember if that's exactly what happens).

I actually haven't seen too many datasets with the Double type, but this is still a pretty big issue. The only solution that I can think of right now would be to change the Double type into a newtype like SodaNum and Money. Having to deal with the newtype for those types is already pretty annoying though, and it seems like you should be able to use a basic datatype in an easier way. If there's no other way though, then I guess we'll have to do it to make responses work correctly.

Correct the URL representation of some SODA function input parameters

Some of the literal SODA values inside of functions appear differently when displayed as parameters. Mostly, these are the geometric types in things like within_circle().

Limit the input types of certain SodaFuncs with typeclasses

Some SodaFuncs have more limited input types than the GADT currently expresses. Typeclasses and instances will need to be made to put additional constraints on the constructors. I think they'll be typeclasses without methods.

Make the type for the "Case" SodaFunc constructor less restrictive

I think that my definition of the Case constructor might be a little restrictive because I think the soda level function can return differently typed values. We could make an existential type just to hide those types, but that would be yet another thing to keep track of.

List of boilerplate to get rid of or simplify

I don't know if getting rid of some or any of the following is possible, but I'd like to try.

Having to put SodaVal on all of the SODA values.
Having to put type declarations on columns
- Have to specify type somehow, but a function that takes a Proxy a value or something like that could make it simpler.
Putting Just on all of the query record field values.
Having to put Expr on a lot of things
Putting SodaNum and Money constructors around the respective types.
- I could make money just be a type synonym for integers, which represent cents.
- Could making Double and SodaNum be the same at the Haskell level be alright? There's not really anything I can think of at the SODA level that would fail if given the other datatype.

Make it clearer what the return type of most binary operators will be.

Either make the parametric binary operator types simpler, have the operator hint which type it is using, or make it very clear in the documentation which it is. Ideally, a combination of all three.

Create a URL Parameter representation of subqueries

Subqueries actually have a different representation of from regular queries which looks a little more like traditional SQL queries. It would require rewriting the representations of many of the types so I just haven't gotten around to it. It should be pretty simple though.

Make a supertypeclass for SodaType

If people made their own instances of SodaType types, the library could operate in unintended ways. The SODA also only recognizes the already defined types, so from a domain perspective, adding external types wouldn't make sense. If we make a supertypeclass for SodaType, we can export toUrlPart, but people can't make instances of the class.

Add ability to include API token.

Add functionality to include an Application Token

Currently, the bindings can only make requests without SODA authentication tokens which means that those requests will be limited. If an application using these bindings wanted to make a lot of queries, it would want to add the authentication token, so it will need to be able to add it.

I'm not sure if this is necessary for the public release, but it should definitely be done at some point.

Create Basic Haddock Documentation

Most of the code doesn't have the Haddock documentation comments. Go over the code and add them.

Customize Req response exceptions

Right now, I'm pretty sure that, req returns IO exceptions. I think I'd rather it return an Either type with a custom exception/error type or something cleaner. It mentions in the Req documentation that you're able to do that, but I just need to look into how to do it.

Figure out how to handle the millisecond precision representation with Timestamp.

Create functionality for subquery parameters

Because the subquery parameter represents a lot of different parts of a query differently than a basic query, it would require rewriting how the functionality for how a query is represented for a lot of different parts. Because I've been working on other parts, and I've wanted to hurry up and make the basic functionality work I've skipped over this part.

It shouldn't be too difficult, but it will probably require a lot of busywork of going through all of the URL parameter representations and making new representations for those that differ.

Interpret responses as Haskell values

Right now, it just gets back a big string as a response from a query. This means unless the user parses/deserializes that string on their own, that these API bindings aren't very useful.

I'm not very familiar with the deserializing libraries like aeson so I'll have to read up before implementing.

I'm not sure if this can, or even should, be implemented without the user specifying the types of what is returned, so I might have to mention that you have to specify the types when querying. I would also like to see if I can have the definition of column variables somehow interact with the specification of response types.

stevenwinfo / haskell-soda Goto Github PK

haskell-soda's People

Contributors

Stargazers

Watchers

haskell-soda's Issues

Recommend Projects

Recommend Topics

Recommend Org