Git Product home page Git Product logo

haskell-soda's People

Contributors

stevenwinfo avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

haskell-soda's Issues

Small things to do

There's quite a few small things that should be done before opening this up to the public.

  • Clean up dirtier code and make names more consistant and understandable.
  • Write a few basic covering tests for all the different areas.
  • Clean up the unnecessary tests.
  • Figure out how to generate and host the Haddock generated documentation.
  • Reorganize some of the code, files, and directory structure.
  • Make use of String, Text, ByteString, and/or Lazy ByteString more consistent throughout the code.
  • Go over all the other tasks and see if there's anything else that should be done before opening this up to the public.
  • Consider adding an examples directory with examples of code using the library.

Of course, there are the other issues with the "opening up to the public" milestone as well.

SodaFunc parameters are both not restrictive enough, and too restrictive at the same time.

Many of the SodaFunc constructors, which currently have the type something like a -> a -> b can actually have different types for the first two parameters, like +. They can only allow a subset of SodaType types though, like the numeric ones.

Another related problem is that SodaFunc constructors like within_circle currently allow any SodaType for the first parameter, whereas it should really only be geometric SodaTypes.

These two problems can be solved by creating a bunch of typeclasses that just specify different subsets of SodaType that need to be used. They don't need to have any methods.

There might be another, more terse way of going about this, but I think this is the simplest way that doesn't add any more boilerplate for the user.

Create a better complex example

The complex example is pretty construed, doesn't really display the things I wanted it to well, and isn't really clear on what it's doing. The second dataset also wasn't a great choice either because it isn't really clear what the data it contains in it is. It also isn't very closely related to the first set so finding ways to compare them aren't clear.

Improve the README

There's a list of things to add or improve about the current README.

  • Clean up what is currently there.
  • Give some examples of how to use it. Even small programs might be appropriate.
  • Make a basic introduction of how to use it.
  • Give some basic reference info like all of the SodaTypes and maybe SODA functions/operators.
  • Maybe explain some of the design decisions made.
  • Give some of the concessions that the library makes, how it could possibly be improved, and mention issues and pull requests.
  • Maybe installation and usage instructions?
  • Point toward other documentation like generated Haddock documentation.
  • Mention that it doesn't include any SODA publishing functionality and that it will hopefully be added in the future.
  • Other important things about using the library that people might need to know: Inconsistencies, tips, neat features, etc.

Improve the return types of some of the binary operators

Some of the binary operators can have different types on the left and right of the operator. This makes it difficult to decide what the resulting type of the operator should be. Right now it just arbitrarily picks the one on the right.

I think that maybe the Constructors should be exported with type a -> b -> c where they can put a type annotation to explicitly say what type they want, and then the infix operator will have type a -> b -> b. Possibly also give a hint in the operator that it is the one on the right, like changing $+ to $+> or something.

Create functionality to specify response with record type

I mentioned this when closing #6, but I think that it should be possible to specify the $select parameters with a user created record type, which we could create a function which fills out that record type and returns that. This would make some things a little more convenient and consistent over the current method.

Users can do this themselves by parsing the string response, however, it would be nice to incorporate it through the whole process.

Write more tests

There are a handful of tests currently, but they are not very extensive and don't provide a lot of coverage. Write more tests that will make you more confident when something is changed that it still works.

Upload the package to Hackage

Once this becomes a little more stable, tested, and many of the problems are hammered out, we can make the package available the way that most Haskell packages are provided, through Hackage. I'm not sure when this will be though. After we do this we could consider adding it to Stackage as well.

Create constants for the metadata fields

The metadata/system fields are always the same for all datasets, so I might as well make named constants that people can add to the query to get that information.

(As a side note, I should see what other kinds of metadata that I can get from datasets and that come with all queries).

Account for the fact that all datatypes have a null value

Currently, only Checkbox typed columns are considered to be able to have null values, but any column of any type can actually have null values. This means our model is a bit off. Functionally it also means we can't currently filter a text field where it isn't (or is) null. Need to change all of the types to add that value. Possibly by making them maybe types, or by making them sum types with another value. I suppose both would be similar, although the latter might have one less constructor depending on the type.

Incorporate runtime error checking and handling

There were a few things I knew that needed to be checked at runtime, but with the addition of the aggregate stuff, the basics of this should really be implemented before being made public.

Determine what actually needs to be exported and how to do it

Right now, the library is probably exporting a lot more things than it needs to so that should be tightened up to only the things that need to be exported. I also know that a lot of other libraries have a particular file dedicated to specifying what is publicly exported, and other exporting schemes, so possibly consider something like that.

Separate and distinguish the aggregate SODA functions

The aggregate SODA functions have some unique things about them that are currently not reflected in these bindings. For example, you can't use them in $where clauses. This means I'll have to separate out those functions from the other SODA functions and use them slightly differently throughout the code. It will probably involve separating them out into another type, as well as probably making some typeclasses.

There are also some more involved things with aggregates, such as requiring columns to be in the $group clause, however, those will be more involved and I'll probably create another issue for those later.

Double isn't interpreted in the response

Because SODA returns SODA Doubles as JSON strings, the FromJSON parsing of those fields into Haskell's Doubles breaks and returns nothing. You also can't override an already declared instance for a specific type, which means we can't write a new instance FromJSON instance for Double to be like SodaNum's instance. This means that right now, the interpreted responses just don't contain Double values (it's been a while since I tested, so I can't remember if that's exactly what happens).

I actually haven't seen too many datasets with the Double type, but this is still a pretty big issue. The only solution that I can think of right now would be to change the Double type into a newtype like SodaNum and Money. Having to deal with the newtype for those types is already pretty annoying though, and it seems like you should be able to use a basic datatype in an easier way. If there's no other way though, then I guess we'll have to do it to make responses work correctly.

Make the type for the "Case" SodaFunc constructor less restrictive

I think that my definition of the Case constructor might be a little restrictive because I think the soda level function can return differently typed values. We could make an existential type just to hide those types, but that would be yet another thing to keep track of.

List of boilerplate to get rid of or simplify

I don't know if getting rid of some or any of the following is possible, but I'd like to try.

  • Having to put SodaVal on all of the SODA values.
  • Having to put type declarations on columns
    • Have to specify type somehow, but a function that takes a Proxy a value or something like that could make it simpler.
  • Putting Just on all of the query record field values.
  • Having to put Expr on a lot of things
  • Putting SodaNum and Money constructors around the respective types.
    • I could make money just be a type synonym for integers, which represent cents.
    • Could making Double and SodaNum be the same at the Haskell level be alright? There's not really anything I can think of at the SODA level that would fail if given the other datatype.

Create a URL Parameter representation of subqueries

Subqueries actually have a different representation of from regular queries which looks a little more like traditional SQL queries. It would require rewriting the representations of many of the types so I just haven't gotten around to it. It should be pretty simple though.

Make a supertypeclass for SodaType

If people made their own instances of SodaType types, the library could operate in unintended ways. The SODA also only recognizes the already defined types, so from a domain perspective, adding external types wouldn't make sense. If we make a supertypeclass for SodaType, we can export toUrlPart, but people can't make instances of the class.

Add functionality to include an Application Token

Currently, the bindings can only make requests without SODA authentication tokens which means that those requests will be limited. If an application using these bindings wanted to make a lot of queries, it would want to add the authentication token, so it will need to be able to add it.

I'm not sure if this is necessary for the public release, but it should definitely be done at some point.

Customize Req response exceptions

Right now, I'm pretty sure that, req returns IO exceptions. I think I'd rather it return an Either type with a custom exception/error type or something cleaner. It mentions in the Req documentation that you're able to do that, but I just need to look into how to do it.

Create functionality for subquery parameters

Because the subquery parameter represents a lot of different parts of a query differently than a basic query, it would require rewriting how the functionality for how a query is represented for a lot of different parts. Because I've been working on other parts, and I've wanted to hurry up and make the basic functionality work I've skipped over this part.

It shouldn't be too difficult, but it will probably require a lot of busywork of going through all of the URL parameter representations and making new representations for those that differ.

Interpret responses as Haskell values

Right now, it just gets back a big string as a response from a query. This means unless the user parses/deserializes that string on their own, that these API bindings aren't very useful.

I'm not very familiar with the deserializing libraries like aeson so I'll have to read up before implementing.

I'm not sure if this can, or even should, be implemented without the user specifying the types of what is returned, so I might have to mention that you have to specify the types when querying. I would also like to see if I can have the definition of column variables somehow interact with the specification of response types.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.