Git Product home page Git Product logo

bdp2ckan's People

Watchers

 avatar  avatar

bdp2ckan's Issues

Having to use patternProperties in schema is messy

Because dimension and measurement keys are unknown the schema has to rely on patternProperties which is very messy. Especially with the mix between measures and dimensions which need to use a negative lookahead. It's a non issue, but still very messy.

Currencies not handled very well

The currency is a property of the measures and is now only grabbed by looping over all measures and extracting the currencies and then submitting that as metadata on a dataset level. The script should look at what resources use the currency and put the currency as metadata on the resource, instead of the dataset. That might also be non straightforward because a resource might have multiple currencies as well. Need to figure out if this is something we need to worry about.

Schema always assumes source has an object value

In the examples of the budget data package specification it is clear that it's possible to define the source with a string, not only an object.

The schema can only evaluate:

{
    "year": {
        "source": "source field name"
    }
}

but not

{
    "year": "source field name"
}

Budget Data Package schema relies on common pattern for source

The budget data package schema assumes that the specification will use source in the mapping to say where the source column is but that is only a common pattern used for budget data packages. It's not clear what would be the official way (although it's very likely to be source).

License identifiers

Licenses in CKAN are defined by a license.json file which may or may not be compatible with the licenses supported in Budget Data Package (the OD list). We need to figure out how to handle license identifiers properly.

Multiple licenses

CKAN does not support multiple licenses which Budget Data Package does support. We need to figure out how to properly support the licenses property.

The schema can't do "license" xor "licenses"

The data package specification (and budget data package as well) says that the descriptor can either use license OR licenses but not both. This can't be captured in the schema and is really weird for the data package as well because data packages allow additional properties. So this is an example of trying to capture:

You can have any additional properties EXCEPT license IF you already have licenses OR license IF you already have license but you don't have to have either of those because these are optional fields.

DimensionType is not properly defined

The Budget Data Package specification only talks about DimensionType in an example in the spec instead of as part of the spec itself and in the example comment names "entity", "classification", "program" and "etc." as possible values. The spec schema obviously ignores "etc." but restricts itself to the other three possible values which are clearly not the definite list.

CKAN API keys

This is not an immediate issue for this commandline script but may be an issue with how it will be used. The OpenSpending CKAN instance does not make it easy to figure out ones API key and may end up not even using that but an external service. This can be made to work with a proof of concept but needs to be addressed soon.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.