trickvi / bdp2ckan Goto Github PK
View Code? Open in Web Editor NEWTurn a Budget Data Package into a CKAN instance
License: GNU General Public License v3.0
Turn a Budget Data Package into a CKAN instance
License: GNU General Public License v3.0
Because dimension and measurement keys are unknown the schema has to rely on patternProperties
which is very messy. Especially with the mix between measures
and dimensions which need to use a negative lookahead. It's a non issue, but still very messy.
The currency is a property of the measures and is now only grabbed by looping over all measures and extracting the currencies and then submitting that as metadata on a dataset level. The script should look at what resources use the currency and put the currency as metadata on the resource, instead of the dataset. That might also be non straightforward because a resource might have multiple currencies as well. Need to figure out if this is something we need to worry about.
In the examples of the budget data package specification it is clear that it's possible to define the source with a string, not only an object.
The schema can only evaluate:
{
"year": {
"source": "source field name"
}
}
but not
{
"year": "source field name"
}
The budget data package schema assumes that the specification will use source
in the mapping to say where the source column is but that is only a common pattern used for budget data packages. It's not clear what would be the official way (although it's very likely to be source
).
Licenses in CKAN are defined by a license.json
file which may or may not be compatible with the licenses supported in Budget Data Package (the OD list). We need to figure out how to handle license identifiers properly.
It is not possible in JSON schema to indicate that one of the things in an array should include a specific property key with a specific value (i.e. one of the fields should have a primaryKey
set to true
).
CKAN does not support multiple licenses which Budget Data Package does support. We need to figure out how to properly support the licenses
property.
The data package specification (and budget data package as well) says that the descriptor can either use license
OR licenses
but not both. This can't be captured in the schema and is really weird for the data package as well because data packages allow additional properties. So this is an example of trying to capture:
You can have any additional properties EXCEPT license
IF you already have licenses
OR license
IF you already have license
but you don't have to have either of those because these are optional fields.
The Budget Data Package specification only talks about DimensionType in an example in the spec instead of as part of the spec itself and in the example comment names "entity", "classification", "program" and "etc." as possible values. The spec schema obviously ignores "etc." but restricts itself to the other three possible values which are clearly not the definite list.
This is not an immediate issue for this commandline script but may be an issue with how it will be used. The OpenSpending CKAN instance does not make it easy to figure out ones API key and may end up not even using that but an external service. This can be made to work with a proof of concept but needs to be addressed soon.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.