Git Product home page Git Product logo

mewim / governorapp Goto Github PK

View Code? Open in Web Editor NEW
1.0 0.0 1.0 53.76 MB

[CHI '23] Governor: Turning Open Government Data Portals into Interactive Databases

Home Page: https://doi.org/10.1145/3544548.3580868

License: MIT License

JavaScript 32.57% Shell 0.38% HTML 0.39% Vue 58.67% Python 8.00%
sigchi government-data opendata webapp paper scientific-publications human-computer-interaction

governorapp's Introduction

Governor App

Governor is an open-sourced web application developed to make open governmental data portals (OGDPs) more accessible to end users by facilitating searching actual records in the tables, previewing them directly without downloading, and suggesting joinable and unionable tables to users based on their latest working tables. Governor also manages the provenance of integrated tables allowing users and their collaborators to easily trace back to the original tables in OGDP.

Dependencies

Getting Started

  • Install all dependencies mentioned above. Note that the default configuration file assumes that MongoDB and Elasticsearch are installed on the same machine and not secured. If the configuration is different, please modify app.config.json accordingly.
  • Project setup:
npm install
npm run install-python
  • Crawl data from the OGDP:
bash crawl.sh
  • Pre-process the data and create the required indices:
bash preprocess.sh

Run server for development (with hot reload)

npm run serve

Run server for production (with compiled and minified JavaScript files)

npm run serve-deployed

Lint and fix files

npm run lint

Configuring Governor

Governor can be configured with a single JSON configuration (app.config.json) where a system administrator can specify the metadata fields that should be indexed for search and displayed on the front end, as well as the URL of the CKAN endpoint for crawling. The app.config.json file provided in this repository is an example for deployment with Data.gov.sg. By simply modifying the configuration file, Governor can be easily deployed with other open data portals. Below we provide a description of each field of the configuration file:

  • portal:
    • siteName: The name of the open data portal. For example, Open Canada.
    • siteUrl: URL prefix of the metadata page of the datasets. Will be concatenated with dataset UUIDs to generate links to the original dataset. For example, https://open.canada.ca/data/en/dataset/.
    • packageApiUrl: The CKAN API endpoint for harvesting the metadata information. For example, https://open.canada.ca/data/api/action/package_search.
    • fileDownloaderConcurrency: The max number of concurrent threads for crawling the files. Should be set according to the bandwidth and rate limit of the portal.
  • mongodb:
    • uri: The URI of the MongoDB server starting with mongodb://.
    • db: The database name of the MongoDB.
    • metadataIndexFields: Fields in the metadata that should be indexed for search. Nested fields are supported in the format of fieldA.subFieldB.
  • elasticsearch:
    • uri: The URI of the Elasticsearch REST API server starting with http:// or https://.
    • index: The index name of the Elasticsearch.
    • token: The bearer token of Elasticsearch REST API server for authentication. Can be left empty if the server is not secured.
  • frontend:
    • search:
      • fields: An array of all the field to be displayed on the search result page for the dataset (top level entry).
        • fieldName: The name/path of the field. Nested fields are supported in the format of fieldA.subFieldB.
        • displayName: Human-readable name of the field.
        • type: The type of the field. Currently 3 different formats are supported: 1) text, which will be directly rendered; 2) date, which will be formatted as Month Day Year; 3) list, which will be formatted as multiple labels.
      • resourcesFields: An array of all the field to be displayed on the search result page for the resources (tables under a dataset). The definition of each field is the same as that of frontend.search.fields.
    • preview:
      • fields: An array of all the field to be displayed on the table preview page for the dataset (under "Dataset Details"). The definition of each field is the same as that of frontend.search.fields.

Citing Governor

If you are a researcher and use Governor in your work, we encourage you to cite our work. You can use the following BibTeX citation:

@inproceedings{governor:chi,
  author =  {Chang Liu and
             Arif Usta and
             Jian Zhao and
             Semih Saliho\u{g}lu},
  title={Governor: Turning Open Government Data Portals into Interactive Databases},
  booktitle={Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI '23)},
  year={2023}
}

governorapp's People

Contributors

mewim avatar

Stargazers

 avatar

Forkers

patlittle

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.