Git Product home page Git Product logo

codeforafrica / openafrica Goto Github PK

View Code? Open in Web Editor NEW
27.0 20.0 9.0 177 KB

openAFRICA aims to be largest independent repository of open data on the African continent. This repo contains the primary deployment scripts and files. Accessible at https://openafrica.net/

Home Page: https://openafrica.net/

License: GNU General Public License v3.0

Makefile 13.30% Dockerfile 56.63% Shell 28.09% Procfile 1.97%
ckan openafrica open-africa dokku docker deployment data-portal ckanext-openafrica docker-image open-data

openafrica's Introduction

openAFRICA

The continent's largest volunteer-driven open data portal.

CKAN version

This repo seeks to streamline deployment of the openAFRICA platform by pulling together the different components used for openAFRICA and deploy using dokku.

CKAN

CKAN is an open-source DMS (data management system) for powering data hubs and data portals. CKAN makes it easy to publish, share and use data. It powers datahub.io, catalog.data.gov and data.gov.uk among many other sites.

We use CKAN's own vanilla releases but because they haven't properly adopted Docker and dockerhub (yet) for deployment, we're keeping a stable version (codeforafrica/ckan:latest) that we can be sure plays nice with our extenstions.

The ckan extensions we are using include:


Development

To set up your development environment:

$ git clone https://github.com/CodeForAfricaLabs/openAFRICA.git

$ cd openAFRICA

Run this command (found on the docker-compose.yml):

docker-compose build && docker-compose up

Updating CKAN Docker Image

To update the openafrica/ckan:latest Docker image, edit Makefile and then run:

make ckan

Tests

?


Deployment

We use dokku for deployment so you'd need to install and set it up first;

 # for debian systems, installs dokku via apt-get
 $ wget https://raw.githubusercontent.com/dokku/dokku/v0.11.3/bootstrap.sh
 $ sudo DOKKU_TAG=v0.11.3 bash bootstrap.sh
 # go to your server's IP and follow the web installer

Install + Create Dependencies

Once installed, we can do the following:

  1. Create the Dokku app and add a domain to it
dokku apps:create ckan
dokku domains:add ckan openafrica.net
  1. Add letsencrypt for free https certificate

Install the dokku-letsencrypt plugin and set the config variables

sudo dokku plugin:install https://github.com/dokku/dokku-letsencrypt.git
dokku config:set --no-restart ckan [email protected]
  1. Create CKAN Solr Instance

CKAN uses a special schema for Solr so you should deploy openafrica/solr

dokku apps:create ckan-solr

sudo docker volume create --name ckan-solr
dokku docker-options:add ckan-solr run,deploy --volume ckan-solr:/opt/solr/server/solr/ckan

sudo docker pull codeforafrica/ckan-solr:2.7.6
sudo docker tag codeforafrica/ckan-solr:2.7.6 dokku/ckan-solr:latest

dokku git:from-image ckan-solr dokku/ckan-solr:latest

  1. Create Redis Instance

Install the redis plugin.

sudo dokku plugin:install https://github.com/dokku/dokku-redis.git redis
dokku redis:create ckan-redis

  1. Create CKAN DataPusher Instance

DataPusher is a standalone web service that automatically downloads any CSV or XLS (Excel) data files from a CKAN site's resources when they are added to the CKAN site, parses them to pull out the actual data, then uses the DataStore API to push the data into the CKAN site's DataStore.

dokku apps:create ckan-datapusher

sudo docker pull openafrica/ckan-datapusher:latest
sudo docker tag openafrica/ckan-datapusher:latest dokku/ckan-datapusher:latest

dokku git:from-image ckan-datapusher dokku/ckan-datapusher:latest

  1. Install Postgres (Optional)

This is an optional step if you'd like to have Postgres installed locally;

sudo dokku plugin:install https://github.com/dokku/dokku-postgres.git postgres
dokku postgres:create ckan-postgres

  1. Install RabbitMQ

Install the RabbitMQ plugin (The harvest extension uses this as its backend)

sudo dokku plugin:install https://github.com/dokku/dokku-rabbitmq.git rabbitmq
dokku rabbitmq:create ckan-rabbitmq
  1. Set up S3

Create a bucket and a programmatic access user, and grant the user full access to the bucket with the following policy

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:*"
            ],
            "Resource": [
                "arn:aws:s3:::openafrica/*",
                "arn:aws:s3:::openafrica"
            ]
        }
    ]
}
  1. Create CKAN filestore volume

Create a named docker volume and configure ckan to use the volume just so we can configure an upload path. It should be kept clear by the s3 plugin.

sudo docker volume create --name ckan-filestore
dokku docker-options:add ckan run,deploy --volume ckan-filestore:/var/lib/ckan/default

Configuration

Now we configure to pull the dependencies together:

Get the Redis Dsn (connection details) for setting in CKAN environment in the next step with /0 appended.

dokku redis:info ckan-redis

Get the RabbitMQ Dsn (connection details) and extract the username, password, hostname, virtualhost and port. You need these details because the harvester extension in its current form does not support configuration using RabbitMQ URI scheme. The URI is in the form

amqp://username:password@hostname:port/virtualhost

Set CKAN environment variables, replacing these examples with actual producation ones

  • REDIS_URL: use the Redis Dsn
  • SOLR_URL: use the alias given for the docker link below
  • BEAKER_SESSION_SECRET: this must be a secret long random string. Each time it changes it invalidates any active sessions.
  • S3FILESTORE__SIGNATURE_VERSION: use as-is - no idea why the plugin requires this.
dokku config:set ckan CKAN_SQLALCHEMY_URL=postgres://ckan_default:password@host/ckan_default \
                      CKAN_DATASTORE_READ_URL=postgresql://ckan_default:pass@localhost/datastore_default \
                      CKAN_DATASTORE_WRITE_URL=postgresql://datastore_default:pass@localhost/datastore_default \
                      CKAN_REDIS_URL=.../0 \
                      CKAN_INI=/ckan.ini \
                      CKAN_SOLR_URL=http://solr:8983/solr/ckan \
                      CKAN_SITE_URL=https://openafrica.net/ \
                      CKAN___BEAKER__SESSION__SECRET= \
                      CKAN_SMTP_SERVER= \
                      CKAN_SMTP_USER= \
                      CKAN_SMTP_PASSWORD= \
                      [email protected] \
                      CKAN___CKANEXT__S3FILESTORE__AWS_BUCKET_NAME=openafrica \
                      CKAN___CKANEXT__S3FILESTORE__AWS_ACCESS_KEY_ID= \
                      CKAN___CKANEXT__S3FILESTORE__AWS_SECRET_ACCESS_KEY= \
                      CKAN___CKANEXT__S3FILESTORE__HOST_NAME=http://s3-eu-west-1.amazonaws.com \
                      CKAN___CKANEXT__S3FILESTORE__REGION_NAME=eu-west-1 \
                      CKAN___CKANEXT__S3FILESTORE__SIGNATURE_VERSION=s3v4 \
                      CKAN__HARVEST__MQ__VIRTUAL_HOST=ckan-rabbitmq \
                      CKAN__HARVEST__MQ__PORT=5672 \
                      CKAN__HARVEST__MQ__HOSTNAME=dokku-rabbitmq-ckan-rabbitmq \
                      CKAN__HARVEST__MQ__PASSWORD=912abee9882be7ca8718d3cab7263cfd \
                      CKAN__HARVEST__MQ__USER_ID=ckan-rabbitmq \

Link CKAN with Redis, Solr, and CKAN DataPusher;

dokku redis:link ckan-redis ckan  #noqa
dokku docker-options:add ckan run,deploy --link ckan-solr.web.1:solr
dokku docker-options:add ckan run,deploy --link ckan-datapusher.web.1:ckan-datapusher

Scheduled Jobs

For openAFRICA to work perfectly, some jobs have to run at certain times e.g. updating tracking statistics and rebuilding the search index for newly uploaded datasets. To create a scheduled job that is executed by a Dokku application, follow these steps:

sudo su dokku
crontab -e

Add the following entries

0 * * * * echo '{}' | dokku --rm run ckan paster --plugin=ckan post -c /ckan.ini /api/action/send_email_notifications > /dev/null

0 * * * * dokku --rm run ckan paster --plugin=ckan tracking update -c /ckan.ini

*/15 * * * * dokku --rm run ckan paster --plugin=ckanext-harvest harvester run --config=/ckan.ini

Deploy CKAN

Once done with installing and configuring, you can push this repository to dokku:

git remote add dokku [email protected]:ckan
git push dokku

Initialize Database

Before you can run CKAN for the first time, you need to run db init to initialize your database

dokku enter ckan
cd src/ckan
paster db init -c /ckan.ini

Lastly, let's make sure we encrypt traffic:

dokku letsencrypt ckan

NOTE: Make sure to have the appropriate permissions to push to dokku.


Contributing

Thank you for considering to contribute to this project. You are awesome. :)

To get you started, here are few pointers:

Check out the development docs to get started on this repo locally.

Security Vulnerabilities

Please report on security vulnerabilities to [email protected]. These will be promptly acted on.


License

GNU General Public License

openAFRICA aims to be the largest independent repository of open data on the African continent. Copyright (C) 2017 Code for Africa

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see https://www.gnu.org/licenses/.

openafrica's People

Contributors

davidlemayian avatar esirk avatar phillipahereza avatar thepsalmist avatar wnjihia avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

openafrica's Issues

Log server errors to Slack

Whenever the openAFRICA server encounters errors, we should log them on Slack for quick reaction time.

Add feature to enable users favourite datasets so they can easily find them

Users might find datasets that they believe can be useful to them someday but not being able to add the datasets to a list of Favourites makes it difficult to remember such datasets. A feature which enable users to add datasets to a Favourites list will assist them in keeping track of every dataset that they have interest in.

Document s3 folders

So that we know what each folder in the s3 bucket holds, we should add some info here.

Add feature to remind registered organizations via email to upload new datasets

Organizations within the platform upload the most reliable datasets. There needs to be a way to remind and motivate them to add more datasets. One way to motivate them is through visible engagement (comments and likes) by users on datasets. Also, reminding the organization about adding new datasets will help them keep it in mind.

I suggest that one of the comments extensions available be installed on OpenAfrica and a Favorite/Like feature be built. Also, building a feature that sends an email to the creators of organizations three months after the last dataset was added (and quarterly from then until a new dataset is added) would assist OpenAfrica achieve its aim of becoming the largest independent repository of open data on the African continent.

Integrate with Travis CI

As this is a repo specifically for deployment, we should align our tests in #3 for inclusion in our code integration.

To do this, let's integrate with Travis-CI.

Add Data Request button on openafrica home page.

Currently:

When you install ckanext-datarequest on your ckan instance, it shows the button for DATAREQUEST. But once ckanext-openafrica is installed, the button is overridden.
Navigating to /datarequest show the page for requesting for dataset but the button is missing on the home page. see https://africaopendata.org/datarequest

Desired Behaviour:

A user should be able to see the datarequest button on the home page, which when clicked, should navigate to /datarequest.

Why This Issue is Important:

Without the button, the user may not have any clue that datasets can be requested for. Moreover, a user who is aware of datarequest will not have to type /datarequest but can easily navigate to the page by clicking on the button.

Notes:

  • This issue can easily be handled by replicating the Suggest a dataset button shown at the button when no dataset is found for a user's search.
  • Since ckanext-datarequest adds the button, it seems like a bug when the button does not appear after installing ckanext-openafrica

Add cron jobs to deployment process

When we deploy a container, we should ensure the cron jobs necessary are live. We should therefore work with how dokku approaches cron jobs.

Add Github templates

We should add our standard Github templates to streamline our development process.

[DATA REQUESTS] Unable to delete spam data

Description

Having trouble deleting spam data requests, on deleting error 404 this resource could not be found is shown.

Tasks

  1. To troubleshot 404 error message.
  2. To make sure spam data requests are deleted.

Publish releases aligned with Docker image versions

Description

We currently create Docker images that are versioned as part of the development process for openAFRICA, we should also align these with releases created here.

Tasks

  • Improve Docker image creation documentation
  • Add release process in documentation
  • Add a changelog
  • Create first release

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.