Git Product home page Git Product logo

swisshikingdata's Introduction

Swiss Hiking Data Updater

Backend service for Swiss Hiking project. This is meant to fetch hiking routes from online providers and aggregate the result into a Datastore.

The frontend code is at https://github.com/lyx-x/SwissHiking. They share the same GCP project ID.

Installation

To install the package and run bin/runner.py, run:

cd SwissHikingData
pip3 install -e .

How to setup GCP Cloud Functions

This project is currently running on GCP, this section records necessary steps to make it happen (to abide by GCP's requirement). We are going to talk about Cloud Datastore, Cloud Source Repositories, Cloud Functions, Cloud Build, Cloud Pub/Sub and Cloud Scheduler.

Cloud Datastore

Datastore keeps track coordinates as well as update status for each content provider. Depending on the source, the updater can choose different updating strategy in order to minimize the load on both side.

To run the project locally with bin/runner.py, we need add credential to our job. You need to first create a Service account for your project and a json file containing your credentials. Then you can add the path to this json file to the GOOGLE_APPLICATION_CREDENTIALS environment variable. Details are at https://cloud.google.com/docs/authentication/production.

Accessing Datastore from other APIs within the same project doesn't need extra configuration.

Cloud Source Repositories

This project is hosted on Github and cloned by Cloud Source Repositories at https://source.cloud.google.com/swiss-hiking/github_lyx-x_SwissHikingData. Cloud Build currently only works with Cloud Source Repositories, hence this choice.

Cloud Functions

Cloud Functions is a request handler. The request can be HTTP, Pub/Sub, etc., they need to be handled by different piece of code living in main.py. This function can live anywhere as long as you have all dependencies listed in requirements.txt alongside this file. In addition, you can't install any dependencies by yourself, so it's better to use relative import in the code. This project has a setup.py, but this is not required.

The entry point main.py is pretty simple, it parse the input data (json) and acts accordingly. HTTP trigger supports argument list, but it's recommended to use json data as in this project.

Once you have this file, you can create a Cloud Function using either console or command line tool.

Cloud Build

Once you have the project, you may want to use continous integration. Cloud Build allows you to specify a list of actions to be taken in order to deploy your function (so the last action should be the deployment itself). To enable Cloud Build, you need to follow these steps:

  1. Create cloudbuild.yaml file under the root directory
  2. Add a build trigger (on Cloud Build console for example)
  3. Add Cloud Functions Developer role to the Cloud Build service account ([email protected]). Go to Console > IAM & Admin > IAM.
  4. Let Cloud Functions Service Agent be a Service Account User of the current project, i.e. add Cloud Functions Service Agent as a member of App Engine default service account with Service Account User role. This can be done on Console > IAM & Admin > Service Accounts or by running the following command:
gcloud iam service-accounts add-iam-policy-binding [PROJECT_ID]@appspot.gserviceaccount.com --member=service-[SOME_NUMBER]@gcf-admin-robot.iam.gserviceaccount.com --role=roles/iam.serviceAccountUser`

Cloud Pub/Sub

Pub/Sub lets you publish messages on a topic and subscribe to that topic. It can be used as a trigger for Cloud Functions. For this project, we created a topic data-updater under the same project. We use attributes field of the message to control the behavior of our function. This is almost the same as passing a json data to the HTTP request. However, the testing UI of Cloud Functions doesn't seem to work. We need to publish a real message on Pub/Sub and look at the log of the execution.

Cloud Scheduler

One last piece of the puzzle is to schedule the Cloud Function with a cron job. This cron job publishes regularly to the Pub/Sub topic, thus invokes the function at the same time.

Cloud Scheduler is currently not available in every region, a walk-around is to create Cron job in the Frontend App Engine and let the frontend publish to the Pub/Sub topic.

swisshikingdata's People

Contributors

lyx-x avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.