Git Product home page Git Product logo

flowmap's Introduction

Kalin Project

This scala project aims to ETL (extract, transform and load) spatiotemporal data of human mobility for further analysis.

Beyond the preprocessing jobs, Kalin also introduces a stlab package to facilitate spatotemporal data analysis with Spark RDD.

Build

$ sbt package

or with dependencies

$ sbt assembly

Run testing

This project relies on specs2 to perform unit test in Scala. Run test command to activate shipped testing cases.

$ sbt test

Available Library

In this project, package cn.edu.sjtu.omnilab.stlab undertakes more general works on processing spatiotemporal data points ("RDD" in parentheses means that method is feed data in Spark RDD container).

  • STUtils: useful small toolkit for transformation of spatiotemporal data.
  • GeoPoint: basic representation of geographic points in LON/LAT or Cartesian coordinate.
  • GeoMidPoint (RDD): calculate the geographic midpoint given a series of points. Two algorithms (average lon/lat and Cartesian) are implemented with nearly the same result in most scenarios.
  • RadiusGyration (RDD): calculate the radius of gyration (RG) from individual movement history.
  • TidyMovement (RDD): remove redundant information in users' movement history.

Available Spark Jobs

Package cn.edu.sjtu.omnilab.flowmap.hz contains jobs for HZ mobile data:

  • CountLogsJob: count the number of logs of feed input;
  • PrepareDataJob: separate input logs into isolated sets by day;
  • RadiusGyrationJob: calculate the radius of gyration from human movement data;
  • GeoRangeJob: give basic dimensions of data, including geo-range, unique logs/users/cells numbers;
  • FilterDataGeo: filter data to leave out logs outside given geo-range, specifically out the range of HZ administrative area;
  • TidyMovementJob: filter out redundant movement history to keep data brief;
  • SampleUsersJob: sample users of high data quality out of the total population;

Package cn.edu.sjtu.omnilab.flowmap.d4d contains jobs for D4D data:

  • TidyMovementJob: filter out redundant movement history to keep data brief;

flowmap's People

Contributors

caesar0301 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.