Git Product home page Git Product logo

py-st-dbscan's Introduction

py-st-dbscan

An implementation of ST-DBScan algorithm using Python language. For more information, see the paper:

Birant, D. and Kut, A. (2007). St-dbscan: An algorithm for clustering 
spatial–temporal data. Data & Knowledge Engineering, 60(1):208 – 221. 
Intelligent Data Mining.

Related work

DBSCAN is a spatial density-based clustering algorithm for applications with noise. This algorithm does not require the number of clusters, this value is identified based on the quantity of highly density connected components. The required parameters are the radius and the minimum number of neighbors.

From these parameters, clusters with different formats and the same density, are found [Sander et al., 1998]. This algorithm can be applied in several contexts in which the identification of densely connected components is desired (e.g., delimitation of deforested regions and identification of areas of organs affected by tumors). In all these contexts, clusters are identified considering spatial characteristics of the elements.

We can find many variations of DBSCAN in the literature, one of these is ST-DBSCAN, which takes into account both spatial and non-spatial aspects (e.g., temperature, color, or time) of the elements [Birant and Kut, 2007]. However, in this work ST-DBSCAN is used for identifying traffic congestions from geospatial data provided by an application of taxi calling. In this context, many efforts have been made to identify traffic congestion using clustering algorithms. In [Kianfar and Edara, 2013], the efficiency of Kmeans, Hierarchical Clustering, and Gaussian Mixture Models (GMM) were compared identifying congested and free roads. In the other way, [Liu et al., 2010] developed an algorithm called Mobility-Based Clustering for find agglomerations of mobile objects (e.g., taxis) in cities. They defend that vehicle speed can be predicted through its accumulation. Finally, a deep learning method, called Restricted Boltzmann Machine, was proposed for predicting traffic conditions from data generated by taxi GPS in order to recommend roads for taxi drivers [Niu et al., 2015].

Sample application Traffic Congestion Detection using Taxi Position

Traffic congestion is a frequent situation in urban centers nowadays. It occurs because the urban infrastructure can not keep up with the growth of the number of vehicles. Thus, it causes many drawbacks, such as stress, delays, and excessive fuel consumption. This application aims to identify traffic congestions using the geographic position of taxis provided by GPS.

We assume that a vehicle speed can be estimated by its position in different times. Thus, we applied a density clustering algorithm, which takes into account both spatial and non-spatial aspects [R3], to identify traffic congestions from taxi positions.

Sample result for Curitiba data

Reference

[Birant and Kut, 2007] Birant, D. and Kut, A. (2007). St-dbscan: An algorithm for clustering spatial–temporal data. Data & Knowledge Engineering, 60(1):208 – 221. Intelligent Data Mining.

[Sander et al., 1998] Sander, J., Ester, M., Kriegel, H.-P., and Xu, X. (1998). Density-based clustering in spatial databases: The algorithm gdbscan and its applications. Data Min. Knowl. Discov., 2(2):169–194

[Liu et al., 2010] Liu, S., Liu, Y., Ni, L. M., Fan, J., and Li, M. (2010). Towards mobility-based clustering. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’10, pages 919–928, New York, NY, USA. ACM.

py-st-dbscan's People

Contributors

fernando-carvalho avatar lucasmsp avatar waltersf avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.