Git Product home page Git Product logo

electric_meters_clustering_1's Introduction

Electric Energy Meters Clustering

Andres Fernando Garcia 2021-06-05

Project details

The present project consists of separating into groups a dataset of Electric Meters. The application of this work can be useful for many purposes such as:

  • Detecting failures
  • Detecting fraudulent usage
  • Distinguish user types for tariff design
  • Demand-side management

The dataset consists of measures of 1507 smart electric meters (after delete meters with missing data) during 1 year. Measurements are expressed in kilowatts [kw]. The sampling period for these measures is 15 minutes. This results in a dataset with a size of 1507x35040.

Data exploration

One of the difficulties is to explore the data through some common exploratory charts as boxplots or histograms. This due to the huge size of the dataset. Instead of these kinds of charts, a scatter plot with maximum and minimum measures is shown. Further, the graph includes the dispersion of each electric meter.

Analyzing the previous plot it is possible to conclude 2 things:

  • Most of the measurements are less than 500 kW
  • Several electric meters have non-zero measures all the time

An overview of all data with measures less than 500 kW is showed in the next figure:

With this image, it is possible to see that most of the measurements are below 200 kW. In addition, the variation of the electric power detected by each meter was almost constant throughout the year.

The following plot shows the measures taken by an electric meter with values less than 250 kW, supposing that the first measure was taken on January 1, 2015, at 00:00.

The following plot shows the measures taken by an electric meter with values greater than 500, supposing that the first measure was taken on January 1, 2015, at 00:00.

Results

The first clustering algorithm to be tested was hierarchical clustering. Firstly, the Principal Component Analysis was run to perform dimension reduction. The final dendrograms were divided to get from 2 to 10 clusters and the number of clusters was selected using the elbow method.

The second clustering algorithm to be tested was Density-based spatial clustering of applications with noise (DBSCAN). Firstly, the Principal Component Analysis was run to perform dimension reduction. The radius of a neighborhood was selected according to the deciles of all data. The final radius of a neighborhood was selected using the elbow method with the corresponding number of clusters.

The third clustering algorithm to be tested was the classical k-means clustering. Firstly, the Principal Component Analysis was run to perform dimension reduction. How the number of clusters (k) was not specified, this parameter was selected using the elbow method varying k from 2 to 10.

The final clustering algorithm to be tested was spectral clustering. In this algorithm, it is necessary to reduce the dimensionality of the dataset using Locality-Preserving Projection. How the number of clusters (k) was not specified, this parameter was selected using the elbow method varying k from 2 to 10. To plot the resulting clusters, the 1st and 2nd scores of PCA were used.

Analysis of results

The silhouette plot is used to select the best clustering algorithm for this dataset. With hierarchical clustering and DBSCAN, it is possible to note that the “main” group gathers the majority of electric meters. According to their silhouette plots, the “main” group members are very similar to each other. How these results have groups with few members, they are not a good option to divide the dataset into useful clusters. The k-means clustering and spectral clustering algorithms give groups with many more members than the previous algorithms. If both silhouette plots are compared, the graph of the k-means clustering shows that this algorithm has better groups. Thus, the conclusion is that the best clustering algorithm to separate the electric meters dataset of the present project is the k-means algorithm.

electric_meters_clustering_1's People

Contributors

and88x avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.