Git Product home page Git Product logo

human-activity-recognition's Introduction

The 7 day Project - Human Activity Recognition

The Story

First of all, this project wasn't 7 days. Eventually I planned for 14 hours, because I only had 14 hours in 7 days. This was "only" a side project next to my work.

I am writing this description after 14 days. And I guess there were 4 days when I wasn't working on this project. So I did it in 10 days ๐Ÿ‘Œ.

What is this project all about? I saw a Youtube video from Daniel Bourke. He did a 42 day project, and I was inspired by this. I've just finished my first Udacity course (Intro to Machine Learning) and I thought, it would be a great practice to do a machine learning project. So I decided, I'm going to do something like this 42 day project, but on my own way.

First, I searched for exciting datasets and I found three really cool ones.

  1. Human Activity Recognition Using Smartphones Data Set
  2. Bar Crawl: Detecting Heavy Drinking Data Set
  3. Daily Demand Forecasting Orders Data Set

In the first dataset we should find different activities with the help of accelerometer and gyroscope data from smartphone. I thought I could solve this, and I could build a model, which could do this job. I found it challenging to achive a high accuracy, but I've already read some of the papers regarding to this project and the accuracy of the models was above 90%, which is great ๐Ÿ‘.

Second dataset hmm. My personal favourite. Here we should find heavy drinking 'episodes' with accelerometer data from smartphones. I have taken a look at the paper here as well. The accuracy was ~77,5%. It is high, but not as high as in the previous one. I thought the success rate is not so high as in the previous one. But to find drunk events with the help of a smartphone, it is cool. ๐Ÿป

The third one. I 've looked for a third dataset and I found this. Predicting total of orders for daily treatment. It sounds a bit boring in comparsion with the first and second one, but I was pretty sure, that I can do this within 7 days. ๐Ÿšš

I decided, that I'll do the first one. Why? Because it is the most logical one for me ๐Ÿง. It has a lot of resources. I've already worked with sensor data and I saw in other papers that this is doable. Since this will be my first bigger project and I was not sure about the result of the second one, I chose the first dataset.

The Work

Data

In the dataset from the UCI, there are the train test split and the raw data as well.

I wanted to use the raw data, because If I do all feature engineering and data wrangling I can understand the data better and I can learn more.

I used these files:

  • RawData/exp_*.txt โžก all measurement file as a txt file
  • RawData/labels.txt โžก informations about all measurements and activities
  • activity_labels.txt โžก meaning of activity labels

Notebooks

The biggest part of the whole project was the data preparation, It was really challenging. You can walk through on all steps in the notebooks/human-activity-recognition-data-preprocessing.ipynb

After the data preprocessing I tried out some of the classifier models from the sckit learn library. I found the random forest classifier the best for this task. The initial accuracy was 97,86%. I was really suprised, because this accuracy was higher than the accuracy in some papers. You can find the machine learning part of the project in the notebooks/human-activity-recognition-ml.ipynb. In the future I would like to fine tune this, because I believe it is possible to achive a better accuracy.

After I created the appropriate data, and I trained a model to classify human activity. I wanted to see the results of the project not only in scores, but visually as well. Therefore I created another notebook with an opportunity to test the original data and to test any correctly recorded data. This is the notebooks/human-activity-recognition-classifier.ipynb.

The first plot of the data is a comparsion between the real labels of the experiment (upper part), and the labels of my classifier (lower part). I plotted one channel (acc_x) of the smartpohne's accelerometer, and the labels as colored areas, for better interpretation.

There are two main differencies between the classified data and the original data:

  1. The resolution of the labels in the classified plot. The answer is really easy, I used 128 sample blocks to train my model, and this is the resolution of my classifier. This means, the model can only calssify 128 sample blocks in a measurement. This corresponds to 2.56 seconds. (the sample rate of the measurements is 50Hz, 1s has 50 samples than 2.56s has 128 samples).
  2. The white parts in the original data. These parts have no labels, it means, that we have no idea what acitvities they are. When I created the train dataset I dropped these part of the measurements. If I used them I could use them as labels for no activity, but this is not totally true, because there is an activity there, but we don't know which activity that is. These part of the mesurement for this usecase are noise. Without of these noises we have only the 12 labels for the 12 activities. Therefore when we look at the classified data we cannot see white spaces. The model classifies all the datapoint in the measurement. Which is cool, because we can see the originally unlabeled parts of the experiment as labeled.

exp_1

human-activity-recognition's People

Contributors

meszdav avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.