Git Product home page Git Product logo

constructionstageclassification's Introduction

Construction Stage Classification

Construction stages classification

Dataset

The dataset consists of 4 cameras. Each camera captures 10 sequences (structures) such as each sequence has 87 timestamps However, not all of these images are labelled. The ones without labels will be removed from dataset

Splitting dataset

Generally, we can split them in ratio of 6-2-2 or 8-1-1. In this assigment, I want to use the latter approach since I believe that we'd have too few images for training if applying 6-2-2. For each camera I randomly choose 8 sequence for training, 1 for testing and 1 for validation. Overall, I use 36 sequences for training, 4 for validation and 4 for testing

Task 1

I used RestNet50 as the backbone and only add a linear layer at the end to adjust to number of classes in our case

Run the training

python train.py --train --num_epoch 200 --batch_size 4 --lr 0,005

Run the evaluation to see how well the model performs on test set.

The accuracy = number of accurate predictions / number of images in test set. I trained for 300 epochs and get the accuray of 99.62%

python train.py --test

Test a specific image from the test set

python test.py --idx 1

Task 2

Unfortunately I am not able to finish the task. My idea is to utilize the Vision transformer architecture to solve it. Specifically, we take e.g 25 images from the sequence then consider that them as patches of a bigger image. It is similar to ViT dividing image into small patches, only different that here we are given the patches already. In terms of the positional encoding in ViT, we can also do the same for our task here. Thanks to the given timestamps we can have the position of each image in the sequence and therefore obtain the embedded position. Nevertheless, I haven't been able to put the idea into work since the adjustment for the shape in each layer is more complex than I expected

References

https://medium.com/@brianpulfer/vision-transformers-from-scratch-pytorch-a-step-by-step-guide-96c3313c2e0c

constructionstageclassification's People

Contributors

kudo510 avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.