Git Product home page Git Product logo

inm-705's Introduction

INM 705 Deep Learning for Image Analysis by Arda Yigithan Orhan

This project implements and evaluates two object detection models, Single Shot MultiBox Detector (SSD) and Faster R-CNN, using the COCO dataset. The goal is to identify everyday objects in real-time within a retail environment, a capability essential for autonomous retail systems like cashier-less stores.

Table of Contents

Project Overview

This project focuses on developing an object detection system capable of identifying everyday objects in real-time within a retail environment. I employ two popular models, SSD and Faster R-CNN, both pre-trained on the COCO dataset, to detect and classify objects in images. The models were chosen for their balance between speed and accuracy, making them suitable for real-world retail applications.

Dataset

The project uses the COCO (Common Objects in Context) dataset, which contains over 200,000 labeled images across 91 object categories. The dataset is accessed using the Deeplake library, which provides efficient data management and loading.

Installation

To run this project, you need to install the required dependencies. Follow the steps below:

  1. Clone this repository:

    git clone https://github.com/yourusername/object-detection-retail.git
    cd object-detection-retail
  2. Set up a virtual environment:

    python -m venv venv
    source venv/bin/activate  # On Windows, use `venv\Scripts\activate`
  3. Install the required packages:

    pip install -r requirements.txt

Project Structure

The project is organized as follows:

project_root/
│
├── models/
│   ├── faster_rcnn.py       # Contains the Faster R-CNN model definition
│   ├── ssd.py               # Contains the SSD model definition
│
├── utils/
│   ├── dataset.py           # Dataset management and preprocessing functions
│
├── train.py                 # Script to train the models
├── inference.py             # Script to run inference using a trained model
├── requirements.txt         # Python package dependencies
├── setup.sh                 # Script to set up the environment and dependencies
├── README.md                # Project documentation (this file)
└── model.pth                # Saved model weights (generated after training)

Training the Models

To train the models, run the train.py script. This script initializes the chosen model (either SSD or Faster R-CNN), loads the COCO dataset, and begins training.

python train.py

Key Parameters

  • MODEL_TYPE: Choose between 'ssd' or 'faster_rcnn'.
  • NUM_EPOCHS: Set the number of training epochs.
  • BATCH_SIZE: Set the batch size for training.
  • LEARNING_RATE: Set the learning rate for the optimizer.
  • SAMPLE_SIZE: Set the number of samples to use for training, useful for quick tests under limited computational resources.

Running Inference

To run inference on a single image using a trained model, use the inference.py script. Specify the path to the image and the model type.

python inference.py --image path_to_image.jpg --model_type faster_rcnn

Visualizing Results

The inference script will output the image with bounding boxes drawn around detected objects, displayed using Matplotlib.

Evaluation

The evaluation during training is conducted using Intersection over Union (IoU) to measure the overlap between predicted bounding boxes and ground truth. The evaluation results are logged using Wandb for detailed analysis.

Known Issues

  • No Valid IoU Scores: During evaluation, the models fail to produce valid bounding boxes, leading to zero IoU scores. This could be due to incorrect data preprocessing or model initialization issues.
  • Limited Sample Size: Due to computational constraints, a reduced sample size was used for training, which may limit the model's ability to generalize effectively.

Future Improvements

  • Data Preprocessing: Review and validate the data preprocessing steps to ensure correct bounding box transformations and label alignment.
  • Hyperparameter Tuning: Experiment with different learning rates and batch sizes to improve model performance.
  • Sample Size: Use a larger sample size for training to ensure the model encounters a diverse range of object instances and variations.
  • Model Fine-tuning: Fine-tune the models more effectively for the COCO dataset to enhance detection accuracy.

inm-705's People

Contributors

ayorhan avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.