Git Product home page Git Product logo

caper.ai-project's Introduction

Caper.ai Project Write-Up

Hi Caper.ai Team Member! I wanted to create a mini-project to show my interest in joining Caper.ai. I built a tiny OpenCV program that tracks when an item is inserted into a cart. Doing this gave me a feeling for what technical challenges making the Caper Cart might involve.

My code is in the cart.py file. I used Python 3 and OpenCV 4.5.3. You'll also need the following files placed in a directory /yolo-coco.

https://github.com/pjreddie/darknet/blob/master/cfg/yolov3.cfg

https://github.com/pjreddie/darknet/blob/master/data/coco.names

https://pjreddie.com/media/files/yolov3.weights

Please let me know if you hit any issues running it.

A general overview of how my code works is given below -

I used pretrained YOLO object detection which was trained on the COCO dataset. It then detects a plastic water bottle based on webcam images. I chose a bottle as a stand in for a generic grocery item, since it's easier for you to test on your computer. If an object is detected, but then loses detection for 4 frames, the test for whether an object was inserted into the cart is applied. If the last known vertical location is lower than the median vertical location of that object, it's assumed that the object is inserted.

I've also added barcode detection to the code. My thinking behind this was that the chances of detecting a barcode was fairly low, but highly accurate. If a barcode is detected, that's a very good identifier for the product being added to the cart. My code currently detects the barcode, then applies the perspective transform so that the barcode appears clearly. It would be trivial to then apply a barcode reading library, however my Macbook was unable to install it.

I noticed a few potential roadblocks for the Caper Cart. I'd be very interested to learn how the company has worked to resolve these

  • Bulk items like multiple paper towel rolls. I noticed the Caper Cart doesn't have the usual rack underneath like a regular shopping cart does.

  • Localization in the store. I assume you are using Bluetooth beacons since I'm not sure if there's any good alternative. SLAM is too computationally expensive and GPS isn't accurate enough.

  • Produce bags. How would the cameras detect a fruit or vegetable in a plastic bag, obscuring the object? I'm wondering if the current solution is manually keying in the item.

  • Tracking whether an object has entered or left the cart. My first instinct was to use linear regression on the vertical position of the object vs time data, but this isn't robust against indecisive item insertion. What I chose was a simpler and more abstract method. Looking at the object's last known vertical position and checking whether that's below the median value of the object's vertical positions seems to give better results.

  • I didn't add a weight sensor, but I'm curious if you've resolved the "unexpected item in bagging area" error. While it's probably correctly flagged as an error from a technical standpoint, the UX of encountering that is terrible.

TESTING INSTRUCTIONS

$ python3 cart.py

In case it doesn't run, I have also included the output from a test run of the code. output.avi is the bounding box and notification that an item was inserted to cart, while the warped.jpg is the warped barcode image. Please note that due to age, my computer ran this very slowly, which is why my sample output is choppy.

Thank you for reading!

caper.ai-project's People

Contributors

fredkozlowski avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.