Git Product home page Git Product logo

openfoodfacts-ai's Introduction

openfoodfacts-ai

GitHub language count GitHub top language GitHub last commit Github Repo Size codecov Code style: black

❗ Before you read on

🔬 Projects

Here are different experiments.

Nutrition table

Category prediction

Weekly meetings

  • We e-meet Mondays at 17:00 Paris Time (16:00 London Time, 21:30 IST, 08:00 AM PT)
  • Google Meet Video call link: https://meet.google.com/qvv-grzm-gzb
  • Join by phone: https://tel.meet/qvv-grzm-gzb?pin=9965177492770
  • Add the Event to your Calendar by adding the Open Food Facts community calendar to your calendar
  • Weekly Agenda: please add the Agenda items as early as you can. Make sure to check the Agenda items in advance of the meeting, so that we have the most informed discussions possible.
  • The meeting will handle Agenda items first, and if time permits, collaborative bug triage.
  • We strive to timebox the core of the meeting (decision making) to 30 minutes, with an optional free discussion/live debugging afterwards.
  • We take comprehensive notes in the Weekly Agenda of agenda item discussions and of decisions taken.

Logos

Spellcheck

To be documented

  • ocr-cleaning (please add a description)
  • object-detection (related to logos and labels)

👷 Contributing

You can fork this repository and start your own experiments or use a distinct repository. Please use a AGPL or more permissive but compatible license.

Do not hesitate to join us on #robotoff channel (or #computervision for work relating on images). We will be happy to help you get data, insights and other useful tips.

📚 More documentation

Contributors

List of contributors to this repository

openfoodfacts-ai's People

Contributors

ahirtz avatar alexgarel avatar antoinebonelli avatar antoineqian avatar dependabot[bot] avatar gabrielbefr avatar raphael0202 avatar rbournhonesque avatar sadokguermazi avatar sgrpanchal31 avatar stephanegigandet avatar teolemon avatar wauplin avatar yichenzhu94 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

openfoodfacts-ai's Issues

Extract QR codes from photos

What

  • Similar to #11
  • A small number of products often feature a QR code. Sometimes, these are part of a "competition", almost always these are URLs.
  • By recognising QR codes, simple regex could detect brands from the urls; or automatically link to the product on manufacturer's site.

Part of

Infer packaging from related products in the category

What

  • Perhaps a robotoff task is best, to allow human confirmation; but there are a number of categories where a product has certain physical properties that require a common style of packaging.

Examples

  • Milk, will come as Bottled or in a Carton, Plastic, Glass or Cardboard
  • Eggs, frequently in a cardboard or plastic carton.
  • Pasta sauce is typically in a glass jar or bottle.

A lot of this can probably be guessed from looking at the most common packaging for a given (specific) category.

We'd also want a way to exclude some categories, where the category is really broad like "Fruits"

Part of

Remove backgrounds

What

  • Remove backgrounds. We currently have a basic technique but it can made much better

Part of

typo in string

You must enter the characters of the barcode or send a product image when the barcode is visible.

to>

You must enter the characters of the barcode or send a product image where the barcode is visible.

Recognize labels and logos

Recognize labels and logos on packaging.

NB: we have already used Google Cloud Vision to do that on part of the database. Some interesting results, but we could benefit from a more specialized solution (we have the clean images of the labels on the Wiki, but no brand dataset)

Extract data from receipts/bills and enrich OFF with prices notion

I read somewhere that table recognition is on roadmap...
When this is ready, scanning "bills" or invoices to extract products price by brand/store/date.

With a shared price information, comparators and other apps would be possibles...

  • See price over time of a product
  • Compare stores margins
  • Detect price fluctuation
  • Compare categories average prices
  • Compare categories average country differences
  • An app to calculate the best couple of store to get this gorcery list you have to buy for next dinner, created from my last bills scans recurring products
    ... and so on

Maybe privacy is to be discussed though !
Maybe a mix of anonymous price data, and a way to keep the pictures/OCR/data local in the user device (aka let apps owners use OCR localy)

My 2 cents

API search results differ from Open Food Facts website search

Hello,

Firstly, this really is a great service and the API is working fantastically well, so thank you.

When searching using the API, for example for "orange juice" the results come back ordered by unique_scans_n which is great

https://world.openfoodfacts.org/cgi/search.pl?search_terms=Orange%20juice&search_simple=1&action=process&json=1&page_size=100

Searching for "Orange juice" in the Open Food facts website search box brings back different (possibly better?) search results. This is despite the fact that the URL looks identical to mine, for example:

https://world.openfoodfacts.org/cgi/search.pl?search_terms=orange+juice&search_simple=1&action=process

Is there any way to get the same results back from the API? &sort_by=unique_scans_n doesn't make a difference as it appears they are already sorted by this value by default.

Thanks in advance

Detect country of origin automatically from label or logo

What

Part of

A more specific version of #7

There are two main variants of "Made in (Country)" labelling which seem common

The plain text wording is likely the easiest to do; as it would be simply a pipeline looking for a country name, plus hint text; which would provide a strong indicator of country of origin.

Score contributors

What

  • Score whether a contributor makes good or bad contributions (will enable edit scoring)

Part of

Extract product quantity, units automatically

Almost all products feature a numeric quantity and unit, this would predominantly be text extraction and filtering results via regexp of common units.

Importantly, this could be used as a broad indicator for #29 - fluids (ml, L, etc) vs solids (g, kg) require different container mechanisms

Flatten images

What

  • Not AI per se, but need to flatten images (on bottles for instance) to enable better recognition on other projects

Part of

Surface interesting charts

What

  • Currently people have to experiment, or use scientific knowledge to create charts on Open Food Facts.
  • See if we can surface charts with interesting shapes, outliers… automatically.
  • Humans could potentially then review those charts, how interesting they are…
  • We could use past chart usage as training data, or just generate images of results, and compare them to classic scientific distributions (bimodal…)

Potential tasklist

  • Generate a dataset of all the charts query ever made on Open Food Facts
  • Extract all parameters
  • See if some charts can already be suggested easily because they have been shared/recreated several times
  • See if suggesting some charts could be problematic
  • See if more suggestions can be synthesized
  • Create a suggestion engine in Robotoff
  • Create thoughtful product opener / mobile integrations

Part of

Enable whitelisting specific faces

Many cloud computer vision have face detection algos, but they might find a face on packaging while we're looking for faces of users.
Allow to whitelist "known faces" and to flag actual faces.

Score edits

What

  • Predict if an edit is likely good or bad

Part of

Extract all barcodes (even when blurry and non-readable)

What

  • They have a very specific shape, and very predictable data (if they can be recognized using the zxing library, or using OCR).
  • Sometimes, both ZXING and OCR will fail, and being able to recognize it looks like a barcode, but it's not legible, is interesting in many ways.
  • Flagging products for human review, nudging the contributor of the photo, quality checks…
  • It would be nice to have a bounded crop of the barcode image, as well as the extracted data.

Part of

Try to detect the name based on the size of the text

A simplistic assumption: the name of the product should be the largest text on the front of the product.
Based on this, we could compute a ratio between the area of the bounding boxes (width by height) and the amount of letters inside it .
Based on this ratio, we could have candidates for the product name

Autocrop ingredient lists

All products have ingredient lists
Many products have ingredient lists in more than one language.

We'd need a system to bound ingredients lists, and ingredient lists in each language. That way we could autoselect an ingredient list image for many products.

Extract Contributing profiles

Analyse what contributions contributor like most, and propose to contribute more in that direction, or to try something new.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.