Git Product home page Git Product logo

fk-visual-search's Introduction

fk-visual-search

This code allows you to train the Visnet model. Visnet, trained on Flipkart's proprietary internal dataset, powers Visual Recommendations at Flipkart. On the publically available dataset, Street2Shop, Visnet achieves state-of-the-art results. Here is the link to the arXiv tech report.

In this Repo, we have open-sourced the following:

  • Training prototxts of Visnet
  • Triplet sampling code, to generate the training files
  • A CUDA based fast K-Nearest Neighbor Search library
  • Other auxillary scripts, such as code to process Street2Shop dataset, sampling triplets, etc.

We soon plan to add other useful scripts, such as:

  • Our useful modifications over Caffe - the image augmentation layer, and triplet accuracy layer to aid the training of Visnet

Visnet Architecture

VisNet is a Convolutional Neural Network (CNN) trained using triplet based deep ranking paradigm. It contains a deep CNN modelled after the VGG-16 network, coupled with parallel shallow convolution layers in order to capture both high-level and low-level image details simultaneously. img

Training

In order to train you need a set of triplets <q,p,n>. For compatibility with Caffe's ImageData layer, you need 3 sets of triplet files (one each for q, p and n). The lines in those files should correspond to triplets, i.e. line#i in each file should correspond to the i'th triplet.

If you wish to train Visnet on Street2Shop dataset, you need to:

  1. Download the Street2Shop dataset (This contains only the image URLs)

  2. Download Street2Shop images (Have a look at scripts/image_downloader.py)

  3. You can then format the data using scripts/create_structured_images.py and scripts/create_wtbi_crops.py

  4. Use scripts/sampler.py to sample the triplet files

  5. Change visnet/train.prototxt to include the location to your triplet files

  6. Run training using Caffe

Feature extraction and NN Search

We provide PyCaffe code to do Feature Extraction (scripts/feature_extractor.py), and a CUDA-based fast NN computer (scripts/cuda_knn.py).

fk-visual-search's People

Contributors

devashishshankar avatar sujaynarumanchi avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.