fk-visual-search

This code allows you to train the Visnet model. Visnet, trained on Flipkart's proprietary internal dataset, powers Visual Recommendations at Flipkart. On the publically available dataset, Street2Shop, Visnet achieves state-of-the-art results. Here is the link to the arXiv tech report.

In this Repo, we have open-sourced the following:

Training prototxts of Visnet
Triplet sampling code, to generate the training files
A CUDA based fast K-Nearest Neighbor Search library
Other auxillary scripts, such as code to process Street2Shop dataset, sampling triplets, etc.

We soon plan to add other useful scripts, such as:

Our useful modifications over Caffe - the image augmentation layer, and triplet accuracy layer to aid the training of Visnet

Visnet Architecture

VisNet is a Convolutional Neural Network (CNN) trained using triplet based deep ranking paradigm. It contains a deep CNN modelled after the VGG-16 network, coupled with parallel shallow convolution layers in order to capture both high-level and low-level image details simultaneously.

Training

In order to train you need a set of triplets <q,p,n>. For compatibility with Caffe's ImageData layer, you need 3 sets of triplet files (one each for q, p and n). The lines in those files should correspond to triplets, i.e. line#i in each file should correspond to the i'th triplet.

If you wish to train Visnet on Street2Shop dataset, you need to:

Download the Street2Shop dataset (This contains only the image URLs)
Download Street2Shop images (Have a look at scripts/image_downloader.py)
You can then format the data using scripts/create_structured_images.py and scripts/create_wtbi_crops.py
Use scripts/sampler.py to sample the triplet files
Change visnet/train.prototxt to include the location to your triplet files
Run training using Caffe

Feature extraction and NN Search

We provide PyCaffe code to do Feature Extraction (scripts/feature_extractor.py), and a CUDA-based fast NN computer (scripts/cuda_knn.py).

ssssssssssss / fk-visual-search Goto Github PK

fk-visual-search's Introduction

fk-visual-search

Visnet Architecture

Training

Feature extraction and NN Search

fk-visual-search's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent