Git Product home page Git Product logo

img2vec's Introduction

Image 2 Vec with PyTorch

Medium post on building the first version from scratch: https://becominghuman.ai/extract-a-feature-vector-for-any-image-with-pytorch-9717561d1d4c

Applications of image embeddings:

  • Ranking for recommender systems
  • Clustering images to different categories
  • Classification tasks
  • Image compression

Available models

  • Resnet-18 (CPU, GPU)
    • Returns vector length 512
  • Alexnet (CPU, GPU)
    • Returns vector length 4096

Installation

Tested on Python 3.6

Requires Pytorch: http://pytorch.org/

pip install img2vec_pytorch

Using img2vec as a library

from img2vec_pytorch import Img2Vec
from PIL import Image

# Initialize Img2Vec with GPU
img2vec = Img2Vec(cuda=True)

# Read in an image
img = Image.open('test.jpg')
# Get a vector from img2vec, returned as a torch FloatTensor
vec = img2vec.get_vec(img, tensor=True)
# Or submit a list
vectors = img2vec.get_vec(list_of_PIL_images)
For running the example, you will additionally need:
  • Pillow: pip install Pillow
  • Sklearn pip install scikit-learn

Running the example

git clone https://github.com/christiansafka/img2vec.git

cd img2vec/example

python test_img_to_vec.py

Expected output

Which filename would you like similarities for?
cat.jpg
0.72832 cat2.jpg
0.641478 catdog.jpg
0.575845 face.jpg
0.516689 face2.jpg

Which filename would you like similarities for?
face2.jpg
0.668525 face.jpg
0.516689 cat.jpg
0.50084 cat2.jpg
0.484863 catdog.jpg

Try adding your own photos!

Img2Vec Params

cuda = (True, False)   # Run on GPU?     default: False
model = ('resnet-18', 'alexnet')   # Which model to use?     default: 'resnet-18'

Advanced users


Additional Parameters

layer = 'layer_name' or int   # For advanced users, which layer of the model to extract the output from.   default: 'avgpool'
layer_output_size = int   # Size of the output of your selected layer

Defaults: (layer = 'avgpool', layer_output_size = 512)
Layer parameter must be an string representing the name of a layer below

conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3, bias=False)
bn1 = nn.BatchNorm2d(64)
relu = nn.ReLU(inplace=True)
maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
layer1 = self._make_layer(block, 64, layers[0])
layer2 = self._make_layer(block, 128, layers[1], stride=2)
layer3 = self._make_layer(block, 256, layers[2], stride=2)
layer4 = self._make_layer(block, 512, layers[3], stride=2)
avgpool = nn.AvgPool2d(7)
fc = nn.Linear(512 * block.expansion, num_classes)

Defaults: (layer = 2, layer_output_size = 4096)
Layer parameter must be an integer representing one of the layers below

alexnet.classifier = nn.Sequential(
            7. nn.Dropout(),                  < - output_size = 9216
            6. nn.Linear(256 * 6 * 6, 4096),  < - output_size = 4096
            5. nn.ReLU(inplace=True),         < - output_size = 4096
            4. nn.Dropout(),		      < - output_size = 4096
            3. nn.Linear(4096, 4096),	      < - output_size = 4096
            2. nn.ReLU(inplace=True),         < - output_size = 4096
            1. nn.Linear(4096, num_classes),  < - output_size = 4096
        )

To-do

  • Benchmark speed and accuracy
  • Add ability to fine-tune on input data
  • Export documentation to a normal place

img2vec's People

Contributors

christiansafka avatar kapily avatar mdkozlowski avatar keherri avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.