This repository contains the code for the chrome extension "Image Caption Generator" and its backend.
The model used in the backend is described in the CVPR2015 paper "Show and tell: A neural image caption generator".
The application is basically a functional Google Chrome Extension that generates a caption for an image which describes what is contained in the image. The image captioning is done by a deep learning network which runs on a cloud server (Heroku). The caption generated is converted into speech by JavaScript’s Web Speech AP (SpeechSynthesisUtterance API).
- Open Google Chrome and go to Extensions (under More Tools)
- Select "Load unpacked extension".
- Select the folder "image-caption Extension" which is inside the directory "Code Of Google Chrome Extension".
The extension is now installed.
If you want to run the backend in local system, then the model has to be trained first. The trained model can be downloaded from this link . Once it is downloaded, extract the files in the directory Code Running In Heroku Cloud Server/image-caption/saved_models.
- Right-click on an image for which you want to view the description.
- Select "Get Image Description" from the menu.
- The description will be displayed on an overlay. The text is also converted to audio.
- Press escape to exit the overlay.
- Options shown when an image is right-clicked
- Overlay displayed after selecting the "Get Image Description" option
- Response from the cloud server when there is no error
- Response from the cloud server when there is an error
- It can't be used for protected images. Example- It can’t be used on the images in facebook as they are protected ones.
- Description for some pictures may not be accurate.