Automatic-Image-Captioning A computer program that takes an image as input and produces a relevant caption as output. Few applications where a solution to this problem can be very useful:
We must first understand how important this problem is to real world scenarios. Let’s see few applications where a solution to this problem can be very useful. *Self driving cars — Automatic driving is one of the biggest challenges and if we can properly caption the scene around the car, it can give a boost to the self driving system. *Aid to the blind — We can create a product for the blind which will guide them travelling on the roads without the support of anyone else. We can do this by first converting the scene into text and then the text to voice. Both are now famous applications of Deep Learning. Refer this link where its shown how Nvidia research is trying to create such a product. *CCTV cameras are everywhere today, but along with viewing the world, if we can also generate relevant captions, then we can raise alarms as soon as there is some malicious activity going on somewhere. This could probably help reduce some crime and/or accidents. *Automatic Captioning can help, make Google Image Search as good as Google Search, as then every image could be first converted into a caption and then search can be performed based on the caption.
Given an image like the example below, our goal is to generate a caption such as "a surfer riding on a wave".
See the image in surfing.png