Image captioning using pretrained inception V3 convolution networks and LSTM recurrent networks.
The model is trained on a dataset of 8000 images and for few epochs. But if you wish to get better results increase the number of data points in dataset and number of epochs.