Git Product home page Git Product logo

ibm / max-image-caption-generator-web-app Goto Github PK

View Code? Open in Web Editor NEW
75.0 35.0 40.0 67.22 MB

Create a web app to interact with machine learning generated image captions

Home Page: https://developer.ibm.com/patterns/create-a-web-app-to-interact-with-machine-learning-generated-image-captions/

License: Apache License 2.0

Python 29.98% CSS 6.50% JavaScript 50.13% HTML 11.94% Dockerfile 1.46%
ibmcode ai deep-learning python tornado web-app codait model-asset-exchange

max-image-caption-generator-web-app's Introduction

Build Status Website Status

Create a web app to interact with machine learning generated image captions

Every day 2.5 quintillion bytes of data are created, based on an IBM study. A lot of that data is unstructured data, such as large texts, audio recordings, and images. In order to do something useful with the data, we must first convert it to structured data.

In this Code Pattern we will use one of the models from the Model Asset Exchange (MAX), an exchange where developers can find and experiment with open source deep learning models. Specifically we will be using the Image Caption Generator to create a web application that will caption images and allow the user to filter through images based image content. The web application provides an interactive user interface backed by a lightweight python server using Tornado. The server takes in images via the UI and sends them to a REST end point for the model and displays the generated captions on the UI. The model's REST endpoint is set up using the docker image provided on MAX. The Web UI displays the generated captions for each image as well as an interactive word cloud to filter images based on their caption.

When the reader has completed this Code Pattern, they will understand how to:

  • Build a Docker image of the Image Caption Generator MAX Model
  • Deploy a deep learning model with a REST endpoint
  • Generate captions for an image using the MAX Model's REST API
  • Run a web application that using the model's REST API

Architecture

Flow

  1. Server sends default images to Model API and receives caption data.
  2. User interacts with Web UI containing default content and uploads image(s).
  3. Web UI requests caption data for image(s) from Server and updates content when data is returned.
  4. Server sends image(s) to Model API and receives caption data to return to Web UI.

Included Components

  • IBM Model Asset Exchange: A place for developers to find and use free and open source deep learning models.
  • Docker: Docker is a tool designed to make it easier to create, deploy, and run applications by using containers.

Featured Technologies

  • Python: Python is a programming language that lets you work more quickly and integrate your systems more effectively.
  • JQuery: jQuery is a cross-platform JavaScript library designed to simplify the client-side scripting of HTML.
  • Bootstrap 3: Bootstrap is a free and open-source front-end library for designing websites and web applications.
  • Pexels: Pexels provides high quality and completely free stock photos licensed under the Creative Commons Zero (CC0) license.

Watch the Video

The following is a talk at Spark+AI Summit 2018 about MAX that includes a short demo of the web app.

Watch the Video

Steps

Ways to run the code pattern:

Deploy to IBM Cloud

Deploy the Model

Follow the Deploy the Model Doc to deploy the Image Caption Generator model to IBM Cloud. If you already have a model API endpoint available you can skip this process.

Note: Deploying the model can take time, to get going faster you can try running locally.

Deploy the Web App

  1. Press the Deploy to IBM Cloud button. If you do not have an IBM Cloud account yet, you will need to create one.

    Deploy to IBM Cloud

  2. Click Delivery Pipeline and click the Create + button in the form to generate a IBM Cloud API Key for the web app.

    Create API Key

  3. Once the API key is generated, the Region, Organization, and Space form sections will populate. Fill in the Image Caption Generator Model API Endpoint section with the endpoint deployed above, then click on Create.

    The format for this entry should be http://170.0.0.1:5000

    Create App

  4. In Toolchains, click on Delivery Pipeline to watch while the app is deployed. Once deployed, the app can be viewed by clicking View app.

    Delivery Pipeline

Deploy on Kubernetes

You can also deploy the model and web app on Kubernetes using the latest docker images on Quay.

On your Kubernetes cluster, run the following commands:

kubectl apply -f https://raw.githubusercontent.com/IBM/MAX-Image-Caption-Generator/master/max-image-caption-generator.yaml
kubectl apply -f https://raw.githubusercontent.com/IBM/MAX-Image-Caption-Generator-Web-App/master/max-image-caption-generator-web-app.yaml

The web app will be available at port 8088 of your cluster. The model will only be available internally, but can be accessed externally through the NodePort.

Note: For deploying the web app on IBM Cloud it is recommended to follow the Deploy to IBM Cloud instructions above rather than deploying with IBM Cloud Kubernetes Service.

Run Locally

NOTE: These steps are only needed when running locally instead of using the Deploy to IBM Cloud button.

Setting up the MAX Model

  1. Deploy the Model
  2. Experimenting with the API (Optional)

Starting the Web App

  1. Check out the code
  2. Installing dependencies
  3. Running the server
  4. Configuring ports (Optional)
  5. Instructions for Docker (Optional)

Setting up the MAX Model

NOTE: The set of instructions in this section are a modified version of the one found on the Image Caption Generator Project Page

1. Deploy the Model

To run the docker image, which automatically starts the model serving API, run:

docker run -it -p 5000:5000 quay.io/codait/max-image-caption-generator

This will pull a pre-built image from Quay (or use an existing image if already cached locally) and run it. If you'd rather build the model locally you can follow the steps in the model README.

Note that currently this docker image is CPU only (we will add support for GPU images later).

2. Experimenting with the API (Optional)

The API server automatically generates an interactive Swagger documentation page. Go to http://localhost:5000 to load it. From there you can explore the API and also create test requests.

Use the model/predict endpoint to load a test file and get captions for the image from the API.

The model samples folder contains a few images you can use to test out the API, or you can use your own.

You can also test it on the command line, for example:

curl -F "image=@path/to/image.jpg" -X POST http://localhost:5000/model/predict
{
  "status": "ok",
  "predictions": [
    {
      "index": "0",
      "caption": "a man riding a wave on top of a surfboard .",
      "probability": 0.038827644239537
    },
    {
      "index": "1",
      "caption": "a person riding a surf board on a wave",
      "probability": 0.017933410519265
    },
    {
      "index": "2",
      "caption": "a man riding a wave on a surfboard in the ocean .",
      "probability": 0.0056628732021868
    }
  ]
}

Starting the Web App

1. Check out the code

Clone the Image Caption Generator Web App repository locally by running the following command:

git clone https://github.com/IBM/MAX-Image-Caption-Generator-Web-App

Note: You may need to cd .. out of the MAX-Image-Caption-Generator directory first

Then change directory into the local repository

cd MAX-Image-Caption-Generator-Web-App

2. Installing dependencies

Before running this web app you must install its dependencies:

pip install -r requirements.txt

3. Running the server

You then start the web app by running:

python app.py

Once it's finished processing the default images (< 1 minute) you can then access the web app at: http://localhost:8088

The Image Caption Generator endpoint must be available at http://localhost:5000 for the web app to successfully start.

4. Configuring ports (Optional)

If you want to use a different port or are running the ML endpoint at a different location you can change them with command-line options:

python app.py --port=[new port] --ml-endpoint=[endpoint url including protocol and port]

5. Instructions for Docker (Optional)

To run the web app with Docker the containers running the web server and the REST endpoint need to share the same network stack. This is done in the following steps:

Modify the command that runs the Image Caption Generator REST endpoint to map an additional port in the container to a port on the host machine. In the example below it is mapped to port 8088 on the host but other ports can also be used.

docker run -it -p 5000:5000 -p 8088:8088 --name max-image-caption-generator quay.io/codait/max-image-caption-generator

Build the web app image by running:

docker build -t max-image-caption-generator-web-app .

Run the web app container using:

docker run --net='container:max-image-caption-generator' -it max-image-caption-generator-web-app
Using the Quay Image

You can also deploy the web app with the latest docker image available on Quay.io by running:

docker run --net='container:max-image-caption-generator' -it quay.io/codait/max-image-caption-generator-web-app

This will use the model docker container run above and can be run without cloning the web app repo locally.

Sample Output

Web UI Screenshot

Troubleshooting

There is a large amount of user uploaded images in a long running web app

When running the web app at http://localhost:8088 an admin page is available at http://localhost:8088/cleanup that allows the user to delete all user uploaded files from the server.

[Note: This deletes all user uploaded images]

Admin UI Screenshot

Links

Libraries used in this Code Pattern

  • D3.js: D3.js is a JavaScript library for manipulating documents based on data.
  • d3-cloud: A Wordle-inspired word cloud layout written in JavaScript.
  • Featherlight: Featherlight is a very lightweight jQuery lightbox plugin.
  • Glyphicons: GLYPHICONS is a library of precisely prepared monochromatic icons and symbols, created with an emphasis to simplicity and easy orientation.
  • Image Picker: Image Picker is a simple jQuery plugin that transforms a select element into a more user friendly graphical interface.
  • Cookie Consent: Cookie Consent is a JavaScript plugin for alerting users about the use of cookies on a website.

Learn More

  • Artificial Intelligence Code Patterns: Enjoyed this Code Pattern? Check out our other Artificial Intelligence Code Patterns
  • AI and Data Code Pattern Playlist: Bookmark our playlist with all of our Code Pattern videos
  • Watson Studio: Master the art of data science with IBM's Watson Studio
  • Deep Learning with Watson Studio: Design and deploy deep learning models using neural networks, easily scale to hundreds of training runs. Learn more at Deep Learning with Watson Studio.

License

This code pattern is licensed under the Apache Software License, Version 2. Separate third party code objects invoked within this code pattern are licensed by their respective providers pursuant to their own separate licenses. Contributions are subject to the Developer Certificate of Origin, Version 1.1 (DCO) and the Apache Software License, Version 2.

Apache Software License (ASL) FAQ

max-image-caption-generator-web-app's People

Contributors

ajbozarth avatar bdwyer2 avatar djalova avatar dolph avatar kant avatar ljbennett62 avatar lresende avatar mmcelaney avatar ptitzler avatar rhagarty avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

max-image-caption-generator-web-app's Issues

Image Caption Generator Model API Endpoint

I was following the tutorial and got lost on this point. "Fill in the Image Caption Generator Model API Endpoint section with the endpoint deployed above, then click on Create." Could you elaborate more?

Threading causes error on old python versions

Rich found the following error running the latest master:

INFO: Connecting to ML endpoint at http://localhost:5000/model/predict
INFO: Starting web server
INFO: Preparing ML metadata
Traceback (most recent call last):
  File "app.py", line 190, in <module>
    main()
  File "app.py", line 181, in main
    prepare_metadata()
  File "app.py", line 115, in prepare_metadata
    threads.clear()
AttributeError: 'list' object has no attribute 'clear'

This is caused since list.clear() is a python 3 function. We should either change this line for version compatibility or add a python 3 requirement to the project.

Review Comments

I received some great review from a teammate, copied below:

va barbosa [6:34 AM]
hello Alex, i recently joined CODAIT (under Bradley Holt’s team). i was looking at your MAX-Image-Caption-Generator-Web-App.
i do have a couple minor comments.
1- the upload fails in Firefox (works fine in Chrome). FF throws the following JS error: ReferenceError: event is not defined (webapp.js:211)
2- for the Starting the Web App section of the README. i think step 1 should be
git clone https://github.com/IBM/MAX-Image-Caption-Generator-Web-App.git
followed by
cd MAX-Image-Caption-Generator-Web-App
then you continue with Installing the dependencies. Otherwise some people may get confused and try to run/install the dependencies
from the same directory/repo they were in for the previous section (MAX-Image-Caption-Generator) (edited)
let me know if you have any questions for me or if you prefer i submit GitHub issues for these. thanks.

va barbosa [8:37 AM]
also, will there be more supplemental (how-to) content or more info added to README ?
maybe more detail discussion around deploying the model and using the REST API … that way it could possibly give it more value to developers wanting to produce/consume the REST API (i.e., in a different UI or from different client interface)
in any case, nice work so far. looking forward to the finished Code Pattern (edited)

Alex Bozarth [1:41 PM]
Thanks for the feedback, I’m looking into the issue and will open a PR wil my updates. nice catch on the FF issue. there was a typo in the code and safari and chrome make js assumptions that masked it.
As for #2, I’m not sure how we forgot to include that step 😕
As for more content, we are open to adding more if asked, but we’re done adding the content we had planned to.
And for the REST API section, that part is actually a snippet of the README for the other repo and we include a note and link if readers want to learn more about it beyond what we included. Do you think we should add another reference to check out the other repo somewhere?
And as for feedback, in general I’d say opening issues is probably better, but you can ping me for the small stuff and I’d just fix it quickly.

I'll be submitting a PR to address this shortly

Web App broken with latest model code

Due to an update in IBM/MAX-Image-Caption-Generator#9 all image uploads to the model from the web app fail. From initial survey this seems to be because the python requests library doesn't set a mime type when posting a file by default. I'm unsure if a fix belongs here on in the model, but I'm opening this know issue until I manage to fix it.

v2.0 Release

@djalova @stevemart @rhagarty

I've drafted a v2.0 release but want to double check with everyone and make sure there aren't anymore changes needed (don't want to cut a 2.1 for some missing text). Once I get the ok from each of you I'll cut the release on the current master. The idea is that this v2.0 cut would be the "final" change to the repo before moving on to our next code pattern. (not to say we wouldn't stay open to new ideas proposed once the pattern is published)

https://github.com/IBM/MAX-Image-Caption-Generator-Web-App/releases/tag/untagged-8e0ce32d39f6bb3a52f7

Fix Deploy to IBM Cloud

The Deploy to IBM Cloud section of the README had to be commented out due to an issue with the (assumed) product instance of the model endpoint.

A production quality model endpoint needs to be determined and utilized before uncommenting the code. This production endpoint will be determined by the MAX dev team.

Note: This is blocking the "release" of v2 but not the publishing of the code pattern.

Allow the user to submit a URL of an image

Allow a user to upload an image by inputing a url to the image. This would require extending the file input process in some way since the html form file input type does not support this.

Create a thorough README.md

Internal Requirements Doc:
https://pages.github.ibm.com/IBMCode/IBMCodeContent/patterns/dev/#1-create-a-thorough-readmemd

Sections (Related Issue):

  • Title
  • Intro
  • Architecture Image (#5)
  • Flow (#5)
  • Included components
  • Featured technologies
  • Watch the Video (#8)
  • Steps
  • Sample output
  • Troubleshooting
  • Links
  • Learn More
  • License

Edit: Remaining TODO comments in README:

  • Line 38: Make sure Components, Technologies, Links, and Learn More bullets are in the correct sections
  • Line 67: Imbed link to youtube video [Will be done in #8]
  • Line 218: Add Common Troubleshooting Issues
  • Line 224: Add Links
  • Line 228: Add Learn More

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.