Git Product home page Git Product logo

cape-webservices's Introduction

cape-webservices CircleCI

Entrypoint for all backend cape webservices.

Frontend demo is here (only works if you already launched a Backend).

Overview of Cape

Cape is a suite of open-source libraries to manage a question-answering model that answers questions by "reading" documents automatically. It is based on state-of-the-art machine reading models trained on massive datasets, and includes several mechanisms to make it easy to use and improve based on user feedback. It has been designed to be portable, i.e. works on a single laptop or on a cluster of parallel machines to speedup computation, and is Open Source friendly to be used at all expertise levels.

It enables users to

  • upload documents and answer questions extracted from them,
  • update models by adding a "saved reply", i.e. a pre-defined answer,
  • manage users, documents and saved replies.

There are several ways to use Cape :

  • As a python library :
 from cape_responder.responder_core import Responder
 Responder.get_answers_from_documents('my-token','How easy is Cape to use', text ="Cape is an open source large-scale question answering system and is super easy to use!")
  • As a python service : python3 -m cape_webservices.run
  • As a standalone Docker container : docker run -p 5050:5050 bloomsburyai/cape
  • As an app with UI
  • As a distributed cluster
  • As a slack bot (video here, more info here)
  • AI-in-the-middle email answering system (video here, more info here)
  • As a Facebook bot (more info here)
  • As a Hangouts bot (more info here)

Quick start

Minimum Requirements

We recommend at least 3GB of RAM and at least 2 modern CPU cores (4 if virtual). If you're using Docker, ensure you increase the memory resource limits in the Docker preferences.

Standalone webapp with Docker

You can run a standalone version of the webapp that includes a management dashboard. After installing docker, update and run the Cape image:

docker pull bloomsburyai/cape && docker run -ti -p 5050:5050 -p 5051:5051 bloomsburyai/cape

This will launch both the backend and the frontend webservices, by default it will also create tunnels for both, outputting the public urls:

  • To use the frontend just browse to the given url, it will be something similar to : **https://RANDOM_STRING_HERE.ngrok.io?configuration={"api":{"backendURL":"https://RANDOM_STRING_HERE.ngrok.io:5050","timeout":"15000"}}
  • To use the backend you can use our client (documentation here or make your own by integrating our HTTP API (documentation here))

Quick Start Guide with Docker

  1. Pull the latest version of the Docker image (it will take a few moments to download all dependencies and a machine reading model): docker pull bloomsburyai/cape

  2. Run the Docker container and launch an IPython console within it using the following command: docker run -ti -p 5050:5050 -p 5051:5051 bloomsburyai/cape ipython3

  3. Import Responder: from cape_responder.responder_core import Responder

  4. Ask a question and store the response (which is a list of answers) and display the first answer using: response = Responder.get_answers_from_documents('my-token','How easy is Cape to use?', text="Cape is an open source large-scale question answering system and is super easy to use!"); print(response[0]['answerText'])

  5. If you are interested in understanding a bit more about what the response looks like, display the full response using: print(response)

Installing without docker

To natively install Cape on a linux system, take a look at deployment/Dockerfile.

Structure

Dependencies Diagram

In summary this is how Cape is organized:

  • cape-webservices Backend server providing the full HTTP API, depends on :
    • cape-responder Unique high level interface for distributing and creating machine reading tasks,depending on :
      • cape-machine-reader Module to integrate machine reading models
      • cape-document-qa Integration of a state of the art machine reading model, with training and evaluation scripts
    • cape-document-manager Interface to manage document and annotations, using SQLite as an example storage backend, depends on :
      • cape-splitter Package to split documents into chunks without breaking sentences
    • cape-userdb, Package to manage and store users and configurations
    • cape-api-helpers, HTTP API utility functions
    • Optional plugins:
  • cape-frontend Frontend server (not in the diagram) it uses the backend server API to provide a management dashboard to the users

cape-webservices's People

Contributors

elleo avatar gbouchar avatar luisulloa avatar maxbartolo avatar patrick-bloomsbury avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cape-webservices's Issues

Running as a distributed cluster

In the README.md file it mentions several ways to run the webapp, including As a distributed cluster, however there are no details on how to do this and what benefits could be expected from this approach. Is there any more information available? If not can there be?

ngrok gets expired after a while

After running the docker images, the ngrok url gets expired after a while. Any way to log into the ngrok, so that it doesn't get expired?

Also, can we can have an option to use, without ngrok? Just normal localhost would be good.

cape-webservices not launching on mac, but fine on an AWS box

Hi guys,

Thank you so much for having open sourced this :)

On my Mac I cannot run any of the examples in your readme with Docker, but it's fine on an AWS t2.medium instance. That sounds extremely weird to me, so even if I don't mind experimenting on a Linux box, I'm reporting this here to save somebody else a bit of time.

Mac config:

~ system_profiler SPHardwareDataType
Hardware:

    Hardware Overview:

      Model Name: MacBook Pro
      Model Identifier: MacBookPro14,3
      Processor Name: Intel Core i7
      Processor Speed: 2.8 GHz
      Number of Processors: 1
      Total Number of Cores: 4
      L2 Cache (per Core): 256 KB
      L3 Cache: 6 MB
      Memory: 16 GB
      Boot ROM Version: MBP143.0167.B00
      SMC Version (system): 2.45f0
      Serial Number (system):*********
      Hardware UUID: *********

Example output:

~ docker run -ti bloomsburyai/cape ipython3                                
Python 3.6.5 (default, Apr  1 2018, 05:46:30) 
Type 'copyright', 'credits' or 'license' for more information
IPython 6.5.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: from cape_responder.responder_core import Responder
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/root/.pyxbld/temp.linux-x86_64-3.6/pyrex/grouper_core.cpp: In function ‘PyObject* __pyx_pw_12grouper_core_1get_n_words_before_position(PyObject*, PyObject*, PyObject*)’:
/root/.pyxbld/temp.linux-x86_64-3.6/pyrex/grouper_core.cpp:1584:15: warning: ‘__pyx_v_remaining_words’ may be used uninitialized in this function [-Wmaybe-uninitialized]
     __pyx_t_4 = (-__pyx_v_remaining_words);
     ~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~
/root/.pyxbld/temp.linux-x86_64-3.6/pyrex/grouper_core.cpp:1381:7: note: ‘__pyx_v_remaining_words’ was declared here
   int __pyx_v_remaining_words;
       ^~~~~~~~~~~~~~~~~~~~~~~
/root/.pyxbld/temp.linux-x86_64-3.6/pyrex/grouper_core.cpp: In function ‘PyObject* __pyx_pw_12grouper_core_3get_n_words_after_position(PyObject*, PyObject*, PyObject*)’:
/root/.pyxbld/temp.linux-x86_64-3.6/pyrex/grouper_core.cpp:1930:42: warning: ‘__pyx_v_remaining_words’ may be used uninitialized in this function [-Wmaybe-uninitialized]
     __pyx_t_6 = (__pyx_v_remaining_words - 1);
                 ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~
/root/.pyxbld/temp.linux-x86_64-3.6/pyrex/grouper_core.cpp:1742:7: note: ‘__pyx_v_remaining_words’ was declared here
   int __pyx_v_remaining_words;
       ^~~~~~~~~~~~~~~~~~~~~~~
/usr/local/lib/python3.6/dist-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
/usr/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
  return f(*args, **kwds)
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/contrib/learn/python/learn/datasets/base.py:198: retry (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version.
Instructions for updating:
Use the retry module or similar alternatives.

In [2]: Responder.get_answers_from_documents('my-token','How easy is Cape to use', text ="Cape is an open source large-scale question answering system and is super easy to use!")
2018-08-18 12:39:58.475848: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Had pre-trained word embeddings for 84902 of 181076 words
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/docqa/nn/attention.py:85: calling reduce_sum (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version.
Instructions for updating:
keep_dims is deprecated, use keepdims instead
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/docqa/nn/attention.py:85: calling reduce_sum (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version.
Instructions for updating:
keep_dims is deprecated, use keepdims instead
INFO:tensorflow:Restoring parameters from /usr/local/lib/python3.6/dist-packages/cape_document_qa/storage/models/production_ready_model/save/checkpoint-123456789
INFO:tensorflow:Restoring parameters from /usr/local/lib/python3.6/dist-packages/cape_document_qa/storage/models/production_ready_model/save/checkpoint-123456789

And then the container just dies. If I do docker inspect, I get the following state:

"State": {
            "Status": "exited",
            "Running": false,
            "Paused": false,
            "Restarting": false,
            "OOMKilled": false,
            "Dead": false,
            "Pid": 0,
            "ExitCode": 137,
            "Error": "",
            "StartedAt": "2018-08-18T12:39:24.1657046Z",
            "FinishedAt": "2018-08-18T12:40:21.0737114Z"
}

So it looks like there is an OOM error somewhere, but this is definitely weird given my Mac's configuration.

Default user & passwords

Hi,
I'm trying to test the solution but there's or I can't find the admin user or password.
I can install the docker image, but once it is running I can't get any valid response.
Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.