This is a demo of a chat bot running on docker on yarn. The user interacts with the bot using telegram messenger. The flow is
Telegram messenger <-> Nifi <->bot.
The bot does the following
Step 1 - use a tf-idf vectoriser and vectorise the incoming message Step 2 - use the vectorised message with a random forest classifier to predict the intent of the conversation as just general conversation or a technical conversation. Step 3- use a onevsrestclassifier wrapped on a logistic regression classifier to identify the tag for the technical conversation (tags are obtained from a dump of stack overflow discussions) Step 4 - compute cosine similarity between message and stack overflow dump to identify thread Step 5 - return thread
to setup the demo follow the below instructions
-
enable docker on yarn a) Yarn>config>settings> docker runtime>enabled b) yarn.nodemanager.container-executor.class=org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor c) yarn.nodemanager.runtime.linux.docker.default-container-network=bridge d) yarn.nodemanager.runtime.linux.docker.allowed-container-networks=host,none,bridge e) yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user=dockeruser f) yarn.nodemanager.runtime.linux.docker.privileged-containers.acl=root,dockeruser,ubuntu,yarn g) enable yarn.nodemanager.runtime.linux.docker.privileged-containers.allowed h) Enable Launching Privileged Containers in yarn i) restart yarn
-
Install local docker repo a) docker run -d -p 6666:5000 --restart=always --name registry -v /mnt/registry:/var/lib/registry registry:2 b) add {"insecure-registries": ["localhost:6666"]} to /etc/docker/daemon.json c) restart docker (service docker restart) d) In yarn set Docker Trusted Registries to localhost:6666 and restart yarn
-
cd chatbotnifiyarn. unzip starspace_embedding.tsv.zip
-
cd chatbotnifiyarn/thread_embeddings_by_tags
-
cat embed.* >>emb.tar
-
tar -xvf emb.tar
-
build docker image cd chatbotnifiyarn docker build -t nlp_image .
-
tag the image docker tag nlp_image:latest localhost:6666/nlp_image:latest
-
push the image into the local repository docker push localhost:6666/nlp_image:latest
-
edit Yarnfile to change /Downloads/chatbotnifiyarn to the location where you cloned the project
-
curl -X POST -H "Content-Type: application/json" http://localhost:8088/app/v1/services?user.name=ubuntu -d @Yarnfile
-
curl http://localhost:8088/app/v1/services/nlp-service?user.name=ubuntu | python -m json.tool
- get the ip address and telnet 5000 to verify that the container is up and listenning on port 5000
-
install nifi (nifi.apache.org). cd /root/nifi-1.8.0/conf and edit nifi.properties nifi.web.http.port=9090 start nifi /root/nifi-1.8.0/bin export JAVA_HOME=/usr/jdk64/jdk1.8.0_112 ./nifi.sh start
-
connect to http://:9090 and upload the template (nifibot.xml). Create a new dataflow from the template
-
Talk to @BotFather in Telegram. The command "/newbot" will create a bot for you. You will be prompted to enter a name and a username for your bot. After that, you will be given a token.
-
open the nifi flow and open the InvokeHTTP processor that has Remote URL like https://api.telegram.org/bot642016628:AAEI7IHimEhyBiNG30G_RqWKZxsE5cjJRkg/getUpdates?offset=${update_id} change the 642016628:AAEI7IHimEhyBiNG30G_RqWKZxsE5cjJRkg to the token from your bot
-
start the nifi flow. Open the telegram messenger and interact with your bot