Twitter dashboard using Apache NiFi, Solr and Banana
Creating an interactive dashboard for your data
For this part we are going to need Solr, Banana and NiFi and we are going to create an interactive dashboard of the tweets.
From the AWS Marketplace launch an instance using the following AMI, https://aws.amazon.com/marketplace/fulfillment?productId=7bc7936e-10a8-42c7-ab44-ce435bd949a9&ref=cns_srchrow
This is the quickest way to get NiFi up and running.
sudo apt-get update && apt-get upgrade -y
First install Java.
sudo apt-get update
sudo tar apt-get install -y default-jre || sudo yum install -y java-headless
Veryify that Java is installed.
java -version
sudo wget http://apache.mirror.anlx.net/lucene/solr/7.0.0/solr-7.0.0.tgz
sudo tar xzf solr-7.0.0.tgz
sudo ./solr-7.0.0/bin/install_solr_service.sh solr-7.0.0.tgz
Create the Solr Collection for Tweets
sudo -i
cd /var/solr/data
mkdir tweets
cp -r _default/ /var/solr/data
mv _default/ tweets
chown -r solr:solr *
exit
sudo -u solr ./solr-7.0.0/bin/solr create_core -c tweets -d _default -s 1 -rf 1
Check the core has been created
http://hostname:8983/solr/#/~cores/tweets
Edit solrconfig.xml by adding EEE MMM d HH:mm:ss Z yyyy under ParseDateFieldUpdateProcessorFactory so it looks like below. This is done to allow Solr to recognise the timestamp format of tweets.
sudo nano /opt/solr/server/configsets/_default/conf/solrconfig.xml
<processor>
<arr name="format">
<str>EEE MMM d HH:mm:ss Z yyyy</str>
Banana is a tool to create dashboards to visualize data you have stored in Solr. Commonly used with Logstash for log data. Its a fork of kibana.
Run Solr at least once to create the webapp directory
Download Banana and install it.
cd /opt/solr/server/solr-webapp/webapp
sudo git clone https://github.com/lucidworks/banana
Replace the default dashboard with the Twitter dashboard
cd /opt/solr/server/solr-webapp/webapp/banana/src/app/dashboards
sudo mv default.json default.json.orig
wget https://raw.githubusercontent.com/abajwa-hw/ambari-nifi-service/master/demofiles/default.json
Browse to http://hostname:8983/solr/banana/src/index.html
Now create a data flow in NiFi to send the twitter data to Solr.
Configure the PutSolrContentStream processor as follows, this will push the tweets to Solr.
Enable the flow and then check Solr.
Use the following URL to access the dashboard. http://hostname:8983/solr/banana/src/index.html#/dashboard
You will have a timeline of tweets in the dashboard. The dashboard is highly configurable, so have a play.
Please read CONTRIBUTING.md for details on our code of conduct, and the process for submitting pull requests to us.
- Mark Craddock - Initial work
- Neville de Mendonca - Initial work
See also the list of contributors who participated in this project.
TBD
https://community.hortonworks.com/articles/1282/sample-hdfnifi-flow-to-push-tweets-into-solrbanana.html https://blogs.apache.org/nifi/entry/indexing_tweets_with_nifi_and