This repo is for the Udacity Full-stack Nanodegree project Logs Analysis. It is a reporting tool for a fictional newspaper site that uses information from the database to discover what kind of articles the site's readers like. The database contains newspaper articles, data about the authors of the articles, and server logs for the site.
- Install VirtualBox https://www.virtualbox.org/wiki/Downloads
- Install Vagrant https://www.vagrantup.com/
- Clone repo https://github.com/udacity/fullstack-nanodegree-vm
- cd into vagrant folder and clone this repo
- Unzip newsdata.zip. Its contents contain newdata.sql.
- Run
vagrant up
and thenvagrant ssh
cd logs-analysis
- Type command
psql -d news -f newsdata.sql
into terminal to load the data. - Run
python newsdata.py
The database consisted of three tables - authors, articles, and log. Here are the questions for the project and the approach I used to solve them:
Query the articles that have been viewed the most from the log table and find the article titles from the articles table.
Query the authors where the sum of the views for each article they've written are highest.
Query logs table to find days where the ratio of request statuses that are 400 or above to total statuses are greater than 1%.