Git Product home page Git Product logo

logs_analysis's Introduction

Logs Analysis

An internal reporting tool that will use information from the database to discover what kind of articles the site's readers like.

This tool will answer three question.

  • What are the most popular three articles of all time?
  • Who are the most popular article authors of all time?
  • On which days did more than 1% of requests lead to errors?

Quick Start

You should Install python2 to run this script

Install

  • Set up a VirtualBox/Vagrant environment

    1. Install VirtualBox
    2. Install Vagrant
    3. Start the virtual machine

    From your terminal, inside the vagrant subdirectory, run the command vagrant up. This will cause Vagrant to download the Linux operating system and install it.

    When vagrant up is finished running, you will get your shell prompt back. At this point, you can run vagrant ssh to log in to your newly installed Linux VM!

  • [optional step] Install python module psycopg2 (IF haven't installed)

If your pip version supports wheel packages it should be possible to install a binary version of Psycopg including all the dependencies from PyPI. Just run:

$ pip install -U pip      # make sure your pip is up-to-date
$ pip install psycopg2
  • Download database Next, download the data here. You will need to unzip this file after downloading it. The file inside is called newsdata.sql.

To load the data, use the command psql -d news -f newsdata.sql.

  • Create View in database You shold create three view in database, log_articles, log_daily, log_error.

  • Run the script

python logs_analysis.py

Database View

log_articles

Outputing a table of successful log record for every article and its author

CREATE VIEW  log_articles AS
  SELECT title, author, count(*) AS views FROM articles
  LEFT JOIN log
  ON '/article/' || slug = path
  WHERE status = '200 OK'
  GROUP BY title, author;

Example:

title author views
Trouble for troubled troublemakers 2 84810
There are a lot of bears 1 84504
Candidate is jerk, alleges rival 2 338647
Bad things gone, say good people 3 170098
Bears love berries, alleges bear 1 253801
Media obsessed with bears 1 84383
Balloon goons doomed 4 84557
Goats eat Google's lawn 1 84906

log_daily

Outputing a table of all requests for each day.

CREATE VIEW log_daily AS
  SELECT DATE(time), COUNT(*) AS count_total
  FROM log
  GROUP BY DATE(time);

Example:

date count_total
2016-07-01 38705
2016-07-02 55200
2016-07-03 54866
2016-07-04 54903

log_error

Outputing a table of error requests for each day

CREATE VIEW log_error AS
  SELECT DATE(time), COUNT(*) AS count_error
  FROM log
  WHERE status LIKE '404%'
  GROUP BY DATE(time);

Example:

date count_error
2016-07-31 329
2016-07-06 420
2016-07-17 1265
2016-07-19 433

logs_analysis's People

Contributors

naughtyflame avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.