Git Product home page Git Product logo

sales-data-analysis-etl's Introduction

Sales-Data-Analysis

Overview

In this project, I used an SQL Dump found on GitHub, which contains data about an India-based company’s customers, markets, products, and transactions. This project’s objective was to create a dashboard that would allow managers in the company to gain insights into sales performance and make decisions based on that, where the end goal is increasing sales.
Before beginning, defining the metrics and KPIs is crucial to getting the most out of the data in hand, so the questions we need to answer are:

  1. What’s the company’s revenue growth over time?
  2. What’s the revenue from each specific client?
  3. How many sales were made in each year/month?
  4. How many sales were made by a specific client?
  5. Who are the top customers?
  6. Which products are bestsellers?
  7. What markets bring in the most revenue?
  8. What markets have the biggest sales quantity?
  9. What percentage of sales were made online?

The dashboard will help answer the questions above, shedding light on areas of improvement to increase sales.

Data Analysis Using SQL

  1. Show all customer records

    SELECT * FROM customers;

  2. Show total number of customers

    SELECT count(*) FROM customers;

  3. Show transactions for Chennai market (market code for chennai is Mark001

    SELECT * FROM transactions where market_code='Mark001';

  4. Show distrinct product codes that were sold in chennai

    SELECT distinct product_code FROM transactions where market_code='Mark001';

  5. Show transactions where currency is US dollars

    SELECT * from transactions where currency="USD"

  6. Show transactions in 2020 join by date table

    SELECT transactions.*, date.* FROM transactions INNER JOIN date ON transactions.order_date=date.date where date.year=2020;

  7. Show total revenue in year 2020 in Chennai

    SELECT SUM(transactions.sales_amount) FROM transactions INNER JOIN date ON transactions.order_date=date.date where date.year=2020 and transactions.market_code="Mark001";

Star Schema

image

ETL Procedure

  • Extracted the data from the SQL Dump onto MySQL, creating a sales Database containing the following tables: Customers, Date, Markets, Products, Transactions.

  • Transformed the data to prepare it for loading onto Power BI, the transformation process included cleansing the data and establishing consistency:
    1. Formula to remove -1 and 0 values from sales_amount column - in sales_transactions table

    = Table.SelectRows(sales_transactions, each ([sales_amount] <> -1 and [sales_amount] <> 0))

    1. Formula to create norm_sales_amount column (currency inversion from INR to USD) - in sales_transactions table

    = Table.AddColumn(#"Cleanup currency", "Norm_sales_amount", each if [currency] = "INR" then Number.Round([sales_amount]*0.012309894,2) else [sales_amount])

    1. Formula to remove duplicates - in sales_transactions table

    = Table.SelectRows(#"remove -1/0 from sales amount", each ([currency] = "INR#(cr)" or [currency] = "USD#(cr)"))

    1. Formula to remove NULL zones - in sales_markets table

    = Table.SelectRows(sales_markets, each ([zone] <> ""))

  • Loaded the clean data onto Power BI to build the dashboard.

    Building Dashboard Using Power BI

    image

    Mobile Version

    image

    Insights

  • The company has 12.12M$ revenue and 2M$ total sales.
  • The company’s revenue has been declining since February 2020, most likely due to covid.
  • Most of the company’s sales (84.44%) are made through Brick & Mortar as opposed to online sales.
  • The Delhi Market is the one with the highest revenue and sales quantity, and has been over the years, followed by Mumbai.
  • Electricalsara Stores are the customer with the highest revenue.
  • 100% of Electricalsara’s sales are done face-to-face and not online.
  • January of 2018 was the month in which the company had the highest revenue with 523K$.
  • As shown by the Top 5 Products bar chart, most product codes were left blank, showing there is a serious mistake that’s being done when inputting the transactions, this mistake needs to be addressed to be able to get better insights in which product bring in the highest revenue.
  • sales-data-analysis-etl's People

    Contributors

    kawtharmu avatar

    Stargazers

     avatar

    Watchers

     avatar

    Recommend Projects

    • React photo React

      A declarative, efficient, and flexible JavaScript library for building user interfaces.

    • Vue.js photo Vue.js

      🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

    • Typescript photo Typescript

      TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

    • TensorFlow photo TensorFlow

      An Open Source Machine Learning Framework for Everyone

    • Django photo Django

      The Web framework for perfectionists with deadlines.

    • D3 photo D3

      Bring data to life with SVG, Canvas and HTML. 📊📈🎉

    Recommend Topics

    • javascript

      JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

    • web

      Some thing interesting about web. New door for the world.

    • server

      A server is a program made to process requests and deliver data to clients.

    • Machine learning

      Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

    • Game

      Some thing interesting about game, make everyone happy.

    Recommend Org

    • Facebook photo Facebook

      We are working to build community through open source technology. NB: members must have two-factor auth.

    • Microsoft photo Microsoft

      Open source projects and samples from Microsoft.

    • Google photo Google

      Google ❤️ Open Source for everyone.

    • D3 photo D3

      Data-Driven Documents codes.