Git Product home page Git Product logo

aymane-maghouti / mobile-data-hive-insights Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 0.0 708 KB

This project demonstrates the process of extracting data from a MySQL database, transferring it using Apache Sqoop, storing it in Hive Data warehouse (the data actually is store in Hadoop Distributed File System (HDFS)), and performing analysis using Hive Query Language (Hive QL) (it is a language close to SQL). Then visualize the data in Power BI,

Home Page: https://youtu.be/_RyK8cX-14Y?si=tan4FEA7dK26Za7h

Shell 23.38% HiveQL 76.62%
apache-sqoop data data-integration data-visualization hadoop-hdfs hivedb hiveql powerbi

mobile-data-hive-insights's Introduction

Mobile-Data-Hive-Insights

Table of Contents

  1. Project Overview
  2. Technologies Used
  3. Data Pipeline
  4. Repository Structure
  5. How to Run
  6. Dashboard
  7. Acknowledgments
  8. Conclusion
  9. Contacts

Project Overview

This project demonstrates the process of extracting data from a MySQL database, transferring it using Apache Sqoop, storing it in Hive Data warehouse (the data actually is store in Hadoop Distributed File System (HDFS)), and performing analysis using Hive Query Language (Hive QL) (it is a language close to SQL). Then visualize the data in Power BI, after connecting the hive datawarehouse to power BI.

Technologies Used

MySQL Database: Used as the source database from which data was extracted.

Apache Sqoop: Utilized for transferring data between MySQL and the Hadoop ecosystem. Batch integration mode was used for importing data from MySQL to Hive datawarehouse.

Apache Hive: Employed as the datawarehouse solution in this project, also for data processing and analysis using Hive Query Language(HiveQL).

Apache Derby: used as embedded database for storing the Hive Metastore.

Hadoop Ecosystem (HDFS): the data, stored in hive datawarehouse, is actually stored in Hadoop Distributed File System (HDFS).

Power BI: Used for creating interactive visualization and dashboard.

Data Pipeline

Here is the data pipeline :

mobile_Data_pipeline

Repository Structure

Mobile-Data-Hive-Insights:.
│   README.md
│
├───dashboard
│       mobile_dash.pbix
│       mobile_dash.pdf
│
├───dataset
│       Mobile_Data.csv
│
├───hive_sqoop
│       hive_statements.sh
│       import_data_with_sqoop.sh
│
└───images
        dashboard_mobile.png
        data_pipeline.png

How to Run

Prerequisites

  • MySQL Database: Ensure you have the necessary access rights to the database containing the source data. (import the data from the csv file into mysql database)
  • Apache Sqoop: Install and configure Apache Sqoop for data transfer between MySQL and hive datawarehouse.
  • Hadoop: Set up Hadoop cluster and configure HDFS for data storage.
  • Apache Hive: Install Hive for data analysis.
  • Power BI: Install Power BI for creating the dashboard.

Running

  • Start the Hadoop cluster using the following commands:

    start-dfs
    start-yarn
    
  • Start the Apache Derby server using the command:

    StartNetWorkServer -h 0.0.0.0
    
  • Launch Hive using the command:

    hive
    
  • Open a new command line window as an Administrator and run the following Sqoop command to import data from MySQL to the Hive table:

sqoop import --connect jdbc:mysql://localhost:3306/<mysql_database_name> --username <your_username> --password <your_password> --table <mysql_table_name> --hive-import --hive-table <hive_table> --create-hive-table --m 1

Note: Replace <mysql_database_name>, <your_username>, <your_password>, <mysql_table_name>, and <hive_table> with appropriate values.

Check for any errors during the data import process. If there are no errors, congratulations!:) The data has been successfully imported into the Hive data warehouse.

Building the Dashboard:

  • Launch Power BI and connect to the Hive data source to access the imported data.

  • Build interactive visualization, charts, and graphs based on the imported data.(or you can directly open the provided .pbix file)

With these steps completed, you have successfully set up the project, imported data from MySQL to the Hive data warehouse, and created interactive dashboards using Power BI. Enjoy exploring your data and gaining valuable insights!

Dashboard

Here is the Dashboard created in Power BI:

mobile Dashboard

Acknowledgments

  • Special thanks to the open-source communities behind hadoop ecosystem , and Power BI.

Conclusion

this project seamlessly integrated MySQL data into the Hadoop ecosystem using Apache Sqoop and Hive, enabling efficient analysis and visualization through Power BI. The successful collaboration of these technologies highlights the project's effectiveness in transforming raw data into actionable insights, emphasizing the power of unified data solutions in modern analytics.

Contacts

For any questions or further information, feel free to contact me :)

mobile-data-hive-insights's People

Contributors

aymane-maghouti avatar

Stargazers

Loubna Boukayoua avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.