Git Product home page Git Product logo

demo_microservice's Introduction

Project1_YuanjingZhu_QueryCustomerSpendingScore

Python application test with Github Actions

This the repository for project 1 of IDS706Data_Engineering_Systemm

Overview

In this project, I downloaded a Mall Customer Segmentation Dataset from Kaggle and upload to the Databricks. Then I connect my Github Codespaces with my Databricks cluster, wrote a function to excute SQL query, built a command line tool and finally a simple web app. By default, the query returns the spending score of customers with $50k annual income, and users can put any income number they are interested in and all the customers' spending score will be displayed.

Dataset

The dataset comes from Kaggle. It contains customer information including Customer ID, age, gender, annual income and spending score from a supermarket. The spending score is assigned to each customer based on their spending nature and purchasing behavior. By analyzing the dataset, the supermarket owner can segment their customers and excute effective strategy accordingly.

Connecting Codespaces and Databricks

Created four secrets in GitHub settings. The four secrets are DATABRICKS_HOST, DATABRICKS_HTTP_PATH, DATABRICKS_SERVER_HOSTNAME and DATABRICKS_TOKEN.

Then test the following code in Codespaces to check the connection.

databricks clusters list --output JSON | jq
databricks fs ls dbfs:/
databricks jobs list --output JSON | jq

SQL query

The default sql query will return the spending score of customers whose annual income are $50,000. Using the querydb function, it will return the spending score of all customers whose annual income is the number assigned.
For example:

Command line tool

chmod +x query_mall_sql_db.py 
./query_mall_sql_db.py cli-query  --help
./query_mall_sql_db.py cli-query  --income "60"

The chmod command is used to manage file system access permissions on Unix and Unix-like systems, +x means to excute. chi-query is the function name. Type --help to see the instruction or type --income"int" to excute the sql query.

Web app

python fastapi-app.py 

After typing the code in the terminal, a new web page will open and the home page says "Welcome to my Databricks sql query!". Type "/query" at the end of the url, the web page will return the customer's spending score.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.