Git Product home page Git Product logo

rkd314 / interviewchallenge1 Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 1.0 2.59 MB

I interviewed for a job at a company that required an analysis of an artificial dataset as part of the process. The company name is left out of the description to make this less searchable, however, the dataset is not real, they change it frequently, and I wasn't asked to keep my analysis private.

Python 100.00%

interviewchallenge1's Introduction

This is a set of python scripts that can be used to analyze the sqlite database also included in this repository. The goal of the analysis was to study performance metrics for the product over time (see more info in the pdf report).

========== Some general notes: This code is not optimal: 1) Lack of error handling 2) No thought given to scalability; basically written just for this or equivalently small exercises 3) Some best practices ignored. Example: I've used python string substitutions instead of the DB-API's replace function. 4) Several of the scripts could have been combined, with functionality further specified by user input, to create more general tools

========== Filename: read_database.py

Function: Opens connection to sqlite database, basic reusable functions for general reading of info

Execution: import read_database

========== Filename: add_requests_new.py

Function: adds requests_new table referenced in writeup

Execution: python add_requests_new.py <DB_FILENAME> 1

==========

Filename: users_basic_info.py

Function: Show basic info about the users in the dataset, using plots and output to the console

Execution:

        To get information about the users' gender:
        
            python users_basic_info.py <DB_FILENAME> 1
        
        To get information about the number of consumers vs providers:
        
            python users_basic_info.py <DB_FILENAME> 2
            
        To get information (and a plot) about the number of consumer users in a given location:
        
            python users_basic_info.py <DB_FILENAME> 3
            
        To get information (and plots) about the number of consumer users in a given category and meta-category:
            
            python users_basic_info.py <DB_FILENAME> 4

========== Filename: requests_basic_info.py

Function: Show basic info about the requests in the dataset, using plots and output to the console

Execution:

        For instructions, type: python requests_basic_info.py help

========== Filename: invites_basic_info.py

Function: Show basic info about the invites in the dataset, using plots and output to the console

Execution:

        For instructions, type: python invites_basic_info.py help

========== Filename: quotes_basic_info.py

Function: Show basic info about the quotes in the dataset, using plots and output to the console

Execution:

        To generate two plots of the number of quotes vs time (daily and weekly binning):
        
            python quotes_basic_info.py <DB_FILENAME> 1
            
        To generate two plots of the number of quotes vs time (daily and weekly binning) for a given meta-category:
            
            python quotes_basic_info.py <DB_FILENAME> 2 <META-CATEGORY ID>

========== Filename: analysis.py

Function: Analysis of quoting metrics and change in number of unique category ids over time

Execution:

        Plots of the linear fit to the trend of the weekly distributions:
            
            python analysis.py trend
            
        Plots of power spectral density for the daily distributions:
        
            Number of quotes vs time:
                python analysis.py period_qvt
                
            Number of unique category ids vs time:
                python analysis.py period_cat
                
            Invite to quote rate vs time:
                python analysis.py period_i2q
                
        Plots of coherence:
        
            Coherence of quotes vs time with categories vs time:
                python analysis.py co_qvt_cat
                
            Coherence of invite-to-quote-rate with categories vs time:
                python analysis.py co_i2q_cat

        To reproduce the dataframe that I reference in the report:
        
            python data_analysis.py <DB_FILENAME> 1
            
        To add an updated requests table ('requests_new') including meta_category_id as an attribute:
        
            python data_analysis.py <DB_FILENAME> 2
            
            NOTE: Don't do this more than once, there is nothing built in to drop the table if it already exists

Example: python data_analysis.py invite_dataset_ff829852.sqlite 1

interviewchallenge1's People

Watchers

James Cloos avatar Rozmin avatar

Forkers

devrj12

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.