Git Product home page Git Product logo

data_science_portfolio's Introduction

Data Science Portfolio


A collection of my various data science projects

Table Of Contents

Data Storytelling

  • Web Crawling for data: Capturing data with a scrapy web spider, an open source framework for data scraping. This web spider crawls a retailers site to build an inventory list with prices. The spider-generated inventory table contains over 135,000 entries. Because it could be run through a cloud service, e.g. Scrapy Cloud, this method is extensible and scalable.
    Keywords: Web Spider, Web Crawler, Scrapy, Pandas, Data visualization

  • Split Test Analysis with Bayes Statistics: A product split test analysis starting from a table of coversion rates.
    Keywords: A/B Test, Bayesian inference, Pandas, Data visualization

  • Geographic sales data: A sample of geographic sales data for California. Geospatial data (latitude and longitude) is logged from two CSV files and merged into one table by order identification. The geo data is used to extrapolate zip code, city, and average income.
    Keywords: Econometrics, Geographic data, Pandas, Google maps, Heatmap, Data analytics, Table merge

  • Online dating stats: An analysis, with posterior distributions, of dating data for a Latino test account compared to similar demographics.
    Keywords: A/B Test, Bayesian inference, Pandas, Data visualization

  • Micro-hydro power generation: Due diligence on the viability of utilizing micro-hydro power generators in California's San Joaquin Valley irrigation canals. This is a work in progress!
    Keywords: Entrepreneur ventures, Business Development, Return on investment, Net present value, Lists of cash flows, Levelized cost of electricity, Returns over time

  • Fitting a sigmoid function to Silicon Emissivity Data : Reworking recorded intrinsic silicon emissivity data with by fitting sigmoid function using pymc3. A work in progress. Keywords: PYMC3, Bayesian inference, Pandas

Data Cleaning

  • Data Wrangling: A data munging exercise, working with JSON file with 150,000 entries.
    Keywords: Data munging, JSON, Large data, Pandas, String manipulation

data_science_portfolio's People

Contributors

caheredia avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.