Git Product home page Git Product logo

Hi there 👋

I'm a data analyst passionate about coding in Python, R, and Go.

Some of my projects include:

Calibrated Uncertainty (NumPy, SciPy, scikit-learn, PyMC3, JAX-based NumPyro)

  • Conducted in-depth analysis of an uncertainty calibration algorithm for Bayesian neural networks.
  • Identified the advantages of the approach and its modes of failure.
  • Received the highest grade and an invitation to continue research at the Harvard Data to Actionable Knowledge lab.

Galaxy Measurements (TensorFlow, pandas, Streamlit, Docker, Heroku)

  • Used TensorFlow to estimate the shape and brightness of simulated galaxies.
  • Responsible for data generation, exploratory data analysis web app, neural architecture search, and denoising pipelines.
  • Got the top grade and an opportunity to continue research.

NBA Conference Advantage (R, tidyverse, Scrapy, LaTeX)

  • Performed statistical modeling of potential bias in the NBA that grants teams in one conference an easier path to success due to the differences in travel and schedule.
  • Wrote web scrapers, feature engineering, and most of the analysis code in R.
  • Built linear regression models, ran diagnostics, authored around 70% of the report.

Meteorological Observatory (Python, TCP sockets, regex, pytest)

  • Implemented in Python streaming data collection from weather instruments.
  • The code has a suite of unit tests and is deployed at a university meteorological station.
  • Serves as the basis for experimental studies of turbulence by five scientific institutions.

Open Source Projects (Python, R, C/C++)

  • Contributed with bug fixes and documentation updates to open source projects such as Apache Arrow, scikit-learn, ThunderSVM (GPU accelerated SVM), Picasso (sparse regression algorithm), Gap Statistic (clustering metric).

Dmitry Vukolov's Projects

arrow icon arrow

Apache Arrow is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. It also provides computational libraries and zero-copy streaming messaging and interprocess communication. Languages currently supported include C, C++, Java, JavaScript, Python, and Ruby.

decimal icon decimal

A high-performance, arbitrary-precision, floating-point decimal library.

fsnotify icon fsnotify

Cross-platform file system notifications for Go.

galsim icon galsim

The modular galaxy image simulation toolkit. Documentation:

gap_statistic icon gap_statistic

Dynamically get the suggested clusters in the data for unsupervised learning.

gotop icon gotop

A terminal based graphical activity monitor inspired by gtop and vtop

kedro icon kedro

A Python library for building robust production-ready data and analytics pipelines

nba-conference icon nba-conference

Statistical modeling of NBA conference bias due to differences in travel and schedule

picasso icon picasso

Penalized Sparse Learning Solver - Unleash the Power of Nonconvex Penalty

scrapy-mongodb icon scrapy-mongodb

MongoDB pipeline for Scrapy. This module supports both MongoDB in standalone setups and replica sets. scrapy-mongodb will insert the items to MongoDB as soon as your spider finds data to extract.

streisand icon streisand

Streisand sets up a new server running your choice of WireGuard, OpenConnect, OpenSSH, OpenVPN, Shadowsocks, sslh, Stunnel, and a Tor bridge. It also generates custom instructions for all of these services. At the end of the run you are given an HTML file with instructions that can be shared with friends, family members, and fellow activists.

thundersvm icon thundersvm

ThunderSVM: A Fast SVM Library on GPUs and CPUs

vtop icon vtop

Wow such top. So stats. More better than regular top.

watchtower icon watchtower

Python CloudWatch Logging: Log Analytics and Application Intelligence

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.