Git Product home page Git Product logo

Denis Volk, PhD

Full Stack Data Science Engineer @ Toptal, PhD in Mathematics


I am a top-notch full-stack data scientist and ML engineer, highly skilled in modern generative AI, machine learning, MLOps, data analysis, mathematical modeling, data pipelines, big data, and cloud infrastructure. My expertise includes LLMs and AI chatbots, time series analysis and forecasting, geospatial data analysis, natural language processing, probabilistic modeling, data engineering, and team leading.

I am Ph.D. in mathematics, and I did research in Math, Physics, Neuroscience, and Medicine.


πŸ‘¨β€πŸ’» Industry experience (selected)

πŸš€ Founder, CTO @ OkGPT

2023 - PRESENT

  • Created OkGPT, an AI personal assistant messenger bot. It allows users to interact with the most advanced AI through voice and text while offering additional productivity features and integrations.
  • Optimized the code to parallelize user queries' processing to increase the bot performance by magnitude.
  • Integrated multiple APIs, including Telegram, OpenAI, Cohere, Anthropic, Google, Redis, Amplitude, Datadog, and more.
  • Supervised three team members and a few external collaborators.

πŸš€ Senior AI Engineer @ Software Company (via Toptal)

2024

  • Trained a custom BERT deep neural network to identify AI-generated software code.
  • Created a data pipeline to prepare model training data from open source code repositories on GitHub.
  • Architected a test suite for classification models, including validation datasets and testing procedures.
  • Deployed the classification models to production and containerized microservices in AWS cloud.

πŸš€ AI Engineer @ Generative Tech Startup

2023

  • Developed an MVP of an app to query enterprise data in natural language. Given access to a database and a question in natural language about the data, the app would output the answer as a plot or a small table.
  • Engineered and fine-tuned the prompts to improve the quality and correctness of SQL code generation.
  • Created an automatic annotator for the database columns and the final table.

πŸš€ Senior Machine Learning Engineer @ Blockchain Security Company

2023

  • Created a machine learning model to automatically detect malicious smart contracts before they can cause harm.
  • Built a visualization tool for model output to audit its decisions.
  • Set up automatic model deployment to AWS cloud platform as a Lambda serverless function.

πŸš€ Senior Data Scientist and Data Engineer @ Israel-based HR Tech Startup (via Toptal)

2021 - 2022

  • Architected and directed the creation of a core Similarity engine to score candidates.
  • Used pre-trained NLP deep neural networks to create semantic text embeddings, which significantly increased the Similarity engine output results.
  • Created a Big Data pipeline in Databricks and Spark to enrich the input data and prepare the features for ML.
  • Prepared custom deep learning models to build richer embeddings, including various data sources and metadata.

πŸš€ Senior Data Scientist and Data Engineer @ US-based Ops/Tech Startup (via Toptal)

2020 - 2021

  • Built a foundational end-to-end machine learning solution that predicts fair prices of real-estate properties, thus eliminating a need for manual assessment and enabling the company to run its business by providing quick responses to its customers.
  • Designed and implemented an automatically refreshing ETL pipeline that injects, cleans, joins, and enriches new data from AWS S3 storage daily.
  • Developed an interpretable machine learning model with Scikit-learn, CatBoost, Lifelines, FBProphet, FAISS, and SHAP that consists of several submodels and satisfies business monotonicity constraints.
  • Supervised other data science team members and coordinated with the engineering team.

πŸš€ Senior Data Scientist @ KPMG

2017 - 2019

  • Created a machine learning model that predicted revenues for a retail store chain based on store location, local demographic data, GIS features, seasonality, and other factors.
  • Developed and deployed an interpretable machine learning model that scored B2B customers for payment default risks and provided explanations for the scores. The model massively reduced workload for weekly risks assessment.
  • Built a probabilistic Bayesian machine learning model to predict which apartment buildings still under construction would fail to be commissioned in time. The model helped reduce the funds needed to hedge risks by two times.
  • Developed and deployed NLP models to automatically label a vast body of housing contracts by contract type and extract contractor party names, address entities, and other attributes.

πŸš€ Researcher and Software Engineer @ Artec Group

2004 - 2007

  • Designed and implemented biometric machine learning face recognition algorithms.
  • Created and implemented statistical test procedures for new recognition algorithms.
  • Developed calibration procedures from 3D laser and flash scanners.

πŸ”¬ Academic experience (selected)

πŸŽ“ Associate Professor in Mathematics @ Interdisciplinary Scientific Center J.-V. Poncelet (CNRS UMI 2615)

2017 - 2020

πŸŽ“ Postdoctoral Researcher @ Centre for Advanced Studies (CAS)

2015 - 2017

  • Invented a novel mathematical method for cross-frequency synchronization analysis in the human brain.
  • Implemented the method as a MATLAB toolbox and ran tests confirming that the results agreed with previously known scientific data.
  • Prepared and published the method and findings in a top-level journal.

πŸŽ“ ERC Advanced Grant Postdoctoral Researcher @ University of Rome (Tor Vergata)

2013 - 2015

  • Discovered a new geometric phenomenon accountable for the rigidity of certain mathematical models related to heat conduction in crystals.
  • Discovered a new stability property of attractors of multidimensional piecewise isometry maps related to Markov field models.

πŸŽ“ GΓΆran Gustafsson Postdoctoral Researcher @ KTH Royal Institute of Technology

2012 - 2013

  • Established that the rotation numbers of circle maps' semigroups define their generators.
  • Discovered a fractal structure of attractors of piecewise isometry maps related to Markov field models.
  • Lectured a PhD-level course on the structural stability of dynamical systems.

πŸŽ“ Postdoctoral Researcher @ SISSA

2010 - 2012

  • Discovered a new class of dynamical systems that have persistent massive attractors.
  • Established a deep relationship between skew product dynamical systems over Markov chains and nonlinear random walks.

πŸ“„ My papers

🧠 Neuroscience

πŸͺ Mathematics and Physics

Preprints

🧬 Math Medicine & Biology

  • Mapping placental topology from 3D scans, the graphic display of variation in arborisation across gestation (with C. Salafia, M. Yampolsky, C. Stodgell, P. Katzman, J. Culhane, P. Landrigan, S. Szabo, N. Thieux, J. Swanson, N. Dole, M. Varner, J. Moye, R. Miller) // Placenta, 34(9):A73--A74, 2013 (see also Erratum)

Conference publications

  • Fast-slow partially hyperbolic systems and pathological foliations (with De Simoi, C. Liverani, C. Poquet) // International Conference "Anosov systems and modern dynamics", 2016, pp. 25–27, ISBN 978-5-98419-073-2
  • Dynamics of Piecewise Translations // Proceedings of International Conference "Dynamics, Bifurcations and Strange Attractors", 2013, pp. 112–113
  • Interval Translation Maps of Three Intervals // Proceedings of International Conference on Differential Equations and Dynamical Systems, 2012, ISBN 978-5-98419-046-6
  • Persistent massive attractors of smooth maps // Proceedings of International Mathematical Conference β€œ50 Years of IITP”, 2011, ISBN 978-5-901158-15-9
  • Skew products with interval fiber // Conference on Geometry and Topology of Foliations, 2010, p. 23
  • Thin attractors (with V. Kleptsyn) // Topology, Geometry and Dynamics: Rokhlin Memorial, 2010, pp. 70–72
  • The density of separatrix connections in $\mathbb{C}^2$ // International conference β€œDifferential equations and related topics” dedicated to I. G. Petrovskii, 2004. Book of Abstracts, p. 241 (Russian).

πŸ’° My grants

  • ERC Advanced Investigator Grant MALADY 246953 [European Union]
  • β€œYoung SISSA Scientists” (Principal Investigator) [Italy]
  • CNRS 10-01-93115 [France]
  • PRIN [Italy]
  • CNRS 05-01-02801-CNRS_a [France]
  • CRDF RM1-2358 [USA]

Denis Volk's Projects

bartpy icon bartpy

Bayesian Additive Regression Trees For Python

dust icon dust

Design and Deploy Large Language Model Apps

gcfd icon gcfd

Code for Article - Volk D, Dubinin I, Myasnikova A, Gutkin B and Nikulin VV (2018) Generalized Cross-Frequency Decomposition: A Method for the Extraction of Neuronal Components Coupled at Different Frequencies. Front. Neuroinform. 12:72. doi: 10.3389/fninf.2018.00072

git icon git

Coursera Git course: Final project

mlcup icon mlcup

Official baseline solutions to Yandex Cup ML challenge

neural-ode icon neural-ode

Jupyter notebook with Pytorch implementation of Neural Ordinary Differential Equations

obsidian-smart-connections icon obsidian-smart-connections

Chat with your notes in Obsidian! Plus, see what's most relevant in real-time! Interact and stay organized. Powered by OpenAI ChatGPT, GPT-4 & Embeddings.

oh-my-zsh-intika icon oh-my-zsh-intika

A delightful community-driven (with 1,300+ contributors) framework for managing your zsh configuration. Includes 200+ optional plugins (rails, git, OSX, hub, capistrano, brew, ant, php, python, etc), over 140 themes to spice up your morning, and an auto-update tool so that makes it easy to keep up with the latest updates from the community.

parametric-t-sne icon parametric-t-sne

Running parametric t-SNE by Laurens Van Der Maaten with Octave and oct2py.

pm-prophet icon pm-prophet

Simplified version of the Facebook Prophet model re-implemented in PyMC3

ray icon ray

An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.