Git Product home page Git Product logo

Ville Puuska

Experience

  • 2023- Data Engineer, Solita
  • 2017-2023 PhD student/researcher, Tampere University

Data engineering

Interests

  • Designing and building E2E testable data pipelines
  • Event driven architectures and data pipelines
  • Streaming data pipelines

Tech at work

  • Python, PySpark, Spark SQL, R when absolutely necessary
  • Azure Data Factory, Databricks

Tech at home

  • Python and a bit of Go
  • Airflow, Docker, DuckDB, Kafka, Polars, Postgres

Mathematics

Research and Publications

My research is focused on the algebraic theory of topological data analysis. I'm interested in utilizing (minimal) resolutions to develop computable and interpretable representations and invariants for multiparameter persistent (co)homology and persistence modules more generally.

Education

  • 2017-2023, PhD, Mathematics, Tampere University
    Advisor: Professor Eero Hyry, Tampere University
    Field: Topological Data Analysis
    Thesis: Flat Covers and Cotorsion in Persistence https://urn.fi/URN:ISBN:978-952-03-3058-3
  • 2013-2017, MSc (and BSc), Mathematics, University of Tampere

Ville Puuska's Projects

aoc icon aoc

Advent of Code solutions

journeys-pipeline-dlt-duckdb-polars icon journeys-pipeline-dlt-duckdb-polars

Simple example of an ELT pipeline using dlt for ingesting from the JourneysAPI, DuckDB for intermediate storage, and DuckDB & Polars for transformations.

local-lakehouse icon local-lakehouse

PoC Python package for using Unity Catalog OSS to manage local structured data and accessing it via a Polars DataFrame API and DuckDB SQL API.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.