Git Product home page Git Product logo

fastexcel's Introduction

fastexcel

A fast excel file reader for Python, written in Rust.

Based on calamine and Apache Arrow.

Docs available here.

Dev setup

Prerequisites

Python>=3.10 and a recent Rust toolchain must be installed on your machine. cargo must be available in your PATH.

First setup

On the very first time you setup the project, you'll need to create a virtualenv and install the necessary tools:

python -m venv .venv
source .venv/bin/activate
(.venv) make dev-setup

This will also set up pre-commit.

Installing the project in dev mode

In order to install the project in dev mode (for local tests for example), use make dev-install. This will compile the wheel (in debug mode) and install it. It will then be available in your venv.

Installing the project in prod mode

This is required for profiling, as dev mdoe wheels are much slower. make prod-install will compile the project in release mode and install it in your local venv, overriding previous dev installs.

Linting and formatting

The Makefile provides the lint and format extras to ease this.

Running the tests

make test

Running the benchmarks

Speed benchmark

make benchmarks

Memory benchmark

mprof run -T 0.01 python python/tests/benchmarks/memory.py python/tests/benchmarks/fixtures/plain_data.xls

Building the docs

make doc

Creating a release

  1. Create a PR containing a commit that only updates the version in Cargo.toml.
  2. Once it is approved, squash and merge it into main.
  3. Tag the squashed commit, and push it.
  4. The release GitHub action will take care of the rest.

Dev tips

  • Use cargo check to verify that your rust code compiles, no need to go through maturin every time
  • cargo clippy = ๐Ÿ’–
  • Careful with arrow constructors, they tend to allocate a lot
  • mprof and time go a long way for perf checks, no need to go fancy right from the start

fastexcel's People

Contributors

lukapeschke avatar dependabot[bot] avatar jgundermann avatar prettywood avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.