Git Product home page Git Product logo

panda-monium's Introduction

Panda-monium

Inactively Maintained

Panda-monium lets you serialize + compress Pandas DataFrames. It uses CSVs to serialize and Goose to compress DataFrames. So far, the only way to serialize DataFrames was to use pickle (which takes lots of space on your computer) and converting to CSV files (which can create the annoying Unnamed: 0 column)

Tutorial

Example:

import pandas as pd
import Pandamonium as pm #Keep the uppercase
data = pd.DataFrame({ ... }) #Add your data
file = "data.pdc"

# Compress
pm.compress(data, file) #Should return "Success!"

# Decompress
loaded = pm.decompress(file)

How it works

Panda-monium works by converting DataFrames into CSV files and replacing substrings (such as a comma next to a number) into 1 character. The larger the data, the more likely it is for compression to work. It removes the annoying Unnamed: 0 column by removing it during decompression.

Collisions

A collision is when one of the DataFrame's strings contains a Panda-monium "keyword" or the replacing character during compression (Panda-monium has been designed to prevent collisions by replacing a comma next to any number with a character that isn't on the keyboard. This makes it unlikely for collisions to happen).

Collisions can cause the DataFrame to become unstable and have weird dimensions, which can cause errors.

Preventing Collisions

Collisions can be prevented by using characters that the system can't show in the message (such as �).

However, there is a small chance that the symbol (that GitHub can't display) gets in the DataFrame. The solution? Escape characters. They will be added in future versions.

panda-monium's People

Contributors

cardinal9999 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.