Git Product home page Git Product logo

arff's Introduction

Arff

A python 3.x Module to create arff files from pandas DataFrames

A simple library to convert a pandas DataFrame to an arff file (sparse or not)

Dependencies:

  • pandas
  • numpy
  • dateinfer (use the actively maintained pydateinfer)

Features:

  • It supports all arff datatypes:
    • Numeric
    • String
    • Nominal
    • Date
  • It can create both sparse and normal arff files
  • Auto convert columns of bool type to nominal attribute of values True/False
  • Easy declaration of nominal values
  • Auto detect the date/time format (thanks to dateinfer <3)

Usage:

    arff = Arff(
                    <name of the relation (same as filename)>,
                    <pandas dataframe>,
                    <dict that specifies data types (eg {
                                                            "ids" : int, 
                                                            "scores":numpy.dtype("float64"),
                                                            "date" : np.dtype("datetime64[ns]")
                                                            "valid" : [True, False]
                                                        }
                                                     )>,
                    <values to count as missing info (list, eg ["null", 0])>
                )
    arff.write(<save directory>, <save it as sparse file or not (bool: True -> sparse, False -> normal)>)

Future Changes:

  • Even for small DataFrames (2k rows), making a sparse arff file takes a lot of time. It is best to make it faster some day.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.