Git Product home page Git Product logo

joinery's Introduction

joinery

joinery [joi-nuh-ree]
1. In woodworking, the craft of joining together pieces of wood to produce more complex items.
2. In Java, a data analysis library for joining together pieces of data to produce insight.

quick start

Remember FizzBuzz (of course you do!), well imagine you have just solved the puzzle (well done!) and you have written the results to a comma-delimited file for further analysis. Now you want to know how many times are the strings Fizz, Buzz, and FizzBuzz printed out.

You could answer this question any number of ways, for example you could modify the original program, or reach for Python/pandas, or even (for the sadistic among us, you know who you are) type out a one-liner at the command prompt (probably including cut, sort, and uniq).

Well, now you have one more option. This option is especially good if you are 1) using Java already and 2) may need to integrate your solution with other Java applications in the future.

You can answer this question with joinery.

df.groupBy("value")
  .count()
  .sortBy("-number")
  .head(3)

Printing out the resulting data frame gives us the following table.

  	   value 	number
 0	Fizz    	    27
 1	Buzz    	    14
 2	FizzBuzz	     6

See FizzBuzz.java for the complete code.

next steps

Get the executable jar and try it for yourself.

$ java -jar joinery-dataframe-1.4-jar-with-dependencies.jar shell
# Rhino 1.7 release 2 2009 03 22
# Java HotSpot(TM) 64-Bit Server VM, Oracle Corporation, 1.8.0_31
> df = new DataFrame()
[empty data frame]
> df.add("value")
[empty data frame]
> [10, 20, 30].forEach(function(val) {
      df.append([val])
  })
> df
        value
 0	   10
 1	   20
 2	   30

>

maven

A maven repository for joinery is hosted on JCenter. For instructions on setting up your maven profile to use JCenter, visit https://bintray.com/bintray/jcenter.

<dependency>
  <groupId>joinery</groupId>
  <artifactId>joinery-dataframe</artifactId>
  <version>1.4</version>
</dependency>

download

JCenter also allows for direct download using the button below.

Download

utilities

joinery includes some tools to make working with data frames easier. These tools are available by running joinery.DataFrame as an application.

$ java joinery.DataFrame
usage: joinery.DataFrame [compare|plot|show|shell] [csv-file ...]

show

Show displays the tabular data of a data frame in a gui window.

$ java joinery.DataFrame show data.csv

Screenshot of show window

plot

Plot displays the numeric data of a data frame as a chart.

$ java joinery.DataFrame plot data.csv

Screenshot of plot window

shell

Launches an interactive JavaScript shell for working with data frames.

$ java joinery.DataFrame shell
# Rhino 1.7 release 2 2009 03 22
# Java HotSpot(TM) 64-Bit Server VM, Oracle Corporation, 1.8.0_31
> df = DataFrame.readCsv('https://www.quandl.com/api/v1/datasets/GOOG/NASDAQ_AAPL.csv')
             Date             Open            High            Low             Close              Volume
   0    2015-02-20      128.62000000    129.50000000    128.05000000    129.50000000     48948419.00000000
   1    2015-02-19      128.48000000    129.03000000    128.33000000    128.45000000     37362381.00000000
   2    2015-02-18      127.62000000    128.78000000    127.45000000    128.72000000     44891737.00000000
   3    2015-02-17      127.49000000    128.88000000    126.92000000    127.83000000     63152405.00000000
   4    2015-02-13      127.28000000    127.28000000    125.65000000    127.08000000     54272219.00000000
   5    2015-02-12      126.06000000    127.48000000    125.57000000    126.46000000     74474466.00000000
   6    2015-02-11      122.77000000    124.92000000    122.50000000    124.88000000     73561797.00000000
   7    2015-02-10      120.17000000    122.15000000    120.16000000    122.02000000     62008506.00000000
   8    2015-02-09      118.55000000    119.84000000    118.43000000    119.72000000     38889797.00000000

... 8629 rows skipped ...

8638    1980-12-12        0.00000000      4.12000000      4.11000000      4.11000000     14657300.00000000

> df.types()
[class java.util.Date, class java.lang.Double, class java.lang.Double, class java.lang.Double, class java.lang.Double, class java.lang.Double]
> df.sortBy('Date')
             Date             Open            High            Low             Close              Volume
8638    1980-12-12        0.00000000      4.12000000      4.11000000      4.11000000     14657300.00000000
8637    1980-12-15        0.00000000      3.91000000      3.89000000      3.89000000      5496400.00000000
8636    1980-12-16        0.00000000      3.62000000      3.61000000      3.61000000      3304000.00000000
8635    1980-12-17        0.00000000      3.71000000      3.70000000      3.70000000      2701300.00000000
8634    1980-12-18        0.00000000      3.82000000      3.80000000      3.80000000      2295300.00000000
8633    1980-12-19        0.00000000      4.05000000      4.04000000      4.04000000      1519700.00000000
8632    1980-12-22        0.00000000      4.25000000      4.23000000      4.23000000      1167600.00000000
8631    1980-12-23        0.00000000      4.43000000      4.41000000      4.41000000      1467200.00000000
8630    1980-12-24        0.00000000      4.66000000      4.64000000      4.64000000      1500100.00000000

... 8629 rows skipped ...

   0    2015-02-20      128.62000000    129.50000000    128.05000000    129.50000000     48948419.00000000

> .reindex('Date')
                  Open                High            Low             Close              Volume
1980-12-12        0.00000000      4.12000000      4.11000000      4.11000000     14657300.00000000
1980-12-15        0.00000000      3.91000000      3.89000000      3.89000000      5496400.00000000
1980-12-16        0.00000000      3.62000000      3.61000000      3.61000000      3304000.00000000
1980-12-17        0.00000000      3.71000000      3.70000000      3.70000000      2701300.00000000
1980-12-18        0.00000000      3.82000000      3.80000000      3.80000000      2295300.00000000
1980-12-19        0.00000000      4.05000000      4.04000000      4.04000000      1519700.00000000
1980-12-22        0.00000000      4.25000000      4.23000000      4.23000000      1167600.00000000
1980-12-23        0.00000000      4.43000000      4.41000000      4.41000000      1467200.00000000
1980-12-24        0.00000000      4.66000000      4.64000000      4.64000000      1500100.00000000

... 8629 rows skipped ...

2015-02-20      128.62000000    129.50000000    128.05000000    129.50000000     48948419.00000000

> .retain('Close')
                  Close
1980-12-12        4.11000000
1980-12-15        3.89000000
1980-12-16        3.61000000
1980-12-17        3.70000000
1980-12-18        3.80000000
1980-12-19        4.04000000
1980-12-22        4.23000000
1980-12-23        4.41000000
1980-12-24        4.64000000

... 8629 rows skipped ...

2015-02-20      129.50000000

> .plot()

documentation

The complete api documentation for the DataFrame class is available at http://cardillo.github.io/joinery


Build Status

joinery's People

Contributors

cardillo avatar lejon avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.