Git Product home page Git Product logo

bash-reduce's Introduction

bash-reduce

A MapReduce framework written in awk, bash and GNU Parallel. Implement map and reduce functions in pure awk and run them using the framework. Tested with bash 4 and gawk 4.0.

You can find a short writeup here.

Run tests

~/src/bash-reduce/test$ ./run-all 
Running all tests

  * running test balanced-1: passed!
  * running test balanced-2: passed!
  * running test balanced-3: passed!
  * running test co-occurence: passed!
  * running test intersect: passed!
  * running test permutations: passed!
  * running test set: passed!
  * running test size: passed!
  * running test sum: passed!
  * running test word-count: passed!
  * running test word-length: passed!

All tests PASSED!

Example use

Count words
$ ./bash-reduce mappers/word-count.awk reducers/sum.awk data/shakespeare | head
the 27825
and 26791
i 20681
to 19261
of 18289
a 14668
you 13716
my 12481
that 11135
in 11027
Get unique words
./bash-reduce mappers/word-count.awk reducers/key.awk data/shakespeare | tail
zenith
zephyrs
zipped
zir
zo
zodiac
zodiacs
zone
zounds
zwaggerd
Grep for "hamlet" (only mapper needed)
$ awk -f mappers/grep.awk -v word=hamlet < data/shakespeare | head
 and bring these gentlemen where HAMLET is
 and what so poor a man as HAMLET is
 as to give words or talk with the lord HAMLET
 bear HAMLET like a soldier to the stage
 but now my cousin HAMLET and my son
 change rapiers and HAMLET wounds laertes
 dard to the combat in which our valiant HAMLET
 did HAMLET so envenom with his envy
 enter ghost and HAMLET
 enter HAMLET

bash-reduce's People

Contributors

sorhus avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

m4k3r-org

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.