Git Product home page Git Product logo

ganapati's Introduction

ganapati – Hadoop HDFS Thrift interface for Ruby

Note - this project has been abandoned by LivingSocial at this point in time. Feel free to fork if you wish to continue development. We have switched to using github.com/kzk/webhdfs for our projects.

Ganapati is a Ruby thrift lib for interfacing with Hadoop’s distributed file system, HDFS. It also includes a few command line client utilities.

To install:

gem install ganapati

Starting thrift server

Documentation in Hadoop for the thrift interface to HDFS is crap. It can be found here.

As a much simpler and safer way of auto compiling and then starting the thrift interface, use the provided script:

bin/hdfs_thrift_server <port>

This will start a thrift server on the given port (after compiling the server code provided in the Hadoop distribution).

Basic Usage

require 'rubygems'
require 'ganapati'

# args are host, port, and optional timeout
client = Ganapati::Client.new 'localhost', 1234

# copy a file to hdfs
client.put("/some/file", "/some/hadoop/path")

# get a file from hadoop
client.get("/some/hadoop/path", "/local/path")

# Create a file
f = client.create("/home/someuser/afile.txt")
f.write("this is some text")
# Always, always close the file
f.close 

# Create a file with code block
client.create("/home/someuser/afile.txt") { |f|
  f.write("this is some text")
}

# Open a file for reading and read it
client.open('/home/someuser/afile.txt') { |f| 
  puts f.read 
  # or read for specific length from start
  puts f.read(0, 4)
}

# read a file line by line
client.readlines('/home/someuser/afile.txt') { |line|
  puts line
}

# Open a file for appending and append to it
client.append('/home/someuser/afile.txt') { |f| 
  f.write "new data" 
}	  

## Common file methods are available (chown, chmod, mkdir, stat, etc).  Examples:
# move a file
client.mv "/home/someuser/afile.txt", "/home/someuser/test.txt"

# remove a file
client.rm "/home/someuser/test.txt"

# test for file existance
client.exists? "/home/someuser/test.txt"

# get a list of all files
client.ls "/home"

client.close

# Quick and dirty way to print remote file.  The run class method takes care of closing the client.
puts Ganapati::Client.run('localhost', 1234) { |c| c.open('/home/someuser/afile.txt') { |f| f.read } }

Command Line Utilities

There are a few utility programs included in the bin directory. hls provides a way to see the contents of HDFS (recursively and verbosely with appropriate command line options):

./bin/hls hdfs://host:port/tmp

hcp provides a way to copy to/from/between HDFS servers:

./bin/hcp hdfs://host:port/some/path/to/file ./file
./bin/hcp ./file hdfs://host:port/some/path/to/file
./bin/hcp hdfs://anotherhost:port/some/path/to/file hdfs://host:port/some/path/to/file

ganapati's People

Contributors

alindeman avatar p5k6 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.