Git Product home page Git Product logo

accumulo-examples's Introduction

Apache Accumulo Examples

Build Status

Setup instructions

Follow the steps below to run the Accumulo examples:

  1. Clone this repository

     git clone https://github.com/apache/accumulo-examples.git
    
  2. Follow Accumulo's quickstart to install and run an Accumulo instance. Accumulo has an accumulo-client.properties in conf/ that must be configured as the examples will use this file to connect to your instance.

  3. Review env.sh.example in to see if you need to customize it. If ACCUMULO_HOME & HADOOP_HOME are set in your shell, you may be able skip this step. Make sure ACCUMULO_CLIENT_PROPS is set to the location of your accumulo-client.properties.

     cp conf/env.sh.example conf/env.sh
     vim conf/env.sh
    
  4. Build the examples repo and copy the examples jar to Accumulo's lib/ext directory:

     ./bin/build
     cp target/accumulo-examples.jar /path/to/accumulo/lib/ext/
    
  5. Each Accumulo example has its own documentation and instructions for running the example which are linked to below.

When running the examples, remember the tips below:

  • Examples are run using the runex or runmr commands which are located in the bin/ directory of this repo. The runex command is a simple script that use the examples shaded jar to run a a class. The runmr starts a MapReduce job in YARN.
  • Commands intended to be run in bash are prefixed by '$' and should be run from the root of this repository.
  • Several examples use the accumulo and accumulo-util commands which are expected to be on your PATH. These commands are found in the bin/ directory of your Accumulo installation.
  • Commands intended to be run in the Accumulo shell are prefixed by '>'.

Available Examples

Each example below highlights a feature of Apache Accumulo.

Example Description
batch Using the batch writer and batch scanner
bloom Creating a bloom filter enabled table to increase query performance
bulkIngest Ingesting bulk data using map/reduce jobs on Hadoop
classpath Using per-table classpaths
client Using table operations, reading and writing data in Java.
combiner Using example StatsCombiner to find min, max, sum, and count.
compactionStrategy Configuring a compaction strategy
constraints Using constraints with tables. Limit the mutation size to avoid running out of memory
deleteKeyValuePair Deleting a key/value pair and verifying the deletion in RFile.
dirlist Storing filesystem information.
export Exporting and importing tables.
filedata Storing file data.
filter Using the AgeOffFilter to remove records more than 30 seconds old.
helloworld Inserting records both inside map/reduce jobs and outside. And reading records between two rows.
isolation Using the isolated scanner to ensure partial changes are not seen.
regex Using MapReduce and Accumulo to find data using regular expressions.
reservations Using conditional mutations to implement simple reservation system.
rgbalancer Using a balancer to spread groups of tablets within a table evenly
rowhash Using MapReduce to read a table and write to a new column in the same table.
sample Building and using sample data in Accumulo.
shard Using the intersecting iterator with a term index partitioned by document.
spark Using Accumulo as input and output for Apache Spark jobs
tabletofile Using MapReduce to read a table and write one of its columns to a file in HDFS.
terasort Generating random data and sorting it using Accumulo.
uniquecols Use MapReduce to count unique columns in Accumulo
visibility Using visibilities (or combinations of authorizations). Also shows user permissions.
wordcount Use MapReduce and Accumulo to do a word count on text files

Release Testing

This repository can be used to test Accumulo release candidates. See docs/release-testing.md.

accumulo-examples's People

Contributors

ctubbsii avatar dependabot[bot] avatar elinaawise avatar jmark99 avatar jzgithub1 avatar keith-turner avatar lbschanno avatar manno15 avatar mikewalch avatar milleruntime avatar mjwall avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.