Git Product home page Git Product logo

hamster's Introduction

Hamster

  • Quality
  • Coverage
  • Build
  • Dependencies
  • Downloads
  • Tags
  • Releases
  • Issues
  • License
  • Version
  • Discuss

Efficient, immutable, and thread-safe collection classes for Ruby.

Hamster provides 6 Persistent Data Structures: Hash, Vector, Set, SortedSet, List, and Deque (which works as an immutable queue or stack).

Hamster collections are immutable. Whenever you modify a Hamster collection, the original is preserved and a modified copy is returned. This makes them inherently thread-safe and shareable. At the same time, they remain CPU and memory-efficient by sharing between copies.

While Hamster collections are immutable, you can still mutate objects stored in them. We recommend that you don't do this, unless you are sure you know what you are doing. Hamster collections are thread-safe and can be freely shared between threads, but you are responsible for making sure that the objects stored in them are used in a thread-safe manner.

Hamster collections are almost always closed under a given operation. That is, whereas Ruby's collection methods always return arrays, Hamster collections will return an instance of the same class wherever possible.

Where possible, Hamster collections offer an interface compatible with Ruby's built-in Hash, Array, and Enumerable, to ease code migration. Also, Hamster methods accept regular Ruby collections as arguments, so code which uses Hamster can easily interoperate with your other Ruby code.

And lastly, Hamster lists are lazy, making it possible to (among other things) process "infinitely large" lists.

Using

To make the collection classes available in your code:

require "hamster"

Or if you prefer to only pull in certain collection types:

require "hamster/hash"
require "hamster/vector"
require "hamster/set"
require "hamster/sorted_set"
require "hamster/list"
require "hamster/deque"

Constructing a Hamster Hash is almost as simple as a regular one:

person = Hamster::Hash[name: "Simon", gender: :male]
# => Hamster::Hash[:name => "Simon", :gender => :male]

Accessing the contents will be familiar to you:

person[:name]                       # => "Simon"
person.get(:gender)                 # => :male

Updating the contents is a little different than you are used to:

friend = person.put(:name, "James") # => Hamster::Hash[:name => "James", :gender => :male]
person                              # => Hamster::Hash[:name => "Simon", :gender => :male]
friend[:name]                       # => "James"
person[:name]                       # => "Simon"

As you can see, updating the hash returned a copy leaving the original intact. Similarly, deleting a key returns yet another copy:

male = person.delete(:name)         # => Hamster::Hash[:gender => :male]
person                              # => Hamster::Hash[:name => "Simon", :gender => :male]
male.key?(:name)                    # => false
person.key?(:name)                  # => true

Since it is immutable, Hamster's Hash doesn't provide an assignment (Hash#[]=) method. However, Hash#put can accept a block which transforms the value associated with a given key:

counters = Hamster::Hash[evens: 0, odds: 0]  # => Hamster::Hash[:evens => 0, :odds => 0]
counters.put(:odds) { |value| value + 1 }    # => Hamster::Hash[:odds => 1, :evens => 0]

Or more succinctly:

counters.put(:odds, &:next)         # => {:odds => 1, :evens => 0}

This is just the beginning; see the API documentation for details on all Hash methods.

A Vector is an integer-indexed collection much like an immutable Array. Examples:

vector = Hamster::Vector[1, 2, 3, 4] # => Hamster::Vector[1, 2, 3, 4]
vector[0]                            # => 1
vector[-1]                           # => 4
vector.put(1, :a)                    # => Hamster::Vector[1, :a, 3, 4]
vector.add(:b)                       # => Hamster::Vector[1, 2, 3, 4, :b]
vector.insert(2, :a, :b)             # => Hamster::Vector[1, 2, :a, :b, 3, 4]
vector.delete_at(0)                  # => Hamster::Vector[2, 3, 4]

Other Array-like methods like #select, #map, #shuffle, #uniq, #reverse, #rotate, #flatten, #sort, #sort_by, #take, #drop, #take_while, #drop_while, #fill, #product, and #transpose are also supported. See the API documentation for details on all Vector methods.

A Set is an unordered collection of values with no duplicates. It is much like the Ruby standard library's Set, but immutable. Examples:

set = Hamster::Set[:red, :blue, :yellow] # => Hamster::Set[:red, :blue, :yellow]
set.include? :red                        # => true
set.add :green                           # => Hamster::Set[:red, :blue, :yellow, :green]
set.delete :blue                         # => Hamster::Set[:red, :yellow]
set.superset? Hamster::Set[:red, :blue]  # => true
set.union([:red, :blue, :pink])          # => Hamster::Set[:red, :blue, :yellow, :pink]
set.intersection([:red, :blue, :pink])   # => Hamster::Set[:red, :blue]

Like most Hamster methods, the set-theoretic methods #union, #intersection, #difference, and #exclusion (aliased as #|, #&, #-, and #^) all work with regular Ruby collections, or indeed any Enumerable object. So just like all the other Hamster collections, Hamster::Set can easily be used in combination with "ordinary" Ruby code.

See the API documentation for details on all Set methods.

SortedSet (API Documentation)

A SortedSet is like a Set, but ordered. You can do everything with it that you can do with a Set. Additionally, you can get the #first and #last item, or retrieve an item using an integral index:

set = Hamster::SortedSet['toast', 'jam', 'bacon'] # => Hamster::SortedSet["bacon", "jam", "toast"]
set.first                                         # => "bacon"
set.last                                          # => "toast"
set[1]                                            # => "jam"

You can also specify the sort order using a block:

Hamster::SortedSet.new(['toast', 'jam', 'bacon']) { |a,b| b <=> a }
Hamster::SortedSet.new(['toast', 'jam', 'bacon']) { |str| str.chars.last }

See the API documentation for details on all SortedSet methods.

Hamster Lists have a head (the value at the front of the list), and a tail (a list of the remaining items):

list = Hamster::List[1, 2, 3]
list.head                    # => 1
list.tail                    # => Hamster::List[2, 3]

Add to a list with List#add:

original = Hamster::List[1, 2, 3]
copy = original.add(0)      # => Hamster::List[0, 1, 2, 3]

Notice how modifying a list actually returns a new list. That's because Hamster Lists are immutable.

Laziness

List is lazy where possible. It tries to defer processing items until absolutely necessary. For example, the following code will only call Prime.prime? as many times as necessary to generate the first 3 prime numbers between 10,000 and 1,000,000:

require 'prime'

Hamster.interval(10_000, 1_000_000).select do |number|
  Prime.prime?(number)
end.take(3)
  # => 0.0009s

Compare that to the conventional equivalent which needs to calculate all possible values in the range before taking the first three:

(10000..1000000).select do |number|
  Prime.prime?(number)
end.take(3)
  # => 10s

Construction

Besides Hamster::List[] there are other ways to construct lists:

  • Hamster.interval(from, to) creates a lazy list equivalent to a list containing all the values between from and to without actually creating a list that big.

  • Hamster.stream { ... } allows you to creates infinite lists. Each time a new value is required, the supplied block is called. To generate a list of integers you could do:

    count = 0
    Hamster.stream { count += 1 }
  • Hamster.repeat(x) creates an infinite list with x the value for every element.

  • Hamster.replicate(n, x) creates a list of size n with x the value for every element.

  • Hamster.iterate(x) { |x| ... } creates an infinite list where the first item is calculated by applying the block on the initial argument, the second item by applying the function on the previous result and so on. For example, a simpler way to generate a list of integers would be:

    Hamster.iterate(1) { |i| i + 1 }

    or even more succinctly:

    Hamster.iterate(1, &:next)
  • Hamster::List.empty returns an empty list, which you can build up using repeated calls to #add or other List methods.

Core Extensions

Enumerable#to_list will convert any existing Enumerable to a list, so you can slowly transition from built-in collection classes to Hamster.

IO#to_list enables lazy processing of huge files. For example, imagine the following code to process a 100MB file:

require 'hamster/core_ext'

File.open("my_100_mb_file.txt") do |file|
  lines = []
  file.each_line do |line|
    break if lines.size == 10
    lines << line.chomp.downcase.reverse
  end
end

Compare to the following more functional version:

File.open("my_100_mb_file.txt") do |file|
  file.map(&:chomp).map(&:downcase).map(&:reverse).take(10)
end

Unfortunately, though the second example reads nicely it takes many seconds to run (compared with milliseconds for the first) even though we're only interested in the first ten lines. Using #to_list we can get the running time back comparable to the imperative version.

File.open("my_100_mb_file.txt") do |file|
  file.to_list.map(&:chomp).map(&:downcase).map(&:reverse).take(10)
end

This is possible because IO#to_list creates a lazy list whereby each line is only ever read and processed as needed, in effect converting it to the first example.

See the API documentation for details on all List methods.

A Deque (or "double-ended queue") is an ordered collection, which allows you to push and pop items from both front and back. This makes it perfect as an immutable stack or queue. Examples:

deque = Hamster::Deque[1, 2, 3] # => Hamster::Deque[1, 2, 3]
deque.first                     # 1
deque.last                      # 3
deque.pop                       # => Hamster::Deque[1, 2]
deque.push(:a)                  # => Hamster::Deque[1, 2, 3, :a]
deque.shift                     # => Hamster::Deque[2, 3]
deque.unshift(:a)               # => Hamster::Deque[:a, 1, 2, 3]

Of course, you can do the same thing with a Vector, but a Deque is more efficient. See the API documentation for details on all Deque methods.

Transformations

Hamster arrays, hashes, and nested structures of arrays and hashes may be transformed with the update_in method.

c = Hamster.from({
  people: [{name: 'Chris', city: 'Lagos'}, {name: 'Pat', city: 'Madrid'}],
  places: [{name: 'Lagos', population: 1}, {name: 'Madrid', population: 1}]})
c2 = c.update_in(:people, 1, :city) { |old_city| 'Lagos' }
c3 = c2.update_in(:places, 1, :population) { |old_population| old_population - 1 }
c4 = c3.update_in(:places, 0, :population) { |old_population| old_population + 1 }
Hamster.to_ruby(c4)
# => {:places=>[{:population=>2, :name=>"Lagos"}, {:population=>0, :name=>"Madrid"}], :people=>[{:name=>"Chris", :city=>"Lagos"}, {:name=>"Pat", :city=>"Lagos"}]}

Naturally, update_in never mutates your collections.

See Hamster::Hash#update_in, Hamster::Vector#update_in, and Hamster::Associable#update_in for details.

Installing

Add this line to your application's Gemfile:

gem "hamster"

And then execute:

$ bundle

Or install it yourself as:

$ gem install hamster

Contributing

  1. Read the Code of Conduct
  2. Fork it
  3. Create your feature branch (git checkout -b my-new-feature)
  4. Commit your changes (git commit -am "Add some feature")
  5. Push to the branch (git push origin my-new-feature)
  6. Create new Pull Request

Other Reading

Licensing

Copyright (c) 2009-2015 Simon Harris

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

hamster's People

Contributors

alexdowad avatar baconpat avatar bill avatar ch4s3 avatar cstorey avatar denisdefreyne avatar dubek avatar dzjuck avatar eiriklied avatar elben avatar gcapizzi avatar gfredericks avatar harukizaemon avatar headius avatar hparker avatar ivoanjo avatar kef avatar kevgathuku avatar misfo avatar plexus avatar rrrene avatar ryantm avatar ryfow avatar stephencelis avatar szajbus avatar tianyicui avatar xaviershay avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hamster's Issues

Use Adamanitum to freeze stuff

For subclasses of Hamster objects, it would be incredibly useful to have memoization capabilities that didn't freak out about the fact that the objects are frozen. The Adamantium gem gives you that ability. (In my specific case, I want to be able to track the mean/median/mode/etc for vectors without having performance go to shit from recalculating all that from the 10 different places that require these values.)

Add more Hash public APIs

Add Hash #count, #partition, #one?, #sort, #sort_by, #max(imum), #min(imum), #cycle, #clear, #key

Hamster::Tries don't surive Marshalling.

When a Hamster::Trie is marshalled, because we don't do any special processing, the Trie is sent as is, and reconstituted on the other side. However, because ruby has a per-process salt that's used to randomise string hash values, the original Trie::Entries won't be in the expected index of @entries when Trie#get is called. I've got a failing example at https://gist.github.com/cstorey/5380150, too.

I should be able to produce a patch for this in the next few days, I think.

"CRUDify"

First of all, nice project. After learning of Erlang I've sometimes wondered if Ruby too would be better if it were fully immutable.

So while I'm here, I'd like to encourage you to consider "CRUDifying" the design of the classes. I know I am using the term a bit loosely, but to explain I simply mean that the methods should all depend on a core set of methods which are typically the create, read, update and delete methods (whatever the actual names might be). The advantage is that the classes become easier to subclass to create variations. You can get a better idea about it by looking at my work on Hashery project.

gem install hamster doesn't work

C:>gem install hamster
Fetching: hamster-0.4.3.gem (100%)
ERROR: While executing gem ... (Errno::EINVAL)
Invalid argument - C:/Ruby193/lib/ruby/gems/1.9.1/gems/hamster-0.4.3/spec/ha
mster/experimental/mutable_set/add?_spec.rb

More permissive equals semantics

require 'hamster'

def myfn(data)
  case data[:remaining]
  when 0; data.put(:g, "h").put(:i, "j").put(:k, "l")
  else myfn(data.put(:remaining, &:pred))
  end
end

original = Hamster.hash(a: "b", c: "d", e: "f", remaining: 5)
result = myfn(original)
expected = {a: "b", c: "d", e: "f", g: "h", i: "j", k: "l", remaining: 0}
puts result == expected # false, would love to be true
puts expected == result # false, would love to be true
puts result === expected # false, maybe we could just get this to be true?
puts expected === result # false, maybe we could just get this to be true?
puts result == Hamster.hash(expected) # true; but annoying to have to do this

The real example from my project has a chain of filters that call each other sequentially, and as I've been playing with using Hamster hashes I was disappointed at having to specifically call out Hamster repeatedly whenever I wanted to:

a) check equality on the result, in rspec
b) mock expectations on calls, in rspec

I have a lot of experience with persistent maps in Clojure, where things turn out to be a lot more seamless because of the way Java's Map#equals contract is defined ('Returns true if the given object is also a map and the two maps represent the same mappings. More formally, two maps m1 and m2 represent the same mappings if m1.entrySet().equals(m2.entrySet()). This ensures that the equals method works properly across different implementations of the Map interface.'). Because of this contract equality turns out to just work without having to Hamster.hash everything before checking equality.

Zipper

Hello hamster folks,

I spent some time yesterday porting clojure.zip to Ruby. I was wondering if this is something that could find a home in Hamster?

Zippers are a way of traversing and editing hierarchical data structures in a functional way.

Here's the code https://gist.github.com/7735514

And here's the original

https://github.com/clojure/clojure/blob/master/src/clj/clojure/zip.clj

It's a quick and dirty port but it works. I can clean it up, use Hamster data structures inside instead of arrays, add tests, make it perhaps a bit more Ruby-esque. I'll be needing something like this for hexp, so I can add it there, but it seems generic enough to put it somewhere where it's easier to reuse.

What do you think?

Broken Image in README

The Travis CI image is broken in the README. It looks like it's using something called http://img.shields.io, which seems to have broken Travis support. Would y'all accept a PR pointing it at the official Travis CI badge link?

.ruby-version file not compatible with rbenv?

I use rbenv which now prefers to use a .ruby-version file if present. Unfortunately the specified version isn't recognised by rbenv.

I'm assuming someone has encountered something like this before and has a work around?

Properly standardize and fix specs

A lot of the specs do tricky things like:

require "spec_helper"
require "hamster/queue"

describe Hamster::Queue do
  [:head, :first, :peek, :front].each do |method|
    describe "##{method}" do
      [
        [[], nil],
        [["A"], "A"],
        [%w[A B C], "A"],
      ].each do |values, expected|
        describe "on #{values.inspect}" do
          before do
            @queue = Hamster.queue(*values)
          end

          it "returns #{expected.inspect}" do
            @queue.send(method).should == expected
          end
        end
      end
    end
  end
end

And while it seems cleaver at first it really hides the fact that we're showing 4 public methods that do the exact same behavior.

Here's what we want to do:

  1. before blocks should always have :each defined:

    before(:each) do
      # ...
    end
  2. The before blocks that define @instancevars should actually be top level let:

    describe Hamster::Vector do
      let(:vector) { Hamster.vector(*values) } 
    end
  3. No more compacting multiple methods into a single spec. Instead for every one in those types of arrays create a file.

  4. Finally, use context when you're describing a particular situation.

Transfer of ownership

Based on recent renewed interest in, enthusiasm for, and activity on Hamster I no longer believe I have the necessary time to maintain Hamster in a way that does it justice or that is in keeping with the ever evolving practices of the Ruby community.

I therefore ask that someone other than myself assume full control of the project.

Thanks, Simon

test coverage for enumerable.rb ?

Comment all lines in enumerable.rb and run rake spec. 0 failure?

I made a few changes in that file. How do I test my changes are correct?

Hash#delete is O(n)

Trie#delete calls Trie#size which is O(n). This is bad for its own sake and e.g. causes Hash#filter to be O(n^2).

Working on a patch that tracks sizes on the Trie class.

Plans for a new release

The current gem version, 0.4.3, is from October 2012. Quite a bit has happened since then. Has anything been said already about cutting a new release? Is the current master stable enough for a release, or should we mark issues that would need to be resolved before a 0.4.4 / 0.5.0 can be pushed out?

Thanks <3

Get MutableHash out of experimental

Since I couldn't find a better way to do a compare-and-swap type operation in Ruby, I found myself implementing essentially the same wrapper around Hamster.hash to create mutable_hash that you've already written.

Q: What is left to do to get that code out of the experimental directory?

I'd like to see implementation of the []= mutator, for one. And, per the yak-stack.txt, I think it would be good if #map and other such methods returned an object of the same type.

What do you think? If there were a list of tasks for this, I might be able to lend a hand?

Hash yield entries

Hash should yield entries rather than key, value. This will allow us to extract more Enumerable stuff as well.

hamster/Hash#merge should accept Ruby hashes

It would be very useful if merge (e.g., Hamster.hash(a: "b", c: "d").merge(a: "e", f: "g")) were usable as a shorthand for multiple puts. (Right now, this requires a chained set of puts, a reduce call, or an additional Hamster.hash call, all of which clutter the code.)

Right now this yields ArgumentError: wrong number of arguments (1 for 2) because stdlib/Hash#each seems to yield pairs rather than with two arguments. I had similar trouble using #each_pair. I'd love to provide a patch but am not sure how far down the road of input type checking you want to go based on performance concerns. Thought I'd check in first.

Add more List public APIs

  • List#sample
  • List#fill
  • List#insert
  • List#insert_by
  • List#permutation
  • List#subsequences
  • List#transpose

API documentation

Hi, I just discovered Hamster, great project you got!

However, I see there's no API docs at all. For a utility lib like this one it would be a great asset. I wouldn't mind adding some doc blocks, but I would need someone to review them since I'm not familiar with this code at all (yet).

Is this something you'd like to see? And do you have an opinion on what style of comments to use? I've become quite a fan of Yard as opposed to Rdoc. Here's an example of how that looks.

https://github.com/rom-rb/rom-mapper/blob/master/lib/rom/mapper.rb

Hamster::List[] plus more clear inspect

The current inspect for a list simply delegates to to_a, so there's no way to distinguish in the output what you're dealing with.

What I'd like to propose is basically this

module Hamster
  module List
    def self.[](*args)
      Hamster.list(*args)
    end

    def inspect
      "Hamster::List" + to_a.inspect
    end
  end
end

This mimics Array[1,2,3] (yes, that works) and gives an inspect that is similar to array's inspect, but still makes it clear what type you're dealing with.

what do y'all think?

New code is questionably licensed?

I know almost nothing about licenses, but it seems that the license on this project only has a date through 2010, which makes me think there might be a question as to what license new code would be under.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.