Git Product home page Git Product logo

couch_foo's Introduction

CouchFoo

Introduction

CouchDB (couchdb.apache.org/) works slightly differently to relational databases. First, and foremost, it is a document-oriented database. That is, data is stored in documents each of which have a unique id that is used to access and modify it. The contents of the documents are free from structure (or schema free) and bear no relation to one another (unless you encode that within the documents themselves). So in many ways documents are like records within a relational database except there are no tables to keep documents of the same type in.

CouchDB interfaces with the external world via a RESTful interface. This allows document creation, updating, deletion etc. The contents of a document are specified in JSON so its possible to serialise objects within the database record efficiently as well as store all the normal types natively.

As a consequence of its free form structure there is no SQL to query the database. Instead you define (table-oriented) views that emit certain bits of data from the record and apply conditions, sorting etc to those views. For example if you were to emit the colour attribute you could find all documents with a certain colour. This is similar to indexed lookups on a relational table (both in terms of concept and performance).

CouchDB has been designed from the ground up to operate in a distributed way. It provides robust, incremental replication with bi-directional conflict detection and resolution. It’s an excellent choice for unstructed data, large datasets that need sharding efficiently and situations where you wish to run local copies of the database.

Overview

CouchFoo provides an ActiveRecord styled interface to CouchDB. The external API is nearly identical to ActiveRecord so it should be possible to migrate your applications quite easily. That said, there are a few minor differences to the way CouchDB works. In particular:

  • CouchDB is schema free so property definitions for the document are defined in the model (like DataMapper)

  • :select, :joins, :having, :group, :from and :lock are not available on find or associations as they don’t apply (locking is handled as conflict resolution at insertion time)

  • :conditions can only accept a hash and not an array or SQL. For example :conditions => {:user_name => “Georgio_1999”}

  • :offset is less efficient in CouchDB - there’s more on this in the rdoc

  • :order is applied after results are retrieved from the database. Therefore :order cannot be used with :limit without a new option :use_key. This is explained fully in the quick start guide and CouchFoo#find documentation

  • :include isn’t implemented yet but the finders and associations still accept the option so you won’t need to make any code changes

  • By default results are ordered by document key. The key uses a UUID scheme so these don’t auto-increment and are likely to come out in a different order to insertion. default_sort can be used on a model to sort by create date by default and overcome this

  • validates_uniqueness_of has had the :case_sensitive option removed

  • Because there’s no SQL there’s no SQL finder methods

  • Timezones, aggregations and fixtures are not yet implemented

  • The price of index updating is paid when next accessing the index rather than the point of insertion. This can be more efficient or less depending on your application. It may make sense to use an external process to do the updating for you - see CouchFoo#find for more on this

  • On that note, compacting of CouchDB is required to recover space from old versions of documents and keep performance fast. This can be kicked off in several ways (see quick start guide)

It is recommend that you read the quick start and performance sections in the rdoc for a full overview of differences and points to be aware of when developing. The Changelog file shows the differences between gem versions and should be checked when upgrading gem versions.

Getting started

CouchFoo::Base.set_database(:host => "http://localhost:5984", :database => "mydatabase")
CouchFoo::Base.logger = Rails.logger

If using with Rails you will need to create an initializer to do this (until proper integration is added). Also note depending on your version of CouchDB will depend on the version of the CouchREST gem you will require - CouchDB 0.9 requires CouchREST greater than 0.2 and CouchDB 0.8 requires CouchREST between 0.16 and 0.2. You will be warned on CouchFoo initialization if this is wrong.

Examples of usage

Basic operations are the same as ActiveRecord:

class Address < CouchFoo::Base
  property :number, Integer
  property :street, String
  property :postcode # Any generic type is fine as long as .to_json and class.from_json(json) can be called on it
end 

address1 = Address.create(:number => 3, :street => "My Street", :postcode => "secret") # Create address
address2 = Address.create(:number => 27, :street => "Another Street", :postcode => "secret")
Address.all # = [address1, address2] or maybe [address2, address1] depending on key generation
Address.first    # = address1 or address2 depending on keys so probably isn't as expected
Address.find_by_street("My Street") # = address1

As key generation is through a UUID scheme, the order can’t be predicted. However you can order the results by default:

class Address < CouchFoo::Base
  property :number, Integer
  property :street, String
  property :postcode # Any generic type is fine as long as .to_json can be called on it
  property :created_at, DateTime

  default_sort :created_at
end

Address.all # = [address1, address2]
Address.first    # = address1 or address2, sorting is applied after results
Address.first(:use_key => :created_at) # = address1 but at the price of creating a new index

Conditions work slightly differently:

Address.find(:all, :conditions {:street => "My Street"}) # = address1, creates index on :street
Address.find(:all, :conditions {:created_at => "sometime"}) # Uses same index as :use_key => :created_at
Address.find(:all, :use_key => :street, :startkey => 'p') # All streets from p in alphabet, reuses the index created 2 lines up

As well as providing support for people using relational databases, CouchFoo attempts to provide a library for those wanting to use CouchDB as a document-oriented database:

class Document < CouchFoo::Base
  property :number, Integer
  property :street, String

  view :number_ordered, "function(doc) {emit([doc.number , doc.street], doc); }", nil, :descending => true
end

Document.number_ordered(:limit => 75) # Will get the last 75 documents in the database ordered by number, street attributes

Associations work as expected but you must to remember to add the properties required for an association (we’ll make this automatic soon):

class House < CouchFoo::Base
	has_many :windows
end

class Window < CouchFoo::Base
  property :house_id, String
	belongs_to :house
end

Credits

This gem was inspired some excellent work on CouchPotato, CouchREST, ActiveCouch and RelaxDB gems. Each offered its own benefits and own challenges. After hacking with each I couldn’t get a library was happy with. So I started with ActiveRecord and modified it to work with CouchDB. Some areas required more work than others but a lot of features were achieved for free once the base level of functionality had been achieved. Credit to DHH, the rails core guys and the CouchDB gems that inspired this work.

What’s left to do?

Please feel free to fork and hit me with a request to merge back in. At the moment, the following areas need addressing:

  • Proper integration with rails using a couchdb.yml config file

  • Example ruby script for updating indexes as an external process, and the same for database compacting

  • :include for database queries

  • Remove dependency from CouchRest - keep support for both CouchDB 0.8 and 0.9 for now. Add support for attachments and log calls to server so know performance times. General tidy up of database.rb. Add support for etags?

  • inline and HABTM inline associations. Also x_id and x_type properties should be automatically defined

  • compatability with willpaginate and friends

  • cleanup of associations, reflection, validations classes (were modified in a rush)

  • has_many :through

  • Rest of calculations, timezones, aggregations and fixtures

  • Diff with AR to check got everything in

  • DataMapper interface

  • Tool to migrate DB from MySQL to CouchDB

  • some kind of generic admin interface where can just browse around data structure?

  • grep of code and doc for records and columns, should be documents and attributes

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.