cequel / cequel Goto Github PK

Ruby ORM for Cassandra with CQL3

License: MIT License

Ruby 99.73% Shell 0.10% Dockerfile 0.17%

cequel's Introduction

Cequel

Cequel is a Ruby ORM for Cassandra using CQL3.

Cequel::Record is an ActiveRecord-like domain model layer that exposes the robust data modeling capabilities of CQL3, including parent-child relationships via compound primary keys and collection columns.

The lower-level Cequel::Metal layer provides a CQL query builder interface inspired by the excellent Sequel library.

Installation

Add it to your Gemfile:

gem 'cequel'

If you use Rails 5, add this:

gem 'activemodel-serializers-xml'

Rails integration

Cequel does not require Rails, but if you are using Rails, you will need version 3.2+. Cequel::Record will read from the configuration file config/cequel.yml if it is present. You can generate a default configuration file with:

rails g cequel:configuration

Once you've got things configured (or decided to accept the defaults), run this to create your keyspace (database):

rake cequel:keyspace:create

Setting up Models

Unlike in ActiveRecord, models declare their properties inline. We'll start with a simple Blog model:

class Blog
  include Cequel::Record

  key :subdomain, :text
  column :name, :text
  column :description, :text
end

Unlike a relational database, Cassandra does not have auto-incrementing primary keys, so you must explicitly set the primary key when you create a new model. For blogs, we use a natural key, which is the subdomain. Another option is to use a UUID.

Compound keys and parent-child relationships

While Cassandra is not a relational database, compound keys do naturally map to parent-child relationships. Cequel supports this explicitly with the has_many and belongs_to relations. Let's create a model for posts that acts as the child of the blog model:

class Post
  include Cequel::Record
  belongs_to :blog
  key :id, :timeuuid, auto: true
  column :title, :text
  column :body, :text
end

The auto option for the key declaration means Cequel will initialize new records with a UUID already generated. This option is only valid for :uuid and :timeuuid key columns.

The belongs_to association accepts a :foreign_key option which allows you to specify the attribute used as the partition key.

Note that the belongs_to declaration must come before the key declaration. This is because belongs_to defines the partition key; the id column is the clustering column.

Practically speaking, this means that posts are accessed using both the blog_subdomain (automatically defined by the belongs_to association) and the id. The most natural way to represent this type of lookup is using a has_many association. Let's add one to Blog:

class Blog
  include Cequel::Record

  key :subdomain, :text
  column :name, :text
  column :description, :text

  has_many :posts
end

Now we might do something like this:

class PostsController < ActionController::Base
  def show
    Blog.find(current_subdomain).posts.find(params[:id])
  end
end

Parent child relationship in a namespaced model can be defined using the class_name option of belongs_to method as follows:

module Blogger
  class Blog
    include Cequel::Record

    key :subdomain, :text
    column :name, :text
    column :description, :text

    has_many :posts
  end
end

module Blogger
  class Post
    include Cequel::Record

    belongs_to :blog, class_name: 'Blogger::Blog'
    key :id, :timeuuid, auto: true
    column :title, :text
    column :body, :text
  end
end

Compound Partition Keys

If you wish to declare a compound partition key in a model, you can do something like:

class Post
  include Cequel::Record

  key :country, :text, partition: true
  key :blog, :text, partition: true
  key :id, :timeuuid, auto: true
  column :title, :text
  column :body, :text
end

Your compound partition key here is (country, blog), and the entire compound primary key is ((country, blog), id). Any key values defined after the last partition key value are clustering columns.

Timestamps

If your final primary key column is a timeuuid with the :auto option set, the created_at method will return the time that the UUID key was generated.

To add timestamp columns, simply use the timestamps class macro:

class Blog
  include Cequel::Record

  key :subdomain, :text
  column :name, :text
  timestamps
end

This will automatically define created_at and updated_at columns, and populate them appropriately on save.

If the creation time can be extracted from the primary key as outlined above, this method will be preferred and no created_at column will be defined.

Enums

If your a column should behave like an ActiveRecord::Enum you can use the column type :enum. It will be handled by the data-type :int and expose some helper methods on the model:

class Blog
  include Cequel::Record

  key :subdomain, :text
  column :name, :text
  column :status, :enum, values: { open: 1, closed: 2 }
end

blog = Blog.new(status: :open)
blog.open? # true
blog.closed? # false
blog.status # :open

Blog.status # { open: 1, closed: 2 }

Schema synchronization

Cequel will automatically synchronize the schema stored in Cassandra to match the schema you have defined in your models. If you're using Rails, you can synchronize your schemas for everything in app/models by invoking:

rake cequel:migrate

Record sets

Record sets are lazy-loaded collections of records that correspond to a particular CQL query. They behave similarly to ActiveRecord scopes:

Post.select(:id, :title).reverse.limit(10)

To scope a record set to a primary key value, use the [] operator. This will define a scoped value for the first unscoped primary key in the record set:

Post['bigdata'] # scopes posts with blog_subdomain="bigdata"

You can pass multiple arguments to the [] operator, which will generate an IN query:

Post['bigdata', 'nosql'] # scopes posts with blog_subdomain IN ("bigdata", "nosql")

To select ranges of data, use before, after, from, upto, and in. Like the [] operator, these methods operate on the first unscoped primary key:

Post['bigdata'].after(last_id) # scopes posts with blog_subdomain="bigdata" and id > last_id

You can also use where to scope to primary key columns, but a primary key column can only be scoped if all the columns that come before it are also scoped:

Post.where(blog_subdomain: 'bigdata') # this is fine
Post.where(blog_subdomain: 'bigdata', permalink: 'cassandra') # also fine
Post.where(blog_subdomain: 'bigdata').where(permalink: 'cassandra') # also fine
Post.where(permalink: 'cassandra') # bad: can't use permalink without blog_subdomain

Note that record sets always load records in batches; Cassandra does not support result sets of unbounded size. This process is transparent to you but you'll see multiple queries in your logs if you're iterating over a huge result set.

Time UUID Queries

CQL has special handling for the timeuuid type, which allows you to return a rows whose UUID keys correspond to a range of timestamps.

Cequel automatically constructs timeuuid range queries if you pass a Time value for a range over a timeuuid column. So, if you want to get the posts from the last day, you can run:

Blog['myblog'].posts.from(1.day.ago)

Updating records

When you update an existing record, Cequel will only write statements to the database that correspond to explicit modifications you've made to the record in memory. So, in this situation:

@post = Blog.find(current_subdomain).posts.find(params[:id])
@post.update_attributes!(title: "Announcing Cequel 1.0")

Cequel will only update the title column. Note that this is not full dirty tracking; simply setting the title on the record will signal to Cequel that you want to write that attribute to the database, regardless of its previous value.

Unloaded models

In the above example, we call the familiar find method to load a blog and then one of its posts, but we didn't actually do anything with the data in the Blog model; it was simply a convenient object-oriented way to get a handle to the blog's posts. Cequel supports unloaded models via the [] operator; this will return an unloaded blog instance, which knows the value of its primary key, but does not read the row from the database. So, we can refactor the example to be a bit more efficient:

class PostsController < ActionController::Base
  def show
    @post = Blog[current_subdomain].posts.find(params[:id])
  end
end

If you attempt to access a data attribute on an unloaded class, it will lazy-load the row from the database and become a normal loaded instance.

You can generate a collection of unloaded instances by passing multiple arguments to []:

class BlogsController < ActionController::Base
  def recommended
    @blogs = Blog['cassandra', 'nosql']
  end
end

The above will not generate a CQL query, but when you access a property on any of the unloaded Blog instances, Cequel will load data for all of them with a single query. Note that CQL does not allow selecting collection columns when loading multiple records by primary key; only scalar columns will be loaded.

There is another use for unloaded instances: you may set attributes on an unloaded instance and call save without ever actually reading the row from Cassandra. Because Cassandra is optimized for writing data, this "write without reading" pattern gives you maximum efficiency, particularly if you are updating a large number of records.

Collection columns

Cassandra supports three types of collection columns: lists, sets, and maps. Collection columns can be manipulated using atomic collection mutation; e.g., you can add an element to a set without knowing the existing elements. Cequel supports this by exposing collection objects that keep track of their modifications, and which then persist those modifications to Cassandra on save.

Let's add a category set to our post model:

class Post
  include Cequel::Record

  belongs_to :blog
  key :id, :uuid
  column :title, :text
  column :body, :text
  set :categories, :text
end

If we were to then update a post like so:

@post = Blog[current_subdomain].posts[params[:id]]
@post.categories << 'Kittens'
@post.save!

Cequel would send the CQL equivalent of "Add the category 'Kittens' to the post at the given (blog_subdomain, id)", without ever reading the saved value of the categories set.

Secondary indexes

Cassandra supports secondary indexes, although with notable restrictions:

Only scalar data columns can be indexed; key columns and collection columns cannot.
A secondary index consists of exactly one column.
Though you can have more than one secondary index on a table, you can only use one in any given query.

Cequel supports the :index option to add secondary indexes to column definitions:

class Post
  include Cequel::Record

  belongs_to :blog
  key :id, :uuid
  column :title, :text
  column :body, :text
  column :author_id, :uuid, :index => true
  set :categories, :text
end

Defining a column with a secondary index adds several "magic methods" for using the index:

Post.with_author_id(id) # returns a record set scoped to that author_id
Post.find_by_author_id(id) # returns the first post with that author_id
Post.find_all_by_author_id(id) # returns an array of all posts with that author_id

You can also call the where method directly on record sets:

Post.where(author_id: id)

Consistency tuning

Cassandra supports tunable consistency, allowing you to choose the right balance between query speed and consistent reads and writes. Cequel supports consistency tuning for reads and writes:

Post.new(id: 1, title: 'First post!').save!(consistency: :all)

Post.consistency(:one).find_each { |post| puts post.title }

Both read and write consistency default to QUORUM.

Compression

Cassandra supports frame compression, which can give you a performance boost if your requests or responses are big. To enable it you can specify client_compression to use in cequel.yaml.

development:
  host: '127.0.0.1'
  port: 9042
  keyspace: Blog
  client_compression: :lz4

ActiveModel Support

Cequel supports ActiveModel functionality, such as callbacks, validations, dirty attribute tracking, naming, and serialization. If you're using Rails 3, mass-assignment protection works as usual, and in Rails 4, strong parameters are treated correctly. So we can add some extra ActiveModel goodness to our post model:

class Post
  include Cequel::Record

  belongs_to :blog
  key :id, :uuid
  column :title, :text
  column :body, :text

  validates :body, presence: true

  after_save :notify_followers
end

Note that validations or callbacks that need to read data attributes will cause unloaded models to load their row during the course of the save operation, so if you are following a write-without-reading pattern, you will need to be careful.

Dirty attribute tracking is only enabled on loaded models.

Upgrading from Cequel 0.x

Cequel 0.x targeted CQL2, which has a substantially different data representation from CQL3. Accordingly, upgrading from Cequel 0.x to Cequel 1.0 requires some changes to your data models.

Upgrading a Cequel::Model

Upgrading from a Cequel::Model class is fairly straightforward; simply add the compact_storage directive to your class definition:

# Model definition in Cequel 0.x
class Post
  include Cequel::Model

  key :id, :uuid
  column :title, :text
  column :body, :text
end

# Model definition in Cequel 1.0
class Post
  include Cequel::Record

  key :id, :uuid
  column :title, :text
  column :body, :text

  compact_storage
end

Note that the semantics of belongs_to and has_many are completely different between Cequel 0.x and Cequel 1.0; if you have data columns that reference keys in other tables, you will need to hand-roll those associations for now.

Upgrading a Cequel::Model::Dictionary

CQL3 does not have a direct "wide row" representation like CQL2, so the Dictionary class does not have a direct analog in Cequel 1.0. Instead, each row key-map key-value tuple in a Dictionary corresponds to a single row in CQL3. Upgrading a Dictionary to Cequel 1.0 involves defining two primary keys and a single data column, again using the compact_storage directive:

# Dictionary definition in Cequel 0.x
class BlogPosts < Cequel::Model::Dictionary
  key :blog_id, :uuid
  maps :uuid => :text

  private

  def serialize_value(column, value)
    value.to_json
  end

  def deserialize_value(column, value)
    JSON.parse(value)
  end
end

# Equivalent model in Cequel 1.0
class BlogPost
  include Cequel::Record

  key :blog_id, :uuid
  key :id, :uuid
  column :data, :text

  compact_storage

  def data
    JSON.parse(read_attribute(:data))
  end

  def data=(new_data)
    write_attribute(:data, new_data.to_json)
  end
end

Cequel::Model::Dictionary did not infer a pluralized table name, as Cequel::Model did and Cequel::Record does. If your legacy Dictionary table has a singlar table name, add a self.table_name = :blog_post in the model definition.

Note that you will want to run ::synchronize_schema on your models when upgrading; this will not change the underlying data structure, but will add some CQL3-specific metadata to the table definition which will allow you to query it.

CQL Gotchas

CQL is designed to be immediately familiar to those of us who are used to working with SQL, which is all of us. Cequel advances this spirit by providing an ActiveRecord-like mapping for CQL. However, Cassandra is very much not a relational database, so some behaviors can come as a surprise. Here's an overview.

Upserts

Perhaps the most surprising fact about CQL is that INSERT and UPDATE are essentially the same thing: both simply persist the given column data at the given key(s). So, you may think you are creating a new record, but in fact you're overwriting data at an existing record:

# I'm just creating a blog here.
blog1 = Blog.create!(
  subdomain: 'big-data',
  name: 'Big Data',
  description: 'A blog about all things big data')

# And another new blog.
blog2 = Blog.create!(
  subdomain: 'big-data',
  name: 'The Big Data Blog')

Living in a relational world, we'd expect the second statement to throw an error because the row with key 'big-data' already exists. But not Cassandra: the above code will just overwrite the name in that row. Note that the description will not be touched by the second statement; upserts only work on the columns that are given.

Counting

Counting is not the same as in a RDB, as it can have a much longer runtime and can put unexpected load on your cluster. As a result Cequel does not support this feature. It is still possible to execute raw cql to get the counts, should you require this functionality. MyModel.connection.execute('select count(*) from table_name;').first['count']

Compatibility

Rails

Ruby

Ruby 2.5, 2,4, 2.3, 2.2, 2.1, 2.0

Cassandra

2.1.x
2.2.x
3.0.x

Breaking API changes

3.0

Dropped support for changing the type of cluster keys because the ability has been removed from Cassandra. Calls to #change_column must be removed.
Dropped support for previously deprecated signature of the #column method of schema DSL. Uses like column :my_column, :text, true must be rewritten as #column :my_column, :text, indexed: true

2.0

dropped support for jruby (Due to difficult to work around bugs in jruby. PRs welcome to restore jruby compatibility.)

Support & Bugs

If you find a bug, feel free to open an issue on GitHub. Pull requests are most welcome.

For questions or feedback, hit up our mailing list at [email protected] or find outoftime in the #cassandra IRC channel on Freenode.

Contributing

See CONTRIBUTING.md

Credits

Cequel was written by an awesome lot. Thanks to you all.

Special thanks to Brewster, which supported the 0.x releases of Cequel.

Shameless Self-Promotion

If you're new to Cassandra, check out Learning Apache Cassandra, a hands-on guide to Cassandra application development by example, written by the creator of Cequel.

License

Cequel is distributed under the MIT license. See the attached LICENSE for all the sordid details.

cequel's People

Contributors

Stargazers

Watchers

Forkers

claude jaredtse outoftime reachlocal chondm fairfaxmedia huminzhi miwest929 lsimoneau suthen suya55 insoul atapio kensentor backupify jberlinsky aviflombaum vicentllongo brimil01 ping4 pascalturbo negamorgan rrrene pezra ibazylchuk jspenc72 sixeight johnrees conduitarch dotlinker aarontc zm69 preact tarhashi thesmart daveinglis adstage jasonmk bpovlich leandromoreira elado arturseletskiy cschneid makrisoft becomingwisest patricknmahoney 4noha digitalcuisine ozeias lucasmundim clarkbremer jholland82 cjbottaro gustly fun-ruby jvoegele bradherman zshenker kusakari patricktulskie ltvco ridiculous niedfelj 0rangeseaw0lf islue cmbaron visfleet kemper jojo89 liooo maxpospischil mirceapreotu staymanhou bachue mkonikowski ignicaodigitalbr galeria-kaufhof juanlizarazo daisukehayata antek-drzewiecki edpaget zenonas lesliev bobbrez elgalu inf1k gitter-badger karlitxo sasha-id geesu kkempin foap maxd govaniso orenmazor lukes immunio zfjoy520 jchinkle jhannus

cequel's Issues

allow setting cql version

allow setting the cassandra-cql option :cql_version

e.g.

Cequel.connect(
:host => '127.0.0.1:9160',
:keyspace => 'myapp_development',
:cql_version => '3.0.0'
)

see
https://github.com/kreynolds/cassandra-cql/blob/master/lib/cassandra-cql/database.rb

Add to_guid support to UUIDs

To ease the transition from cassandra-cql to cql-rb, add a #to_guid alias to Cql::Uuid. Mark it as deprecated.

Add credentials to Cql::Client.connect

Explicit pagination for scopes

Something to the effect of:

next_page_params = results.next_page_params
# presumably a request/response cycle
results = results.next_page(next_page_params)

Can't retrieve ttl

Is there a way to retrieve the ttl?

Automigrations

Model classes should be able to "automigrate" the schema to match that defined in their class bodies.

Counters

Cequel::Record should support counter columns. This would be done by having a "companion" counter column table. It might be complicated.

User-facing, it should look like this:

class Post
  include Cequel::Record

  belongs_to :blog
  key :id, :timeuuid, auto: true
  column :title, :string
  column :body, :string
  column :view_count, :counter
end

Under the hood this would define both a posts table and a post_counters table; they would have identical primary key columns and then have the respective data/counter columns.

Composite-keyed Table

Hi
Is it possible to build a composite-keyed table with Cequel?

For example this:

create table Bite (
      partkey varchar,
      score bigint,
      id varchar,
      data varchar,
      PRIMARY KEY (partkey, score, id)
) with clustering order by (score desc);

And, additionaly, how can I set a clustering order?
Thanks

Wide row support (billions of columns)

Is there any support for wide rows (i.e. billions of columns) in Cequel? I looked into the specs and lib. Ideally, there would be a way to lazy-load a range of n-to-m columns using an accessor method.

Does this already exist?

Is this a feature that might be added?

Is it better to use Cequel w/ many skinny-rows instead of one wide column?

Great library, looking forward to a reply.

Reconnect. Cql::NotConnectedError

How I can achieve reconnection if I lose my current connection?

cequel.yml

development:
  hosts:
    - '192.168.1.1'
    - '192.168.1.2'
  port: 9042
  keyspace: test_cequel

Thanks

omitting :thrift from Cequel.connect issue

Hi,

the following code
c = Cequel.connect(
:host => '127.0.0.1:9160',
:keyspace => 'myks',
)

fails with:
TypeError: can't convert nil into Hash
org/jruby/RubyHash.java:1617:in merge' /home/emakris/.rvm/gems/jruby-1.7.0.preview1/gems/cassandra-cql-1.1.0/lib/cassandra-cql/database.rb:30:ininitialize'
/home/emakris/.rvm/gems/jruby-1.7.0.preview1/gems/cequel-0.4.1/lib/cequel/keyspace.rb:24:in connection' /home/emakris/.rvm/gems/jruby-1.7.0.preview1/gems/cequel-0.4.1/lib/cequel/keyspace.rb:47:inexecute'
/home/emakris/.rvm/gems/jruby-1.7.0.preview1/gems/cequel-0.4.1/lib/cequel/keyspace.rb:89:in log' /home/emakris/.rvm/gems/jruby-1.7.0.preview1/gems/cequel-0.4.1/lib/cequel/keyspace.rb:46:inexecute'
/home/emakris/.rvm/gems/jruby-1.7.0.preview1/gems/cequel-0.4.1/lib/cequel/data_set.rb:192:in `each'

@thrift_options is nil and being passed to cassandra-cql which expects a hash.

Clarify querying over multiple keys

Cequel::Model::Dictionary did some heavy lifting by defining [] and []= in 0.5.6. Now that the preferred method in CQL 3 is to use composite primary keys, it's not readily apparent how to fetch or issue range queries for the two keys.

dependent => destroy for has_many

Support CQL binary protocol

The cql-rb driver provides bindings to the CQL binary protocol.

Support schema creation and modification

Lower-level interface should support the creation and destruction of keyspaces, creation/modification/destruction of column families, etc.

Upgrade from 0.5.6 should mention table name changes

This one may just require a small documentation change, but in 0.5.6 the table names for the models were singular. Now they seem to be plural like ActiveRecord uses. I didn't see this mentioned in the upgrade guide. For now I just set the table name in each of the models.

Timestamp Type is handled as string instead of Time

When using CQL3s timestamp type cequel returns a string instead of a Time Object.

Obviously the return value from cassandra is "Date plus time, encoded as 8 bytes since epoch". CQL3 is able to parse DateTime to timestamp but it seems to return the unencoded bytes.

So first the internal_name needs to be changed to .TimestampType and second the returend byte Value needs to be converted to Time Object.

Metal outside of cequel

Hi there,

I was curious if there was any chance you would consider breaking Cequel::Metal out into its own gem along the lines of its inspiration Sequel. I don't need a full ORB, but the CQL generation it does is light-years ahead of what I've been doing.

If not, can you think of anything that would break if I pull in the whole gem and just use that? I took a look and it looks like everything is handled with modules which means if I don't include them, they won't interfere with any of my code, but I figure you would have a much better sense of where you might be plugging into Rails.

Thanks!

Cql::QueryError - line 1:36 missing EOF

I'm getting a really weird bug that appears to be due to the "find_by" auto-generated methods. I have a model:

class User
  include Cequel::Record
  key :guid, :ascii
  column :email, :text
end

Later, I try to fetch the user using a header value:

guid = request.env['HTTP_X_TOKEN'].split('-')[0]
user = User.find_by_guid(guid)

It crashes and I get this dump:

Cql::QueryError - line 1:36 missing EOF at 'mht8i':
    /vendor/bundle/ruby/2.1.0/gems/cql-rb-1.2.1/lib/cql/client/synchronous_client.rb:54:in `execute'
    /vendor/bundle/ruby/2.1.0/gems/cequel-1.2.4/lib/cequel/metal/keyspace.rb:174:in `block in execute_with_consistency'
    /vendor/bundle/ruby/2.1.0/gems/cequel-1.2.4/lib/cequel/metal/request_logger.rb:34:in `block in log'
    /vendor/bundle/ruby/2.1.0/gems/activesupport-4.0.2/lib/active_support/core_ext/benchmark.rb:12:in `block in ms'
    /Users/smart/.rbenv/versions/2.1.2/lib/ruby/2.1.0/benchmark.rb:294:in `realtime'
    /vendor/bundle/ruby/2.1.0/gems/activesupport-4.0.2/lib/active_support/core_ext/benchmark.rb:12:in `ms'
    /vendor/bundle/ruby/2.1.0/gems/cequel-1.2.4/lib/cequel/metal/request_logger.rb:34:in `log'
    /vendor/bundle/ruby/2.1.0/gems/cequel-1.2.4/lib/cequel/metal/keyspace.rb:173:in `execute_with_consistency'
    /vendor/bundle/ruby/2.1.0/gems/cequel-1.2.4/lib/cequel/metal/data_set.rb:662:in `execute_cql'
    /vendor/bundle/ruby/2.1.0/gems/cequel-1.2.4/lib/cequel/metal/data_set.rb:584:in `each'
    /vendor/bundle/ruby/2.1.0/gems/cequel-1.2.4/lib/cequel/record/record_set.rb:675:in `entries'
    /vendor/bundle/ruby/2.1.0/gems/cequel-1.2.4/lib/cequel/record/record_set.rb:675:in `find_rows_in_single_batch'
    /vendor/bundle/ruby/2.1.0/gems/cequel-1.2.4/lib/cequel/record/record_set.rb:585:in `find_rows_in_batches'
    /vendor/bundle/ruby/2.1.0/gems/cequel-1.2.4/lib/cequel/record/record_set.rb:569:in `find_each_row'
    /vendor/bundle/ruby/2.1.0/gems/cequel-1.2.4/lib/cequel/record/record_set.rb:533:in `find_each'
    /vendor/bundle/ruby/2.1.0/gems/cequel-1.2.4/lib/cequel/record/record_set.rb:478:in `each'
    /vendor/bundle/ruby/2.1.0/gems/cequel-1.2.4/lib/cequel/record/record_set.rb:478:in `first'
    /vendor/bundle/ruby/2.1.0/gems/cequel-1.2.4/lib/cequel/record/record_set.rb:478:in `first'
    /vendor/bundle/ruby/2.1.0/gems/cequel-1.2.4/lib/cequel/record/finders.rb:87:in `find_by_guid'

If I change the code to this, it works:

user = User.find_by_guid("cmht8i")

Of if I change the code to this, it also works:

user = User.where(:guid, guid)

Cequel::Model and Grape

How can you use ::Model with non-rails app? I tried to to do Cequel::Model.configure, but keep getting
ThriftClient::NoServersAvailable

Time marshaling confuses UTC and local time

A have a record defined with the timestamps helper.

In Ruby, Time.now returns local time.

When I create! a new record timestamps helper has a Time in UTC set in updated_at.

In the DB, Cassandra says updated_at is local time.

When restored from the DB, the record's updated_at is in local time.

Serialized column types

Should support serializing arbitrary objects to JSON and messagepack at a minimum.

Update/rewrite README for 1.0

RecordSet reform

The current RecordSet class plays two roles:

Encode a certain scope, which can be translated into a CQL query. This is distinct from a DataSet in that the RecordSet is schema-aware, and thus provides a more semantic interface to building queries (and avoids exposing an interface to create illegal ones).
Actually query the data store and provide an enumerator over result records, possibly in multiple batches.

Move the functionality from the first role into a RecordScope class, which the RecordSet encapsulates and uses to generate queries which it then performs. Further, the link between either of these and the actual record class is weak; mostly, they just need access to the class's schema. RecordSet should just be initialized with a schema and optionally a block that specifies what to do with the result rows (e.g. hydrate Record instances).

Typecasting everywhere

Places we're not currently enforcing types:

Collection columns
Record set construction

Support for CQL3 set, list, map column types?

Wondering if latest version of cequel supports the set, list, map column types in DSE Cassandra 1.2: http://www.datastax.com/dev/blog/cql3_collections?

Always order by the first clustering column

Currently RecordSets order by the scoped range_key, which is incorrect:

ORDER BY clauses can select a single column only. That column has to be the second column in a compound PRIMARY KEY. This also applies to tables with more than two column components in the primary key.

Ordering should always be by the first clustering column.

Prevent Cequel::Model auto-load cequel.yml

Is there a way to prevent Cequel::Model from auto-loading your cequel.yml (if it exists)?
I would like to control how my application is processing that yml file? Also, wish to use a different configuration filename.
Any suggestions?

Run tests in in-memory mode

Hi all,
Wondering if cequel gem provides an in_memory option?
This way I can run tests for my service without needing to have a cassandra server running locally. If no option exists any plans to add it?

Single-table inheritance

Rake tasks for Sinatra

Seems like the migration task on record/tasks.rb uses the Rails class. Would be nice to have it framework-agnostic so it can work on Sinatra, Grape etc.

Enforce key correctness

Fail fast on attempt to save a model that doesn't have all keys defined
Don't allow changing of key on non-new model

Incrementing counter in callback fails due to batching

I have an after_save callback which updates some counters, but it always fails with:

Counter mutations are only allowed in COUNTER batches

Perhaps there's a way to create a new batch for just the counters? I tried wrapping in connection.batch do but it still got applied in the same batch as before.

Can't filter by both primary key and secondary index

This works and get me the result

2.1.0 :019 > Cequel::Record.connection.execute_with_consistency("SELECT * FROM userlines WHERE username = 'admin' AND msg_id = dcedde7a-dd0a-11e3-be13-3118a5800f92", [], consistency: :one)

But this don't

2.1.0 :020 > Userline.where(:username => "admin", :msg_id => Cequel::uuid("dcedde7a-dd0a-11e3-be13-3118a5800f92"))
Cequel::Record::IllegalQuery: Can't filter by both primary key and secondary index

Userline.rb

  key :username, :text
  key :time, :timeuuid, order: :desc, auto: true
  column :msg_id, :uuid, :index => true

Identity Map

Upgrade guide should mention new Cassandra requirements

This one is my fault for not reading all the way to the bottom of the document, but I just realized that the gem no longer works with Cassandra 1.1. I think highlighting this in the upgrade section might be worthwhile.

On the other hand, I think you could actually support 1.1. But I can understand if you don't want to. I had attempted to circumvent this by requiring 'cassandra-cql/1.1' in my Gemfile, but since you explicitly require 'cassandra-cql/1.2', that didn't work.

For what it's worth, the use case here is to try to upgrade to CQL 3.0 first and then upgrade my version of Cassandra so that I'm only tweaking one thing at a time.

cequel.yml configuration ignored on rails console

Configuration settings(config/cequel.yml) are being ignored when running the rails console.

Rails 4.1.1
Ruby 2.1.1
Cequel 1.2.4

Can't filter by both primary key and secondary index

I'm getting an Cequel::Record::IllegalQuery Can't filter by both primary key and secondary index exception, and I wanted to make sure this is by-design.

I have two record classes:

class User
  include Cequel::Record
  key :id, :timeuuid, auto: true
  column :name, :text
end

class Document
  include Cequel::Record

  belongs_to :user
  key :id, :timeuuid, auto: true, order: :desc
  column :state, :ascii, :index => true
end

I'm trying to get all documents owned by a user that are undeleted. Example:

Document.where(:user_id => user_id, :state => :ok).each do |doc|
  # ... do stuff
end

This raises Cequel::Record::IllegalQuery Can't filter by both primary key and secondary index but seems like a pretty common scenario for 2i indexes. Even without a 2i index, I still get the error same error.

Add DatabaseCleaner support

https://github.com/bmabey/database_cleaner/tree/master/lib/database_cleaner
DatabaseCleaner is a handy gem that cleans your database between every test.
Are there any plans to have DatabaseCleaner support cequel ORM?

Best way to set default values

I'm struggling a bit to set default values with a new record.
I've got this model:

class AccessToken
  include Cequel::Record
  key :id, :text
  column :name, text
end

I'd like to have an initializer where the :id gets generated automatically and I pass in the name. Rails documentation suggests using after_initialize, but that function doesn't exist.
Overwriting initialize( ) doesn't give the desired effect either.

Cql::Io::ConnectionError: Not enough bytes available to decode a bytes:

I don't know the reason behind this. At first it was working well but all of sudden it threw the error

Cql::Io::ConnectionError: Not enough bytes available to decode a bytes: 1416128877e bytes required but only 15239 available

My cequel gem version is (1.2.4)

Here is the full error log of what I got using just User.all

Loading development environment (Rails 4.1.0)
:001 > User.all
Cql::Io::ConnectionError: Not enough bytes available to decode a bytes: 1416128877 bytes required but only 15239 available
from /home/ckgagan/.rvm/gems/ruby-2.1.1/gems/cql-rb-1.2.1/lib/cql/client/synchronous_client.rb:54:in execute' from /home/ckgagan/.rvm/gems/ruby-2.1.1/gems/cequel-1.2.4/lib/cequel/metal/keyspace.rb:174:inblock in execute_with_consistency'
from /home/ckgagan/.rvm/gems/ruby-2.1.1/gems/cequel-1.2.4/lib/cequel/metal/request_logger.rb:34:in block in log' from /home/ckgagan/.rvm/gems/ruby-2.1.1/gems/activesupport-4.1.0/lib/active_support/core_ext/benchmark.rb:12:inblock in ms'
from /home/ckgagan/.rvm/rubies/ruby-2.1.1/lib/ruby/2.1.0/benchmark.rb:294:in realtime' from /home/ckgagan/.rvm/gems/ruby-2.1.1/gems/activesupport-4.1.0/lib/active_support/core_ext/benchmark.rb:12:inms'
from /home/ckgagan/.rvm/gems/ruby-2.1.1/gems/cequel-1.2.4/lib/cequel/metal/request_logger.rb:34:in log' from /home/ckgagan/.rvm/gems/ruby-2.1.1/gems/cequel-1.2.4/lib/cequel/metal/keyspace.rb:173:inexecute_with_consistency'
from /home/ckgagan/.rvm/gems/ruby-2.1.1/gems/cequel-1.2.4/lib/cequel/metal/data_set.rb:662:in execute_cql' from /home/ckgagan/.rvm/gems/ruby-2.1.1/gems/cequel-1.2.4/lib/cequel/metal/data_set.rb:584:ineach'
from /home/ckgagan/.rvm/gems/ruby-2.1.1/gems/cequel-1.2.4/lib/cequel/record/record_set.rb:675:in entries' from /home/ckgagan/.rvm/gems/ruby-2.1.1/gems/cequel-1.2.4/lib/cequel/record/record_set.rb:675:infind_rows_in_single_batch'
from /home/ckgagan/.rvm/gems/ruby-2.1.1/gems/cequel-1.2.4/lib/cequel/record/record_set.rb:592:in find_rows_in_batches' from /home/ckgagan/.rvm/gems/ruby-2.1.1/gems/cequel-1.2.4/lib/cequel/record/record_set.rb:569:infind_each_row'
from /home/ckgagan/.rvm/gems/ruby-2.1.1/gems/cequel-1.2.4/lib/cequel/record/record_set.rb:533:in find_each' from /home/ckgagan/.rvm/gems/ruby-2.1.1/gems/cequel-1.2.4/lib/cequel/record/record_set.rb:517:ineach'
from /home/ckgagan/.rvm/gems/ruby-2.1.1/gems/railties-4.1.0/lib/rails/commands/console.rb:90:in start' from /home/ckgagan/.rvm/gems/ruby-2.1.1/gems/railties-4.1.0/lib/rails/commands/console.rb:9:instart'
from /home/ckgagan/.rvm/gems/ruby-2.1.1/gems/railties-4.1.0/lib/rails/commands/commands_tasks.rb:69:in console' from /home/ckgagan/.rvm/gems/ruby-2.1.1/gems/railties-4.1.0/lib/rails/commands/commands_tasks.rb:40:inrun_command!'
from /home/ckgagan/.rvm/gems/ruby-2.1.1/gems/railties-4.1.0/lib/rails/commands.rb:17:in `<top (required)>'

Set default consistency

Is there a way to specify the default consistency? It looks like it could be read easily from the configuration, but that doesn't happen.

Time zone support for timestamp columns

In the Cequel::Metal layer, all timestamp column values in result rows should be converted to UTC
In Cequel::Record, timestamp attributes should use the configured ActiveSupport time zone (i.e. value.in_time_zone)

There's no need to worry about time zones on write because Cassandra stores timestamps as milliseconds since epoch, so there's no attached time zone information at all.

Spec helpers

I have implemented a database cleaner (it truncates tables) that is run in the before each hooks for one of my projects. Now I need similar functionality in another project so i am going to extract and generalize it. Would such functionality be welcome in Cequel or do you think a separate project would be more appropriate?

cequel doesn't support bigint

I've tried using bigint as column type but rake cequel:migrate told me "Unrecognized CQL type :bigint" bus as described on http://www.datastax.com/docs/1.1/references/cql/cql_data_types
bigint is support.

Save incorrectly returns true if a previous save exceptioned out at the CQL level

I ran into some weird behavior today, probably related to some sort of dirty checking.

As I was building out a new model (full model below), I added a new column to it, and accidentally only synced my test database, and not my development db.

In Rails Console (development) I was manually creating a few records. When I set the :sk attribute which had not been synced, I (correctly) got an error on save:

Cql::QueryError: Unknown identifier sk
        from /opt/aa-organizations/shared/bundle/ruby/2.1.0/gems/cql-rb-1.2.1/lib/cql/client/synchronous_client.rb:54:in `execute'
        from /opt/aa-organizations/shared/bundle/ruby/2.1.0/bundler/gems/cequel-030f580992cd/lib/cequel/metal/keyspace.rb:174:in `block in execute_with_consistency'

The really weird part though (the bug itself!) is that immediately calling save on the record again returns true.

I assume that the item had a dirty flag the first time, which got unset on the (failed) first save, and then it never attempted to actually save the second time.

class Membership
  include Cequel::Record

  key :user_sk, :uuid
  key :organization_sk, :uuid
  set :roles, :text
  column :sk, :uuid, index: true

  before_save :make_uuid_if_needed

  def user
    User[user_sk]
  end

  def organization
    Organization[organization_sk]
  end

  private

  def make_uuid_if_needed
    self.sk = Cequel.uuid if self.sk.nil?
  end
end

Chaining confuses mutation

I just wasted a few hours realizing the chaining in Cequel returns a copy of RecordSet rather than mutating the RecordSet.

This works:

set = MyRecord[key].after(3.minutes.ago)
set.each ...

But this doesn't

set = MyRecord[key]
set.after(3.minutes.ago)
set.each ...

ArgumentError when setting wrong type value to a column

Hi,

I have the following setup:

class Rating
  include Cequel::Record
....
  key :id, :timeuuid, :auto => true
  column :value, :int

  validates_numericality_of :value
end

When I try:

r = Rating.new(:value => "abc")

I get the following error:

ArgumentError: invalid value for Integer(): "abc"
    from /opt/boxen/rbenv/versions/2.1.2/gemsets/kuende/gems/cequel-1.3.2/lib/cequel/type.rb:328:in `Integer'
    from /opt/boxen/rbenv/versions/2.1.2/gemsets/kuende/gems/cequel-1.3.2/lib/cequel/type.rb:328:in `cast'
    from /opt/boxen/rbenv/versions/2.1.2/gemsets/kuende/gems/cequel-1.3.2/lib/cequel/schema/column.rb:89:in `cast'
    from /opt/boxen/rbenv/versions/2.1.2/gemsets/kuende/gems/cequel-1.3.2/lib/cequel/record/persistence.rb:302:in `write_attribute'
    from /opt/boxen/rbenv/versions/2.1.2/gemsets/kuende/gems/cequel-1.3.2/lib/cequel/record/dirty.rb:62:in `write_attribute'
    from /opt/boxen/rbenv/versions/2.1.2/gemsets/kuende/gems/cequel-1.3.2/lib/cequel/record/properties.rb:210:in `value='
    from /opt/boxen/rbenv/versions/2.1.2/gemsets/kuende/gems/cequel-1.3.2/lib/cequel/record/properties.rb:277:in `block in attributes='
    from /opt/boxen/rbenv/versions/2.1.2/gemsets/kuende/gems/cequel-1.3.2/lib/cequel/record/properties.rb:276:in `each_pair'
    from /opt/boxen/rbenv/versions/2.1.2/gemsets/kuende/gems/cequel-1.3.2/lib/cequel/record/properties.rb:276:in `attributes='
    from /opt/boxen/rbenv/versions/2.1.2/gemsets/kuende/gems/cequel-1.3.2/lib/cequel/record/mass_assignment.rb:39:in `attributes='
    from /opt/boxen/rbenv/versions/2.1.2/gemsets/kuende/gems/cequel-1.3.2/lib/cequel/record/properties.rb:380:in `initialize_new_record'
    from /opt/boxen/rbenv/versions/2.1.2/gemsets/kuende/gems/cequel-1.3.2/lib/cequel/record/scoped.rb:62:in `initialize_new_record'
    from /opt/boxen/rbenv/versions/2.1.2/gemsets/kuende/gems/cequel-1.3.2/lib/cequel/record/properties.rb:55:in `block in new'
    from /opt/boxen/rbenv/versions/2.1.2/gemsets/kuende/gems/cequel-1.3.2/lib/cequel/record/properties.rb:54:in `tap'
    from /opt/boxen/rbenv/versions/2.1.2/gemsets/kuende/gems/cequel-1.3.2/lib/cequel/record/properties.rb:54:in `new'
    from (irb):17

I'm using ActiveModel validations to make sure value column will always be integer, and handle response if there is an error, instead of trying to catch the error.

What aproach do you suggest ? I do not want to make the check that value is integer inside the controller.

Later Edit: If I try Rating.new(:value => "123") it will set value 123 (integer) to the new instance.

CQL-RB 2.0.0 RELEASE

Do you plan to switch to new stable release of cql-rb v. 2.0.0?
https://github.com/iconara/cql-rb/releases/tag/v2.0.0
Thanks.

Retrieved timestamps lose precision

I'm creating a custom index based on when a record last changed. It appears that the updated_at time is losing precision in the query serialization process. Here is the test case:

class IndexRecord
  include Cequel::Record
  key :user_guid, :ascii
  key :updated_at, :timestamp, order: :desc
  column :id, :uuid # points to a DataRecord
end

class DataRecord
  include Cequel::Record
  key :id, :uuid
  column :user_guid, :ascii
  column :data, :text
  timestamps
  after_create do
    puts 'after_create'
    puts (self.updated_at.to_f * 1000).floor
    IndexRecord.create!({
      :user_guid => self.user_guid, 
      :updated_at => self.updated_at,
      :id => self.id
    })
  end
  after_update do
    puts 'after_update'
    old_updated_at = self.changes[:updated_at].first
    puts (old_updated_at.to_f * 1000).floor
    indexes = IndexRecord.where(:user_guid => self.user_guid, :updated_at => old_updated_at).to_a
    puts indexes.length
    puts indexes
  end
end

IndexRecord.synchronize_schema
DataRecord.synchronize_schema
$cequel.schema.truncate_table(IndexRecord.table_name)
$cequel.schema.truncate_table(DataRecord.table_name)

data_record = DataRecord.create!({
  :id => Cequel.uuid,
  :user_guid => (rand*100000).floor.to_s,
  :data => (rand*100000).floor.to_s
})
sleep(2)
data_record = DataRecord.where(:id => data_record.id).to_a[0]
data_record.data = (rand*100000).floor.to_s
data_record.save!
puts 'DONE.'

Here is the output:

after_create
1405633600925
after_update
1405633600925
0
DONE.

I expected the indexes to be printed and the length to be 1. Here's the cequel log that shows the error:

CQL (22ms) TRUNCATE index_records
CQL (20ms) TRUNCATE data_records
CQL (2ms) BEGIN BATCH
INSERT INTO data_records (id, user_guid, data, updated_at, created_at) VALUES (d62db7b4-0dfb-11e4-bd0e-6f224471fc5f, '23806', '35562', 1405633600925, 1405633600925)
INSERT INTO index_records (user_guid, updated_at, id) VALUES ('23806', 1405633600925, d62db7b4-0dfb-11e4-bd0e-6f224471fc5f)
APPLY BATCH
CQL (2ms) SELECT * FROM data_records WHERE id = d62db7b4-0dfb-11e4-bd0e-6f224471fc5f LIMIT 1000
CQL (1ms) SELECT * FROM index_records WHERE user_guid = '23806' AND updated_at = 1405633600924 LIMIT 1000

Notice, updated_at is 1405633600924 instead of 1405633600925!