valchkou / cassandra-driver-mapping Goto Github PK

JPA addon for DataStax Java Driver for Cassandra

Java 100.00%

cassandra-driver-mapping's Introduction

cassandra-driver-mapping

Entity Mapper Add-on for the DataStax Java Driver (Driver) for Cassandra (C*).
This Add-on allows you to synchronize schema automatically and persist JPA annotated entities in Cassandra.

No mapping files, no scripts, no configuration files.
No need to create Tables and Indexes for your Entity manually.
Entity definition will be automatically synchronized with Cassandra.

Add-on is not replacement for the Driver but lightweight Object Mapper on top of it.
You still can utilize full power of the Driver API and Datastax documentation.
Mapping Add-on relies on JPA 2.1 and Driver 3+

For Complete Documentation go here

cassandra-driver-mapping's People

Stargazers

Watchers

cassandra-driver-mapping's Issues

fluent setters not supported

Would it be possible to support fluent setters?

example:

public CurrentClass setId(Integer id) {
    this.id = id;
    return this;
}

Currently the metadata scanner skips such methods while scanning setters and therefore wants to delete columns while creating sync cql query.

Can't use 2 columnDefinition annotations in compound key

I have the following two classes that are to be used as compound primary keys:

public class UserMessageKey {
    private String jid;

    @Column(columnDefinition="timeuuid")
    private UUID msgId;
}

public class ConversationMessageKey {
    @Column(columnDefinition="timeuuid")
    private UUID convId;

    @Column(columnDefinition="timeuuid")
    private UUID msgId;
}

I then use them in the following two classes:

public class MessagesByUser {
    @EmbeddedId
    private UserMessageKey key;
}

public class MessagesByConversation {
    @EmbeddedId
    private ConversationMessageKey key;
}

In the first case I get what is expected, a key with jid and msgId, however in the second case I only get msgId. Is it not possible to have 2 column overrides?

Alter table with Map<String,String> column type fails.

During Sync, the newly added field of type

Map<Stirng,String>

interpreted as map, should be map < text,text > . Works fine when table is being created.

Collections and UnmodifiableCollection

Is there a way to update collections using the collection.add() method?

currently in my code I keep getting this exception:

java.lang.UnsupportedOperationException
at java.util.Collections$UnmodifiableCollection.add(Collections.java:1075)

How to do updates

Hi Eugene, would you be able to put some light into this subject?

I'm trying to only add elements to a set. If I run an insert statement (session.save()) what happens is that it end up wiping the existing entries in a set with the just inserted one/ones.

So what I do is to just run updates operations. That way I only add elements to a set that don't exist. This approach does no require read-before-write.

So my questions are:

How do I achieve that with your wrapper?
How to I selectively choose with columns to retrieve? (partial entity loading)

Great work on this project. Clean documentation and coding. Keep it up.

Log warnings - "Re-preparing already prepared query SELECT"

similarily as in #15 - I'm becoming log warnings for prepared select cql queries:

2014-06-12 23:06:59.252  WARN   --- [Driver worker-0] com.datastax.driver.core.Cluster         : Re-preparing already prepared query SELECT * FROM test.test WHERE hash=?;. Please note that preparing the same query more than once is generally an anti-pattern and will likely affect performance. Consider preparing the statement only once.

Sometimes I also become the same error as in #15 (for UPDATE queries) - but now it happens very rarely and unregularly. Is it possible that the reason wold be the use of separate sessions (and MappingSessions) for each thread? I'm building a multi threaded app with concurrent reads and writes (currently running with 4 threads at once).

Indexes disappear on second sync

on behalf of user

I'm using your library and it looks good.
But when my app reconnect to Cassandra it drops column indexes for entities on the SchemaSync.sync(); when it does MappingSession().

For the first connection session all tables were created normally, with all needed indexes. But on the second connection all indexes disappeared. Could you have a look on it?

How to add WITH CLUSTERING ORDER BY (<columnName> DESC) clause

Hi, is it possible to define a WITH CLUSTERING ORDER BY clause via the JPA entity / annotations?

Cassandra driver mapping 2.0.4 is breaking at save

Hi,

cassandra-driver-mapping 2.0.0 is working with MappingSession save() method, but as soon as I am upgrading version to 2.0.2 MappingSession save() is breaking with following exception:--
java.lang.NoSuchMethodError: com.datastax.driver.mapping.MappingSession.save(Ljava/lang/Object;)Ljava/lang/Object;

This is how I am calling save method:--

private MappingSession mappingSession;
public void save(T t) {
getLogger().debug("saving: {}", t);

mappingSession.getMappingSession().save(t);

}

Please suggest?

Regards,
Pankaj

StatementCache issue

There is a little problem with MappingBuilder.prepareSelect. Value that puts to cache depends on keyspace and table name. But key to get value from cache consists only table name. So second call will return not proper statement:
prepareSelect(AnEntity.class, id, options, "keyspace1", session);
prepareSelect(AnEntity.class, id, options, "keyspace2", session);
because of statementCache returns something like
select * from keyspace1.anentity
instead of
select * from keyspace2.anentity

Is there an issue with driver time based UUIDs

Just a question, I see in the docs it's suggested to bring in a 3rd party lib to do time based UUID, but since the datastax core libs are already included as part of this driver, why not just use the utils.UUIDs class? Is there a known issue with that class that we really should be worrying about?

EntityTypeParser.isGetterFor method returns wrong result for fields with similar names

Hi!
I've caught a bug: EntityTypeParser.isGetterFor method fails for two fields with similar names and different types.
Assume fields int count and boolean countUpdated : in this case getter for count field can be isCountUpdated(), because isGetterFor method checks startWith() only.

Thanks in advance for fix

How to turn off auto UUID generation

Just saw the change for auto generation of UUID and I can't see a way to disable it. There are cases where the UUID value is used in other data structures within our app and we have relied on null UUIDs being an error indicating we missed something. I would rather continue to have the option of seeing errors then have them hidden with this feature.

Secondary index recreated even if it already exists

We are having trouble with the sync process in that we get a number of "schema not synchronized" messages on application startup. We are using the auto sync feature of this library and no tables have been changed nor have any secondary indexes been altered. What we see in the Cassandra log is that the tables don't change but each index is recreated and it is this change that appears to be causing our out of sync problem. Is this expected, and if it is, is there a way to turn it off so that it only happens if the index has been changed?

The following is an example of a class that is using annotations to describe the indexes:

@Table(name = "messages_by_conversation", //
    indexes = { @Index(columnList = "msgid"), @Index(columnList = "resource"), @Index(columnList = "senderid") })
@TableProperties(values = { @TableProperty("CLUSTERING ORDER BY (deletedOn ASC, timeSent DESC)") })
public class MessagesByConversation {

Unsuccesful EntityTypeParser.overrideDataTypeMapping for timeuuid

I try to use timeuuid type instead of uuid by default for id.

I overrided UUID type as EntityTypeParser.overrideDataTypeMapping(java.util.UUID.class, DataType.timeuuid().getName());

and I've got nulls for UUID fields because
MappingSession.getValueFromRow() method doesn't use EntityTypeParser.javaTypeToDataType and knows nothing about timeuuid type.

thanks in advance

Cannot get proper value for timeuuid field

I've changed my entity definition from:
@id()
private UUID id = UUIDs.random();

to:
@id()
@column(columnDefinition="timeuuid")
private UUID id = UUIDs.timeBased();

then MappingSession.getByQuery stopped working. It returns now new UUID (not originally saved to database) for each getByQuery invocation.

EntityTypeParser does not support enums

No getters and setters are found if the POJO has an enum field that is stored in Cassandra as varchar.

Optimistic Concurrency using versioning

Hi,
I was wondering how mapping driver is ensuring read operation of Version column is race issue fee?

-Nitin

RetryPolicy does not set properly

MappingBuilder lines 172-174:

        if (options.getConsistencyLevel() != null) {
            insert.setRetryPolicy(options.getRetryPolicy());
        }

Support TTL for batch save

How to make update of two or more fields of entity atomic?

CQL allow to update more than one column of row in one query, and it is atomic.
Can I do this with your API? I didn't find any variations of MappingSession.updateValue with multiple values.

Enable Optimistic lock and Lightweight transactions

support @Version annotation and check the version on update.
if version is incorrect return null, otherwise return saved object.
support Lightweight transactions. add it to SaveOptions.

Nested Entities

Hi,

Does the library support persisting and retrieval of nest entries into C_?
I am evaluating C_ to persist our 100 - 150 defined POJO's which are related in some ways. There are not yet annotated with javax.persistence annotations but I am looking for a solution that can help with that.

Would you consider request on adding examples for nested entities to your README ?

Thank you.
Adil.

Async methods support

Hello Eugene
Are you going to support datastax session async methods ?
Thanks

Re-preparing warnings when using MappingSession

When using the session to get an entity, i.e.

Entity entity = mappingSession.get(Entity.class, id);

I often get warnings from com.datastax.driver.core.Cluster.addPrepared() telling me the statement has already been prepared. Originally I thought it was a caching issue but I often see this for the same entity within short periods of time, somethings less than 30 seconds apart. Is there a extra level of synchronization that I need to do, or trigger, so that the statement will not undergo the seemingly redundant prepare step?

support query options

support query options such as consistency level, retry policy & etc

Is valchkou support UDT??

Hi,

Is there any way to implement user define type(UDT) in Cassandra using valchkou??

Thank,
Ankit

EntityTypeParser doesn't parse field properly if the field type and setter parameter type doesn't match

Our entity classes are generated by Avro, and when avro generates the class for primitives defined in the avro schema the field in the class is kept as primitive but the getters and setters are boxed. Something like this:

class Test {
    private int val;

    public Integer getVal() {
        return val;
    }

    public void setVal(Integer val) {
        this.val = val;
    }
}

When we upgraded to the latest version of your library(2.1.4) we started seeing the EntityMetadata objects were missing these kind of autoboxed fields. Digging a bit deeper seems like there was a recent change where you added the check for type in: EntityTypeParser.isSetterFor method:

if(method.getParameterTypes()[0] != field.getType()) return false)

So is it possible for you to handle cases like these where primitive fields have boxed getters and setters?

@Ttl not being passed on table creation

Using the @Ttl annotation on a table, while the table gets created, when I check the default_time_to_live value on the table it's always 0, the value I provided does not get set. Has this been implemented yet?

Could you put the option of adding custom indexes on the annotations?

It's just that I have a custom index on one of my tables, and because I can't put it on the annotation the SchemaSync deletes it.

any-to-any mapping

If I start with a class (ABC) containing the fields A, B, C and then read an entity (BCD) from the DB returning fields B, C, D, when I use this feature to assign the results to the ABC class, I get an IllegalArgumentException stating the A is not a column in the metadata. The description of the feature says that extra fields in BCD not in ABC will be ignored, is there are reason why are those of the receiving class should throw an exception instead of also being ignored? Is there some configuration I can use to alter this behavior? I know I can just turn off INFO logging and the message goes away but then I might miss other potentially important information.

support update/insert options

support options such as TTL, timestamp, etc

Option to turn of schema synching in a production environment

Hi Eugene,

I'm hoping to use your Java Driver for Cassandra in a production environment, and I wanted to see if there is an option to turn off Schema Sync: I'm fearful that with this feature enabled, if a developer makes a schema mistake in our codebase, there's a chance that it could/would delete important user data by inadvertently dropping columns.

Thank you,

-Mike

Support timeuuid & quering timeseries data.

Cassandra supports uuid and timeuuid.
Both mapped to same java class UUID.

need to be able to "override" on entity column which cassandra datatype to use.
create table taking in account "datatype override"
this can be useful not only for uuid but also for other types too.

How to use Any-to-Any mapping without an @id

Just started using this extension and liked the Any-to-Any functionality but at first it didn't work for me. It turned out that I had to designate at least one field as an @id but that seems kind of strange (and is not indicated in the docs). All I want is to get a representation of the data, which may have duplicates based on any of the fields I might designate as @id, so I was wondering if there is something missing in or some config I have to do to get around this.

Could not find a better place to ask this type of question but if this is the wrong place please let me know and I won't repost it here.

How can I use C*-side generation of UUID and TimeUUID?

I want make some handy save method on top of your API.
Main idea is when entity does not have ID and it is of type UUID(90% case, i think) - let C* generate it on save, e.g. execute query like this:
insert into stuff (uid, name) values(now(), 'my name');

I didn't find how to do this in project wiki. Can I do this in some way?

Cannot map "long" type in POJO to "timestamp" type in Cassandra

I am creating the schema in AVRO, which does not support the Date type. Hence, have to mark my field type as long. Now, in cassandra, the field is a timestamp. How can i override the field type in the POJO or provide an override method for mapping of fields ? I tried using ColumnDefinition annotation but it does not work. Also, overwriting the mapping dictionary works for writes but reads are broken.

Entity Parser fails at parsing primary keys when it is of type generic

Example:

@Embeddable
public class ItemPrimaryKey<T extends HasPage> {

    public static final String COL_TIME = "time";
    public static final String COL_TYPE = "type";
    public static final String COL_HASH = "hash";

    @EmbeddedId
    private T partitionKey;

    // Compound key
    @Column(name = COL_TIME)
    private Date time;

    @Column(name = COL_TYPE)
    private String type;

    @Column(name = COL_HASH)
    private String hash;

In this case partitionKey is parsed as the interface "HasPage" instead of the runtime value (probably due to java type erasure).

Do we have any option or workaround?

We want to make it generic because this logic is shared across different tables and DAOs.

Thank you.
Patricio

EntityTypeParser does not support enums

No getters and setters are found if the POJO has an enum field that is stored in Cassandra as varchar.

Indexing: columnList doesn't support multiple columns

@table(name="test_entity_index",
indexes = {
@Index(name="test_entity_email_idx", columnList="email" ),
@Index(name="test_entity_timestamp_idx", columnList="timestamp" )
})

is OK, but a multi-column index such as:

@table(name="test_entity_index",
indexes = {
@Index(name="test_entity_email_timestamp_idx", columnList="email, timestamp" )
})

fails at run-time with:
mismatched input ',' expecting ')'

The JPA 2.1 spec says:

11.1.23 Index Annotation
The Index annotation is used in schema generation. Note that it is not necessary to specify an index for a primary key, as the primary key index will be created automatically, however, the Index annotation may be used to specify the ordering of the columns in the index for the primary key.
@target({}) @retention(RUNTIME)
public @interface Index {
String name() default "";
String columnList();
boolean unique() default false;
}
The syntax of the columnList element is a column_list, as follows:
column::= index_column [,index_column]*
index_column::= column_name [ASC | DESC]
The persistence provider must observe the specified ordering of the columns.
If ASC or DESC is not specified, ASC (ascending order) is assumed.

Statement Cache Causing a Sequential Get's on Different Keyspaces to return the wrong info

I am using a simple table that exists in multiple keyspaces. I have an object (TestTable) annotated with the @table annotation. If I do sequential get(TestTable.class, id) using the same or different MappingSession objects it only works for the first keyspace because the static statement cache is using the table name as the key (and is not keyspace aware).

MappingBuilder.prepareSelect needs to take into account the keyspace when pulling from the cache (or the cache needs to be keyspace aware).

// ps = statementCache.get(table, new Callable()

Missing 2.1.0 on mavencentral

Could not download artifact 'com.valchkou.datastax:cassandra-driver-mapping:2.1.0:cassandra-driver-mapping.jar'
[02:18:32] > Could not GET 'https://repo1.maven.org/maven2/com/valchkou/datastax/cassandra-driver-mapping/2.1.0/cassandra-driver-mapping-2.1.0.jar'. Received status code 503 from server: backend read error

It is not good practice remove or update artifacts on maven central.

support mixed case for table and column names

currently all is in lower case.
need to create column and table names in its original case
and make the mapping case insensitive.

License missing

Could you add a license file, please?

Apply mapping session to Iterator

Hi Eugene,

I have an application where I need to retrieve many columns at once (~10,000), and I'd like to start processing the first returned results before the entire batch comes. Right now, though if I want to apply your mapping session, I have to call: List getFromResultSet(Class clazz, ResultSet rs), which has to loop through the results before returning the List.

I was wondering if you could augment the mapper by adding a call like:
Iterator getIteratorFromResultSet(Class clazz, ResultSet rs)
which would just apply a transformation to the rs.iterator() which converts the Row into T upon iteration.

Thanks!

-Mike

STATIC columns

Is there any way to annotate a column so it is static?

Log warnings - "Re-preparing already prepared query UPDATE"

I'm using spring and with every call to mappingSession.append(), following warnings are being written to the log:

2014-06-10 23:53:53.994  WARN   --- [Driver worker-7] com.datastax.driver.core.Cluster         : Re-preparing already prepared query UPDATE test_db.test SET days=days+{16098} WHERE id=?;. Please note that preparing the same query more than once is generally an anti-pattern and will likely affect performance. Consider preparing the statement only once.

I think that fixing this issue would bring a little performance gain at least, but the main thing is to get rid of the warning from the log.

Thanks for fixing this.

Memory leaks?

This is not a proper bug report, just an unconfirmed observation.

But still - might it be possible, that this library doesn't free some memory during the garbage collection?

To be more specific - I'm working with on a multi threaded application, which is generating data in huge speeds and in parallel and the data is being saved into cassandra using this library. The code works (at least better when comparing to Kundera from the beginning), but it seems to me, that the library doesn't free some memory. It looks to me, like there was a query cache or statement cache or some query log, which collects small amounts of data after each query call and since the app is making millions of queries, the amount of used memory slowly grows, until the data doesn't fit into the RAM. When I was profiling the app, I've seen, that the memory amount is growing until the garbage collection phase, after which the memory usage shrinks, but not exactly to the amount from the last garbage collection phase, but a little above this value. When this happens repeatedly during a couple of hours, the app has no other memory to use and it begins to get stuck in the garbage collection loop.

Is my observation correct? Might there be a bug?

The other option would be, that there is a bug in my app, but I don't think that might be the case, because the same app has worked correctly using hibernate (when I was storing the data into postgresql). That means that I didn't change any algorithm, just the persistence layer.

Any help appreciated. Thank you.

Support for mixed case table and column names?

Hello Eugene

Thank you for making your mapping layer available. I'm just getting started considering the next generation of a product I built over the last several years using the Hector client library. The product is still using Cassandra 1.1, and I'm looking to modernize things.

So, Cassandra 2.0 (DataStax Enterprise 4.x in my case) and the latest DataStax Java driver release came up on the radar screen. That and consideration for Astyanax, which did not exist at the time I started the original effort. I also use the Hector Object Mapper and in my search to find equivalent functionality for the Java driver, I found your project. :)

Something I noticed immediately when working with my existing schema is that the mixed cased table and column names are forced to lowercase by CQL3. I quickly found out I can use quotes to preserve the case.

Your project readme file, under "Various Mappings" stipulates: "All names are converted to lowercase.". Would it be possible for that not to be the case? Camel casing is a common Java variable/field naming technique. I know that need not directly translate to the Cassandra table and column names, but mine currently do.

Of course, it may make sense to go and adjust my naming within Cassandra. Not having to put quotes in my CQL files or typing them into cqlsh has its benefits as well. I just want to check to see what my options are.

Thanks,

Jeff

Support Table options

Table option such as
compression, compaction, read_repair_chance, comments and etc

valchkou / cassandra-driver-mapping Goto Github PK

cassandra-driver-mapping's Introduction

cassandra-driver-mapping

cassandra-driver-mapping's People

Stargazers

Watchers

Forkers

cassandra-driver-mapping's Issues

Recommend Projects

Recommend Topics

Recommend Org