netflix / astyanax Goto Github PK
View Code? Open in Web Editor NEWCassandra Java Client
License: Apache License 2.0
Cassandra Java Client
License: Apache License 2.0
I'm looking at the implementation of the getPartition method in TokenPartitionedTopology.
This method takes a BigInteger "token" as its argument and is supposed to return the HostConnectionPoolPartition that "owns" the token.
It seems to me that the the method, as implemented, is inconsistent with the description on the Cassandra wiki page:
Each Cassandra server [node] is assigned a unique Token that determines what keys it is the first replica for. If you sort all nodes' Tokens, the Range of keys each is responsible for is (PreviousToken, MyToken], that is, from the previous token (exclusive) to the node's token (inclusive). The machine with the lowest Token gets both all keys less than that token, and all keys greater than the largest Token; this is called a "wrapping Range."
The implementation of TokenPartitionedTopology.getPartition returns the partition whose id is the maximum value, over all partition ids, less than or equal to the argument token.
For examples, if the partitions have the ids: [ 0, 14, 25]:
getPartition(9) returns the partition whose id = 0 ( 0 is the maximum value in {0, 14, 25} less than 9)
getPartition(19) returns the partition whose id = 14 (14 is the maximum value in {0, 14, 25} less than 19)
getPartition(29) returns the partition whose id = 25 (25 is the maximum value in {0, 14, 25} less than 29)
This doesn't seem consistent with the description above, where, as I read it, the assignments in the examples should be:
getPartition(9) should return the partition whose id = 14 ( 0 < 9 <= 14)
getPartition(19) should return the partition whose id = 25 (14 < 19 <= 25)
getPartition(29) should return the partition whose id = 0 (because of "wrapping range")
(If I have something confused, let me know, the rest of this message just builds on the confusion.)
The Cassandra wiki page suggests, but does not seem to explicitly require, that the smallest token assigned to a node have the value 0.
(In particular, the passage I quoted above suggests that the lowest token might be greater than zero.)
If the smallest partition id is greater than 0, the implementation of TokenPartitionedTopology.getPartition assigns lesser tokens to the partition with the highest id:
For partitions with ids: [ 3, 14, 25]:
getPartition(1) returns the partition whose id = 25) (?? should the partition whose id = 0 ("wrapping range"))
getPartition(9) returns the partition whose id = 3) (?? should be the partition whose id = 14 ( 3 < 9 <= 14))
getPartition(19) returns the partition whose id = 14) (?? should be the partition whose id = 25 (14 < 19 <= 25))
getPartition(29) returns the partition whose id = 25) (?? should be the partition whose id = 0 ("wrapping range"))
(Actually the first example in this last set will throw an exception as I describe in my next issue.)
John Batali
Knewton
Current signature of this method is getKeySlice(Collection<K> keys)
. I think it would be more useful if it were getKeySlice(Iterable<? extends K> keys)
.
If no limit is specified with .limit(int) in CompositeRangeBuilder the default limit is 100.
Should it be this way or should it have no limit if no limit is specified.
Even though my hector code works well on my local. For some reason the round robin connection pooler. Any reason why this might be?
https://gist.github.com/1719092
the method .setRowLimit(int rowLimit) not working when the parameter rowLimit is larger than 1.for example
keyspace.prepareQuery(cf).getAllRows().setRowLimit(1).excute(); YES it works, it returns only one row as i wish
keyspace.prepareQuery(cf).getAllRows().setRowLimit(2).excute(); NO it not works,it returns all rows
Looks like the RangeBuilder can specify limit, but can't specify start. It can only give the column NAME of the start column, but I'd like to use the INDEX number of the start column. Is there a way to do something like that?
Thanks
Cassandra 1.1 is out for some time but Astyanax cannot be used with cassandra-all artifact of version 1.1. For me, the main reason to update is that Cassandra 1.1 uses libthrift 0.7.0 with some bugfixes.
When trying to create a column family with (approximately):
AstyanaxContext<Cluster> clusterContext = new AstyanaxContext.Builder()
.forCluster(CLUSTER_NAME)
.forKeyspace(KEYSPACE_NAME)
.withAstyanaxConfiguration(aci)
.withConnectionPoolConfiguration(connPool)
.buildCluster(ThriftFamilyFactory.getInstance());
clusterContext.start();
Cluster cluster = clusterContext.getEntity();
cluster.addColumnFamily(cluster.makeColumnFamilyDefinition()
.setName(COLUMNFAMILY_NAME)
.setKeyspace(KEYSPACE_NAME)
);
I get:
com.netflix.astyanax.connectionpool.exceptions.BadRequestException: BadRequestException: [host=127.0.0.1(127.0.0.1):9160, latency=10(10), attempts=1] InvalidRequestException(why:You have not set a keyspace for this session)
at com.netflix.astyanax.thrift.ThriftConverter.ToConnectionPoolException(ThriftConverter.java:155)
at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:60)
at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:27)
at com.netflix.astyanax.thrift.ThriftSyncConnectionFactoryImpl$1.execute(ThriftSyncConnectionFactoryImpl.java:113)
at com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:49)
at com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:222)
at com.netflix.astyanax.thrift.ThriftClusterImpl.addColumnFamily(ThriftClusterImpl.java:200)
at com.acunu.analytics.storage.AstyanaxStorage.<init>(AstyanaxStorage.java:135)
at com.acunu.analytics.storage.AstyanaxStorage.main(AstyanaxStorage.java:44)
Caused by: InvalidRequestException(why:You have not set a keyspace for this session)
at org.apache.cassandra.thrift.Cassandra$system_add_column_family_result.read(Cassandra.java:26897)
at org.apache.cassandra.thrift.Cassandra$Client.recv_system_add_column_family(Cassandra.java:1455)
at org.apache.cassandra.thrift.Cassandra$Client.system_add_column_family(Cassandra.java:1430)
at com.netflix.astyanax.thrift.ThriftClusterImpl$9.internalExecute(ThriftClusterImpl.java:205)
at com.netflix.astyanax.thrift.ThriftClusterImpl$9.internalExecute(ThriftClusterImpl.java:202)
at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:55)
... 7 more
This patch seems to solve the problem, if I understand the code correctly:
diff --git a/src/main/java/com/netflix/astyanax/thrift/ThriftClusterImpl.java b/src/main/java/com/netflix/astyanax/thrift/ThriftClusterImpl.java
index 79b40ca..2d5fd28 100644
--- a/src/main/java/com/netflix/astyanax/thrift/ThriftClusterImpl.java
+++ b/src/main/java/com/netflix/astyanax/thrift/ThriftClusterImpl.java
@@ -198,8 +198,8 @@ public class ThriftClusterImpl implements Cluster {
@Override
public String addColumnFamily(final ColumnFamilyDefinition def) throws ConnectionException {
return connectionPool.executeWithFailover(
- new AbstractOperationImpl<String>(
- tracerFactory.newTracer(CassandraOperationType.ADD_COLUMN_FAMILY)) {
+ new AbstractKeyspaceOperationImpl<String>(
+ tracerFactory.newTracer(CassandraOperationType.ADD_COLUMN_FAMILY), def.getKeyspace()) {
@Override
public String internalExecute(Client client) throws Exception {
return client.system_add_column_family(((ThriftColumnFamilyDefinitionImpl)def).getThriftColumnFamilyDefinition());
@@ -210,8 +210,8 @@ public class ThriftClusterImpl implements Cluster {
@Override
public String updateColumnFamily(final ColumnFamilyDefinition def) throws ConnectionException {
return connectionPool.executeWithFailover(
- new AbstractOperationImpl<String>(
- tracerFactory.newTracer(CassandraOperationType.UPDATE_COLUMN_FAMILY)) {
+ new AbstractKeyspaceOperationImpl<String>(
+ tracerFactory.newTracer(CassandraOperationType.UPDATE_COLUMN_FAMILY), def.getKeyspace()) {
@Override
public String internalExecute(Client client) throws Exception {
return client.system_update_column_family(((ThriftColumnFamilyDefinitionImpl)def).getThriftColumnFamilyDefinition());
I only see TimeUUID returned as java.util.UUID in the util class. But in the WiKi example, TimeUUID is used. Where should I import TimeUUID?
It would be nice if you can put the example code in the github code version control.
Thanks
The TimeUUID getNext() method increments the least significant bits of its value in order to get the "next" TimeUUID for the purposes of pagination.
However, Cassandra's TimeUUID comparator doesn't simply compare the whole thing lexically, the way its UUID comparator does. It compares it with respect to the TimeUUID RFC: http://www.ietf.org/rfc/rfc4122.txt
Note that the least significant part of the TimeUUID for comparison purposes is actually its 4th byte.
This problem obviously won't bite you often, since the space of UUIDs is so large, but when it bites you, it'll bite hard...
Does this make sense, or am I missing something?
Hello,
I have CQL requests I would like to execute through Astyanax:
select col1,col2,col3 from mycf where KEY IN ('01','02')
select col1,col2,col3 from mycf where KEY='01'
I have strange behavior here since the request returns the right number of rows (2 for the first request and 1 for the second) but these rows contain no column and the id/key isn't correct.
Here is the way I execute the requests:
Keyspace ks = (...);
OperationResult<CqlResult<String, String>> operationResult
= ks.prepareQuery(columnFamilyInfo)
.withCql(expression.toString()).execute();
SelectResult result = new SelectResult();
CqlResult<String, String> cqlResult = operationResult.getResult();
(...)
Here is the way I created the column family with cassandra cli:
drop column family mycf;
create column family mycf with comparator=UTF8Type
and column_metadata=[
{column_name: col1, validation_class: UTF8Type, index_type: KEYS},
{column_name: col2, validation_class: UTF8Type, index_type: KEYS}];
assume mycf keys as utf8;
assume mycf comparator as utf8;
assume mycf validator as utf8;
I use Cassandra 1.0.8 and Astyanax 0.8.9.
Thanks very much for your help!
Thierry
It seems all support was dropped in favor of composite columns.
Latencies for errors are counted towards the overall latency for the node. Errors generally happen very quickly and you can end up sending more and more traffic to the nodes that may be in a bad state.
I'm writing an integration test to check the Astyanax adapter functionality. Part of that is to create the keyspace, dynamically using code, and delete it later. I wrote the following code but seems like its not working. Can someone please help me to find the issue or provide alternate code to achieve the same functionality?
Here is what I tried to do.
AstyanaxContext clusterContext = new AstyanaxContext.Builder()
.forCluster(clusterName)
.withAstyanaxConfiguration(new AstyanaxConfigurationImpl().setDiscoveryType(NodeDiscoveryType.NONE))
.withConnectionPoolConfiguration(
new ConnectionPoolConfigurationImpl("MyConnectionPool").setPort(Integer.parseInt(ports))
.setMaxConnsPerHost(1).setSeeds("127.0.0.1:9160"))
.withConnectionPoolMonitor(new CountingConnectionPoolMonitor())
.buildCluster(ThriftFamilyFactory.getInstance());
ThriftKeyspaceDefinitionImpl def = new ThriftKeyspaceDefinitionImpl();
def.setName(keyspace);
clusterContext.getEntity().addKeyspace(def);
And this is the exception I'm getting.
com.netflix.astyanax.connectionpool.exceptions.NoAvailableHostsException: NoAvailableHostsException: [host=None(0.0.0.0):0, latency=0(0), attempts=0] No hosts to borrow from
at com.netflix.astyanax.connectionpool.impl.RoundRobinExecuteWithFailover.(RoundRobinExecuteWithFailover.java:30)
at com.netflix.astyanax.connectionpool.impl.RoundRobinConnectionPoolImpl.newExecuteWithFailover(RoundRobinConnectionPoolImpl.java:49)
at com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:222)
at com.netflix.astyanax.thrift.ThriftClusterImpl.addKeyspace(ThriftClusterImpl.java:229)
Can this please be added to the javadocs? And in particular, are Keyspace and ColumnFamily safe?
We're using the TokenAwareConnectionPool. During a node outage we discovered that the token ownership of our ring changed.
# nodetool -h localhost ring
Address DC Rack Status State Load Owns Token
127605887595351923798765477786913079296
192.168.220.55 datacenter1 rack1 Down Normal ? 36.47% 19521588064215444032974458419416990725
192.168.220.56 datacenter1 rack1 Up Normal 120.93 GB 13.53% 42535295865117307932921825928971026432
192.168.220.57 datacenter1 rack1 Up Normal 120.34 GB 25.00% 85070591730234615865843651857942052864
192.168.220.58 datacenter1 rack1 Up Normal 106.25 GB 25.00% 127605887595351923798765477786913079296
We're not sure why an outage resulted in the token for 220.55 changing from 0 to 19521588064215444032974458419416990725 but after the token changed any request for a token less than 19521588064215444032974458419416990725 resulted in the following
Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
at java.util.ArrayList.get(ArrayList.java:324)
at com.netflix.astyanax.connectionpool.impl.TokenPartitionedTopology.getPartition(TokenPartitionedTopology.java:152)
at com.netflix.astyanax.connectionpool.impl.TokenAwareConnectionPoolImpl.newExecuteWithFailover(TokenAwareConnection PoolImpl.java:67)
at com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:222)
at com.netflix.astyanax.thrift.ThriftColumnFamilyQueryImpl$3.execute(ThriftColumnFamilyQueryImpl.java:383)
at com.tendril.platform.services.mds.dao.MeterDataDaoCassandra.executeWithColumnRange(MeterDataDaoCassandra.java:54)
at com.tendril.platform.services.mds.dao.MeterDataDaoCassandra.getFirstOnOrBefore(MeterDataDaoCassandra.java:122)
at com.tendril.platform.services.mds.dao.MeterDataDaoCassandra.getFirstOnOrBefore(MeterDataDaoCassandra.java:1)
at com.tendril.platform.services.mds.service.MeterDataServiceImpl.getFirst(MeterDataServiceImpl.java:87)
at com.tendril.platform.services.mds.service.MeterDataServiceImpl.access$0(MeterDataServiceImpl.java:86)
at com.tendril.platform.services.mds.service.MeterDataServiceImpl$1.call(MeterDataServiceImpl.java:59)
at com.tendril.platform.services.mds.service.MeterDataServiceImpl$1.call(MeterDataServiceImpl.java:1)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
This appears to be related the following code in TokenParitionTopology;
@Override
public HostConnectionPoolPartition<CL> getPartition(BigInteger token) {
// First, get a copy of the partitions.
List<HostConnectionPoolPartition<CL>> partitions = this.sortedRing.get();
// Must have a token otherwise we default to the base class
// implementation
if (token == null || partitions == null || partitions.isEmpty()) {
return getAllPools();
}
// Do a binary search to find the token partition which owns the
// token. We can get two responses here.
// 1. The token is in the list in which case the response is the
// index of the partition
// 2. The token is not in the list in which case the response is the
// index where the token would have been inserted into the list.
// We convert this index (which is negative) to the index of the
// previous position in the list.
@SuppressWarnings("unchecked")
int j = Collections.binarySearch(partitions, token, tokenSearchComparator);
if (j < 0) {
j = -j - 2;
}
return partitions.get(j % partitions.size());
}
In a ring that's smallest token is non-zero, the value of j
from Collections.binarySearch(partitions, token, tokenSearchComparator)
will be -1
for any token that is less than the smallest token in the paritions
list. This will result in j = -j - 2
returning -1
and eventually the ArrayIndexOutOfBoundsException since j % partitions.size()
will also be -1
If there's not a hard reason for Guice 2.0, can the project please update to 3.0?
Hector have this nice feature, with this tenant id application can isolate data in the same columnfamily
I'm looking at the implementation of the getPartition method in TokenPartitionedTopology.
This method takes a BigInteger "token" as its argument and is supposed
to return the HostConnectionPoolPartition that "owns" the token.
If the smallest partition id is greater than zero, the method will
throw an ArrayIndexOutOfBoundsException if its token argument is less than
the smallest partition id.
This occurs because the index of the partition with the next greater
id is found (using Collections.binarySearch), decremented by 1 and
taken % partitions.size().
If the argument token is smaller than the smallest partition id, the
value returned by Collections.binarySearch will correspond to an index
of 0, which is decremented to -1.
And java's % operator will return a negative value if its left hand
argument is negative.
Replacing the % operator with a call to Guava's IntMath.mod method
(which always returns positive values) gets rid of the exception.
(But the method still maps tokens less than the smallest partition id to the partition
with the greatest id. As I describe in my previous issue, this seems
inconsistent with description on Cassandra's wiki page.)
John Batali
Knewton
An example demonstrating declaring a Cassandra column family would be useful.
I posted this on stack overflow but looks like no astyanax questions there yet
I saw the ChunkedExample that takes an InputStream, but I am receiving http requests chunk after chunk and as I receive them, I want to stream them into cassandra "without" waiting for a reply from cassandra until the very last byte and then check all responses to make sure the whole file succeeded in being written.
Also, then need an example on getting it back out. It looks like you may have this streaming capability,, right?
Also, is there a forum for questions? I am trying to build an ORM/indexing(albeit lucene indexing with solr/solandra) in one which actually has NamedQuery but with lucene/solr syntax on github right now and trying to find a way to add this file streaming capabiltiy which is a little outside the ORM......solr then avoids the whole reverse indexing hack which is kind of very custom.
thanks,
Dean
Need:
ColumnListMutation<C> putColumn(C columnName, float value, Integer ttl);
without this, sending across float or double values will intermittently cause AbstractType.validate to puke for float and/or double
super easy to fix; I'm still figuring out how to use this github thing, else I would submit a patch.
Anyone know what's going on here?
testCqlCount(com.netflix.astyanax.thrift.ThrifeKeyspaceImplTest) Time elapsed: 0.008 sec <<< FAILURE!
junit.framework.AssertionFailedError: null
...
at com.netflix.astyanax.thrift.ThrifeKeyspaceImplTest.testCqlCount(ThrifeKeyspaceImplTest.java:1356)
When calling Cluster.addColumnFamily, I get the below exception. I understand the root cause (keyspace not set for this session) but I am not sure how to use this API correctly.
On the surface, it seems like ThriftClusterImpl#addColumnFamily is missing a call for set_keyspace but I am not entirely sure.
Am I using this API incorrectly?
Thanks
com.netflix.astyanax.connectionpool.exceptions.BadRequestException: BadRequestException: [host=localhost(127.0.0.1):9160, latency=566(566), attempts=1] InvalidRequestException(why:You have not set a keyspace for this session)
at com.netflix.astyanax.thrift.ThriftConverter.ToConnectionPoolException(ThriftConverter.java:155)
at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:60)
at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:27)
at com.netflix.astyanax.thrift.ThriftSyncConnectionFactoryImpl$1.execute(ThriftSyncConnectionFactoryImpl.java:115)
at com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:49)
at com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:222)
at com.netflix.astyanax.thrift.ThriftClusterImpl.addColumnFamily(ThriftClusterImpl.java:200)
at com.emc.storageos.db.server.impl.SchemaUtil.checkCf(SchemaUtil.java:167)
at com.emc.storageos.db.server.impl.SchemaUtil.scanAndSetupDb(SchemaUtil.java:142)
at com.emc.storageos.DbsvcTestBase.startDb(DbsvcTestBase.java:84)
at com.emc.storageos.DbsvcTestBase.setup(DbsvcTestBase.java:55)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:27)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
at org.junit.runner.JUnitCore.run(JUnitCore.java:157)
at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:76)
at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:182)
at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:62)
Caused by: InvalidRequestException(why:You have not set a keyspace for this session)
at org.apache.cassandra.thrift.Cassandra$system_add_column_family_result.read(Cassandra.java:26899)
at org.apache.cassandra.thrift.Cassandra$Client.recv_system_add_column_family(Cassandra.java:1455)
at org.apache.cassandra.thrift.Cassandra$Client.system_add_column_family(Cassandra.java:1430)
at com.netflix.astyanax.thrift.ThriftClusterImpl$9.internalExecute(ThriftClusterImpl.java:205)
at com.netflix.astyanax.thrift.ThriftClusterImpl$9.internalExecute(ThriftClusterImpl.java:202)
at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:55)
... 23 more
Hi,
I potentially have a use case where I need to fetch a significant fraction of the rows of a column family in the background. I would like to distribute the load as evenly as possible amongst the cassandra nodes. (BTW for various reasons I cannot use Hadoop). I was thinking I could use the host descriptions to find the token ranges for each partition, and then issue a series of queries to each node. I believed the TokenAwareConnectionPool in conjunction with operations specifying a pinned host would allow me to do this. But when I looked at the implementation it appeared that pinning a host was mutually exclusive with failover. So I'd like to propose a change to TokenAwareConnectionPoolImpl such that it starts with the pinned host but still can failover to other replicas if that host fails. I coded up something (untested) just to show what I have in mind:
@Override
public <R> ExecuteWithFailover<CL, R> newExecuteWithFailover(
Operation<CL, R> op) throws ConnectionException {
HostConnectionPoolPartition<CL> partition
= topology.getPartition(op.getToken());
List<HostConnectionPool<CL>> pools = partition.getPools();
int firstHostIndex = -1;
Host pinnedHost = op.getPinnedHost();
if (pinnedHost == null) {
firstHostIndex = partition.isSorted() ? 0 : -1;
} else {
for ( int p = 0; p < pools.size(); ++p) {
if ( pinnedHost.equals(pools.get(p).getHost())) {
firstHostIndex = p;
break;
}
}
}
return new RoundRobinExecuteWithFailover<CL, R>(config, monitor, pools,
firstHostIndex >= 0 ? firstHostIndex : roundRobinCounter.incrementAndGet());
}
in com.netflix.astyanax.connectionpool.impl.ConnectionPoolMBeanManager (and maybe other classes as well) the Logger and LoggerFactory are in the org.apache.log4j package instead of using the more generic org.slf4j package. This means we have to use log4j instead of something more current like logback.
https://gist.github.com/1718436
So I am sure the Keyspace exists and my cassandra was up. Can someone tell me whats going on here?
Using Astyanax 1.0.0...
[May 07 09:25:07] INFO ConnectionPoolMBeanManager:45 - Registering mbean: com.netflix.MonitoredResources:type=ASTYANAX,name=FliteConnectionPool,ServiceType=connectionpool
java.lang.NoClassDefFoundError: com/google/common/cache/CacheBuilder
at com.netflix.astyanax.thrift.ThriftKeyspaceImpl.(ThriftKeyspaceImpl.java:81)
at com.netflix.astyanax.thrift.ThriftClusterImpl.getKeyspace(ThriftClusterImpl.java:239)
...etc...
Seems like the pom.xml is outdated. It has guice that's not really used and cassandra-all which is at 0.8.9 which is pretty old.
I am having trouble persisting a Long correctly when defining my ColumnFamily at runtime and using the Mapping functionality.
Here is how I am creating the Column Family:
ColumnFamilyDefinition cfDef = cluster.makeColumnFamilyDefinition().setName(CF_DATA.getName()).setComparatorType("UTF8Type").setKeyValidationClass("UTF8Type");
cfDef.addColumnDefinition(cluster.makeColumnDefinition().setName("status").setValidationClass("UTF8Type").setIndex("test_status_index", "KEYS"));
cfDef.addColumnDefinition(cluster.makeColumnDefinition().setName("dateCreated").setValidationClass("LongType").setIndex("test_dateCreated_index", "KEYS"));
Then From there I am inserting:
TestEntity testBean = new TestEntity();
testBean.setDateCreated(System.currentTimeMillis());
testBean.setStatus("PENDING");
// writing a bean
MutationBatch mutationBatch = testKeySpace.prepareMutationBatch();
ColumnListMutation<String> columnListMutation = mutationBatch.withRow(CF_DATA, 1L);
mapper.fillMutation(testBean, columnListMutation);
mutationBatch.execute();
And my Entity looks like this:
public class TestEntity {
public Long getId() {
return id;
}
public void setId(Long id) {
this.id = id;
}
public String getStatus() {
return status;
}
public void setStatus(String status) {
this.status = status;
}
public Long getDateCreated() {
return dateCreated;
}
public void setDateCreated(Long dateCreated) {
this.dateCreated = dateCreated;
}
@Id("id")
private Long id;
@Column("status")
private String status;
@Column("dateCreated")
private Long dateCreated;
}
The error I get con the CLI console when trying to view the persisted entity is "invalid UTF8 bytes 0000013604971679". I know it has to do with the Long dateCreated field because when I remove it everything works fine. Is this a bug or am I doing something wrong?
I think:
// Querying cassandra for all columns of row key "SomeSessionId" starting with
should be:
// Querying cassandra for all columns of row key "SomeUserName" starting with
The comment is kinda confusing..but I am just learning about Cassandra so I didn't want to change it myself if I'm just missing something...
I'm equally new to Cassandra and Astyanax and am having difficulty leveraging the improved manner of approaching Composites with the Component annotation.
public class MyCompositeColumn {
@Component(ordinal = 0)
private UUID timeUUID;
@Component(ordinal = 1)
private String someId;
}
Most of the time I'm accessing columns by their TimeUUID range, but occasionally I'd like to filter by the other value in the composite column. From what I see in Hector, there exists the ability to specify this level of filtering in a get.
[Hector]
composite.addComponent(1,"someId",Composite.ComponentEquality.EQUAL);
It seems that AnnotatedCompositeSerializer.makeEndpoint will always set the composite component to position 0 if I'm reading the source correclty. Perhaps this method can be overloaded with a position argument?
The other manner to do this is AnnotatedCompositeSerializer.buildRange() withPrefix. But seeing how we want the value in this case to be ignored or a wild card, there is no way around this to get to component 1
AnnotatedCompositeSerializer.buildRange()
.withPrefix(null) // ???
.greaterThanEquals(someId)
.lessThanEquals(someId))
Maybe my feeble mind is missing another way to do this in Astyanax? Any suggestions would be appreciated. I'm new to the open-source-ery and would like to help if I can ;)
My Example:
[CLI] create column family COMPOSITE_TESTER with comparator = 'CompositeType(TimeUUIDType, UTF8Type)' and key_validation_class = 'UTF8Type' and default_validation_class = 'UTF8Type';
AnnotatedCompositeSerializer myCompositeColumnSerializer = new AnnotatedCompositeSerializer(MyCompositeColumn.class);
ColumnFamily<String, MyCompositeColumn> COMPOSITE_TESTER = new ColumnFamily<String, MyCompositeColumn>(
"COMPOSITE_TESTER", // Column Family
StringSerializer.get(), // Key Serializer
myCompositeColumnSerializer); // Column Serializer
MyCompositeColumn myCompositeColumn = new MyCompositeColumn (TimeUUIDUtils.getTimeUUID(System.currentTimeMillis(), String.valueOf(randomId.nextInt(1000)));
keyspace.prepareColumnMutation(COMPOSITE_TESTER, keyName, myCompositeColumn)
.putValue("{ JSON " + System.currentTimeMillis() + "}", 300)
.execute();
OperationResult<ColumnList> result = keyspace.prepareQuery(COMPOSITE_TESTER)
.getKey(keyName)
// I'm able to do a columnRange slice on Component position 0 with...
.withColumnRange(
//makeEndpoint always operates on Component 0 of a Composite Colum ... position=0; components.get(position);
myCompositeColumnSerializer.makeEndpoint(null, Equality.GREATER_THAN).toBytes(),
myCompositeColumnSerializer.makeEndpoint(nowTimeUUID, Equality.LESS_THAN_EQUALS).toBytes(),
false, 100)
// this too works for the Component at position 0...
// .withColumnRange(
// new MyCompositeColumn(thenTimeUUID, null),
// new MyCompositeColumn(nowTimeUUID, null),
// false, 100)
// this doesn't work for the Component at position 1. tweaking Comparable / equals is hacky ?
// .withColumnRange(
// new MyCompositeColumn(null, someId),
// new MyCompositeColumn(null, someId),
// false, 100)
.execute();
// this does not....
// myCompositeColumnSerializer.buildRange()
// .withPrefix(null) // ???
// .greaterThanEquals(someId)
// .lessThanEquals(someId))
I have a composite type class with
@Component(ordinal = 0)
private long startTime;
@Component(ordinal = 1)
private URI uri;
but i can't use it because SerializerTypeInferer
knows nothing about my UriSerializer extends AbstractSerializer<URI>
that serializes java.net.URI
class.
public AnnotatedCompositeSerializer(Class<T> clazz) {
this.clazz = clazz;
this.components = new ArrayList<ComponentSerializer<?>>();
for (Field field : clazz.getDeclaredFields()) {
Component annotation = field.getAnnotation(Component.class);
if (annotation != null) {
components.add(makeComponent(field, SerializerTypeInferer.getSerializer(field.getType()),
annotation.ordinal()));
}
}
Collections.sort(this.components);
}
Exclude all transitive dependencies of cassandra-all
that are not really needed by Astyanax. Like avro, slf4j-log4j, log4j, junit, etc.
$ mvn dependency:tree
[INFO] Scanning for projects...
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] Building astyanax 1.0.2-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO]
[INFO] --- maven-dependency-plugin:2.1:tree (default-cli) @ astyanax ---
[INFO] com.netflix.astyanax:astyanax:jar:1.0.2-SNAPSHOT
[INFO] +- joda-time:joda-time:jar:2.0:compile
[INFO] +- org.apache.servicemix.bundles:org.apache.servicemix.bundles.commons-csv:jar:1.0-r706900_3:compile
[INFO] +- commons-lang:commons-lang:jar:2.4:compile
[INFO] +- com.github.stephenc.high-scale-lib:high-scale-lib:jar:1.1.1:compile
[INFO] +- com.google.guava:guava:jar:11.0.2:compile
[INFO] | \- com.google.code.findbugs:jsr305:jar:1.3.9:compile
[INFO] +- org.apache.cassandra:cassandra-all:jar:1.0.8:compile
[INFO] | +- org.xerial.snappy:snappy-java:jar:1.0.4.1:compile
[INFO] | +- com.ning:compress-lzf:jar:0.8.4:compile
[INFO] | +- commons-cli:commons-cli:jar:1.1:compile
[INFO] | +- commons-codec:commons-codec:jar:1.2:compile
[INFO] | +- com.googlecode.concurrentlinkedhashmap:concurrentlinkedhashmap-lru:jar:1.2:compile
[INFO] | +- org.antlr:antlr:jar:3.2:compile
[INFO] | | \- org.antlr:antlr-runtime:jar:3.2:compile
[INFO] | | \- org.antlr:stringtemplate:jar:3.2:compile
[INFO] | | \- antlr:antlr:jar:2.7.7:compile
[INFO] | +- org.apache.cassandra.deps:avro:jar:1.4.0-cassandra-1:compile
[INFO] | | \- org.mortbay.jetty:jetty:jar:6.1.22:compile
[INFO] | | +- org.mortbay.jetty:jetty-util:jar:6.1.22:compile
[INFO] | | \- org.mortbay.jetty:servlet-api:jar:2.5-20081211:compile
[INFO] | +- org.codehaus.jackson:jackson-core-asl:jar:1.4.0:compile
[INFO] | +- org.codehaus.jackson:jackson-mapper-asl:jar:1.4.0:compile
[INFO] | +- jline:jline:jar:0.9.94:compile
[INFO] | +- com.googlecode.json-simple:json-simple:jar:1.1:compile
[INFO] | +- org.yaml:snakeyaml:jar:1.6:compile
[INFO] | +- log4j:log4j:jar:1.2.16:compile
[INFO] | +- org.slf4j:slf4j-log4j12:jar:1.6.1:runtime
[INFO] | +- org.apache.thrift:libthrift:jar:0.6.1:compile
[INFO] | | +- junit:junit:jar:4.4:compile
[INFO] | | +- javax.servlet:servlet-api:jar:2.5:compile
[INFO] | | \- org.apache.httpcomponents:httpclient:jar:4.0.1:compile
[INFO] | | +- org.apache.httpcomponents:httpcore:jar:4.0.1:compile
[INFO] | | \- commons-logging:commons-logging:jar:1.1.1:compile
[INFO] | +- org.apache.cassandra:cassandra-thrift:jar:1.0.8:compile
[INFO] | \- com.github.stephenc:jamm:jar:0.2.5:compile
[INFO] +- com.github.stephenc.eaio-uuid:uuid:jar:3.2.0:compile
[INFO] +- org.slf4j:slf4j-api:jar:1.6.4:compile
[INFO] \- org.codehaus.jettison:jettison:jar:1.3.1:compile
[INFO] \- stax:stax-api:jar:1.0.1:compile
It seems that the async code path is implemented by simply executing synchronous version on the thread pool.
Is it possible to make this really async (i.e. using non-blocking socket stack like, for example, what netty does)?
I'm not familiar enough with Thrift, it may not be even fundamentally possible..
Thanks,
Lech
Hi,
when I run together the first three snippets of code from Getting Started i.e. 'Initializing Astyanax', 'Define your column family structure' and 'Inserting data' I get a connection exception with executing the mutation batch: com.netflix.astyanax.connectionpool.exceptions.BadRequestException: BadRequestException: [host=127.0.0.1(127.0.0.1):9160, latency=8(40), attempts=1] InvalidRequestException(why:Keyspace KeySpaceName does not exist).
Only when I manually create the Keyspace and ColumnFamily on cassandra do the inserts work. Am I missing some obvious steps here?
Thanks
Jude
I just started to evalaute astyanax by testing out these listed code snipptes. But I keep getting the same exception "com.netflix.astyanax.connectionpool.exceptions.PoolTimeoutException: PoolTimeoutException". I'm pretty sure that I get all the connection parameters correct. Could anyone help? Thanks.
A query with getKeyRange passing startKey, endKey and null for startToken and endToken, where the number of rows in the key slice is smaller than count fails with the exception below.
The doc states that count is
Max number of keys to return
thus this behavior is unexpected.
Exception in thread "main" com.netflix.astyanax.connectionpool.exceptions.BadRequestException: BadRequestException: [host=127.0.0.1(127.0.0.1):9160, latency=20(64), attempts=1] InvalidRequestException(why:start key's md5 sorts after end key's md5. this is not allowed; you probably should not specify end key at all, under RandomPartitioner)
at com.netflix.astyanax.thrift.ThriftConverter.ToConnectionPoolException(ThriftConverter.java:155)
at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:60)
at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:27)
at com.netflix.astyanax.thrift.ThriftSyncConnectionFactoryImpl$1.execute(ThriftSyncConnectionFactoryImpl.java:113)
at com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:49)
at com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:222)
at com.netflix.astyanax.thrift.ThriftColumnFamilyQueryImpl$2.execute(ThriftColumnFamilyQueryImpl.java:320)
at testapp.Main$.main(ExampleApp.scala:40)
at testapp.Main.main(ExampleApp.scala)
Caused by: InvalidRequestException(why:start key's md5 sorts after end key's md5. this is not allowed; you probably should not specify end key at all, under RandomPartitioner)
at org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:12814)
at org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:762)
at org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:734)
at com.netflix.astyanax.thrift.ThriftColumnFamilyQueryImpl$2$1.internalExecute(ThriftColumnFamilyQueryImpl.java:338)
at com.netflix.astyanax.thrift.ThriftColumnFamilyQueryImpl$2$1.internalExecute(ThriftColumnFamilyQueryImpl.java:324)
at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:55)
... 7 more
I generate another pull request but since there is one already pending, I doubt I had gone through.
I can send another PR as soon as my previous one gets accepted.
I noticed a TODO in the javadoc for putColumn() for composites. How soon might this be added? I'm wondering if I should take a crack at it, as we'd like to use composites asap, given the deprecation of supercolumns in astyanax.
@Override
public long getCurrentTime() {
// The following simulates a microseconds resolution by advancing a static counter
// every time a client calls the createClock method, simulating a tick.
long us = System.currentTimeMillis() * ONE_THOUSAND;
return us + counter.getAndIncrement() % ONE_THOUSAND;
}
The last line, if this clock is invoke a few times for the same millisecond and when the atomic integer is near the value 1000, future calls for the same millisecond can generate smalle values.
I recommend using the version in Hector to resolve this issue.
After I enable libthrift 0.8.0 got following error at runtime:
com.netflix.astyanax.connectionpool.exceptions.PoolTimeoutException: PoolTimeoutException: [host=127.0.0.1(127.0.0.1):9160, latency=2007(2007), attempts=1] Timed out waiting for connection
Got RuntimeException -
at com.netflix.astyanax.connectionpool.impl.SimpleHostConnectionPool.waitForConnection(SimpleHostConnectionPool.java:194)
at com.netflix.astyanax.connectionpool.impl.SimpleHostConnectionPool.borrowConnection(SimpleHostConnectionPool.java:154)
at com.netflix.astyanax.connectionpool.impl.RoundRobinExecuteWithFailover.borrowConnection(RoundRobinExecuteWithFailover.java:59)
at com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:47)
at com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:222)
at com.netflix.astyanax.thrift.ThriftKeyspaceImpl.executeOperation(ThriftKeyspaceImpl.java:407)
at com.netflix.astyanax.thrift.ThriftKeyspaceImpl.access$000(ThriftKeyspaceImpl.java:59)
at com.netflix.astyanax.thrift.ThriftKeyspaceImpl$2.execute(ThriftKeyspaceImpl.java:121)
at com.mycompany.app.TryCassandraByAs.createEvent(TryCassandraByAs.java:67)
at com.mycompany.app.TryCassandraByAs.main(TryCassandraByAs.java:94)
ColumnList<String> columns = ...
Long value = columns.getLongValue("do-not-exist", null);
will produce
Caused by: java.lang.NullPointerException
at com.netflix.astyanax.model.AbstractColumnList.getLongValue(AbstractColumnList.java:29)
This is because compiler tries to unbox defaultValue
(see this question) BEFORE the condition check:
@Override
public Long getLongValue(C columnName, Long defaultValue) {
Column<C> column = getColumnByName(columnName);
return column != null ? column.getLongValue() : defaultValue;
}
The code should be (the same problem with other get...Value()
methods of that class):
@Override
public Long getLongValue(C columnName, Long defaultValue) {
Column<C> column = getColumnByName(columnName);
if (column != null) {
return column.getLongValue();
}
return defaultValue;
}
The spacing in some files is really bad, making them much more difficult to read. E.g. DynamicCompositeSerializer has a good mix of tabs and spacing, which makes it hard to view the source in github.
https://github.com/Netflix/astyanax/blob/master/src/main/java/com/netflix/astyanax/serializers/DynamicCompositeSerializer.java
I'd suggest replacing all tabs with spaces:
grep -Plr ‘\t’ src | xargs sed -i ‘s/\t/ /g’
Alternatively, you could import the project into Eclipse and reformat the entire source tree.
Add a hook for using a transaction manager when doing a MutationBatch. This will serve two use cases
The usage will be
MutationBatch m = new MutationBatch()
m.withRow("abc')
.putColumn(...)
...
m.usingTransactionManager(someTransactionManager);
m.execute();
The interface of the TransactionManager would be,
interface TransactionManager {
Transaction beginTransaction(MutationBatch m);
}
interface Transaction {
void commit();
void rollback();
}
There is a miss in the consistency level. We'd like to use TWO but it isn't available.
Java's Future feature is not very useful when using the vanilla version. One has eventually call future.get() and potentially block.
It would be good to use something like ListenableFuture<> instead (of the google guava fame) so the caller can get notified when the Future is actually satisfied.
I have a simple two string composite column:
static public class LicenseXPathCompositeColumn
{
@Component(ordinal=0) String hash;
@Component(ordinal=1) String id;
}
(I've hidden two constructors and a toString for simplicity)
The code below correctly compiles and runs:
// get the colums
LicenseXPathCompositeColumn lXPathColumn = new LicenseXPathCompositeColumn(lHashcode, "xpath");
LicenseXPathCompositeColumn lResultColumn = new LicenseXPathCompositeColumn(lHashcode, "result");
OperationResult<ColumnList<LicenseXPathCompositeColumn>> result = keyspace.prepareQuery(cLicenseXPathColumnFamily)
.getKey(id)
.withColumnSlice(lXPathColumn, lResultColumn)
.execute();
ColumnList<LicenseXPathCompositeColumn> lColumns = result.getResult();
// process the columns
for (com.netflix.astyanax.model.Column<LicenseXPathCompositeColumn> c : lColumns) { System.out.println(c.getName() + "=" + c.getStringValue()); }
System.out.println(lColumns.getColumnByName(lXPathColumn));
System.out.println(lColumns.getColumnByName(lResultColumn));
Two columns are found and this is the output of the printlns:
89eb5af1fbecabc7874de697d815d4d5:result=true
89eb5af1fbecabc7874de697d815d4d5:xpath=count(/license/configuration/services/notification) > 0
null
null
What confuses me is that according to the API I have to provide the composite columns as the parameter to the getColumnByName, but it does not find the column.
It appears that 0.8.9 is the only version available in the maven repo. Will someone push 0.8.10 out there so we don't have to rely on local builds? Or, is it available somewhere and I missed it?
See http://mvnrepository.com/artifact/com.netflix.astyanax/astyanax
Thanks in advance!
I have a composite column with two fields, a Date and a String defined as:
@component(ordinal=0)
private Date datePart;
@Component(ordinal=1)
private String subcolName;
I insert 3 such columns using dates as today, tomorrow and the day after:
Calendar calendar = Calendar.getInstance();
SimpleDateFormat sdf = new SimpleDateFormat("yyyy.MM.dd 'at' HH:mm:ss.SSS");
Date date = calendar.getTime(); // today
calendar.add(Calendar.DAY_OF_MONTH, 1);
Date date1 = calendar.getTime(); // tomorrow
calendar.add(Calendar.DAY_OF_MONTH, 1);
Date date2 = calendar.getTime(); // the day after
The inserts work fine and I can see them in cassandra with cassandra-cli. However if I do a column range query and use exactly 'date2' as the end range, that column gets skipped. If I add 1 second to 'date2', then the column is returned properly. FWIW I have implemented Comparable and overridden equals() and hashCode().
This is the query:
OperationResult<ColumnList<MockCompositeType>> compositeResult = keyspace.prepareQuery(CF_COMPOSITE)
.getKey(key)
.withColumnRange(new MockCompositeType(dateStart, null),
new MockCompositeType(dateEnd, null),
false, 100)
.execute();
System.out.println("Composite column range query for '" + key + "' from: '" + dateStart + "' to '" + dateEnd + "'");
for (Column<MockCompositeType> c : compositeResult.getResult()) {
System.out.println("'"+c.getName()+"'= " + c.getStringValue());
}
The output for the range [date, date2] is:
Composite column range query for 'key-1' from: 'Sun Apr 01 14:28:21 PDT 2012' to 'Tue Apr 03 22:28:21 PDT 2012'
'MockCompositeType[Mon Apr 02 14:28:21 PDT 2012,Volume]'= today 11
'MockCompositeType[Tue Apr 03 14:28:21 PDT 2012,Volume]'= tomorrow 11
Now I add 1 second to date2 and get:
Composite column range query for 'key-1' from: 'Mon Apr 02 14:28:21 PDT 2012' to 'Wed Apr 04 14:28:22 PDT 2012'
'MockCompositeType[Mon Apr 02 14:28:21 PDT 2012,Volume]'= today 11
'MockCompositeType[Tue Apr 03 14:28:21 PDT 2012,Volume]'= tomorrow 11
'MockCompositeType[Wed Apr 04 14:28:21 PDT 2012,Volume]'= the day after 11
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.