Git Product home page Git Product logo

dcos-cassandra-service's People

Contributors

aaronjwood avatar dylanwilder avatar eastlondoner avatar fcuny avatar gabrielhartmann avatar ianpward avatar jay-zhuang avatar joel-hamill avatar jsrodman avatar keithchambers avatar kensipe avatar kow3ns avatar loren avatar mgummelt avatar mohitsoni avatar mpereira avatar mrbrowning avatar nickbp avatar nimavaziri avatar sascala avatar ssk2 avatar susanxhuynh avatar szhou1234 avatar triclambert avatar unclebarney avatar varungup90 avatar verma7 avatar xuteng2000 avatar zhitaoli avatar zhiyanshao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dcos-cassandra-service's Issues

Add support for graphite metrics collector

Support for statsd metrics collection was recently added in #43 (bf9a4f4).

However, at Uber we are planning to use Graphite. Hence I would like to generalize the metrics collection to support statsd or graphite by using environment variables.

replacement task was lost during rebooting machine

We have in-house 5 nodes cassandra cluster under mesos management. After I rebooting one of node, the node came back, but not the cassandra task. This problem is also repeatable in 2-nodes cluster as well.

Look into the code, seems the problem exists in CassandraRepairScheduler.java. See blow code, we change task state before evaluating against the offer. if the evaluation fails then task will not be in 'terminal' state anymore.

resourceOffers(....)
{
...
if (terminatedOption.isPresent()) {
try {
CassandraTask terminated = terminatedOption.get();
terminated = cassandraTasks.replaceTask(terminated); <<<< this will remove the 'terminal' state
OfferRequirement offerReq =
offerRequirementProvider.getReplacementOfferRequirement(
terminated.toProto());
OfferEvaluator offerEvaluator = new OfferEvaluator(offerReq);
List recommendations =
offerEvaluator.evaluate(offers);
LOGGER.debug("Got recommendations: {} for terminated task: {}",
recommendations,
terminated.getId());
acceptedOffers = offerAccepter.accept(driver, recommendations);

IdentityResource tests

Implement tests for the IdentityResource. Tests should use a live application and http client.

add periodic task reconciliation

Current Cassandra scheduler kicks reconciler at start time. We should have periodic task reconciliation as this improves the availability of system.

Mesosphere maven repos throw a 403

http://downloads.mesosphere.com/maven-snapshot
http://downloads.mesosphere.com/maven

These URLs both give a 403. I'm unable to grab all of the dependencies needed with Gradle for the project because of this. Compilation fails when running ./gradlew clean build:

:cassandra-commons:compileJava
/Users/user/Code/src/dcos-cassandra-service/cassandra-commons/src/generated/main/java/com/mesosphere/dcos/cassandra/common/CassandraProtos.java:2291: error: cannot find symbol
      com.google.protobuf.GeneratedMessageV3 implements
                         ^
  symbol:   class GeneratedMessageV3
  location: package com.google.protobuf
/Users/user/Code/src/dcos-cassandra-service/cassandra-commons/src/generated/main/java/com/mesosphere/dcos/cassandra/common/CassandraProtos.java:848: error: cannot find symbol
      com.google.protobuf.GeneratedMessageV3 implements
                         ^
  symbol:   class GeneratedMessageV3
  location: package com.google.protobuf
/Users/user/Code/src/dcos-cassandra-service/cassandra-commons/src/generated/main/java/com/mesosphere/dcos/cassandra/common/CassandraProtos.java:1489: error: cannot find symbol
      com.google.protobuf.GeneratedMessageV3 implements
                         ^
  symbol:   class GeneratedMessageV3
  location: package com.google.protobuf
/Users/user/Code/src/dcos-cassandra-service/cassandra-commons/src/generated/main/java/com/mesosphere/dcos/cassandra/common/CassandraProtos.java:6904: error: package com.google.protobuf.GeneratedMessageV3 does not exist
    com.google.protobuf.GeneratedMessageV3.FieldAccessorTable
                                          ^
/Users/user/Code/src/dcos-cassandra-service/cassandra-commons/src/generated/main/java/com/mesosphere/dcos/cassandra/common/CassandraProtos.java:6909: error: package com.google.protobuf.GeneratedMessageV3 does not exist
    com.google.protobuf.GeneratedMessageV3.FieldAccessorTable
                                          ^
/Users/user/Code/src/dcos-cassandra-service/cassandra-commons/src/generated/main/java/com/mesosphere/dcos/cassandra/common/CassandraProtos.java:6914: error: package com.google.protobuf.GeneratedMessageV3 does not exist
    com.google.protobuf.GeneratedMessageV3.FieldAccessorTable
                                          ^
/Users/user/Code/src/dcos-cassandra-service/cassandra-commons/src/generated/main/java/com/mesosphere/dcos/cassandra/common/CassandraProtos.java:6919: error: package com.google.protobuf.GeneratedMessageV3 does not exist
    com.google.protobuf.GeneratedMessageV3.FieldAccessorTable
                                          ^
/Users/user/Code/src/dcos-cassandra-service/cassandra-commons/src/generated/main/java/com/mesosphere/dcos/cassandra/common/CassandraProtos.java:6924: error: package com.google.protobuf.GeneratedMessageV3 does not exist
    com.google.protobuf.GeneratedMessageV3.FieldAccessorTable
                                          ^
/Users/user/Code/src/dcos-cassandra-service/cassandra-commons/src/generated/main/java/com/mesosphere/dcos/cassandra/common/CassandraProtos.java:2295: error: package com.google.protobuf.GeneratedMessageV3 does not exist
    private CassandraConfig(com.google.protobuf.GeneratedMessageV3.Builder<?> builder) {
                                                                  ^
/Users/user/Code/src/dcos-cassandra-service/cassandra-commons/src/generated/main/java/com/mesosphere/dcos/cassandra/common/CassandraProtos.java:2435: error: package com.google.protobuf.GeneratedMessageV3 does not exist
    protected com.google.protobuf.GeneratedMessageV3.FieldAccessorTable

etc. etc. etc.

It looks like this has happened before as well mesosphere/chaos#23

Task failure lead to spinning up more tasks then required due to a bug in id generation/persistence

INFO  [2016-02-16 14:39:26,384] com.mesosphere.dcos.cassandra.scheduler.CassandraScheduler: Received status update for taskId=server-1_e8c8c0db-edf5-4ec5-8ef7-c26fe8d58486 state=TASK_KILLED source=SOURCE_EXECUTOR reason=REASON_COMMAND_EXECUTOR_FAILED message='Cassandra Daemon was killed by signal 15'
INFO  [2016-02-16 14:39:26,394] com.mesosphere.dcos.cassandra.scheduler.plan.CassandraDaemonBlock: Reallocating task server-3_a457dca3-cf64-45ae-8ec3-1d7e7670aa21 for block 1
INFO  [2016-02-16 14:39:27,047] com.mesosphere.dcos.cassandra.scheduler.CassandraScheduler: Received 1 offers
WARN  [2016-02-16 14:39:27,047] org.apache.mesos.scheduler.plan.DefaultPlanScheduler: No block to process.
INFO  [2016-02-16 14:39:27,048] com.mesosphere.dcos.cassandra.scheduler.CassandraRepairScheduler: Terminated tasks size: 1
INFO  [2016-02-16 14:39:27,048] com.mesosphere.dcos.cassandra.scheduler.offer.PersistentOfferRequirementProvider: Getting replacement requirement for task: server-1_e8c8c0db-edf5-4ec5-8ef7-c26fe8d58486
INFO  [2016-02-16 14:39:27,048] com.mesosphere.dcos.cassandra.scheduler.offer.PersistentOfferRequirementProvider: Task has a volume, taskId: server-1_e8c8c0db-edf5-4ec5-8ef7-c26fe8d58486, reusing existing requirement
INFO  [2016-02-16 14:39:27,048] com.mesosphere.dcos.cassandra.scheduler.offer.PersistentOfferRequirementProvider: Getting existing OfferRequirement for task: name: "server-1"

Repair job gets assigned to a node other than the one needs to be repaired

The consequence is that the executor is unable to connect to the Cassandra daemon process because it uses 127.0.0.1. Looks like there are two bugs in the scheduler:

  1. Repair job should use NodePlacementStrategy
  2. NodePlacementStrategy doesn't return the expected nodes to avoid.

I'm going to submit a pull request to fix this but let me know if I misunderstand.

Replacing a failed task should generate a new UUID suffix

When replacing a failed task, taskId should re-use 'server-N' prefix but should generate a new 'UUID' suffix. Reusing the older UUID suffix works, but as a user once cannot browse logs from the failed tasks, as Mesos cannot distinguish between the old and new task (as the taskId is same).

Missed configuration variables

Hi:

We are using dc/os and we have seen there are some non editable config variables.
In scheduler.yml there are several hardcoded config variables.

For example, I want to config cassandra node to enable UDF and i havent found how to config it.

Is there any other reason to not allow this?
Cheers

Error running cqlsh

I follow these instructions from the docs:

Login to an agent inside DCOS cluster running cassandra node, and then run following to launch a docker container:
$ docker run --net=host -it mohitsoni/alpine-cqlsh:2.2.5 /bin/sh

And get this error:

Connection error: ('Unable to connect to any servers', {'127.0.0.1': error(111, "Tried connecting to [('127.0.0.1', 9042)]. Last error: Connection refused")})

Unable to Install Package Cassandra

While trying to install cassandra in DC/OS 1.8 in EC2, we get the following error.

fetch 'https://downloads.mesosphere.com/cassandra/assets/apache-cassandra-3.0.8-bin-dcos.tar.gz': Error downloading resource, received HTTP return code 403

We tried installed dse 5 package which got installed. Got the cli package as well installed. However, while trying to query the cassandra connections via
dcos cassandra connection it throws the following exception:

/service/cassandra/v1/nodes/list failed: 500 Internal Server Error
2016/09/29 19:07:08 - Did you provide the correct service name? Currently using 'cassandra', specify a different name with '--name='.
2016/09/29 19:07:08 - Was the service recently installed? It may still be initializing, Wait a bit and try again.

Expose all cassandra parameters

Currently, a few cassandra parameters in cassandra.yaml are not exposed through CASSANDRA_* variable. It would be helpful to expose all of them.

uuid is printed in some encoding the the log

The uuid below is printed in come un-readable encoding. Is that the task uuid?

INFO [2016-03-24 21:37:49,689] org.apache.mesos.scheduler.plan.StageManager: Updated current block with status: block = org.apache.mesos.scheduler.plan.ReconciliationBlock@5fcd35d8,status = task_id {

value: "node-1_c6e02992-6eb9-4d23-bc8a-18b346530096"

}

state: TASK_RUNNING

data: "\b\001\020\000"

slave_id {

value: "0ceb1be3-4b36-4b6a-9b13-ebf3803e0118-S1"

}

timestamp: 1.458855281518391E9

executor_id {

value: "node-1_c6e02992-6eb9-4d23-bc8a-18b346530096_executor"

}

source: SOURCE_EXECUTOR

uuid: "\353%\232\313f\002Ao\257\330Q\310R\246\261<"

container_status {

network_infos {

ip_address: "192.168.3.22"

ip_addresses {

  ip_address: "192.168.3.22"

}

}

}

Allow Java agent override

We need a way to inject a java agent (jmx exporter) and a custom jar file, so we can extract metrics from the Cassandra cluster. Currently I believe overriding the JAVA_OPTS only effects the scheduler, not each executor instance.

ConfigurationResource tests

Implement tests for ConfigurationResource. These tests should use a live server and http client and not mock the network connection.

Node replacement failed

I attempted to replace a node and got the following:

cat: /etc/ld.so.conf.d/*.conf: No such file or directory
Java HotSpot(TM) 64-Bit Server VM warning: Cannot open file apache-cassandra-2.2.5/bin/../logs/gc.log due to No such file or directory

java.lang.RuntimeException: A node required to move the data consistently is down (/10.0.1.122). If you wish to move the data from a potentially inconsistent replica, restart the node with -Dcassandra.consistent.rangemovement=false
    at org.apache.cassandra.dht.RangeStreamer.getAllRangesWithStrictSourcesFor(RangeStreamer.java:264)
    at org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:147)
    at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:82)
    at org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1230)
    at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:924)
    at org.apache.cassandra.service.StorageService.initServer(StorageService.java:709)
    at org.apache.cassandra.service.StorageService.initServer(StorageService.java:585)
    at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300)
    at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:516)
    at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:625)

and:

CompilerOracle: inline org/apache/cassandra/db/AbstractNativeCell.compareTo (Lorg/apache/cassandra/db/composites/Composite;)I
CompilerOracle: inline org/apache/cassandra/db/composites/AbstractSimpleCellNameType.compareUnsigned (Lorg/apache/cassandra/db/composites/Composite;Lorg/apache/cassandra/db/composites/Composite;)I
CompilerOracle: inline org/apache/cassandra/io/util/Memory.checkBounds (JJ)V
CompilerOracle: inline org/apache/cassandra/io/util/SafeMemory.checkBounds (JJ)V
CompilerOracle: inline org/apache/cassandra/utils/AsymmetricOrdering.selectBoundary (Lorg/apache/cassandra/utils/AsymmetricOrdering/Op;II)I
CompilerOracle: inline org/apache/cassandra/utils/AsymmetricOrdering.strictnessOfLessThan (Lorg/apache/cassandra/utils/AsymmetricOrdering/Op;)I
CompilerOracle: inline org/apache/cassandra/utils/ByteBufferUtil.compare (Ljava/nio/ByteBuffer;[B)I
CompilerOracle: inline org/apache/cassandra/utils/ByteBufferUtil.compare ([BLjava/nio/ByteBuffer;)I
CompilerOracle: inline org/apache/cassandra/utils/ByteBufferUtil.compareUnsigned (Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)I
CompilerOracle: inline org/apache/cassandra/utils/FastByteOperations$UnsafeOperations.compareTo (Ljava/lang/Object;JILjava/lang/Object;JI)I
CompilerOracle: inline org/apache/cassandra/utils/FastByteOperations$UnsafeOperations.compareTo (Ljava/lang/Object;JILjava/nio/ByteBuffer;)I
CompilerOracle: inline org/apache/cassandra/utils/FastByteOperations$UnsafeOperations.compareTo (Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)I
INFO  18:59:37 Loading settings from file:/var/lib/mesos/slave/slaves/20aac1e4-622d-461c-abb5-d99d9ba55ec7-S6/frameworks/20aac1e4-622d-461c-abb5-d99d9ba55ec7-0001/executors/node-2_9e70ce50-098d-4b87-85ed-c2f99ee0386a_executor/runs/64ea900c-1bfa-4393-b18c-447a7b2666b1/apache-cassandra-2.2.5/conf/cassandra.yaml
INFO  18:59:37 Node configuration:[authenticator=AllowAllAuthenticator; authorizer=AllowAllAuthorizer; auto_snapshot=false; batch_size_fail_threshold_in_kb=50; batch_size_warn_threshold_in_kb=5; batchlog_replay_throttle_in_kb=1024; cas_contention_timeout_in_ms=1000; client_encryption_options=<REDACTED>; cluster_name=cassandra; column_index_size_in_kb=64; commit_failure_policy=stop; commitlog_directory=/var/lib/mesos/slave/slaves/20aac1e4-622d-461c-abb5-d99d9ba55ec7-S6/frameworks/20aac1e4-622d-461c-abb5-d99d9ba55ec7-0001/executors/node-2_9e70ce50-098d-4b87-85ed-c2f99ee0386a_executor/runs/64ea900c-1bfa-4393-b18c-447a7b2666b1/volume/commitlog; commitlog_segment_size_in_mb=32; commitlog_sync=periodic; commitlog_sync_period_in_ms=10000; compaction_large_partition_warning_threshold_mb=100; compaction_throughput_mb_per_sec=16; concurrent_counter_writes=16; concurrent_reads=16; concurrent_writes=32; counter_cache_save_period=7200; counter_cache_size_in_mb=null; counter_write_request_timeout_in_ms=5000; cross_node_timeout=false; data_file_directories=[/var/lib/mesos/slave/slaves/20aac1e4-622d-461c-abb5-d99d9ba55ec7-S6/frameworks/20aac1e4-622d-461c-abb5-d99d9ba55ec7-0001/executors/node-2_9e70ce50-098d-4b87-85ed-c2f99ee0386a_executor/runs/64ea900c-1bfa-4393-b18c-447a7b2666b1/volume/data]; disk_failure_policy=stop; dynamic_snitch_badness_threshold=0.1; dynamic_snitch_reset_interval_in_ms=600000; dynamic_snitch_update_interval_in_ms=100; enable_user_defined_functions=false; endpoint_snitch=GossipingPropertyFileSnitch; hinted_handoff_enabled=true; hinted_handoff_throttle_in_kb=1024; incremental_backups=false; index_summary_capacity_in_mb=null; index_summary_resize_interval_in_minutes=60; inter_dc_tcp_nodelay=false; internode_compression=all; key_cache_save_period=14400; key_cache_size_in_mb=null; listen_address=10.0.2.66; max_hint_window_in_ms=10800000; max_hints_delivery_threads=2; memtable_allocation_type=heap_buffers; native_transport_port=9042; num_tokens=256; partitioner=org.apache.cassandra.dht.Murmur3Partitioner; permissions_validity_in_ms=2000; range_request_timeout_in_ms=10000; read_request_timeout_in_ms=5000; request_scheduler=org.apache.cassandra.scheduler.NoScheduler; request_timeout_in_ms=10000; role_manager=CassandraRoleManager; roles_validity_in_ms=2000; row_cache_save_period=0; row_cache_size_in_mb=0; rpc_address=10.0.2.66; rpc_keepalive=true; rpc_port=9160; rpc_server_type=sync; saved_caches_directory=/var/lib/mesos/slave/slaves/20aac1e4-622d-461c-abb5-d99d9ba55ec7-S6/frameworks/20aac1e4-622d-461c-abb5-d99d9ba55ec7-0001/executors/node-2_9e70ce50-098d-4b87-85ed-c2f99ee0386a_executor/runs/64ea900c-1bfa-4393-b18c-447a7b2666b1/volume/saved_caches; seed_provider=[{class_name=com.mesosphere.dcos.cassandra.DcosSeedProvider, parameters=[{seeds_url=http://cassandra.marathon.mesos:9000/v1/seeds}]}]; server_encryption_options=<REDACTED>; snapshot_before_compaction=false; ssl_storage_port=7001; sstable_preemptive_open_interval_in_mb=50; start_native_transport=true; start_rpc=false; storage_port=7000; thrift_framed_transport_size_in_mb=15; tombstone_failure_threshold=100000; tombstone_warn_threshold=1000; tracetype_query_ttl=86400; tracetype_repair_ttl=604800; trickle_fsync=false; trickle_fsync_interval_in_kb=10240; truncate_request_timeout_in_ms=60000; windows_timer_interval=1; write_request_timeout_in_ms=2000]
INFO  18:59:37 DiskAccessMode 'auto' determined to be mmap, indexAccessMode is mmap
INFO  18:59:37 Global memtable on-heap threshold is enabled at 509MB
INFO  18:59:37 Global memtable off-heap threshold is enabled at 509MB
INFO  18:59:37 Unable to load cassandra-topology.properties; compatibility mode disabled
WARN  18:59:37 Only 28482 MB free across all data volumes. Consider adding more capacity to your cluster or removing obsolete snapshots
INFO  18:59:37 Retrieved response {"isSeed":false,"seeds":["10.0.1.121","10.0.1.123"]} from URL http://cassandra.marathon.mesos:9000/v1/seeds
INFO  18:59:37 Retrieved remote seeds [/10.0.1.121, /10.0.1.123]
INFO  18:59:37 Hostname: ip-10-0-2-66.us-west-2.compute.internal
INFO  18:59:37 JVM vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.8.0_74
INFO  18:59:37 Heap size: 2136997888/2136997888
INFO  18:59:37 Code Cache Non-heap memory: init = 2555904(2496K) used = 3151552(3077K) committed = 3211264(3136K) max = 251658240(245760K)
INFO  18:59:37 Metaspace Non-heap memory: init = 0(0K) used = 13007080(12702K) committed = 13238272(12928K) max = -1(-1K)
INFO  18:59:37 Compressed Class Space Non-heap memory: init = 0(0K) used = 1637264(1598K) committed = 1703936(1664K) max = 1073741824(1048576K)
INFO  18:59:37 Par Eden Space Heap memory: init = 83886080(81920K) used = 62144072(60687K) committed = 83886080(81920K) max = 83886080(81920K)
INFO  18:59:37 Par Survivor Space Heap memory: init = 10485760(10240K) used = 0(0K) committed = 10485760(10240K) max = 10485760(10240K)
INFO  18:59:37 CMS Old Gen Heap memory: init = 2042626048(1994752K) used = 0(0K) committed = 2042626048(1994752K) max = 2042626048(1994752K)
INFO  18:59:37 Classpath: apache-cassandra-2.2.5/bin/../conf:apache-cassandra-2.2.5/bin/../build/classes/main:apache-cassandra-2.2.5/bin/../build/classes/thrift:apache-cassandra-2.2.5/bin/../lib/ST4-4.0.8.jar:apache-cassandra-2.2.5/bin/../lib/airline-0.6.jar:apache-cassandra-2.2.5/bin/../lib/antlr-runtime-3.5.2.jar:apache-cassandra-2.2.5/bin/../lib/apache-cassandra-2.2.5.jar:apache-cassandra-2.2.5/bin/../lib/apache-cassandra-clientutil-2.2.5.jar:apache-cassandra-2.2.5/bin/../lib/apache-cassandra-thrift-2.2.5.jar:apache-cassandra-2.2.5/bin/../lib/cassandra-driver-core-2.2.0-rc2-SNAPSHOT-20150617-shaded.jar:apache-cassandra-2.2.5/bin/../lib/commons-cli-1.1.jar:apache-cassandra-2.2.5/bin/../lib/commons-codec-1.2.jar:apache-cassandra-2.2.5/bin/../lib/commons-lang3-3.1.jar:apache-cassandra-2.2.5/bin/../lib/commons-math3-3.2.jar:apache-cassandra-2.2.5/bin/../lib/compress-lzf-0.8.4.jar:apache-cassandra-2.2.5/bin/../lib/concurrentlinkedhashmap-lru-1.4.jar:apache-cassandra-2.2.5/bin/../lib/crc32ex-0.1.1.jar:apache-cassandra-2.2.5/bin/../lib/disruptor-3.0.1.jar:apache-cassandra-2.2.5/bin/../lib/ecj-4.4.2.jar:apache-cassandra-2.2.5/bin/../lib/guava-16.0.jar:apache-cassandra-2.2.5/bin/../lib/high-scale-lib-1.0.6.jar:apache-cassandra-2.2.5/bin/../lib/jackson-core-asl-1.9.2.jar:apache-cassandra-2.2.5/bin/../lib/jackson-mapper-asl-1.9.2.jar:apache-cassandra-2.2.5/bin/../lib/jamm-0.3.0.jar:apache-cassandra-2.2.5/bin/../lib/javax.inject.jar:apache-cassandra-2.2.5/bin/../lib/jbcrypt-0.3m.jar:apache-cassandra-2.2.5/bin/../lib/jcl-over-slf4j-1.7.7.jar:apache-cassandra-2.2.5/bin/../lib/jna-4.0.0.jar:apache-cassandra-2.2.5/bin/../lib/joda-time-2.4.jar:apache-cassandra-2.2.5/bin/../lib/json-simple-1.1.jar:apache-cassandra-2.2.5/bin/../lib/libthrift-0.9.2.jar:apache-cassandra-2.2.5/bin/../lib/log4j-over-slf4j-1.7.7.jar:apache-cassandra-2.2.5/bin/../lib/logback-classic-1.1.3.jar:apache-cassandra-2.2.5/bin/../lib/logback-core-1.1.3.jar:apache-cassandra-2.2.5/bin/../lib/lz4-1.3.0.jar:apache-cassandra-2.2.5/bin/../lib/metrics-core-3.1.0.jar:apache-cassandra-2.2.5/bin/../lib/metrics-logback-3.1.0.jar:apache-cassandra-2.2.5/bin/../lib/metrics-statsd-common-4.1.2.jar:apache-cassandra-2.2.5/bin/../lib/metrics2-statsd-4.1.2.jar:apache-cassandra-2.2.5/bin/../lib/netty-all-4.0.23.Final.jar:apache-cassandra-2.2.5/bin/../lib/ohc-core-0.3.4.jar:apache-cassandra-2.2.5/bin/../lib/ohc-core-j8-0.3.4.jar:apache-cassandra-2.2.5/bin/../lib/reporter-config-base-3.0.0.jar:apache-cassandra-2.2.5/bin/../lib/reporter-config3-3.0.0.jar:apache-cassandra-2.2.5/bin/../lib/seedprovider-0.1.0.jar:apache-cassandra-2.2.5/bin/../lib/sigar-1.6.4.jar:apache-cassandra-2.2.5/bin/../lib/slf4j-api-1.7.7.jar:apache-cassandra-2.2.5/bin/../lib/snakeyaml-1.11.jar:apache-cassandra-2.2.5/bin/../lib/snappy-java-1.1.1.7.jar:apache-cassandra-2.2.5/bin/../lib/stream-2.5.2.jar:apache-cassandra-2.2.5/bin/../lib/super-csv-2.1.0.jar:apache-cassandra-2.2.5/bin/../lib/thrift-server-0.3.7.jar:apache-cassandra-2.2.5/bin/../lib/jsr223/*/*.jar:apache-cassandra-2.2.5/bin/../lib/jamm-0.3.0.jar
INFO  18:59:37 JVM Arguments: [-ea, -javaagent:apache-cassandra-2.2.5/bin/../lib/jamm-0.3.0.jar, -XX:+CMSClassUnloadingEnabled, -XX:+UseThreadPriorities, -XX:ThreadPriorityPolicy=42, -Xms2048M, -Xmx2048M, -Xmn100M, -XX:+HeapDumpOnOutOfMemoryError, -Xss256k, -XX:StringTableSize=1000003, -XX:+UseParNewGC, -XX:+UseConcMarkSweepGC, -XX:+CMSParallelRemarkEnabled, -XX:SurvivorRatio=8, -XX:MaxTenuringThreshold=1, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -XX:+UseTLAB, -XX:+PerfDisableSharedMem, -XX:CompileCommandFile=apache-cassandra-2.2.5/bin/../conf/hotspot_compiler, -XX:CMSWaitDuration=10000, -XX:+CMSParallelInitialMarkEnabled, -XX:+CMSEdenChunksRecordAlways, -XX:CMSWaitDuration=10000, -XX:+UseCondCardMark, -XX:+PrintGCDetails, -XX:+PrintGCDateStamps, -XX:+PrintHeapAtGC, -XX:+PrintTenuringDistribution, -XX:+PrintGCApplicationStoppedTime, -XX:+PrintPromotionFailure, -Xloggc:apache-cassandra-2.2.5/bin/../logs/gc.log, -XX:+UseGCLogFileRotation, -XX:NumberOfGCLogFiles=10, -XX:GCLogFileSize=10M, -Djava.net.preferIPv4Stack=true, -Dcassandra.jmx.local.port=7199, -XX:+DisableExplicitGC, -Djava.library.path=apache-cassandra-2.2.5/bin/../lib/sigar-bin, -Dcassandra.metricsReporterConfigFile=metrics-reporter-config.yaml, -Dlogback.configurationFile=logback.xml, -Dcassandra.logdir=apache-cassandra-2.2.5/bin/../logs, -Dcassandra.storagedir=apache-cassandra-2.2.5/bin/../data, -Dcassandra-foreground=yes]
INFO  18:59:38 JNA mlockall successful
WARN  18:59:38 jemalloc shared library could not be preloaded to speed up memory allocations
WARN  18:59:38 JMX is not enabled to receive remote connections. Please see cassandra-env.sh for more info.
INFO  18:59:38 Initializing SIGAR library
INFO  18:59:38 Checked OS settings and found them configured for optimal performance.
INFO  18:59:39 Initializing system.sstable_activity
INFO  18:59:41 Initializing key cache with capacity of 100 MBs.
INFO  18:59:41 Initializing row cache with capacity of 0 MBs
INFO  18:59:41 Initializing counter cache with capacity of 50 MBs
INFO  18:59:41 Scheduling counter cache save to every 7200 seconds (going to save all keys).
INFO  18:59:41 Initializing system.hints
INFO  18:59:41 Initializing system.compaction_history
INFO  18:59:41 Initializing system.peers
INFO  18:59:41 Initializing system.schema_columnfamilies
INFO  18:59:41 Initializing system.schema_functions
INFO  18:59:41 Initializing system.IndexInfo
INFO  18:59:41 Initializing system.schema_columns
INFO  18:59:41 Initializing system.schema_triggers
INFO  18:59:41 Initializing system.local
INFO  18:59:41 Initializing system.schema_usertypes
INFO  18:59:41 Initializing system.batchlog
INFO  18:59:41 Initializing system.available_ranges
INFO  18:59:42 Initializing system.schema_aggregates
INFO  18:59:42 Initializing system.paxos
INFO  18:59:42 Initializing system.peer_events
INFO  18:59:42 Initializing system.size_estimates
INFO  18:59:42 Initializing system.compactions_in_progress
INFO  18:59:42 Initializing system.schema_keyspaces
INFO  18:59:42 Initializing system.range_xfers
INFO  18:59:43 Initializing system_distributed.parent_repair_history
INFO  18:59:43 Initializing system_distributed.repair_history
INFO  18:59:43 Initializing system_auth.role_permissions
INFO  18:59:43 Initializing system_auth.resource_role_permissons_index
INFO  18:59:43 Initializing system_auth.roles
INFO  18:59:43 Initializing system_auth.role_members
INFO  18:59:43 Initializing system_traces.sessions
INFO  18:59:43 Initializing system_traces.events
INFO  18:59:43 Completed loading (104 ms; 7 keys) KeyCache cache
INFO  18:59:43 Replaying /var/lib/mesos/slave/slaves/20aac1e4-622d-461c-abb5-d99d9ba55ec7-S6/frameworks/20aac1e4-622d-461c-abb5-d99d9ba55ec7-0001/executors/node-2_9e70ce50-098d-4b87-85ed-c2f99ee0386a_executor/runs/64ea900c-1bfa-4393-b18c-447a7b2666b1/volume/commitlog/CommitLog-5-1462561114318.log, /var/lib/mesos/slave/slaves/20aac1e4-622d-461c-abb5-d99d9ba55ec7-S6/frameworks/20aac1e4-622d-461c-abb5-d99d9ba55ec7-0001/executors/node-2_9e70ce50-098d-4b87-85ed-c2f99ee0386a_executor/runs/64ea900c-1bfa-4393-b18c-447a7b2666b1/volume/commitlog/CommitLog-5-1462561114319.log
INFO  18:59:46 Log replay complete, 15 replayed mutations
INFO  18:59:46 Cassandra version: 2.2.5
INFO  18:59:46 Thrift API version: 20.1.0
INFO  18:59:46 CQL supported versions: 3.3.1 (default: 3.3.1)
INFO  18:59:47 Initializing index summary manager with a memory pool size of 101 MB and a resize interval of 60 minutes
INFO  18:59:47 Loading persisted ring state
INFO  18:59:47 Retrieved response {"isSeed":false,"seeds":["10.0.1.121","10.0.1.123"]} from URL http://cassandra.marathon.mesos:9000/v1/seeds
INFO  18:59:47 Retrieved remote seeds [/10.0.1.121, /10.0.1.123]
INFO  18:59:47 Starting Messaging Service on /10.0.2.66:7000 (eth0)
INFO  18:59:47 Retrieved response {"isSeed":false,"seeds":["10.0.1.121","10.0.1.123"]} from URL http://cassandra.marathon.mesos:9000/v1/seeds
INFO  18:59:47 Retrieved remote seeds [/10.0.1.121, /10.0.1.123]
INFO  18:59:47 Handshaking version with /10.0.1.123
INFO  18:59:47 Handshaking version with /10.0.1.121
INFO  18:59:52 Node /10.0.1.121 has restarted, now UP
INFO  18:59:52 Node /10.0.1.122 has restarted, now UP
INFO  18:59:52 InetAddress /10.0.1.122 is now DOWN
INFO  18:59:52 Node /10.0.1.123 has restarted, now UP
INFO  18:59:53 Starting up server gossip
INFO  18:59:54 Retrieved response {"isSeed":false,"seeds":["10.0.1.121","10.0.1.123"]} from URL http://cassandra.marathon.mesos:9000/v1/seeds
INFO  18:59:54 Retrieved remote seeds [/10.0.1.121, /10.0.1.123]
INFO  18:59:54 Retrieved response {"isSeed":false,"seeds":["10.0.1.121","10.0.1.123"]} from URL http://cassandra.marathon.mesos:9000/v1/seeds
INFO  18:59:54 Retrieved remote seeds [/10.0.1.121, /10.0.1.123]
INFO  18:59:54 Retrieved response {"isSeed":false,"seeds":["10.0.1.121","10.0.1.123"]} from URL http://cassandra.marathon.mesos:9000/v1/seeds
INFO  18:59:54 Retrieved remote seeds [/10.0.1.121, /10.0.1.123]
INFO  18:59:54 Retrieved response {"isSeed":false,"seeds":["10.0.1.121","10.0.1.123"]} from URL http://cassandra.marathon.mesos:9000/v1/seeds
INFO  18:59:54 Retrieved remote seeds [/10.0.1.121, /10.0.1.123]
WARN  18:59:54 Detected previous bootstrap failure; retrying
INFO  18:59:54 JOINING: waiting for ring information
INFO  18:59:54 JOINING: schema complete, ready to bootstrap
INFO  18:59:54 JOINING: waiting for pending range calculation
INFO  18:59:54 JOINING: calculation complete, ready to bootstrap
INFO  18:59:54 JOINING: getting bootstrap token
INFO  18:59:55 Handshaking version with /10.0.1.121
INFO  18:59:55 Node /10.0.1.121 is now part of the cluster
INFO  18:59:55 Node /10.0.1.121 state jump to NORMAL
INFO  18:59:56 JOINING: sleeping 30000 ms for pending range setup
WARN  18:59:56 Not marking nodes down due to local pause of 16496601209 > 5000000000
INFO  18:59:56 Updating topology for /10.0.1.121
INFO  18:59:56 Updating topology for /10.0.1.121
INFO  18:59:56 Node /10.0.1.122 is now part of the cluster
INFO  18:59:56 Node /10.0.1.122 state jump to shutdown
INFO  18:59:56 Updating topology for /10.0.1.122
INFO  18:59:56 Updating topology for /10.0.1.122
INFO  18:59:56 InetAddress /10.0.1.122 is now DOWN
INFO  18:59:56 Node /10.0.1.123 is now part of the cluster
INFO  18:59:56 Node /10.0.1.123 state jump to NORMAL
INFO  18:59:56 Updating topology for /10.0.1.123
INFO  18:59:56 Updating topology for /10.0.1.123
INFO  18:59:56 InetAddress /10.0.1.121 is now UP
INFO  18:59:56 InetAddress /10.0.1.121 is now UP
INFO  18:59:56 Handshaking version with /10.0.1.123
INFO  18:59:56 InetAddress /10.0.1.123 is now UP
INFO  19:00:26 JOINING: Starting to bootstrap...
Exception (java.lang.RuntimeException) encountered during startup: A node required to move the data consistently is down (/10.0.1.122). If you wish to move the data from a potentially inconsistent replica, restart the node with -Dcassandra.consistent.rangemovement=false
ERROR 19:00:26 Exception encountered during startup
java.lang.RuntimeException: A node required to move the data consistently is down (/10.0.1.122). If you wish to move the data from a potentially inconsistent replica, restart the node with -Dcassandra.consistent.rangemovement=false
    at org.apache.cassandra.dht.RangeStreamer.getAllRangesWithStrictSourcesFor(RangeStreamer.java:264) ~[apache-cassandra-2.2.5.jar:2.2.5]
    at org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:147) ~[apache-cassandra-2.2.5.jar:2.2.5]
    at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:82) ~[apache-cassandra-2.2.5.jar:2.2.5]
    at org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1230) ~[apache-cassandra-2.2.5.jar:2.2.5]
    at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:924) ~[apache-cassandra-2.2.5.jar:2.2.5]
    at org.apache.cassandra.service.StorageService.initServer(StorageService.java:709) ~[apache-cassandra-2.2.5.jar:2.2.5]
    at org.apache.cassandra.service.StorageService.initServer(StorageService.java:585) ~[apache-cassandra-2.2.5.jar:2.2.5]
    at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [apache-cassandra-2.2.5.jar:2.2.5]
    at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:516) [apache-cassandra-2.2.5.jar:2.2.5]
    at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:625) [apache-cassandra-2.2.5.jar:2.2.5]
WARN  19:00:26 No local state or state is in silent shutdown, not announcing shutdown

ConfigurationManager tests

Improve testing for ConfigurationManager class. Tests may mock external interfaces where necessary, but should use a curator TestingServer for the ZooKeeper interface and not mock the interactions with ZK.

Repair endpoint starts cluster tasks on nodes not running Cassandra

We are running Aurora and Cassandra frameworks in the same Mesos cluster.

I tried to start a repair by using:

curl -H 'Content-Type: application/json' \
     -X PUT -d '{ "nodes" : [ "*" ] }' \
     http://appdocker968-sjc1.prod.uber.internal:5287/v1/repair/start

It starts accepting offers from the non-Cassandra machines, launches the task and then fails. This leaves dynamically reserved resources on all of the Aurora machines on which it has attempted to launch (the role of the framework is cassandra-cstar-token:
Excerpt from the http://mesos-master:5050/state endpoint:

{
id: "ec1f56cf-a66c-48cd-880c-020b430b0d1a-S4",
pid: "slave(1)@10.28.20.29:5051",
hostname: "compute131-sjc1.prod.uber.internal",
registered_time: 1466724399.95873,
reregistered_time: 1466724400.11227,
resources: {
cpus: 24,
disk: 1665158,
mem: 120635,
ports: "[31000-32000]"
},
used_resources: {
cpus: 5.75,
disk: 6144,
mem: 12672,
ports: "[31439-31439, 31583-31583, 31755-31755, 31787-31787]"
},
offered_resources: {
cpus: 16.25,
disk: 1659014,
mem: 107451,
ports: "[31000-31438, 31440-31582, 31584-31754, 31756-31786, 31788-32000]"
},
reserved_resources: {
cassandra-cstar-token: {
cpus: 1,
disk: 0,
mem: 256
}
},
unreserved_resources: {
cpus: 22,
disk: 1665158,
mem: 120123,
ports: "[31000-32000]"
},
attributes: {
host: "compute131-sjc1",
kernel: 3.18,
pipeline: "us1",
pod: "c",
rack: 337
},
active: true,
version: "0.28.2-22"
}

Is this a bug caused due to passing null as the agentsToColocate here: https://github.com/mesosphere/dcos-cassandra-service/blob/master/cassandra-scheduler/src/main/java/com/mesosphere/dcos/cassandra/scheduler/offer/ClusterTaskOfferRequirementProvider.java#L108 ?

I think it should colocate the cluster repair task on the nodes running the Cassandra tasks.

Framework assumes persistent volume is created even if creation failed and keeps on looking for it

This bug is similar to issue #118 but now I have a deterministic way of reproducing it.

  1. Upgrade Mesos to 1.0
  2. Launch fresh instance of the scheduler
  3. Scheduler sends RESERVE, CREATE and LAUNCH operations.
  4. Creation of persistent volume fails because the principal is not set in DiskInfo.Persistence (https://issues.apache.org/jira/browse/MESOS-5005)
  5. Scheduler assumes that the volume was created, stores the persistent volume ID in zookeeper and keeps trying to find it in subsequent offers.

The correct behavior is that it should keep retrying to create the volume if it does not see the persistent volume in subsequent offers.

IdentityManager tests

Improve tests for the IdentityManager class. Tests should use a curator TestingServer and not an in memory persistence provider.

TasksResource test

Implement tests for the TasksResource class. These tests should use a live http client and application.

Bug: "Unable to find a volume with persistence id" while creating a new cluster

I am getting this error almost every time when I try to start a new cluster:

INFO  [2016-05-04 03:55:56,117] org.apache.mesos.offer.OfferEvaluator: Found Offer meeting placement constraints: id {
  value: "448075ee-ee69-4fba-9d9d-9157746b3841-O644"
}
framework_id {
  value: "7a570f10-67bd-464b-be11-2696d6c2a5c6-0000"
}
slave_id {
  value: "3db86480-98e4-41c7-a341-195f8cc10cef-S23"
}
hostname: "compute23-sjc1.prod.uber.internal"
resources {
  name: "cpus"
  type: SCALAR
  scalar {
    value: 24.0
  }
  role: "*"
}
resources {
  name: "mem"
  type: SCALAR
  scalar {
    value: 120635.0
  }
  role: "*"
}
resources {
  name: "disk"
  type: SCALAR
  scalar {
    value: 995927.0
  }
  role: "*"
}
resources {
  name: "ports"
  type: RANGES
  ranges {
    range {
      begin: 31000
      end: 32000
    }
  }
  role: "*"
}
attributes {
  name: "host"
  type: TEXT
  text {
    value: "compute23-sjc1"
  }
}
attributes {
  name: "kernel"
  type: SCALAR
  scalar {
    value: 3.18
  }
}
attributes {
  name: "pod"
  type: TEXT
  text {
    value: "d"
  }
}
attributes {
  name: "rack"
  type: SCALAR
  scalar {
    value: 1010.0
  }
}
url {
  scheme: "http"
  address {
    hostname: "compute23-sjc1.prod.uber.internal"
    ip: "10.163.31.21"
    port: 5051
  }
  path: "/slave(1)"
}

INFO  [2016-05-04 03:55:56,117] org.apache.mesos.offer.ResourceUtils: Selected disk: type = ROOTresource = name: "disk"
type: SCALAR
scalar {
  value: 995927.0
}
role: "*"

INFO  [2016-05-04 03:55:56,117] org.apache.mesos.offer.ResourceUtils: Selected disk: type = ROOTresource = name: "disk"
type: SCALAR
scalar {
  value: 995927.0
}
role: "*"

ERROR [2016-05-04 03:55:56,117] org.apache.mesos.offer.OfferEvaluator: Unable to find a volume with persistence id: c9bd2df4-c7ab-4bce-8540-b578c6bb2885
INFO  [2016-05-04 03:55:56,117] org.apache.mesos.offer.OfferEvaluator: VolumeMode is EXISTING and VolumeType is ROOT hasExpectedVolumes is false
INFO  [2016-05-04 03:55:56,117] org.apache.mesos.offer.OfferEvaluator: EnoughCPU: true EnoughMem: true EnoughDisk: true EnoughPorts: true HasExpectedVolumes: false
WARN  [2016-05-04 03:55:56,117] org.apache.mesos.offer.OfferEvaluator: No Offers found meeting Resource constraints.
WARN  [2016-05-04 03:55:56,117] org.apache.mesos.offer.OfferEvaluator: No acceptable offers due to insufficient resources.

Unable to Setup Cassandra Cluster

Followed the Documentation and Installation steps mentioned:
After installation of package when i check cassandra connection this is what i get :

dcos cassandra connection
{"address":[],"dns":[]}

stdout on my marathon :
INFO [2016-09-29 21:53:53,562] org.eclipse.jetty.setuid.SetUIDListener: Opened DC/OS Cassandra Service@1da4b6b3{HTTP/1.1}{0.0.0.0:9000}
INFO [2016-09-29 21:53:53,565] org.eclipse.jetty.server.Server: jetty-9.2.14.v20151106
INFO [2016-09-29 21:53:54,372] org.apache.mesos.scheduler.SchedulerDriverFactory: Creating unauthenticated MesosSchedulerDriver for scheduler[com.mesosphere.dcos.cassandra.scheduler.CassandraScheduler@4393593c],
frameworkInfo[user: "root"
name: "cassandra"
id {
value: "dae7eb1e-00d3-4417-ad5a-9a1135abc00a-0001"
}
failover_timeout: 604800.0
checkpoint: true
role: "cassandra-role"
principal: "cassandra-principal"
], masterUrl[zk://master.mesos:2181/mesos]
INFO [2016-09-29 21:53:56,368] com.mesosphere.dcos.cassandra.scheduler.CassandraScheduler: Starting driver...
INFO [2016-09-29 21:53:56,369] com.mesosphere.dcos.cassandra.scheduler.CassandraScheduler: Driver started with status: DRIVER_RUNNING
ERROR [2016-09-29 21:53:56,662] com.mesosphere.dcos.cassandra.scheduler.CassandraScheduler: Scheduler driver error: Framework has been removed
INFO [2016-09-29 21:54:07,365] io.dropwizard.jersey.DropwizardResourceConfig: The following paths were found for the configured resources:
5:15 PM
{"name":"cassandra","id":"","version":"1.0.0","user":"root","cluster":"cassandra-cluster","role":"cassandra-role","principal":"cassandra-principal","failover_timeout_s":604800,"secret":"","checkpoint":true}
5:16 PM
{"address":[],"dns":[]} connection
5:17 PM
plan {"phases":[],"errors":[],"status":"COMPLETE"}

Monitoring with nodetool

My cassandra deployments- and I suspect a lot of other people's- is heavily monitored by examination of the output of regularly run nodetool tasks (compactionstats, gossipinfo, tpstats, status, etc) and other JMX based tooling. It's unclear how to reproduce that stuff in this framework, given that JMX is not exposed outside the node containers.

Advice on this? Am I missing something? Anyone working on an approach?

Unable to deploy cassandra on dc/os 1.8.2

Same error is also given during dse deploy.

Exception in logs is:

Error injecting constructor, org.apache.mesos.config.ConfigStoreException: Failed to retrieve current target configuration from path '/dcos-service-datastax/ConfigTarget'
at com.mesosphere.dcos.cassandra.scheduler.seeds.SeedsManager.(SeedsManager.java:105)
at com.mesosphere.dcos.cassandra.scheduler.SchedulerModule.configure(SchedulerModule.java:204)
while locating com.mesosphere.dcos.cassandra.scheduler.seeds.SeedsManager
1 error
at com.google.inject.internal.Errors.throwCreationExceptionIfErrorsExist(Errors.java:466)
at com.google.inject.internal.InternalInjectorCreator.injectDynamically(InternalInjectorCreator.java:184)
at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:110)
at com.google.inject.Guice.createInjector(Guice.java:96)
at com.google.inject.Guice.createInjector(Guice.java:73)
at com.google.inject.Guice.createInjector(Guice.java:62)
at com.mesosphere.dcos.cassandra.scheduler.Main.run(Main.java:67)
at com.mesosphere.dcos.cassandra.scheduler.Main.run(Main.java:23)
at io.dropwizard.cli.EnvironmentCommand.run(EnvironmentCommand.java:40)
at io.dropwizard.cli.ConfiguredCommand.run(ConfiguredCommand.java:77)
at io.dropwizard.cli.Cli.run(Cli.java:70)
at io.dropwizard.Application.run(Application.java:80)
at com.mesosphere.dcos.cassandra.scheduler.Main.main(Main.java:28)
Caused by: org.apache.mesos.config.ConfigStoreException: Failed to retrieve current target configuration from path '/dcos-service-datastax/ConfigTarget'
at org.apache.mesos.curator.CuratorConfigStore.getTargetConfig(CuratorConfigStore.java:180)
at com.mesosphere.dcos.cassandra.scheduler.config.DefaultConfigurationManager.getTargetName(DefaultConfigurationManager.java:204)
at com.mesosphere.dcos.cassandra.scheduler.config.DefaultConfigurationManager.getTargetConfig(DefaultConfigurationManager.java:213)
at com.mesosphere.dcos.cassandra.scheduler.seeds.SeedsManager.(SeedsManager.java:139)
at com.mesosphere.dcos.cassandra.scheduler.seeds.SeedsManager$$FastClassByGuice$$91cd171b.newInstance()
at com.google.inject.internal.cglib.reflect.$FastConstructor.newInstance(FastConstructor.java:40)
at com.google.inject.internal.DefaultConstructionProxyFactory$1.newInstance(DefaultConstructionProxyFactory.java:61)
at com.google.inject.internal.ConstructorInjector.provision(ConstructorInjector.java:105)
at com.google.inject.internal.ConstructorInjector.construct(ConstructorInjector.java:85)
at com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:267)
at com.google.inject.internal.ProviderToInternalFactoryAdapter$1.call(ProviderToInternalFactoryAdapter.java:46)
at com.google.inject.internal.InjectorImpl.callInContext(InjectorImpl.java:1103)
at com.google.inject.internal.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:40)
at com.google.inject.internal.SingletonScope$1.get(SingletonScope.java:145)
at com.google.inject.internal.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:41)
at com.google.inject.internal.InternalInjectorCreator$1.call(InternalInjectorCreator.java:205)
at com.google.inject.internal.InternalInjectorCreator$1.call(InternalInjectorCreator.java:199)
at com.google.inject.internal.InjectorImpl.callInContext(InjectorImpl.java:1092)
at com.google.inject.internal.InternalInjectorCreator.loadEagerSingletons(InternalInjectorCreator.java:199)
at com.google.inject.internal.InternalInjectorCreator.injectDynamically(InternalInjectorCreator.java:180)
... 11 more

Cancel in-progress backup / restore

Backups / restores execute serially, which means they could take hours or days. It would be useful to allow operators to cancel an in-progress backup / restore.

Incorrect order in which CassandraDaemon nodes are started when cluster size is > 10

I started a 20-node Cassandra cluster and the nodes start in the order:
node-0, node-1, node-10, node-11,.., node19, node-2, node-3, ..., node-9

This is because we internally sort them by their names. We can either:
a) Change the sorting code to correctly sort it
b) Change the node names to be node-000, node-001, node-002, etc. (assuming no-one is going to start a cluster with > 1000 nodes)

@kow3ns @mohitsoni What do you guys think?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.