DEPRECATED—Open source Apache Cassandra running on DC/OS is now replaced by mesosphere/dcos-commons/frameworks/cassandra. This repository will be deleted at the end of 2017.

License: Apache License 2.0

Java 92.27% Shell 1.16% HTML 0.79% Python 4.52% Groovy 0.16% Go 1.11%

dcos-cassandra-service's People

Contributors

Stargazers

Watchers

dcos-cassandra-service's Issues

Collect requirements for multi-dc support

Backup to S3 fails when null values passed for name, keys, etc

Validate user input from REST API
Fail fast if can't connect to S3 or if auth fails

Workaround is to manually delete the znode that stores the data.

Add support for graphite metrics collector

Support for statsd metrics collection was recently added in #43 (bf9a4f4).

However, at Uber we are planning to use Graphite. Hence I would like to generalize the metrics collection to support statsd or graphite by using environment variables.

replacement task was lost during rebooting machine

We have in-house 5 nodes cassandra cluster under mesos management. After I rebooting one of node, the node came back, but not the cassandra task. This problem is also repeatable in 2-nodes cluster as well.

Look into the code, seems the problem exists in CassandraRepairScheduler.java. See blow code, we change task state before evaluating against the offer. if the evaluation fails then task will not be in 'terminal' state anymore.

resourceOffers(....)
{
...
if (terminatedOption.isPresent()) {
try {
CassandraTask terminated = terminatedOption.get();
terminated = cassandraTasks.replaceTask(terminated); <<<< this will remove the 'terminal' state
OfferRequirement offerReq =
offerRequirementProvider.getReplacementOfferRequirement(
terminated.toProto());
OfferEvaluator offerEvaluator = new OfferEvaluator(offerReq);
List recommendations =
offerEvaluator.evaluate(offers);
LOGGER.debug("Got recommendations: {} for terminated task: {}",
recommendations,
terminated.getId());
acceptedOffers = offerAccepter.accept(driver, recommendations);

Document min # nodes required

It's 3 according to @kow3ns

Verify multiple C* nodes from different C* frameworks instances can run on same DCOS Agent

Add support for sending Cassandra metrics with statsd

Add support for sending Cassandra JMX metrics to a statsd metics receiver when environment variable STATSD_UDP_HOST is set.

@kow3ns

IdentityResource tests

Implement tests for the IdentityResource. Tests should use a live application and http client.

Create a Cassandra Repair task

Running Cassandra node repairs is crucial for the healthy operation of a Cassandra cluster (https://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html).

We should add a new REST endpoint /v1/repair to repair keyspaces and column families. I think I have a good idea of how to do it and can send a pull request if everyone agrees to this.

Cancel in-progress repair

Enable operators to cancel an in-progress repair.

add periodic task reconciliation

Current Cassandra scheduler kicks reconciler at start time. We should have periodic task reconciliation as this improves the availability of system.

Mesosphere maven repos throw a 403

http://downloads.mesosphere.com/maven-snapshot
http://downloads.mesosphere.com/maven

These URLs both give a 403. I'm unable to grab all of the dependencies needed with Gradle for the project because of this. Compilation fails when running ./gradlew clean build:

:cassandra-commons:compileJava
/Users/user/Code/src/dcos-cassandra-service/cassandra-commons/src/generated/main/java/com/mesosphere/dcos/cassandra/common/CassandraProtos.java:2291: error: cannot find symbol
      com.google.protobuf.GeneratedMessageV3 implements
                         ^
  symbol:   class GeneratedMessageV3
  location: package com.google.protobuf
/Users/user/Code/src/dcos-cassandra-service/cassandra-commons/src/generated/main/java/com/mesosphere/dcos/cassandra/common/CassandraProtos.java:848: error: cannot find symbol
      com.google.protobuf.GeneratedMessageV3 implements
                         ^
  symbol:   class GeneratedMessageV3
  location: package com.google.protobuf
/Users/user/Code/src/dcos-cassandra-service/cassandra-commons/src/generated/main/java/com/mesosphere/dcos/cassandra/common/CassandraProtos.java:1489: error: cannot find symbol
      com.google.protobuf.GeneratedMessageV3 implements
                         ^
  symbol:   class GeneratedMessageV3
  location: package com.google.protobuf
/Users/user/Code/src/dcos-cassandra-service/cassandra-commons/src/generated/main/java/com/mesosphere/dcos/cassandra/common/CassandraProtos.java:6904: error: package com.google.protobuf.GeneratedMessageV3 does not exist
    com.google.protobuf.GeneratedMessageV3.FieldAccessorTable
                                          ^
/Users/user/Code/src/dcos-cassandra-service/cassandra-commons/src/generated/main/java/com/mesosphere/dcos/cassandra/common/CassandraProtos.java:6909: error: package com.google.protobuf.GeneratedMessageV3 does not exist
    com.google.protobuf.GeneratedMessageV3.FieldAccessorTable
                                          ^
/Users/user/Code/src/dcos-cassandra-service/cassandra-commons/src/generated/main/java/com/mesosphere/dcos/cassandra/common/CassandraProtos.java:6914: error: package com.google.protobuf.GeneratedMessageV3 does not exist
    com.google.protobuf.GeneratedMessageV3.FieldAccessorTable
                                          ^
/Users/user/Code/src/dcos-cassandra-service/cassandra-commons/src/generated/main/java/com/mesosphere/dcos/cassandra/common/CassandraProtos.java:6919: error: package com.google.protobuf.GeneratedMessageV3 does not exist
    com.google.protobuf.GeneratedMessageV3.FieldAccessorTable
                                          ^
/Users/user/Code/src/dcos-cassandra-service/cassandra-commons/src/generated/main/java/com/mesosphere/dcos/cassandra/common/CassandraProtos.java:6924: error: package com.google.protobuf.GeneratedMessageV3 does not exist
    com.google.protobuf.GeneratedMessageV3.FieldAccessorTable
                                          ^
/Users/user/Code/src/dcos-cassandra-service/cassandra-commons/src/generated/main/java/com/mesosphere/dcos/cassandra/common/CassandraProtos.java:2295: error: package com.google.protobuf.GeneratedMessageV3 does not exist
    private CassandraConfig(com.google.protobuf.GeneratedMessageV3.Builder<?> builder) {
                                                                  ^
/Users/user/Code/src/dcos-cassandra-service/cassandra-commons/src/generated/main/java/com/mesosphere/dcos/cassandra/common/CassandraProtos.java:2435: error: package com.google.protobuf.GeneratedMessageV3 does not exist
    protected com.google.protobuf.GeneratedMessageV3.FieldAccessorTable

etc. etc. etc.

It looks like this has happened before as well mesosphere/chaos#23

Task failure lead to spinning up more tasks then required due to a bug in id generation/persistence

INFO  [2016-02-16 14:39:26,384] com.mesosphere.dcos.cassandra.scheduler.CassandraScheduler: Received status update for taskId=server-1_e8c8c0db-edf5-4ec5-8ef7-c26fe8d58486 state=TASK_KILLED source=SOURCE_EXECUTOR reason=REASON_COMMAND_EXECUTOR_FAILED message='Cassandra Daemon was killed by signal 15'
INFO  [2016-02-16 14:39:26,394] com.mesosphere.dcos.cassandra.scheduler.plan.CassandraDaemonBlock: Reallocating task server-3_a457dca3-cf64-45ae-8ec3-1d7e7670aa21 for block 1
INFO  [2016-02-16 14:39:27,047] com.mesosphere.dcos.cassandra.scheduler.CassandraScheduler: Received 1 offers
WARN  [2016-02-16 14:39:27,047] org.apache.mesos.scheduler.plan.DefaultPlanScheduler: No block to process.
INFO  [2016-02-16 14:39:27,048] com.mesosphere.dcos.cassandra.scheduler.CassandraRepairScheduler: Terminated tasks size: 1
INFO  [2016-02-16 14:39:27,048] com.mesosphere.dcos.cassandra.scheduler.offer.PersistentOfferRequirementProvider: Getting replacement requirement for task: server-1_e8c8c0db-edf5-4ec5-8ef7-c26fe8d58486
INFO  [2016-02-16 14:39:27,048] com.mesosphere.dcos.cassandra.scheduler.offer.PersistentOfferRequirementProvider: Task has a volume, taskId: server-1_e8c8c0db-edf5-4ec5-8ef7-c26fe8d58486, reusing existing requirement
INFO  [2016-02-16 14:39:27,048] com.mesosphere.dcos.cassandra.scheduler.offer.PersistentOfferRequirementProvider: Getting existing OfferRequirement for task: name: "server-1"

Repair job gets assigned to a node other than the one needs to be repaired

The consequence is that the executor is unable to connect to the Cassandra daemon process because it uses 127.0.0.1. Looks like there are two bugs in the scheduler:

Repair job should use NodePlacementStrategy
NodePlacementStrategy doesn't return the expected nodes to avoid.

I'm going to submit a pull request to fix this but let me know if I misunderstand.

Replacing a failed task should generate a new UUID suffix

When replacing a failed task, taskId should re-use 'server-N' prefix but should generate a new 'UUID' suffix. Reusing the older UUID suffix works, but as a user once cannot browse logs from the failed tasks, as Mesos cannot distinguish between the old and new task (as the taskId is same).

Missed configuration variables

Hi:

We are using dc/os and we have seen there are some non editable config variables.
In scheduler.yml there are several hardcoded config variables.

For example, I want to config cassandra node to enable UDF and i havent found how to config it.

Is there any other reason to not allow this?
Cheers

Add support for configuring role + principal + secret

Add API endpoint that returns a list of Cassandra endpoints for client connection

Example:

$ dcos cassandra connection
{
"nodes": [
"ip-10-0-3-230.us-west-2.compute.internal:9042",
"ip-10-0-3-231.us-west-2.compute.internal:9042"
]
}

Considerations:

How to show CQL and Thrift RPC endpoints
How many Cassandra endpoints do we return for large clusters (1000 nodes)

@kow3ns

Error running cqlsh

I follow these instructions from the docs:

Login to an agent inside DCOS cluster running cassandra node, and then run following to launch a docker container:
$ docker run --net=host -it mohitsoni/alpine-cqlsh:2.2.5 /bin/sh

And get this error:

Connection error: ('Unable to connect to any servers', {'127.0.0.1': error(111, "Tried connecting to [('127.0.0.1', 9042)]. Last error: Connection refused")})

Unable to Install Package Cassandra

While trying to install cassandra in DC/OS 1.8 in EC2, we get the following error.

fetch 'https://downloads.mesosphere.com/cassandra/assets/apache-cassandra-3.0.8-bin-dcos.tar.gz': Error downloading resource, received HTTP return code 403

We tried installed dse 5 package which got installed. Got the cli package as well installed. However, while trying to query the cassandra connections via
dcos cassandra connection it throws the following exception:

/service/cassandra/v1/nodes/list failed: 500 Internal Server Error
2016/09/29 19:07:08 - Did you provide the correct service name? Currently using 'cassandra', specify a different name with '--name='.
2016/09/29 19:07:08 - Was the service recently installed? It may still be initializing, Wait a bit and try again.

ExecutorDriver needs to run in a separate thread

ExecutorDriver is currently executed in the main thread, as was the case in the alpha version. This blocks the Application from loading API resources and serving requests.

Expose all cassandra parameters

Currently, a few cassandra parameters in cassandra.yaml are not exposed through CASSANDRA_* variable. It would be helpful to expose all of them.

Specify role for scheduler and tasks

In https://github.com/mesosphere/dcos-cassandra-service/blob/master/README.md#service-configuration

{
    "service": {
        "name": "cassandra2",
        "cluster" : "dcos_cluster",
        "role": "cassandra_role",
        "data_center" : "dc2",
...

How do I use role ? It isn't mentioned in the https://github.com/mesosphere/universe/blob/version-3.x/repo/packages/C/cassandra/16/config.json at all :(

volumeSizeMb not being picked up in configuration

Creating persistent volume does not work with Mesos 1.0

Mesos 1.0 now enforces that DiskInfo principal is equal to framework/operator principal (https://issues.apache.org/jira/browse/MESOS-5005) or else the creation of persistent volume fails. So we need to set this during the create operation.

uuid is printed in some encoding the the log

The uuid below is printed in come un-readable encoding. Is that the task uuid?

INFO [2016-03-24 21:37:49,689] org.apache.mesos.scheduler.plan.StageManager: Updated current block with status: block = org.apache.mesos.scheduler.plan.ReconciliationBlock@5fcd35d8,status = task_id {

value: "node-1_c6e02992-6eb9-4d23-bc8a-18b346530096"

}

state: TASK_RUNNING

data: "\b\001\020\000"

slave_id {

value: "0ceb1be3-4b36-4b6a-9b13-ebf3803e0118-S1"

}

timestamp: 1.458855281518391E9

executor_id {

value: "node-1_c6e02992-6eb9-4d23-bc8a-18b346530096_executor"

}

source: SOURCE_EXECUTOR

uuid: "\353%\232\313f\002Ao\257\330Q\310R\246\261<"

container_status {

network_infos {

ip_address: "192.168.3.22"

ip_addresses {

  ip_address: "192.168.3.22"

}

}

Allow Java agent override

We need a way to inject a java agent (jmx exporter) and a custom jar file, so we can extract metrics from the Cassandra cluster. Currently I believe overriding the JAVA_OPTS only effects the scheduler, not each executor instance.

Support named VIPs like Kafka does

Make prefix string for each task type a public constant

For each task (Daemon, Snapshot, Upload, Download, Restore, etc.) introduce a static PREFIX constant to avoid magic strings in code.

ConfigurationResource tests

Implement tests for ConfigurationResource. These tests should use a live server and http client and not mock the network connection.

Node replacement failed

I attempted to replace a node and got the following:

cat: /etc/ld.so.conf.d/*.conf: No such file or directory
Java HotSpot(TM) 64-Bit Server VM warning: Cannot open file apache-cassandra-2.2.5/bin/../logs/gc.log due to No such file or directory

java.lang.RuntimeException: A node required to move the data consistently is down (/10.0.1.122). If you wish to move the data from a potentially inconsistent replica, restart the node with -Dcassandra.consistent.rangemovement=false
    at org.apache.cassandra.dht.RangeStreamer.getAllRangesWithStrictSourcesFor(RangeStreamer.java:264)
    at org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:147)
    at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:82)
    at org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1230)
    at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:924)
    at org.apache.cassandra.service.StorageService.initServer(StorageService.java:709)
    at org.apache.cassandra.service.StorageService.initServer(StorageService.java:585)
    at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300)
    at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:516)
    at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:625)

and:

CompilerOracle: inline org/apache/cassandra/db/AbstractNativeCell.compareTo (Lorg/apache/cassandra/db/composites/Composite;)I
CompilerOracle: inline org/apache/cassandra/db/composites/AbstractSimpleCellNameType.compareUnsigned (Lorg/apache/cassandra/db/composites/Composite;Lorg/apache/cassandra/db/composites/Composite;)I
CompilerOracle: inline org/apache/cassandra/io/util/Memory.checkBounds (JJ)V
CompilerOracle: inline org/apache/cassandra/io/util/SafeMemory.checkBounds (JJ)V
CompilerOracle: inline org/apache/cassandra/utils/AsymmetricOrdering.selectBoundary (Lorg/apache/cassandra/utils/AsymmetricOrdering/Op;II)I
CompilerOracle: inline org/apache/cassandra/utils/AsymmetricOrdering.strictnessOfLessThan (Lorg/apache/cassandra/utils/AsymmetricOrdering/Op;)I
CompilerOracle: inline org/apache/cassandra/utils/ByteBufferUtil.compare (Ljava/nio/ByteBuffer;[B)I
CompilerOracle: inline org/apache/cassandra/utils/ByteBufferUtil.compare ([BLjava/nio/ByteBuffer;)I
CompilerOracle: inline org/apache/cassandra/utils/ByteBufferUtil.compareUnsigned (Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)I
CompilerOracle: inline org/apache/cassandra/utils/FastByteOperations$UnsafeOperations.compareTo (Ljava/lang/Object;JILjava/lang/Object;JI)I
CompilerOracle: inline org/apache/cassandra/utils/FastByteOperations$UnsafeOperations.compareTo (Ljava/lang/Object;JILjava/nio/ByteBuffer;)I
CompilerOracle: inline org/apache/cassandra/utils/FastByteOperations$UnsafeOperations.compareTo (Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)I
INFO  18:59:37 Loading settings from file:/var/lib/mesos/slave/slaves/20aac1e4-622d-461c-abb5-d99d9ba55ec7-S6/frameworks/20aac1e4-622d-461c-abb5-d99d9ba55ec7-0001/executors/node-2_9e70ce50-098d-4b87-85ed-c2f99ee0386a_executor/runs/64ea900c-1bfa-4393-b18c-447a7b2666b1/apache-cassandra-2.2.5/conf/cassandra.yaml
INFO  18:59:37 Node configuration:[authenticator=AllowAllAuthenticator; authorizer=AllowAllAuthorizer; auto_snapshot=false; batch_size_fail_threshold_in_kb=50; batch_size_warn_threshold_in_kb=5; batchlog_replay_throttle_in_kb=1024; cas_contention_timeout_in_ms=1000; client_encryption_options=<REDACTED>; cluster_name=cassandra; column_index_size_in_kb=64; commit_failure_policy=stop; commitlog_directory=/var/lib/mesos/slave/slaves/20aac1e4-622d-461c-abb5-d99d9ba55ec7-S6/frameworks/20aac1e4-622d-461c-abb5-d99d9ba55ec7-0001/executors/node-2_9e70ce50-098d-4b87-85ed-c2f99ee0386a_executor/runs/64ea900c-1bfa-4393-b18c-447a7b2666b1/volume/commitlog; commitlog_segment_size_in_mb=32; commitlog_sync=periodic; commitlog_sync_period_in_ms=10000; compaction_large_partition_warning_threshold_mb=100; compaction_throughput_mb_per_sec=16; concurrent_counter_writes=16; concurrent_reads=16; concurrent_writes=32; counter_cache_save_period=7200; counter_cache_size_in_mb=null; counter_write_request_timeout_in_ms=5000; cross_node_timeout=false; data_file_directories=[/var/lib/mesos/slave/slaves/20aac1e4-622d-461c-abb5-d99d9ba55ec7-S6/frameworks/20aac1e4-622d-461c-abb5-d99d9ba55ec7-0001/executors/node-2_9e70ce50-098d-4b87-85ed-c2f99ee0386a_executor/runs/64ea900c-1bfa-4393-b18c-447a7b2666b1/volume/data]; disk_failure_policy=stop; dynamic_snitch_badness_threshold=0.1; dynamic_snitch_reset_interval_in_ms=600000; dynamic_snitch_update_interval_in_ms=100; enable_user_defined_functions=false; endpoint_snitch=GossipingPropertyFileSnitch; hinted_handoff_enabled=true; hinted_handoff_throttle_in_kb=1024; incremental_backups=false; index_summary_capacity_in_mb=null; index_summary_resize_interval_in_minutes=60; inter_dc_tcp_nodelay=false; internode_compression=all; key_cache_save_period=14400; key_cache_size_in_mb=null; listen_address=10.0.2.66; max_hint_window_in_ms=10800000; max_hints_delivery_threads=2; memtable_allocation_type=heap_buffers; native_transport_port=9042; num_tokens=256; partitioner=org.apache.cassandra.dht.Murmur3Partitioner; permissions_validity_in_ms=2000; range_request_timeout_in_ms=10000; read_request_timeout_in_ms=5000; request_scheduler=org.apache.cassandra.scheduler.NoScheduler; request_timeout_in_ms=10000; role_manager=CassandraRoleManager; roles_validity_in_ms=2000; row_cache_save_period=0; row_cache_size_in_mb=0; rpc_address=10.0.2.66; rpc_keepalive=true; rpc_port=9160; rpc_server_type=sync; saved_caches_directory=/var/lib/mesos/slave/slaves/20aac1e4-622d-461c-abb5-d99d9ba55ec7-S6/frameworks/20aac1e4-622d-461c-abb5-d99d9ba55ec7-0001/executors/node-2_9e70ce50-098d-4b87-85ed-c2f99ee0386a_executor/runs/64ea900c-1bfa-4393-b18c-447a7b2666b1/volume/saved_caches; seed_provider=[{class_name=com.mesosphere.dcos.cassandra.DcosSeedProvider, parameters=[{seeds_url=http://cassandra.marathon.mesos:9000/v1/seeds}]}]; server_encryption_options=<REDACTED>; snapshot_before_compaction=false; ssl_storage_port=7001; sstable_preemptive_open_interval_in_mb=50; start_native_transport=true; start_rpc=false; storage_port=7000; thrift_framed_transport_size_in_mb=15; tombstone_failure_threshold=100000; tombstone_warn_threshold=1000; tracetype_query_ttl=86400; tracetype_repair_ttl=604800; trickle_fsync=false; trickle_fsync_interval_in_kb=10240; truncate_request_timeout_in_ms=60000; windows_timer_interval=1; write_request_timeout_in_ms=2000]
INFO  18:59:37 DiskAccessMode 'auto' determined to be mmap, indexAccessMode is mmap
INFO  18:59:37 Global memtable on-heap threshold is enabled at 509MB
INFO  18:59:37 Global memtable off-heap threshold is enabled at 509MB
INFO  18:59:37 Unable to load cassandra-topology.properties; compatibility mode disabled
WARN  18:59:37 Only 28482 MB free across all data volumes. Consider adding more capacity to your cluster or removing obsolete snapshots
INFO  18:59:37 Retrieved response {"isSeed":false,"seeds":["10.0.1.121","10.0.1.123"]} from URL http://cassandra.marathon.mesos:9000/v1/seeds
INFO  18:59:37 Retrieved remote seeds [/10.0.1.121, /10.0.1.123]
INFO  18:59:37 Hostname: ip-10-0-2-66.us-west-2.compute.internal
INFO  18:59:37 JVM vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.8.0_74
INFO  18:59:37 Heap size: 2136997888/2136997888
INFO  18:59:37 Code Cache Non-heap memory: init = 2555904(2496K) used = 3151552(3077K) committed = 3211264(3136K) max = 251658240(245760K)
INFO  18:59:37 Metaspace Non-heap memory: init = 0(0K) used = 13007080(12702K) committed = 13238272(12928K) max = -1(-1K)
INFO  18:59:37 Compressed Class Space Non-heap memory: init = 0(0K) used = 1637264(1598K) committed = 1703936(1664K) max = 1073741824(1048576K)
INFO  18:59:37 Par Eden Space Heap memory: init = 83886080(81920K) used = 62144072(60687K) committed = 83886080(81920K) max = 83886080(81920K)
INFO  18:59:37 Par Survivor Space Heap memory: init = 10485760(10240K) used = 0(0K) committed = 10485760(10240K) max = 10485760(10240K)
INFO  18:59:37 CMS Old Gen Heap memory: init = 2042626048(1994752K) used = 0(0K) committed = 2042626048(1994752K) max = 2042626048(1994752K)
INFO  18:59:37 Classpath: apache-cassandra-2.2.5/bin/../conf:apache-cassandra-2.2.5/bin/../build/classes/main:apache-cassandra-2.2.5/bin/../build/classes/thrift:apache-cassandra-2.2.5/bin/../lib/ST4-4.0.8.jar:apache-cassandra-2.2.5/bin/../lib/airline-0.6.jar:apache-cassandra-2.2.5/bin/../lib/antlr-runtime-3.5.2.jar:apache-cassandra-2.2.5/bin/../lib/apache-cassandra-2.2.5.jar:apache-cassandra-2.2.5/bin/../lib/apache-cassandra-clientutil-2.2.5.jar:apache-cassandra-2.2.5/bin/../lib/apache-cassandra-thrift-2.2.5.jar:apache-cassandra-2.2.5/bin/../lib/cassandra-driver-core-2.2.0-rc2-SNAPSHOT-20150617-shaded.jar:apache-cassandra-2.2.5/bin/../lib/commons-cli-1.1.jar:apache-cassandra-2.2.5/bin/../lib/commons-codec-1.2.jar:apache-cassandra-2.2.5/bin/../lib/commons-lang3-3.1.jar:apache-cassandra-2.2.5/bin/../lib/commons-math3-3.2.jar:apache-cassandra-2.2.5/bin/../lib/compress-lzf-0.8.4.jar:apache-cassandra-2.2.5/bin/../lib/concurrentlinkedhashmap-lru-1.4.jar:apache-cassandra-2.2.5/bin/../lib/crc32ex-0.1.1.jar:apache-cassandra-2.2.5/bin/../lib/disruptor-3.0.1.jar:apache-cassandra-2.2.5/bin/../lib/ecj-4.4.2.jar:apache-cassandra-2.2.5/bin/../lib/guava-16.0.jar:apache-cassandra-2.2.5/bin/../lib/high-scale-lib-1.0.6.jar:apache-cassandra-2.2.5/bin/../lib/jackson-core-asl-1.9.2.jar:apache-cassandra-2.2.5/bin/../lib/jackson-mapper-asl-1.9.2.jar:apache-cassandra-2.2.5/bin/../lib/jamm-0.3.0.jar:apache-cassandra-2.2.5/bin/../lib/javax.inject.jar:apache-cassandra-2.2.5/bin/../lib/jbcrypt-0.3m.jar:apache-cassandra-2.2.5/bin/../lib/jcl-over-slf4j-1.7.7.jar:apache-cassandra-2.2.5/bin/../lib/jna-4.0.0.jar:apache-cassandra-2.2.5/bin/../lib/joda-time-2.4.jar:apache-cassandra-2.2.5/bin/../lib/json-simple-1.1.jar:apache-cassandra-2.2.5/bin/../lib/libthrift-0.9.2.jar:apache-cassandra-2.2.5/bin/../lib/log4j-over-slf4j-1.7.7.jar:apache-cassandra-2.2.5/bin/../lib/logback-classic-1.1.3.jar:apache-cassandra-2.2.5/bin/../lib/logback-core-1.1.3.jar:apache-cassandra-2.2.5/bin/../lib/lz4-1.3.0.jar:apache-cassandra-2.2.5/bin/../lib/metrics-core-3.1.0.jar:apache-cassandra-2.2.5/bin/../lib/metrics-logback-3.1.0.jar:apache-cassandra-2.2.5/bin/../lib/metrics-statsd-common-4.1.2.jar:apache-cassandra-2.2.5/bin/../lib/metrics2-statsd-4.1.2.jar:apache-cassandra-2.2.5/bin/../lib/netty-all-4.0.23.Final.jar:apache-cassandra-2.2.5/bin/../lib/ohc-core-0.3.4.jar:apache-cassandra-2.2.5/bin/../lib/ohc-core-j8-0.3.4.jar:apache-cassandra-2.2.5/bin/../lib/reporter-config-base-3.0.0.jar:apache-cassandra-2.2.5/bin/../lib/reporter-config3-3.0.0.jar:apache-cassandra-2.2.5/bin/../lib/seedprovider-0.1.0.jar:apache-cassandra-2.2.5/bin/../lib/sigar-1.6.4.jar:apache-cassandra-2.2.5/bin/../lib/slf4j-api-1.7.7.jar:apache-cassandra-2.2.5/bin/../lib/snakeyaml-1.11.jar:apache-cassandra-2.2.5/bin/../lib/snappy-java-1.1.1.7.jar:apache-cassandra-2.2.5/bin/../lib/stream-2.5.2.jar:apache-cassandra-2.2.5/bin/../lib/super-csv-2.1.0.jar:apache-cassandra-2.2.5/bin/../lib/thrift-server-0.3.7.jar:apache-cassandra-2.2.5/bin/../lib/jsr223/*/*.jar:apache-cassandra-2.2.5/bin/../lib/jamm-0.3.0.jar
INFO  18:59:37 JVM Arguments: [-ea, -javaagent:apache-cassandra-2.2.5/bin/../lib/jamm-0.3.0.jar, -XX:+CMSClassUnloadingEnabled, -XX:+UseThreadPriorities, -XX:ThreadPriorityPolicy=42, -Xms2048M, -Xmx2048M, -Xmn100M, -XX:+HeapDumpOnOutOfMemoryError, -Xss256k, -XX:StringTableSize=1000003, -XX:+UseParNewGC, -XX:+UseConcMarkSweepGC, -XX:+CMSParallelRemarkEnabled, -XX:SurvivorRatio=8, -XX:MaxTenuringThreshold=1, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -XX:+UseTLAB, -XX:+PerfDisableSharedMem, -XX:CompileCommandFile=apache-cassandra-2.2.5/bin/../conf/hotspot_compiler, -XX:CMSWaitDuration=10000, -XX:+CMSParallelInitialMarkEnabled, -XX:+CMSEdenChunksRecordAlways, -XX:CMSWaitDuration=10000, -XX:+UseCondCardMark, -XX:+PrintGCDetails, -XX:+PrintGCDateStamps, -XX:+PrintHeapAtGC, -XX:+PrintTenuringDistribution, -XX:+PrintGCApplicationStoppedTime, -XX:+PrintPromotionFailure, -Xloggc:apache-cassandra-2.2.5/bin/../logs/gc.log, -XX:+UseGCLogFileRotation, -XX:NumberOfGCLogFiles=10, -XX:GCLogFileSize=10M, -Djava.net.preferIPv4Stack=true, -Dcassandra.jmx.local.port=7199, -XX:+DisableExplicitGC, -Djava.library.path=apache-cassandra-2.2.5/bin/../lib/sigar-bin, -Dcassandra.metricsReporterConfigFile=metrics-reporter-config.yaml, -Dlogback.configurationFile=logback.xml, -Dcassandra.logdir=apache-cassandra-2.2.5/bin/../logs, -Dcassandra.storagedir=apache-cassandra-2.2.5/bin/../data, -Dcassandra-foreground=yes]
INFO  18:59:38 JNA mlockall successful
WARN  18:59:38 jemalloc shared library could not be preloaded to speed up memory allocations
WARN  18:59:38 JMX is not enabled to receive remote connections. Please see cassandra-env.sh for more info.
INFO  18:59:38 Initializing SIGAR library
INFO  18:59:38 Checked OS settings and found them configured for optimal performance.
INFO  18:59:39 Initializing system.sstable_activity
INFO  18:59:41 Initializing key cache with capacity of 100 MBs.
INFO  18:59:41 Initializing row cache with capacity of 0 MBs
INFO  18:59:41 Initializing counter cache with capacity of 50 MBs
INFO  18:59:41 Scheduling counter cache save to every 7200 seconds (going to save all keys).
INFO  18:59:41 Initializing system.hints
INFO  18:59:41 Initializing system.compaction_history
INFO  18:59:41 Initializing system.peers
INFO  18:59:41 Initializing system.schema_columnfamilies
INFO  18:59:41 Initializing system.schema_functions
INFO  18:59:41 Initializing system.IndexInfo
INFO  18:59:41 Initializing system.schema_columns
INFO  18:59:41 Initializing system.schema_triggers
INFO  18:59:41 Initializing system.local
INFO  18:59:41 Initializing system.schema_usertypes
INFO  18:59:41 Initializing system.batchlog
INFO  18:59:41 Initializing system.available_ranges
INFO  18:59:42 Initializing system.schema_aggregates
INFO  18:59:42 Initializing system.paxos
INFO  18:59:42 Initializing system.peer_events
INFO  18:59:42 Initializing system.size_estimates
INFO  18:59:42 Initializing system.compactions_in_progress
INFO  18:59:42 Initializing system.schema_keyspaces
INFO  18:59:42 Initializing system.range_xfers
INFO  18:59:43 Initializing system_distributed.parent_repair_history
INFO  18:59:43 Initializing system_distributed.repair_history
INFO  18:59:43 Initializing system_auth.role_permissions
INFO  18:59:43 Initializing system_auth.resource_role_permissons_index
INFO  18:59:43 Initializing system_auth.roles
INFO  18:59:43 Initializing system_auth.role_members
INFO  18:59:43 Initializing system_traces.sessions
INFO  18:59:43 Initializing system_traces.events
INFO  18:59:43 Completed loading (104 ms; 7 keys) KeyCache cache
INFO  18:59:43 Replaying /var/lib/mesos/slave/slaves/20aac1e4-622d-461c-abb5-d99d9ba55ec7-S6/frameworks/20aac1e4-622d-461c-abb5-d99d9ba55ec7-0001/executors/node-2_9e70ce50-098d-4b87-85ed-c2f99ee0386a_executor/runs/64ea900c-1bfa-4393-b18c-447a7b2666b1/volume/commitlog/CommitLog-5-1462561114318.log, /var/lib/mesos/slave/slaves/20aac1e4-622d-461c-abb5-d99d9ba55ec7-S6/frameworks/20aac1e4-622d-461c-abb5-d99d9ba55ec7-0001/executors/node-2_9e70ce50-098d-4b87-85ed-c2f99ee0386a_executor/runs/64ea900c-1bfa-4393-b18c-447a7b2666b1/volume/commitlog/CommitLog-5-1462561114319.log
INFO  18:59:46 Log replay complete, 15 replayed mutations
INFO  18:59:46 Cassandra version: 2.2.5
INFO  18:59:46 Thrift API version: 20.1.0
INFO  18:59:46 CQL supported versions: 3.3.1 (default: 3.3.1)
INFO  18:59:47 Initializing index summary manager with a memory pool size of 101 MB and a resize interval of 60 minutes
INFO  18:59:47 Loading persisted ring state
INFO  18:59:47 Retrieved response {"isSeed":false,"seeds":["10.0.1.121","10.0.1.123"]} from URL http://cassandra.marathon.mesos:9000/v1/seeds
INFO  18:59:47 Retrieved remote seeds [/10.0.1.121, /10.0.1.123]
INFO  18:59:47 Starting Messaging Service on /10.0.2.66:7000 (eth0)
INFO  18:59:47 Retrieved response {"isSeed":false,"seeds":["10.0.1.121","10.0.1.123"]} from URL http://cassandra.marathon.mesos:9000/v1/seeds
INFO  18:59:47 Retrieved remote seeds [/10.0.1.121, /10.0.1.123]
INFO  18:59:47 Handshaking version with /10.0.1.123
INFO  18:59:47 Handshaking version with /10.0.1.121
INFO  18:59:52 Node /10.0.1.121 has restarted, now UP
INFO  18:59:52 Node /10.0.1.122 has restarted, now UP
INFO  18:59:52 InetAddress /10.0.1.122 is now DOWN
INFO  18:59:52 Node /10.0.1.123 has restarted, now UP
INFO  18:59:53 Starting up server gossip
INFO  18:59:54 Retrieved response {"isSeed":false,"seeds":["10.0.1.121","10.0.1.123"]} from URL http://cassandra.marathon.mesos:9000/v1/seeds
INFO  18:59:54 Retrieved remote seeds [/10.0.1.121, /10.0.1.123]
INFO  18:59:54 Retrieved response {"isSeed":false,"seeds":["10.0.1.121","10.0.1.123"]} from URL http://cassandra.marathon.mesos:9000/v1/seeds
INFO  18:59:54 Retrieved remote seeds [/10.0.1.121, /10.0.1.123]
INFO  18:59:54 Retrieved response {"isSeed":false,"seeds":["10.0.1.121","10.0.1.123"]} from URL http://cassandra.marathon.mesos:9000/v1/seeds
INFO  18:59:54 Retrieved remote seeds [/10.0.1.121, /10.0.1.123]
INFO  18:59:54 Retrieved response {"isSeed":false,"seeds":["10.0.1.121","10.0.1.123"]} from URL http://cassandra.marathon.mesos:9000/v1/seeds
INFO  18:59:54 Retrieved remote seeds [/10.0.1.121, /10.0.1.123]
WARN  18:59:54 Detected previous bootstrap failure; retrying
INFO  18:59:54 JOINING: waiting for ring information
INFO  18:59:54 JOINING: schema complete, ready to bootstrap
INFO  18:59:54 JOINING: waiting for pending range calculation
INFO  18:59:54 JOINING: calculation complete, ready to bootstrap
INFO  18:59:54 JOINING: getting bootstrap token
INFO  18:59:55 Handshaking version with /10.0.1.121
INFO  18:59:55 Node /10.0.1.121 is now part of the cluster
INFO  18:59:55 Node /10.0.1.121 state jump to NORMAL
INFO  18:59:56 JOINING: sleeping 30000 ms for pending range setup
WARN  18:59:56 Not marking nodes down due to local pause of 16496601209 > 5000000000
INFO  18:59:56 Updating topology for /10.0.1.121
INFO  18:59:56 Updating topology for /10.0.1.121
INFO  18:59:56 Node /10.0.1.122 is now part of the cluster
INFO  18:59:56 Node /10.0.1.122 state jump to shutdown
INFO  18:59:56 Updating topology for /10.0.1.122
INFO  18:59:56 Updating topology for /10.0.1.122
INFO  18:59:56 InetAddress /10.0.1.122 is now DOWN
INFO  18:59:56 Node /10.0.1.123 is now part of the cluster
INFO  18:59:56 Node /10.0.1.123 state jump to NORMAL
INFO  18:59:56 Updating topology for /10.0.1.123
INFO  18:59:56 Updating topology for /10.0.1.123
INFO  18:59:56 InetAddress /10.0.1.121 is now UP
INFO  18:59:56 InetAddress /10.0.1.121 is now UP
INFO  18:59:56 Handshaking version with /10.0.1.123
INFO  18:59:56 InetAddress /10.0.1.123 is now UP
INFO  19:00:26 JOINING: Starting to bootstrap...
Exception (java.lang.RuntimeException) encountered during startup: A node required to move the data consistently is down (/10.0.1.122). If you wish to move the data from a potentially inconsistent replica, restart the node with -Dcassandra.consistent.rangemovement=false
ERROR 19:00:26 Exception encountered during startup
java.lang.RuntimeException: A node required to move the data consistently is down (/10.0.1.122). If you wish to move the data from a potentially inconsistent replica, restart the node with -Dcassandra.consistent.rangemovement=false
    at org.apache.cassandra.dht.RangeStreamer.getAllRangesWithStrictSourcesFor(RangeStreamer.java:264) ~[apache-cassandra-2.2.5.jar:2.2.5]
    at org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:147) ~[apache-cassandra-2.2.5.jar:2.2.5]
    at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:82) ~[apache-cassandra-2.2.5.jar:2.2.5]
    at org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1230) ~[apache-cassandra-2.2.5.jar:2.2.5]
    at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:924) ~[apache-cassandra-2.2.5.jar:2.2.5]
    at org.apache.cassandra.service.StorageService.initServer(StorageService.java:709) ~[apache-cassandra-2.2.5.jar:2.2.5]
    at org.apache.cassandra.service.StorageService.initServer(StorageService.java:585) ~[apache-cassandra-2.2.5.jar:2.2.5]
    at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [apache-cassandra-2.2.5.jar:2.2.5]
    at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:516) [apache-cassandra-2.2.5.jar:2.2.5]
    at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:625) [apache-cassandra-2.2.5.jar:2.2.5]
WARN  19:00:26 No local state or state is in silent shutdown, not announcing shutdown

Healthcheck doesn't detect correct number of servers when updateConfig is set to true

When the process is restarted with updateConfig flag set and a new value for servers param, the framework does the right thing by launching the required number of new instances. But, the healthcheck doesn't seem to detect the change, and fails stating that the expected instances is less than the number of actual instances.

zookeeper dependency binaries are missing

I pulled latest version today, and unable to run clean build. 19 tests are failing due to zookeeper binaries are missing.

Stacktrace can be found here:

http://stackoverflow.com/questions/38705618/dcos-cassandra-mesos-project-build-failing-for-quorumpeermain-class-not-found

ConfigurationManager tests

Improve testing for ConfigurationManager class. Tests may mock external interfaces where necessary, but should use a curator TestingServer for the ZooKeeper interface and not mock the interactions with ZK.

Repair endpoint starts cluster tasks on nodes not running Cassandra

We are running Aurora and Cassandra frameworks in the same Mesos cluster.

I tried to start a repair by using:

curl -H 'Content-Type: application/json' \
     -X PUT -d '{ "nodes" : [ "*" ] }' \
     http://appdocker968-sjc1.prod.uber.internal:5287/v1/repair/start

It starts accepting offers from the non-Cassandra machines, launches the task and then fails. This leaves dynamically reserved resources on all of the Aurora machines on which it has attempted to launch (the role of the framework is cassandra-cstar-token:
Excerpt from the http://mesos-master:5050/state endpoint:

{
id: "ec1f56cf-a66c-48cd-880c-020b430b0d1a-S4",
pid: "slave(1)@10.28.20.29:5051",
hostname: "compute131-sjc1.prod.uber.internal",
registered_time: 1466724399.95873,
reregistered_time: 1466724400.11227,
resources: {
cpus: 24,
disk: 1665158,
mem: 120635,
ports: "[31000-32000]"
},
used_resources: {
cpus: 5.75,
disk: 6144,
mem: 12672,
ports: "[31439-31439, 31583-31583, 31755-31755, 31787-31787]"
},
offered_resources: {
cpus: 16.25,
disk: 1659014,
mem: 107451,
ports: "[31000-31438, 31440-31582, 31584-31754, 31756-31786, 31788-32000]"
},
reserved_resources: {
cassandra-cstar-token: {
cpus: 1,
disk: 0,
mem: 256
}
},
unreserved_resources: {
cpus: 22,
disk: 1665158,
mem: 120123,
ports: "[31000-32000]"
},
attributes: {
host: "compute131-sjc1",
kernel: 3.18,
pipeline: "us1",
pod: "c",
rack: 337
},
active: true,
version: "0.28.2-22"
}

Is this a bug caused due to passing null as the agentsToColocate here: https://github.com/mesosphere/dcos-cassandra-service/blob/master/cassandra-scheduler/src/main/java/com/mesosphere/dcos/cassandra/scheduler/offer/ClusterTaskOfferRequirementProvider.java#L108 ?

I think it should colocate the cluster repair task on the nodes running the Cassandra tasks.

Framework assumes persistent volume is created even if creation failed and keeps on looking for it

This bug is similar to issue #118 but now I have a deterministic way of reproducing it.

Upgrade Mesos to 1.0
Launch fresh instance of the scheduler
Scheduler sends RESERVE, CREATE and LAUNCH operations.
Creation of persistent volume fails because the principal is not set in DiskInfo.Persistence (https://issues.apache.org/jira/browse/MESOS-5005)
Scheduler assumes that the volume was created, stores the persistent volume ID in zookeeper and keeps trying to find it in subsequent offers.

The correct behavior is that it should keep retrying to create the volume if it does not see the persistent volume in subsequent offers.

user config var is unused

https://github.com/mesosphere/dcos-cassandra-service/blob/master/universe/package/config.json#L17 is unused

IdentityManager tests

Improve tests for the IdentityManager class. Tests should use a curator TestingServer and not an in memory persistence provider.

TasksResource test

Implement tests for the TasksResource class. These tests should use a live http client and application.

Create design for multi-dc support

Bug: "Unable to find a volume with persistence id" while creating a new cluster

I am getting this error almost every time when I try to start a new cluster:

INFO  [2016-05-04 03:55:56,117] org.apache.mesos.offer.OfferEvaluator: Found Offer meeting placement constraints: id {
  value: "448075ee-ee69-4fba-9d9d-9157746b3841-O644"
}
framework_id {
  value: "7a570f10-67bd-464b-be11-2696d6c2a5c6-0000"
}
slave_id {
  value: "3db86480-98e4-41c7-a341-195f8cc10cef-S23"
}
hostname: "compute23-sjc1.prod.uber.internal"
resources {
  name: "cpus"
  type: SCALAR
  scalar {
    value: 24.0
  }
  role: "*"
}
resources {
  name: "mem"
  type: SCALAR
  scalar {
    value: 120635.0
  }
  role: "*"
}
resources {
  name: "disk"
  type: SCALAR
  scalar {
    value: 995927.0
  }
  role: "*"
}
resources {
  name: "ports"
  type: RANGES
  ranges {
    range {
      begin: 31000
      end: 32000
    }
  }
  role: "*"
}
attributes {
  name: "host"
  type: TEXT
  text {
    value: "compute23-sjc1"
  }
}
attributes {
  name: "kernel"
  type: SCALAR
  scalar {
    value: 3.18
  }
}
attributes {
  name: "pod"
  type: TEXT
  text {
    value: "d"
  }
}
attributes {
  name: "rack"
  type: SCALAR
  scalar {
    value: 1010.0
  }
}
url {
  scheme: "http"
  address {
    hostname: "compute23-sjc1.prod.uber.internal"
    ip: "10.163.31.21"
    port: 5051
  }
  path: "/slave(1)"
}

INFO  [2016-05-04 03:55:56,117] org.apache.mesos.offer.ResourceUtils: Selected disk: type = ROOTresource = name: "disk"
type: SCALAR
scalar {
  value: 995927.0
}
role: "*"

INFO  [2016-05-04 03:55:56,117] org.apache.mesos.offer.ResourceUtils: Selected disk: type = ROOTresource = name: "disk"
type: SCALAR
scalar {
  value: 995927.0
}
role: "*"

ERROR [2016-05-04 03:55:56,117] org.apache.mesos.offer.OfferEvaluator: Unable to find a volume with persistence id: c9bd2df4-c7ab-4bce-8540-b578c6bb2885
INFO  [2016-05-04 03:55:56,117] org.apache.mesos.offer.OfferEvaluator: VolumeMode is EXISTING and VolumeType is ROOT hasExpectedVolumes is false
INFO  [2016-05-04 03:55:56,117] org.apache.mesos.offer.OfferEvaluator: EnoughCPU: true EnoughMem: true EnoughDisk: true EnoughPorts: true HasExpectedVolumes: false
WARN  [2016-05-04 03:55:56,117] org.apache.mesos.offer.OfferEvaluator: No Offers found meeting Resource constraints.
WARN  [2016-05-04 03:55:56,117] org.apache.mesos.offer.OfferEvaluator: No acceptable offers due to insufficient resources.

CassandraTasks testing

Improve the CassandraTasks test suites.

Unable to Setup Cassandra Cluster

Followed the Documentation and Installation steps mentioned:
After installation of package when i check cassandra connection this is what i get :

dcos cassandra connection
{"address":[],"dns":[]}

stdout on my marathon :
INFO [2016-09-29 21:53:53,562] org.eclipse.jetty.setuid.SetUIDListener: Opened DC/OS Cassandra Service@1da4b6b3{HTTP/1.1}{0.0.0.0:9000}
INFO [2016-09-29 21:53:53,565] org.eclipse.jetty.server.Server: jetty-9.2.14.v20151106
INFO [2016-09-29 21:53:54,372] org.apache.mesos.scheduler.SchedulerDriverFactory: Creating unauthenticated MesosSchedulerDriver for scheduler[com.mesosphere.dcos.cassandra.scheduler.CassandraScheduler@4393593c], frameworkInfo[user: "root"
name: "cassandra"
id {
value: "dae7eb1e-00d3-4417-ad5a-9a1135abc00a-0001"
}
failover_timeout: 604800.0
checkpoint: true
role: "cassandra-role"
principal: "cassandra-principal"
], masterUrl[zk://master.mesos:2181/mesos]
INFO [2016-09-29 21:53:56,368] com.mesosphere.dcos.cassandra.scheduler.CassandraScheduler: Starting driver...
INFO [2016-09-29 21:53:56,369] com.mesosphere.dcos.cassandra.scheduler.CassandraScheduler: Driver started with status: DRIVER_RUNNING
ERROR [2016-09-29 21:53:56,662] com.mesosphere.dcos.cassandra.scheduler.CassandraScheduler: Scheduler driver error: Framework has been removed
INFO [2016-09-29 21:54:07,365] io.dropwizard.jersey.DropwizardResourceConfig: The following paths were found for the configured resources:
5:15 PM
{"name":"cassandra","id":"","version":"1.0.0","user":"root","cluster":"cassandra-cluster","role":"cassandra-role","principal":"cassandra-principal","failover_timeout_s":604800,"secret":"","checkpoint":true}
5:16 PM
{"address":[],"dns":[]} connection
5:17 PM
plan {"phases":[],"errors":[],"status":"COMPLETE"}

Monitoring with nodetool

My cassandra deployments- and I suspect a lot of other people's- is heavily monitored by examination of the output of regularly run nodetool tasks (compactionstats, gossipinfo, tpstats, status, etc) and other JMX based tooling. It's unclear how to reproduce that stuff in this framework, given that JMX is not exposed outside the node containers.

Advice on this? Am I missing something? Anyone working on an approach?

Cassandra 3.5 installation

Can we install cassandra 3.5 on Mesos, instead of 3.0.7?

Unable to deploy cassandra on dc/os 1.8.2

Same error is also given during dse deploy.

Exception in logs is:

Error injecting constructor, org.apache.mesos.config.ConfigStoreException: Failed to retrieve current target configuration from path '/dcos-service-datastax/ConfigTarget'
at com.mesosphere.dcos.cassandra.scheduler.seeds.SeedsManager.(SeedsManager.java:105)
at com.mesosphere.dcos.cassandra.scheduler.SchedulerModule.configure(SchedulerModule.java:204)
while locating com.mesosphere.dcos.cassandra.scheduler.seeds.SeedsManager
1 error
at com.google.inject.internal.Errors.throwCreationExceptionIfErrorsExist(Errors.java:466)
at com.google.inject.internal.InternalInjectorCreator.injectDynamically(InternalInjectorCreator.java:184)
at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:110)
at com.google.inject.Guice.createInjector(Guice.java:96)
at com.google.inject.Guice.createInjector(Guice.java:73)
at com.google.inject.Guice.createInjector(Guice.java:62)
at com.mesosphere.dcos.cassandra.scheduler.Main.run(Main.java:67)
at com.mesosphere.dcos.cassandra.scheduler.Main.run(Main.java:23)
at io.dropwizard.cli.EnvironmentCommand.run(EnvironmentCommand.java:40)
at io.dropwizard.cli.ConfiguredCommand.run(ConfiguredCommand.java:77)
at io.dropwizard.cli.Cli.run(Cli.java:70)
at io.dropwizard.Application.run(Application.java:80)
at com.mesosphere.dcos.cassandra.scheduler.Main.main(Main.java:28)
Caused by: org.apache.mesos.config.ConfigStoreException: Failed to retrieve current target configuration from path '/dcos-service-datastax/ConfigTarget'
at org.apache.mesos.curator.CuratorConfigStore.getTargetConfig(CuratorConfigStore.java:180)
at com.mesosphere.dcos.cassandra.scheduler.config.DefaultConfigurationManager.getTargetName(DefaultConfigurationManager.java:204)
at com.mesosphere.dcos.cassandra.scheduler.config.DefaultConfigurationManager.getTargetConfig(DefaultConfigurationManager.java:213)
at com.mesosphere.dcos.cassandra.scheduler.seeds.SeedsManager.(SeedsManager.java:139)
at com.mesosphere.dcos.cassandra.scheduler.seeds.SeedsManager$$FastClassByGuice$$91cd171b.newInstance()
at com.google.inject.internal.cglib.reflect.$FastConstructor.newInstance(FastConstructor.java:40)
at com.google.inject.internal.DefaultConstructionProxyFactory$1.newInstance(DefaultConstructionProxyFactory.java:61)
at com.google.inject.internal.ConstructorInjector.provision(ConstructorInjector.java:105)
at com.google.inject.internal.ConstructorInjector.construct(ConstructorInjector.java:85)
at com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:267)
at com.google.inject.internal.ProviderToInternalFactoryAdapter$1.call(ProviderToInternalFactoryAdapter.java:46)
at com.google.inject.internal.InjectorImpl.callInContext(InjectorImpl.java:1103)
at com.google.inject.internal.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:40)
at com.google.inject.internal.SingletonScope$1.get(SingletonScope.java:145)
at com.google.inject.internal.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:41)
at com.google.inject.internal.InternalInjectorCreator$1.call(InternalInjectorCreator.java:205)
at com.google.inject.internal.InternalInjectorCreator$1.call(InternalInjectorCreator.java:199)
at com.google.inject.internal.InjectorImpl.callInContext(InjectorImpl.java:1092)
at com.google.inject.internal.InternalInjectorCreator.loadEagerSingletons(InternalInjectorCreator.java:199)
at com.google.inject.internal.InternalInjectorCreator.injectDynamically(InternalInjectorCreator.java:180)
... 11 more

@kow3ns @mohitsoni What do you guys think?

mesosphere-backup / dcos-cassandra-service Goto Github PK

dcos-cassandra-service's People

Contributors

Stargazers

Watchers

Forkers

dcos-cassandra-service's Issues

Recommend Projects

Recommend Topics

Recommend Org