thelastpickle / cassandra-medusa Goto Github PK

View Code? Open in Web Editor NEW

255.0 16.0 132.0 2.8 MB

Apache Cassandra Backup and Restore Tool

License: Apache License 2.0

Python 88.75% Shell 2.43% Gherkin 8.23% Dockerfile 0.59%

cassandra-medusa's Introduction

Medusa for Apache Cassandra™

Medusa is an Apache Cassandra backup system.

Features

Medusa is a command line tool that offers the following features:

Single node backup
Single node restore
Cluster wide in place restore (restoring on the same cluster that was used for the backup)
Cluster wide remote restore (restoring on a different cluster than the one used for the backup)
Backup purge
Support for local storage, Google Cloud Storage (GCS), Azure Blob Storage and AWS S3 (and its compatibles)
Support for clusters using single tokens or vnodes
Full or differential backups

Medusa currently does not support (but we would gladly accept help with changing that):

Cassandra deployments with multiple data folder directories.

Documentation

For user questions and general/dev discussions, please join the #cassandra-medusa channel on the ASF slack at http://s.apache.org/slack-invite.

Docker images

You can find the Docker images for Cassandra Medusa at https://hub.docker.com/r/k8ssandra/medusa.

Dependencies

Medusa requires Python 3.8 or newer.

For information on the packaged dependencies of Medusa for Apache Cassandra® and their licenses, check out our open source report.

cassandra-medusa's People

Contributors

Stargazers

Watchers

Forkers

phanirajl docent-net sayoun bucinko bhagenbourger nicholasamorim troisdiz gitter-badger jfharden pumpkiny9120 bishoprunout piotr-pierzgalski chebelom jsanda kwelity gatling aneumann82 b1ackbeardast rubik-ai sindhuja21 andreybratus srjacobs mclpfr albertkaruna mbursali giltohar mm86133 cohesityit giom-l elaprendizdecoder ilhanadiyaman house-of-vanity pingbase bolteddown mkramer07 nutellinoit convertkit dmitry-saprykin kihahu cajnoj deniszh chidhambaram7 daysofwonder emerkle826 mimarpe mzekri-madar drissireda k8ssandra adamglt burmanm mkcello96 rch rjshrjndrn teads jeffbanks brendancicchi wentingwu666666 simoncaron wangzgboom spintower jdonenine seanevans-m451 adutra alarex kiddom nnazeer mattfellows lo764640 atallahade devopspinar ajmaidak brianwong1861 elsmorian charoensri alizarddrazila yanfenghao-tradedoubler skunnyk hazalozbey maxbrunet vsmit-e21 andreabrantes pierrotws sivakarthikeyanaws andyfoston-sky andyfoston altiorvalerian batchlabs andre-prata alvaropalmeirao landrew57 croitod justinmir isabella232 paulojmdias lidarbu jltejada98 anclrii c3-yuhsuan rtib wanlinxiang

cassandra-medusa's Issues

Large files downloads eat up all memory for S3

As reported in #46, restoring backups with large files can end up failing because of memory usage.
It seems that the simple blob.download() method of libcloud tries to read the data in memory before writing it to disk.

As it was done for uploads, we need to rely on awscli for now in order to use multi part downloads on large files.

restore_mapping.txt has different source/target order in Readme

It seems Readme has different setting for restore_mapping.txt (remote restore)
Here is the code restore_cluster.py

    def _populate_hostmap(self):
        with open(self.host_list, 'r') as f:
            for line in f.readlines():
                seed, target, source = line.replace('\n', '').split(self.config.storage.host_file_separator)
                # in python, bool('False') evaluates to True. Need to test the membership as below
                self.host_map[target.strip()] = {'source': [source.strip()], 'seed': seed in ['True']}

In contrast, Readme has seed, source, target configurations.

<Is it a seed node?>,<source node>,<target node>

Please check this. Thanks

Local backups not connected

Environment:
3-node Cassandra cluster in Docker
Debian GNU/Linux 9 (stretch)
Python 3.7.5

I have an ops container and 3 Cassandra containers. All else appears to be working without issue (this is a standard cluster I use for testing, now with medusa added), but when doing medusa backups to local storage, the backups on each node aren't connected.

From the ops container, I run:
cstar run --command="medusa backup --backup-name=testbackup3" --seed-host=172.16.238.2 --strategy=all

All ends well:

+++
3 done, 0 failed, 0 executing

But restore does not work, reporting that backups are incomplete. On each Cassandra node, the backups have run but aren't connected. Each node has its own backup and reports missing for the other 2 nodes. Here is an example of the status:

medusa status --backup-name testbackup3
testbackup3 [Incomplete!]
- Started: 2019-11-25 16:09:24, Finished: never
- 1 nodes completed, 0 nodes incomplete, 2 nodes missing
- Missing nodes:
    DC1C1.cassandra-medusa_myring
    DC1C3.cassandra-medusa_myring
- 193 files, 355.51 KB

Here's the only section changed from the default medusa.ini:

[storage]
storage_provider = local
; storage_provider should be either of "local", "google_storage" or the s3_* values from
; https://github.com/apache/libcloud/blob/trunk/libcloud/storage/types.py
bucket_name = testbackup20191124
;key_file = <JSON key file for service account with access to GCS bucket or AWS credentials file (home-dir/.aws/credentials)>
base_path = /home
;prefix = <Any prefix used for multitenancy in the same bucket>
;fqdn = <enforce the name of the local node. Computed automatically if not provided.>
;max_backup_age = <number of days before backups are purged. 0 means backups don't get purged by age (default)>
;max_backup_count = <number of backups to retain. Older backups will get purged beyond that number. 0 means backups don't get purged by count (default)>
; Both thresholds can be defined for backup purge.

detail step by step installation about cassandra-medusa tool

Hi there

i am looking for detail step by step installation about cassandra-medusa tool?

your setup page is not enough, i got many issues about that..

for example there is no any mentioning about requirements before installation..

please someone can help me about details, step by step installation for this beauty tool..

Thanks

Ned

clean_path fails with mounted file systems

The clean path command fails if the folder to be wiped is a docker volume mount

Changing

subprocess.check_output(['sudo', '-u', p.owner(), 'rm', '-rf', str(p)])
to
subprocess.check_output(['sudo', '-u', p.owner(), 'rm', '-rf', str(p) + '/*'])
should allow the folder to be cleared successfully

Use the strongest constraint for purges

Project board link

When purging old backups, there are 2 options:

Remove backups older than X
Do not keep more than Y backups (a count).

Here michaelsembwever reported that when setting X to 30 days and Y (the count) to 15, with a daily backup, it seems that we are not respecting the second rule and go over the 15 backups and keeping the 30 backups (1 per day for 30 days).

The more natural behavior here would probably to apply the most strict constraint when both are set.

To express it more clear, the purge should trigger as soon as any constraint of the constraints is reached. In my example we should have no more than 15 backups and 15 days of history, not 30 as we medusa will purge after that.

┆Issue is synchronized with this Jira Task by Unito
┆Issue Number: K8SSAND-326

Set nodetool parameters

We need to set nodetool parameters like a password or a port. I suggest to add a nodetool section into the medusa.ini config file and use parameters defined into this section to build nodetool commands.
The nodetool section could have these parameters, only set parameters will be used including:

username
password
passwordFilePath
host
port

Unable to backup over secure port? (9142)

We have a requirement that all data in transit must be encrypted, so we have disabled traffic to the default port (9042) and enabled SSL on port 9142.

medusa fails with this error (IP addresses masked):
ERROR: This error happened during the backup: ('Unable to connect to any servers', {'xx.xxx.xx.xxx:9042': OSError(None, "Tried connecting to [('xx.xxx.xx.xxx', 9042)]. Last error: timed out")})
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/medusa/backup.py", line 252, in main
schema, tokenmap = get_schema_and_tokenmap(cassandra)
File "/usr/local/lib/python3.6/dist-packages/retrying.py", line 49, in wrapped_f
return Retrying(*dargs, **dkw).call(f, *args, **kw)
File "/usr/local/lib/python3.6/dist-packages/retrying.py", line 212, in call
raise attempt.get()
File "/usr/local/lib/python3.6/dist-packages/retrying.py", line 247, in get
six.reraise(self.value[0], self.value[1], self.value[2])
File "/usr/lib/python3/dist-packages/six.py", line 693, in reraise
raise value
File "/usr/local/lib/python3.6/dist-packages/retrying.py", line 200, in call
attempt = Attempt(fn(*args, **kwargs), attempt_number, False)
File "/usr/local/lib/python3.6/dist-packages/medusa/backup.py", line 296, in get_schema_and_tokenmap
with cassandra.new_session() as cql_session:
File "/usr/local/lib/python3.6/dist-packages/medusa/cassandra_utils.py", line 267, in new_session
return self._cql_session_provider.new_session()
File "/usr/local/lib/python3.6/dist-packages/medusa/cassandra_utils.py", line 86, in new_session
session = cluster.connect()
File "cassandra/cluster.py", line 1429, in cassandra.cluster.Cluster.connect
File "cassandra/cluster.py", line 1465, in cassandra.cluster.Cluster.connect
File "cassandra/cluster.py", line 1452, in cassandra.cluster.Cluster.connect
File "cassandra/cluster.py", line 2996, in cassandra.cluster.ControlConnection.connect
File "cassandra/cluster.py", line 3041, in cassandra.cluster.ControlConnection._reconnect_internal

Is there any way to configure medusa to connect to Cassandra securely?

Thanks,
John

Build and publish RPM packages

Project board link

To ease up installation of Medusa, we need to build rpm packages for releases, and publish them to bintray.

┆Issue is synchronized with this Jira Task by Unito
┆Issue Number: K8SSAND-327

backup show completed but medusa status show incomplete.

Hello,
I used pssh to take backup on my 2 node cluster using
pssh -H "host1 host2" medusa backcup --backup-name=22012020

[2020-01-22 09:35:21,338] INFO: Updating backup index [2020-01-22 09:35:22,161] INFO: Backup done [2020-01-22 09:35:22,161] INFO: - Started: 2020-01-22 09:34:50 - Started extracting data: 2020-01-22 09:34:52 - Finished: 2020-01-22 09:35:22 [2020-01-22 09:35:22,161] INFO: - Real duration: 0:00:29.203785 (excludes time waiting for other nodes) [2020-01-22 09:35:22,162] INFO: - 261 files, 1.23 MB [2020-01-22 09:35:22,162] INFO: - 261 files copied from host
Upon executing medusa status --backup-name=22012020
It shows incomplete. There is apprx no data in this test cluster.

`22012020 [Incomplete!]

Started: 2020-01-22 09:34:53, Finished: never
2 nodes completed, 0 nodes incomplete, 1 nodes missing
Missing nodes:
host2
531 files, 2.45 MB`

Configuring medusa

Project board link

I just did a setup for medusa.
OK, and started "medusa backup"
The texts just got stuck into
"[rootdse1b medusa]# medusa backup --backup-name test
[2020-02-21 12:16:21,854] INFO: Monitoring provider is noop
[2020-02-21 12:16:21,861] WARNING: is ccm : 0
[2020-02-21 12:16:22,170] INFO: Creating snapshot
[2020-02-21 12:16:22,170] INFO: Saving tokenmap and schema"

until I replaced the listen_address:
to be the address of the rpc for the server
in /etc/cassandra/conf/cassandra.yaml

in the long run this of course is not acceptable

┆Issue is synchronized with this Jira Task by Unito
┆Issue Number: K8SSAND-332

Configuring for a dynamic data directory

Project board link

We need to pass in a dynamic data directory different than configured in the yaml file. Is there a way to pass the data directory in any flag or the .ini? Alternatively, what is Medusa using the cassandra.yaml file for? (Could we feed it a different yaml than what Cassandra is using for startup?)

┆Issue is synchronized with this Jira Task by Unito
┆Issue Number: K8SSAND-322

Warning "Can't fetch IAM Role" thrown when using AWS credential file

This is a very minor issue but in release 0.5.0 where support for env based S3 authentication has been added a warning is written to the logs each time a file is uploaded if you are using authentication via an AWS credentials file.

Maybe the log level should be set to DEBUG for this message

Differential mode uploading all files from host

I've successfully been able to backup to S3 and restore my node data using medusa 0.3.0 but am finding that regardless whether I use mode=full or mode=differential all files are being uploaded from the host
In a simple test case where I simply backup the system keyspaces and run a full backup first and then a differential backup immediately afterwards I get the following output

[2019-12-02 17:08:31] INFO: Backup done
[2019-12-02 17:08:31] INFO: - Started: 2019-12-02 17:07:28
- Started extracting data: 2019-12-02 17:07:32
- Finished: 2019-12-02 17:08:31
[2019-12-02 17:08:31] INFO: - Real duration: 0:00:58.868730 (excludes time waiting for other nodes)
[2019-12-02 17:08:31] INFO: - 376 files, 283.62 KB
[2019-12-02 17:08:31] INFO: - 376 files copied from host
[2019-12-02 17:08:31] INFO: - 0 copied from previous backup (test2)

medusa download fails due to missing parameter

medusa download --backup-name download-broken --download-destination /mnt/temp/

Traceback (most recent call last):
  File "/usr/local/bin/medusa", line 11, in <module>
    sys.exit(cli())
  File "/usr/local/lib/python3.5/dist-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.5/dist-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.5/dist-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.5/dist-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/click/decorators.py", line 64, in new_func
    return ctx.invoke(f, obj, *args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/medusa/medusacli.py", line 152, in download
    medusa.download.download_cmd(medusaconfig, backup_name, Path(download_destination))
  File "/usr/local/lib/python3.5/dist-packages/medusa/download.py", line 75, in download_cmd
    download_data(config.storage, node_backup, download_destination)
TypeError: download_data() missing 1 required positional argument: 'destination'

Support S3 RGW (CephFS)

The .ini file states that storage_provider should be either of "local", "google_storage" or the s3_* values from

However, I think the s3_* only applies to the classes that define AWS S3 regions and not S3_RGW.

Attempting to use IAM_Role and medusa

Ubuntu 16.04
python3.6

$medusa
Traceback (most recent call last):
File "/usr/local/bin/medusa", line 7, in
from medusa.medusacli import cli
File "/usr/local/lib/python3.5/dist-packages/medusa/medusacli.py", line 33, in
import medusa.backup
File "/usr/local/lib/python3.5/dist-packages/medusa/backup.py", line 33, in
from medusa.index import add_backup_start_to_index, add_backup_finish_to_index, set_latest_backup_in_index
File "/usr/local/lib/python3.5/dist-packages/medusa/index.py", line 21, in
import medusa.storage
File "/usr/local/lib/python3.5/dist-packages/medusa/storage/init.py", line 32, in
from medusa.storage.google_storage import GoogleStorage
File "/usr/local/lib/python3.5/dist-packages/medusa/storage/google_storage.py", line 26, in
from medusa.storage.abstract_storage import AbstractStorage
File "/usr/local/lib/python3.5/dist-packages/medusa/storage/abstract_storage.py", line 25, in
import medusa.storage.concurrent
File "/usr/local/lib/python3.5/dist-packages/medusa/storage/concurrent.py", line 161
return f"{size:.{decimal_places}f}{unit}"
^
SyntaxError: invalid syntax

Cannot use AWS Stockholm eu-north-1 without latest libcloud

This got fixed in libcloud commit (2aad12a)[https://github.com/apache/libcloud/commit/2aad12ad42eb67a293a7bf675ce65a8e31c951a5].

Once released, cassandra-medusa can upgrade to it, allowing eu-north-1 (and other new aws regions) to be used.

In the meantime the workaround is to manually install that libcloud version by doing

pip3 install git+https://github.com/apache/libcloud.git@2aad12ad42eb67a293a7bf675ce65a8e31c951a5 --force-reinstall

Cluster restore on same hardware is not working

Cluster restore on same hardware is not working. This is three node cluster. Made sure agent forwarding is working. medusa and medusa-wrapper paths are fine. medusa is stopping cassandra service but restore is not working. OS is CentOS 6.10.

[svarupula@test1 ~]$ ssh -A root@localhost
Last login: Tue Feb 11 09:12:30 2020 from test2.sandbox.com

[root@test1 ~]# env | grep SSH_AUTH_SOCK
SSH_AUTH_SOCK=/tmp/ssh-bSGtkZ5120/agent.5120

[root@test1 ~]# medusa restore-cluster --backup-name=test --seed-target test2.sandbox.com
[2020-02-11 09:15:58,500] INFO: Monitoring provider is noop
[2020-02-11 09:15:58,500] INFO: system_auth keyspace will be overwritten with the backup on target nodes
[2020-02-11 09:15:58,649] INFO: Ensuring the backup is found and is complete
[2020-02-11 09:15:58,675] INFO: Restore will happen "In-Place", no new hardware is involved
[2020-02-11 09:15:59,459] INFO: Starting Restore on all the nodes in this list: None
[2020-02-11 09:15:59,459] INFO: Starting cluster restore...
[2020-02-11 09:15:59,459] INFO: Working directory for this execution: /tmp/medusa-job-bd8a39ca-a5ea-4a3a-820f-0fa6ddc5130a
[2020-02-11 09:15:59,460] INFO: About to restore on test1.sandbox.com using {'source': ['test1.sandbox.com'], 'seed': False} as backup source
[2020-02-11 09:15:59,460] INFO: About to restore on test2.sandbox.com using {'source': ['test2.sandbox.com'], 'seed': False} as backup source
[2020-02-11 09:15:59,460] INFO: About to restore on test3.sandbox.com using {'source': ['test3.sandbox.com'], 'seed': False} as backup source
[2020-02-11 09:15:59,460] INFO: This will delete all data on the target nodes and replace it with backup test.
Are you sure you want to proceed? (Y/n)Y
[2020-02-11 09:16:06,026] INFO: target seeds : []
[2020-02-11 09:16:06,026] INFO: Stopping Cassandra on all nodes currently up
[2020-02-11 09:16:06,026] INFO: Executing "/etc/init.d/cassandra stop" on all nodes.
[2020-02-11 09:16:06,423] INFO: Job executing "/etc/init.d/cassandra stop" ran and finished Successfully on all nodes.
[2020-02-11 09:16:06,423] INFO: Restoring data on test1.sandbox.com...
[2020-02-11 09:16:06,423] INFO: Restoring data on test2.sandbox.com...
[2020-02-11 09:16:06,423] INFO: Restoring data on test3.sandbox.com...
[2020-02-11 09:16:06,423] INFO: Executing "nohup sh -c "mkdir /tmp/medusa-job-bd8a39ca-a5ea-4a3a-820f-0fa6ddc5130a; cd /tmp/medusa-job-bd8a39ca-a5ea-4a3a-820f-0fa6ddc5130a && medusa-wrapper sudo medusa --fqdn=%s -vvv restore-node --in-place %s --no-verify --backup-name test --temp-dir /tmp "" on all nodes.
[2020-02-11 09:16:06,840] INFO: Job executing "nohup sh -c "mkdir /tmp/medusa-job-bd8a39ca-a5ea-4a3a-820f-0fa6ddc5130a; cd /tmp/medusa-job-bd8a39ca-a5ea-4a3a-820f-0fa6ddc5130a && medusa-wrapper sudo medusa --fqdn=%s -vvv restore-node --in-place %s --no-verify --backup-name test --temp-dir /tmp "" ran and finished with errors on following nodes: ['test1.sandbox.com', 'test2.sandbox.com', 'test3.sandbox.com']
[2020-02-11 09:16:06,840] INFO: [test1.sandbox.com] nohup: ignoring input and appending output to nohup.out' [2020-02-11 09:16:06,840] INFO: test1.sandbox.com-stdout: nohup: ignoring input and appending output to nohup.out'
[2020-02-11 09:16:06,840] INFO: [test2.sandbox.com] nohup: ignoring input and appending output to nohup.out' [2020-02-11 09:16:06,840] INFO: test2.sandbox.com-stdout: nohup: ignoring input and appending output to nohup.out'
[2020-02-11 09:16:06,841] INFO: [test3.sandbox.com] nohup: ignoring input and appending output to nohup.out' [2020-02-11 09:16:06,841] INFO: test3.sandbox.com-stdout: nohup: ignoring input and appending output to nohup.out'
[2020-02-11 09:16:06,842] ERROR: Some nodes failed to restore. Exiting
[2020-02-11 09:16:06,842] ERROR: This error happened during the cluster restore: Some nodes failed to restore. Exiting
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/medusa/restore_cluster.py", line 72, in orchestrate
restore.execute()
File "/usr/local/lib/python3.6/site-packages/medusa/restore_cluster.py", line 147, in execute
self._restore_data()
File "/usr/local/lib/python3.6/site-packages/medusa/restore_cluster.py", line 349, in _restore_data
raise Exception(err_msg)
Exception: Some nodes failed to restore. Exiting

[root@test1 ~]# cat nohup.out
sh: medusa-wrapper: command not found

[root@test1 ~]# which medusa-wrapper
/usr/local/bin/medusa-wrapper

[root@test1 ~]# ls -l /tmp/medusa-job-bd8a39ca-a5ea-4a3a-820f-0fa6ddc5130a
total 0

[root@test2 ~]# ls -l /tmp/medusa-job-bd8a39ca-a5ea-4a3a-820f-0fa6ddc5130a
total 0

[root@test3 ~]# ls -l /tmp/medusa-job-bd8a39ca-a5ea-4a3a-820f-0fa6ddc5130a
total 0

Support for Cassandra in Kubernetes

Greetings,

This looks like a great and much-needed project.

Saw the comment from this TLP blog post regarding compatibility with Cassandra running in Docker containers (e.g. a Kubernetes stateful set using persistent volumes for storage), and want to confirm that you do not recommend testing cassandra-medusa in Kubernetes at this time.

Unable to restore-cluster where topology does not match

I am trying to restore a cluster into a new cluster where the topology doesn't match, but I get a python exception:

# medusa restore-cluster --host-list /etc/medusa/restore-mapping.txt --backup-name=test001 --use-sstableloader --temp-dir /mnt/cassandra/data01/__MEDUSA_TMP_DIR/
[2020-01-07 14:56:09,577] INFO: Monitoring provider is noop
[2020-01-07 14:56:09,577] INFO: system_auth keyspace will be left untouched on the target nodes
[2020-01-07 14:56:10,383] INFO: Ensuring the backup is found and is complete
[2020-01-07 14:56:10,406] INFO: Restore will happen on new hardware
[2020-01-07 14:56:10,406] INFO: Starting Restore on all the nodes in this list: /etc/medusa/restore-mapping.txt
[2020-01-07 14:56:10,406] INFO: Starting cluster restore...
[2020-01-07 14:56:10,406] INFO: Working directory for this execution: /mnt/cassandra/data01/__MEDUSA_TMP_DIR/medusa-job-a27776bf-eb6e-4950-99b3-d619aa8a1a38
[2020-01-07 14:56:10,406] INFO: About to restore on 172.17.41.10 using {'source': ['ip-172-17-44-6.eu-west-1.compute.internal'], 'seed': True} as backup source
[2020-01-07 14:56:10,406] INFO: This will delete all data on the target nodes and replace it with backup test001.
Are you sure you want to proceed? (Y/n)Y
[2020-01-07 14:56:11,891] INFO: target seeds : ['172.17.41.10']
[2020-01-07 14:56:11,891] INFO: Restoring schema on the target cluster
[2020-01-07 14:56:11,934] ERROR: This error happened during the cluster restore: 'NoneType' object has no attribute 'new_session'
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/medusa/restore_cluster.py", line 72, in orchestrate
    restore.execute()
  File "/usr/local/lib/python3.7/site-packages/medusa/restore_cluster.py", line 147, in execute
    self._restore_data()
  File "/usr/local/lib/python3.7/site-packages/medusa/restore_cluster.py", line 328, in _restore_data
    self._restore_schema()
  File "/usr/local/lib/python3.7/site-packages/medusa/restore_cluster.py", line 384, in _restore_schema
    with self.session_provider.new_session() as session:
AttributeError: 'NoneType' object has no attribute 'new_session'

As far as I can see this code path can't work. The session_provider is only set in the if statement where you have given a seed provider:

https://github.com/thelastpickle/cassandra-medusa/blob/master/medusa/restore_cluster.py#L130-L144

if you have provided a host list instead you will hit the exception above, providing both is an error:

# medusa restore-cluster --seed-target 172.17.41.10 --host-list /etc/medusa/restore-mapping.txt --backup-name=test001 --use-sstableloader --temp-dir /mnt/cassandra/data01/__MEDUSA_TMP_DIR/
[2020-01-07 14:57:44,946] INFO: Monitoring provider is noop
[2020-01-07 14:57:44,946] ERROR: You must either provide a seed target or a list of host, not both
[2020-01-07 14:57:44,946] ERROR: This error happened during the cluster restore: You must either provide a seed target or a list of host, not both
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/medusa/restore_cluster.py", line 49, in orchestrate
    raise Exception(err_msg)
Exception: You must either provide a seed target or a list of host, not both

It's perfectly possible I'm doing something wrong but the docs on how to restore-cluster to an alternate topology are a little thin on the ground (I'm happy to try and help with this if I can figure out the correct incantations).

Support for IAM role?

I'm using S3 as the storage and according to the setup, access key and secret access key are required. Is there support to use IAM role instead? If not, do you have an idea how you want it to be supported?
If libcloud supports IAM role, or you have a preference on what lib to use, I might be able to help implementing it.

Thanks,
Yanan

fqdn parameter from /etc/medusa/medusa.ini is not being read

it only works if I set it in the command like i.e.

medusa --fqdn hostname backup

medusa support aliyun cloud oss storage

Project board link

I think it is ok for us to support aliyun cloud oss storage after aws s3 and google cloud storage is supported. I am doing this job now, I want to know if it is ok to merge the code to trunk ? adejanovski

┆Issue is synchronized with this Jira Task by Unito
┆Issue Number: K8SSAND-330

Allow Aws-cli to look for or accept different paths in case running in an venv

Hello sometimes medusa isn't able to find awscli binary because it assumes its always installed the primary python environments. This issue arrives for me as I build an rpm which installs everything in a venv .

SO either provide a config value for it or dynamically try to find it like so #77

Thanks I realize this request might be a little niche :)

Restore-cluster fails on a ScyllaDB Cluster

When running medusa restore-cluster or medusa restore-node the command fails with the following error:

[2020-02-18 16:05:52,497] DEBUG: Restoring table view_virtual_columns-08843b6345dc3be29798a0418295cfaa with sstableloader...
Unrecognized option: --storage-port
Traceback (most recent call last):
  File "/usr/local/bin/medusa", line 11, in <module>
    load_entry_point('cassandra-medusa==0.6.0.dev0', 'console_scripts', 'medusa')()
  File "/usr/local/lib/python3.5/dist-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.5/dist-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.5/dist-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.5/dist-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/click/decorators.py", line 64, in new_func
    return ctx.invoke(f, obj, *args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/medusa/medusacli.py", line 212, in restore_node
    verify, set(keyspaces), set(tables), use_sstableloader)
  File "/usr/local/lib/python3.5/dist-packages/medusa/restore_node.py", line 51, in restore_node
    keyspaces, tables)
  File "/usr/local/lib/python3.5/dist-packages/medusa/restore_node.py", line 152, in restore_node_sstableloader
    invoke_sstableloader(config, download_dir, keep_auth, fqtns_to_restore, cassandra.storage_port)
  File "/usr/local/lib/python3.5/dist-packages/medusa/restore_node.py", line 176, in invoke_sstableloader
    os.path.join(ks_path, table)])
  File "/usr/local/lib/python3.5/dist-packages/gevent/subprocess.py", line 348, in check_output
    raise CalledProcessError(retcode, process.args, output=output)
subprocess.CalledProcessError: Command '['sstableloader', '-d', 'scylladb-test-4a53de998b', '--storage-port', '7000', '--username', 'xxx', '--password', 'xxx', '--no-progress', '/tmp/medusa-restore-55d11fa6-7324-499f-b2d3-24faedefa942/system_schema/view_virtual_columns-08843b6345dc3be29798a0418295cfaa']' returned non-zero exit status 1

The problem is that the sstableloader binary on ScyllaDB is different than Cassandra and it doesn't have the option --storage-port. The workaround that I did was to comment/remove the storage_port arg references on restore_node.py file. With this change, the restore runs successfully.
So I wonder how can we arrange a more elegant workaround to keep compatibility between Cassandra and ScyllaDB.

Create a publish debian packages

To ease up installation of Medusa, we need to build debian packages for releases, and publish them to bintray.

ERROR: This error happened during the backup: Medusa only supports one data directory

It seems, medusa does not support C* nodes with more than 1 data directory.
I believe, that it would be very usefull, to mention this in DOCs.

Are there any plans to support it in the future ? ;-)

BTW. It seems to be excellent tool for backup & restore . Lightweight (as opposed to priam or cassy), and having good foundaion (no JAVA !).

But I can't use it right now :(

Best regards
Piotr Rybicki

Purge leaves behind incomplete backups... forever!

Project board link

michaelsembwever reported an issue as he had no way to remove incomplete backups during his tests:

No delete function - addressed in #18
No purge of incomplete backups. That's what we should fix here.

This leaves the user with no way of cleaning backups without accessing the data storage and delete files, hoping to do it right or leave the data behind forever.

It seems this is all about making purge considering any started backup instead of only checking the complete backups.

┆Issue is synchronized with this Jira Task by Unito
┆Issue Number: K8SSAND-325

Log file or log streaming configuration

Is there any simple way to configure file based logging or the streaming of the log messages from the medusacli to a centralised logging system ?

It would be useful to be able to set up alerting based on these message to ensure that the backup processes have been successful.

Medusa not stopping cassandra as service properly.

While I am restoring a single node backup on same node, medusa is not stopping cassandra properly. I doubt it is removing "commitlogs" folder while cassandra shutdown (/etc/init.d/cassandra stop) is happening. So, shutdown is not clean. After restore, cassandra is not starting up as service (/etc/init.d/cassandra start). So I have to run "cassandra stop" and "cassandra start" again. I am on CentOS 6.10.

Medusa has to wait till cassandra is shutdown gracefully before removing commitlogs folder.

[root@localhost ~]# medusa restore-node --backup-name test4 --temp-dir /localhost/data/cassandra/tmp/ --verify --in-place
[2020-02-06 06:40:57,253] WARNING: is ccm : 0
[2020-02-06 06:40:57,283] INFO: Downloading data from backup to /localhost/data/cassandra/tmp/medusa-restore-efc46e67-f299-4e10-bc98-a3397c7fcf97
[2020-02-06 06:41:14,654] INFO: Stopping Cassandra
[2020-02-06 06:41:14,698] INFO: Moving backup data to Cassandra data directory
[2020-02-06 06:41:16,010] INFO: No --seeds specified so we will not wait for any
[2020-02-06 06:41:16,010] INFO: Starting Cassandra
[2020-02-06 06:41:16,024] INFO: Verifying the restore
[2020-02-06 06:41:16,024] INFO: Waiting for Cassandra to come up on localhost.localhost.net
[2020-02-06 06:41:17,833] INFO: Cassandra is up on localhost.localhost.net
[2020-02-06 06:41:17,834] INFO: Executing restore verify query: select * from tutorialspoint.emp;
Exception: Could not establish CQL session after 5

######## Error LOG

INFO [StorageServiceShutdownHook] 2020-02-06 06:41:14,674 HintsService.java:209 - Paused hints dispatch
INFO [StorageServiceShutdownHook] 2020-02-06 06:41:14,677 Server.java:179 - Stop listening for CQL clients
INFO [StorageServiceShutdownHook] 2020-02-06 06:41:14,678 Gossiper.java:1647 - Announcing shutdown
INFO [StorageServiceShutdownHook] 2020-02-06 06:41:14,679 StorageService.java:2442 - Node /127.0.0.1 state jump to shutdown
INFO [StorageServiceShutdownHook] 2020-02-06 06:41:16,681 MessagingService.java:985 - Waiting for messaging service to quiesce
INFO [ACCEPT-/127.0.0.1] 2020-02-06 06:41:16,682 MessagingService.java:1346 - MessagingService has terminated the accept() thread
INFO [StorageServiceShutdownHook] 2020-02-06 06:41:16,988 HintsService.java:209 - Paused hints dispatch
ERROR [COMMIT-LOG-ALLOCATOR] 2020-02-06 06:41:16,993 StorageService.java:465 - Stopping gossiper
WARN [COMMIT-LOG-ALLOCATOR] 2020-02-06 06:41:16,994 StorageService.java:322 - Stopping gossip by operator request
INFO [COMMIT-LOG-ALLOCATOR] 2020-02-06 06:41:16,994 Gossiper.java:1647 - Announcing shutdown
INFO [COMMIT-LOG-ALLOCATOR] 2020-02-06 06:41:16,995 StorageService.java:2442 - Node /127.0.0.1 state jump to shutdown
ERROR [StorageServiceShutdownHook] 2020-02-06 06:41:16,999 AbstractCommitLogSegmentManager.java:313 - Failed waiting for a forced recycle of in-use commit log segments
java.lang.AssertionError: attempted to delete non-existing file CommitLog-6-1580971059608.log
at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:133) ~[apache-cassandra-3.11.5.jar:3.11.5]
at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:160) ~[apache-cassandra-3.11.5.jar:3.11.5]
at org.apache.cassandra.db.commitlog.CommitLogSegmentManagerStandard.discard(CommitLogSegmentManagerStandard.java:37) ~[apache-cassandra-3.11.5.jar:3.11.5]
at org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager.archiveAndDiscard(AbstractCommitLogSegmentManager.java:329) ~[apache-cassandra-3.11.5.jar:3.11.5]
at org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager.forceRecycleAll(AbstractCommitLogSegmentManager.java:303) ~[apache-cassandra-3.11.5.jar:3.11.5]
at org.apache.cassandra.db.commitlog.CommitLog.forceRecycleAllSegments(CommitLog.java:208) [apache-cassandra-3.11.5.jar:3.11.5]
at org.apache.cassandra.service.StorageService.drain(StorageService.java:4693) [apache-cassandra-3.11.5.jar:3.11.5]
at org.apache.cassandra.service.StorageService$1.runMayThrow(StorageService.java:681) [apache-cassandra-3.11.5.jar:3.11.5]
at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) [apache-cassandra-3.11.5.jar:3.11.5]
at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:84) [apache-cassandra-3.11.5.jar:3.11.5]
at java.lang.Thread.run(Unknown Source) ~[na:1.8.0_171]
ERROR [COMMIT-LOG-ALLOCATOR] 2020-02-06 06:41:18,997 CommitLog.java:464 - Failed managing commit log segments. Commit disk failure policy is stop; terminating thread
org.apache.cassandra.io.FSWriteError: java.nio.file.NoSuchFileException: /localhost/data/cassandra/commitlog/CommitLog-6-1580971059610.log
at org.apache.cassandra.db.commitlog.CommitLogSegment.(CommitLogSegment.java:174) ~[apache-cassandra-3.11.5.jar:3.11.5]
at org.apache.cassandra.db.commitlog.MemoryMappedSegment.(MemoryMappedSegment.java:45) ~[apache-cassandra-3.11.5.jar:3.11.5]
at org.apache.cassandra.db.commitlog.CommitLogSegment.createSegment(CommitLogSegment.java:131) ~[apache-cassandra-3.11.5.jar:3.11.5]
at org.apache.cassandra.db.commitlog.CommitLogSegmentManagerStandard.createSegment(CommitLogSegmentManagerStandard.java:78) ~[apache-cassandra-3.11.5.jar:3.11.5]
at org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager$1.runMayThrow(AbstractCommitLogSegmentManager.java:110) ~[apache-cassandra-3.11.5.jar:3.11.5]
at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) [apache-cassandra-3.11.5.jar:3.11.5]
at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:84) [apache-cassandra-3.11.5.jar:3.11.5]
at java.lang.Thread.run(Unknown Source) ~[na:1.8.0_171]
Caused by: java.nio.file.NoSuchFileException: /localhost/data/cassandra/commitlog/CommitLog-6-1580971059610.log
at sun.nio.fs.UnixException.translateToIOException(Unknown Source) ~[na:1.8.0_171]
at sun.nio.fs.UnixException.rethrowAsIOException(Unknown Source) ~[na:1.8.0_171]
at sun.nio.fs.UnixException.rethrowAsIOException(Unknown Source) ~[na:1.8.0_171]
at sun.nio.fs.UnixFileSystemProvider.newFileChannel(Unknown Source) ~[na:1.8.0_171]
at java.nio.channels.FileChannel.open(Unknown Source) ~[na:1.8.0_171]
at java.nio.channels.FileChannel.open(Unknown Source) ~[na:1.8.0_171]
at org.apache.cassandra.db.commitlog.CommitLogSegment.(CommitLogSegment.java:169) ~[apache-cassandra-3.11.5.jar:3.11.5]
... 7 common frames omitted

Differential mode uploads everything again

Related to #25

Running medusa backup --backup-name=20191211-1 --mode=differential
And then medusa backup --backup-name=20191211-2 --mode=differential
And then medusa backup --backup-name=20191211-3 --mode=differential

Always uploads all files again.

There's a comment in #25:

Do you have a custom (non system) keyspace in there with sstables that didn't get compacted away between the backups?

Does this make a difference? I expected that if I run one after the other, just the differences would be copied (almost none).

I'm using my branch - the one with the changes in PR #32 although I can't see how that would change the tool behaviour if the libcloud interface is used.

Backup only specific keyspace(s)

It is possible to recover a single keyspace. Is there any particular reason it is not possible to backup only one keyspace?

Would that be a simple addition?

Support secondary indexes

Medusa currently doesn't support backing up and restoring secondary indexes.
We need to add this ability so they do not require to be rebuilt after a restore.

Add commit log backup to Medusa

Project board link

Allow Medusa to backup commit logs.

┆Issue is synchronized with this Jira Task by Unito
┆Issue Number: K8SSAND-318

Cannot backup production loads to AWS s3 buckets ("Broken pipe")

Uploading large SSTables to an s3 bucket often experiences BrokenPipeError failures, so in practice production backups to an s3 bucket will fail.

The Apache Libcloud library used has no retry policy when uploading chunks/mutiparts to an s3 bucket. In addition the Medusa code does not use the chunk/multipart upload libcloud api.

The fix for this is two part:

the Medusa code needs to use libcloud's api for multipart uploads, and deal with AWS's different approach to hashing multipart-uploaded files, and
the libcloud library needs to have a retry policy added to it.

(1) is available in https://github.com/thelastpickle/cassandra-medusa/compare/alex/fix-s3-broken-pipes

(2) is available in apache/libcloud@trunk...thelastpickle:retry-s3-multipart-uploads

Add functionality to delete specific backups

Backups may be created by testing or fail and remain in an incomplete state. Such backups are useless and wasting resources and money.

There should be a way to delete such backups without having to go into the s3 buckets are manually delete stuff (which is not trivial because of the layout medusa creates there)

MD5 hash checksum does not match

I'm trying to backup a 6-node cluster and this keeps happening.

3 nodes out of 6 finish fine but 3 always fail (apparently random machines).

The failed ones contain logs like this:

[2020-01-21 09:04:12,816] INFO: Uploading /cassandradb/data/sm_scotia/activity_v2-e1dbfa20f27b11e9b97a75eb25aa986a/snapshots/medusa-3943bf26-f320-4bd9-8637-013b06ec4ffa/md-118609-big-Statistics.db (17.950KiB)
[2020-01-21 09:04:12,878] DEBUG: Cleaning up snapshot
[2020-01-21 09:04:12,878] DEBUG: Executing: nodetool clearsnapshot -t medusa-3943bf26-f320-4bd9-8637-013b06ec4ffa
[2020-01-21 09:04:14,734] DEBUG: nodetool output: Requested clearing snapshot(s) for [all keyspaces] with snapshot name [medusa-3943bf26-f320-4bd9-8637-013b06ec4ffa]

[2020-01-21 09:04:14,735] ERROR: This error happened during the backup: <ObjectHashMismatchError in <libcloud.storage.drivers.rgw.S3RGWStorageDriver object at 0x7f82608ec190>, value=MD5 hash  checksum does not match 729db185d04fb6a33c6b231c9d09a215, object = pf-ca1-cc3/data/sm_scotia/activity_v2-e1dbfa20f27b11e9b97a75eb25aa986a/md-108464-big-Data.db>

Any tips or reasoning for this? Log level is already debug so there's not more info than that.

restore-cluster command ignore --prefix argument

Project board link

For two clusters with different "prefix" settings in the medusa.ini, medusa should take the --prefix from command line and overwrite that property in medusa.ini during the restore-cluster process.
Otherwise error will be thrown: backup not found.

Command:
medusa restore-cluster
--prefix={the prefix of backup on S3 storage}
--backup-name=.....
.....

┆Issue is synchronized with this Jira Task by Unito
┆Issue Number: K8SSAND-333

Gather up all logs in one place

Project board link

Medusa is writing standard logs that are well redirected the log folder in use /var/log/medusa, but some commands that runs on each distinct host are actually logging in /tmp/<workdir>/stdout and /tmp/<workdir>/stderr.

This is disturbing at the moment of debugging Medusa. We might want to improve this.

┆Issue is synchronized with this Jira Task by Unito
┆Issue Number: K8SSAND-324

Provide web api for medusa

Project board link

I'm currently in the process of developing an Kubernets Operator for Cassandra, including the option for backup and restore, and Medusa seems like a good fit for that.

The current approach is to run medusa in a container alongside cassandra, and trigger the backup in there. This works fine so far, although medusa isn't really designed to be run this way.

In my current build, I have written a very thin layer around the cli that starts a web server and let's a user start the medusa commands with HTTP requests, as this is a more Kubernetes-Style approach than to SSH or kubeexec into the container and run the commands there.

Is this something that could be integrated here upstream?
I'd like to get some feedback if it makes sense for me to put a bit more work into it, write a more complete design proposal and implement it for the rest of the CLI commands and open a PR for this, or if this something that's not going to be integrated anyway.

┆Issue is synchronized with this Jira Task by Unito
┆Issue Number: K8SSAND-334

No such file or directory: 'nodetool'

Hi,
I am using cassandra 3.11 with cassandra-medusa 0.4.1
Cassandra located :/usr/share/cassandra

It showing:

:~# medusa backup --backup-name=22012020
[2020-01-22 16:38:38] INFO: Monitoring provider is noop
[2020-01-22 16:38:38] WARNING: is ccm : 0
[2020-01-22 16:38:38] INFO: Creating snapshot
[2020-01-22 16:38:38] INFO: Saving tokenmap and schema
[2020-01-22 16:38:38] INFO: Node local does not have latest backup
[2020-01-22 16:38:38] INFO: Starting backup
[2020-01-22 16:38:38] ERROR: This error happened during the backup: [Errno 2] No such file or directory: 'nodetool'
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/medusa/backup.py", line 274, in main
cassandra, node_backup, storage, differential_mode, config)
File "/usr/local/lib/python3.7/site-packages/medusa/backup.py", line 318, in do_backup
with cassandra.create_snapshot() as snapshot:
File "/usr/local/lib/python3.7/site-packages/medusa/cassandra_utils.py", line 336, in create_snapshot
subprocess.check_call(cmd, stdout=subprocess.DEVNULL, universal_newlines=True)
File "/usr/local/lib/python3.7/site-packages/gevent/subprocess.py", line 270, in check_call
retcode = call(*popenargs, **kwargs)
File "/usr/local/lib/python3.7/site-packages/gevent/subprocess.py", line 249, in call
with Popen(*popenargs, **kwargs) as p:
File "/usr/local/lib/python3.7/site-packages/gevent/subprocess.py", line 627, in init
restore_signals, start_new_session)
File "/usr/local/lib/python3.7/site-packages/gevent/subprocess.py", line 1505, in _execute_child
raise child_exception
FileNotFoundError: [Errno 2] No such file or directory: 'nodetool'

GitHub actions build fails due to "permission error"

It seems like builds have started failing due to /etc/medusa not being writeable anymore when running tox.
I've tried to fix this by running a chmod on /etc/ but then it fails on a weird error when downloading flake8 from pypi : https://github.com/thelastpickle/cassandra-medusa/runs/380888511

3.4+ in setup.py is not true, at least 3.6 is required

setup.py allows medusa to be installed in any python that is 3.4+. However, when trying to use medusa, a SyntaxError is raised on the f-string usage.

Traceback (most recent call last):
  File "/usr/local/bin/medusa", line 6, in <module>
    from medusa.medusacli import cli
  File "/usr/local/lib/python3.5/dist-packages/medusa/medusacli.py", line 33, in <module>
    import medusa.backup
  File "/usr/local/lib/python3.5/dist-packages/medusa/backup.py", line 33, in <module>
    from medusa.index import add_backup_start_to_index, add_backup_finish_to_index, set_latest_backup_in_index
  File "/usr/local/lib/python3.5/dist-packages/medusa/index.py", line 21, in <module>
    import medusa.storage
  File "/usr/local/lib/python3.5/dist-packages/medusa/storage/__init__.py", line 32, in <module>
    from medusa.storage.google_storage import GoogleStorage
  File "/usr/local/lib/python3.5/dist-packages/medusa/storage/google_storage.py", line 26, in <module>
    from medusa.storage.abstract_storage import AbstractStorage
  File "/usr/local/lib/python3.5/dist-packages/medusa/storage/abstract_storage.py", line 25, in <module>
    import medusa.storage.concurrent
  File "/usr/local/lib/python3.5/dist-packages/medusa/storage/concurrent.py", line 161
    return f"{size:.{decimal_places}f}{unit}"

The blog post at TLP does state:

The command line tool that uses Python version 3.6 and needs to be installed on all the nodes you want to back up.

Error during backup: "nodetool clearsnapshot -t medusa-.... returned non-zero exit status 1

I'm trying to backup my cluster for the 3rd time
and all of them have ended with something like ERROR: This error happened during the backup:

Command '['nodetool', 'clearsnapshot', '-t', 'medusa-0d88c264-3149-42ba-b71e-d3d5a14ca1d0']' returned non-zero exit status 1.

This seems to happen in all nodes (6 in total). I'm not even sure where to start debugging this.

get_last_complete_cluster_backup fails if no complete backups are available

medusa get-last-complete-cluster-backup

[...]
Traceback (most recent call last):
  File "/usr/local/bin/medusa", line 11, in <module>
    sys.exit(cli())
  File "/usr/local/lib/python3.5/dist-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.5/dist-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.5/dist-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.5/dist-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/click/decorators.py", line 64, in new_func
    return ctx.invoke(f, obj, *args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/medusa/medusacli.py", line 252, in get_last_complete_cluster_backup
    print(backup.name)
AttributeError: 'NoneType' object has no attribute 'name'

medusa support upload files package and data compress

Project board link

When doing backup the data of one generation number like : md-18-big got eight files , if when can tar then to one file or tar the file under table dir to one file ,then the result file can do compress, afther doing compress we can upload the final file to cloud storage. In this way we can saving network bandwidth as well as upload time and download time . for almost case cpu is ok to do compress.

┆Issue is synchronized with this Jira Task by Unito
┆Issue Number: K8SSAND-328

Can't find backups for restore or list

Python 3.7.5 (default, Nov 14 2019, 13:29:23)
Docker debian build.

Config file

[cassandra]
#stop_cmd = /etc/init.d/cassandra stop
#start_cmd = /etc/init.d/cassandra start
#config_file = <path to cassandra.yaml. Defaults to /etc/cassandra/cassandra.yaml>
cql_username = cassandra
cql_password = cassandra
#check_running = <Command ran to verify if Cassandra is running on a node. Defaults to "nodetool version">

[storage]
storage_provider = local
base_path = /home
bucket_name = shed

[restore]
;health_check = <Which ports to check when verifying a node restored properly. Options are 'cql' (default), 'thrift', 'all'.>
;query =
;expected_rows =
;expected_result = <Coma separated string representation of values returned by the query. Checks only 1st row returned, and only if specified>

root@uk_staging-cassandra-03:/etc/medusa# medusa backup
[2019-11-19 14:23:52] INFO: Monitoring provider is noop
[2019-11-19 14:23:52] WARNING: is ccm : 0
[2019-11-19 14:23:52] ERROR: This error happened during the backup: Error: Backup 2019111914 already exists
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/medusa/backup.py", line 191, in main
raise IOError('Error: Backup {} already exists'.format(backup_name))
OSError: Error: Backup 2019111914 already exists

Proof the backup exists ^ Which was created a few min before.
I've added some prints to the command so you can see config.

root@uk_staging-cassandra-03:/etc/medusa# medusa list-backups
StorageConfig(bucket_name='shed', key_file=None, prefix=None, fqdn='uk_staging-cassandra-3', host_file_separator=',', storage_provider='local', api_key_or_username=None, api_secret_or_password=None, base_path='/home', max_backup_age='0', max_backup_count='0', api_profile='default')

Issue with Prefix parameter when using s3 storage

I've started experimenting with medusa and noticed that if the prefix parameter is set to allow multi-tenancy within the same bucket the backup is reported as invalid

In the example below I have set the prefix to "medusa" and for each sstable the verification output reports two messages as below:-

[medusa/cassandra01dev/backup/data/system_schema/views-9786ac1cdd583201a7cdad556410c985/md-45-big-Index.db] exists in storage, but not in manifest

[cassandra01dev/backup/data/system_schema/views-9786ac1cdd583201a7cdad556410c985/md-45-big-Index.db] Doesn't exists

Once I commented out the prefix setting in medusa.ini the backup verified correctly

Validating backup ...

Completion: OK!

Manifest validated: OK!!

Continuous upload of incremental sstables

Project board link

Is there any intention for Medusa to support a continuous/semi-continuous operating mode where, with Cassandra configured to write incremental backups, that Medusa will upload (and clean) them all? (And on the restore side, restore the latest full snapshot then the available incremental tables from after then?)

I'm trying to determine the path of least resistance for our needs: we need to minimize the window of data lost in the event of a total cluster loss. Daily or even hourly snapshots are not good enough.

We have a tool that does this, but it is lacking in a few ways (pruning, cloud portability, general code fixups/test coverage), so I'm trying to get a feeling for whether extending/improving Medusa is a viable (/better) option than finishing our tool.

┆Issue is synchronized with this Jira Task by Unito
┆Issue Number: K8SSAND-274

thelastpickle / cassandra-medusa Goto Github PK

cassandra-medusa's Introduction

Medusa for Apache Cassandra™

Features

Documentation

Docker images

Dependencies

cassandra-medusa's People

Contributors

Stargazers

Watchers

Forkers

cassandra-medusa's Issues

Recommend Projects

Recommend Topics

Recommend Org