Git Product home page Git Product logo

gce-rescue's Introduction

GCE Rescue

test badge

How to use GCE Rescue

This page shows you how to rescue a virtual machine (VM) instance by using GCE Rescue.

With GCE Rescue, you can boot the VM instance using a temporary boot disk to fix any problem that may be stopping the VM instance. Specifically, GCE Rescue uses a temporary Linux image as the VM instance’s boot disk to let you do maintenance on the faulty boot disk while it is in rescue mode.

When running GCE Rescue, it creates a snapshot of the existing boot disk for backup.

After you’ve fixed the faulty disk, you can then restore the original configuration by running GCE Rescue again to reboot the VM instance in normal mode again.

The advantage of using GCE Rescue is that it uses the resources already configured on the VM instance, such as networking, VPC firewalls or routes, to restore the faulty boot disk instead of creating a duplicate VM instance to restore the faulty boot disk.

Note: GCE Rescue is not an officially supported Google Cloud product. The Google Cloud Support team maintains this repository, but the product is experimental and, therefore, it can be unstable.

Requirements

To install and use GCE Rescue, you must have:

  1. Python environment >= 3.7 (read more)
  2. gcloud CLI (read more)

Note

The requirement of Python >= 3.7 was inherited from the google-api-python-client package.

Despite the fact that gce-rescue can be installed in some ways with a Python version < 3.7 and may work, this is not recommended and is not supported.

Installation

To install GCE Rescue, follow these steps:

  1. Clone the git repository to your local machine:
$ git clone https://github.com/GoogleCloudPlatform/gce-rescue.git
  1. Navigate to the cd/gce-rescue folder:
$ cd gce-rescue/
  1. To install GCE Rescue, select one of the following options:
  • Install GCE Rescue globally.
$ sudo python3 setup.py install
  • Install GCE Rescue locally.
$ python3 setup.py install --user

Note: If you cannot find the gce-rescue executable after your install GCE Rescue, add the Python Library to your PATH:

$ export PATH=$PATH:$(python3 -m site --user-base)/bin

Usage

gce-rescue --help
usage: gce-rescue [-h] [-p PROJECT] -z ZONE -n NAME [-d] [-f] [--skip-snapshot]

GCE Rescue v0.4-beta - Set/Reset GCE instances to boot in rescue mode.

optional arguments:
  -h, --help            show this help message and exit
  -p PROJECT, --project PROJECT
                        The project-id that has the instance.
  -z ZONE, --zone ZONE  Zone where the instance is created.
  -n NAME, --name NAME  Instance name.
  -d, --debug           Print to the log file in debug leve
  -f, --force           Don't ask for confirmation.
  --skip-snapshot       Skip backing up the disk using a snapshot.
  • --zone

    • The instances zone. (REQUIRED)
  • --name

    • The instance name (not instance ID). (REQUIRED)
  • --project

    • The project-id of the faulty instance. (OPTIONAL)
  • --force

    • Do not ask for confirmation. It can be useful when running from a script.
  • --debug

    • If provided, the log output will be set to DEBUG level. (OPTIONAL)

    • The log file will be created on ./ containing the VM name and timestamp on the name, that can be used to help to troubleshoot failed executions as well as to manually recover the instance's original configuration, if necessary.

    • The log files contain important information about the initial state of the VM instance that may be required to manually restore it.

  • --skip-snapshot

    • Skip the snapshot creation. (OPTIONAL)
    • Before setting your instance in rescue mode, GCE Rescue will always create a snapshot of your boot disk before taking any action. For some users this might be time consuming and not always necessary. Use this argument if you want to skip this step.

Examples

$ gce-rescue --zone europe-central2-a --name test

This option will boot the instance test in RESCUE MODE.
If your instance is running it will be rebooted.
Do you want to continue [y/N]: y
Starting...
┌── Configuring...
│   └── Progress 6/6 [█████████████████████████████████████████████████████████████]
├── Configurations finished.
└── Your instance is READY! You can now connect your instance "test" via:
  1. CLI. (add --tunnel-through-iap if necessary)
    $ gcloud compute ssh test --zone=europe-central2-a --project=my-project --ssh-flag="-o StrictHostKeyChecking=no"
  OR
  2. Google Cloud Console:
    https://ssh.cloud.google.com/v2/ssh/projects/my-project/zones/europe-central2-a/instances/test?authuser=0&hl=en_US&useAdminProxy=true&troubleshoot4005Enabled=true

Once your VM instance is in rescue mode you can connect via SSH, as you normally would do.

Notice that -rescue was added to your hostname, to highlight that you are currently in rescue mode.

The original boot disk should be automatically mounted on /mnt/sysroot:

user@test-rescue:~$ lsblk
NAME    MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda       8:0    0   10G  0 disk
├─sda1    8:1    0  9.9G  0 part /
├─sda14   8:14   0    3M  0 part
└─sda15   8:15   0  124M  0 part /boot/efi
sdb       8:16   0   30G  0 disk
├─sdb1    8:17   0    2M  0 part
├─sdb2    8:18   0   20M  0 part
└─sdb3    8:19   0   30G  0 part /mnt/sysroot

user@test-rescue:~$ chroot /mnt/sysroot

At this point you should take the necessary actions to restore your faulty boot disk.

When finished you can close your SSH connections and restore the VM instance to the original mode, by running the same command again:

$ gce-rescue --zone europe-central2-a --name test

The instance "test" is currently configured to boot as rescue mode since 2022-11-01 12:05:08.
Would you like to restore the original configuration ? [y/N]: y
Restoring VM...
┌── Configuring...
│   └── Progress 4/4 [█████████████████████████████████████████████████████████████]
├── Configurations finished.
└── The instance test was restored! Use the snapshot below if you need to restore the modification made while the instance was in rescue mode.
 Snapshot name: test-1668009968
 More information: https://cloud.google.com/compute/docs/disks/restore-snapshot

A snapshot was taken before setting the instance in Rescue Mode and can be used to recover the disk status. You will be able to idenfiy the snapshot name, like in the example above is: test-1668009968.

You are ready !

When you connect again you will noticed the your instance is back to the normal mode:

user@test:~> uptime
 12:24:18  up   0:05,  1 user,  load average: 0.00, 0.00, 0.00

user@test:~> lsblk
NAME   MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sda      8:0    0  30G  0 disk
├─sda1   8:1    0   2M  0 part
├─sda2   8:2    0  20M  0 part /boot/efi
└─sda3   8:3    0  30G  0 part /

user@test:~>

Authentication

This script makes use of Application Default Credentials (ADC). Make sure you have gcloud installed and your ADC updated.

You can find more information on: https://cloud.google.com/docs/authentication/provide-credentials-adc


Permissions

This is the list of the minimal IAM permissions required.

Description Permissions
Start and stop instance compute.instances.stop
compute.instances.start
Create and remove disk compute.instances.attachDisk on the instance
compute.instances.detachDisk on the instance
compute.images.useReadOnly on the image if creating a new root persistent disk
compute.disks.use on the disk if attaching an existing disk in read/write mode
compute.disks.setLabels on the disk if setting labels
Create snapshot compute.snapshots.create on the project
compute.disks.createSnapshot on the disk
Configure metadata compute.instances.setMetadata if setting metadata
compute.instances.setLabels on the instance if setting labels

Contact

GCE Rescue Team

[email protected]

gce-rescue's People

Contributors

dependabot[bot] avatar halleysouza avatar runxinw avatar soulless-viewer avatar tomerlf1 avatar tomerlf44 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gce-rescue's Issues

importlib.metadata.PackageNotFoundError: No package metadata was found for gce-rescue

Hi

Can you please advise on below error message ?

importlib.metadata.PackageNotFoundError: No package metadata was found for gce-rescue
cq032_t1@cloudshell:~ (dbg-clearstream-simu-fe9d0e70)$ gce-rescue -z europe-west3-a -
Traceback (most recent call last):
File "/home/cq032_t1/.local/bin/gce-rescue", line 33, in
sys.exit(load_entry_point('gce-rescue==0.4b0', 'console_scripts', 'gce-rescue')()
File "/home/cq032_t1/.local/bin/gce-rescue", line 22, in importlib_load_entry_point
for entry_point in distribution(dist_name).entry_points
File "/usr/lib/python3.10/importlib/metadata/init.py", line 969, in distributio
return Distribution.from_name(distribution_name)
File "/usr/lib/python3.10/importlib/metadata/init.py", line 548, in from_name
raise PackageNotFoundError(name)
importlib.metadata.PackageNotFoundError: No package metadata was found for gce-rescue

Thanks for your help

Tool crushes if the vm name is wrong

Gracefully handle erronous VMs and allow the user correct the name or even better suggest the correct name

gce-rescue --zone europe-central2-a --name test
Traceback (most recent call last):
  File "/home/username/.local/bin/gce-rescue", line 8, in <module>
    sys.exit(main())
  File "/home/username/.local/lib/python3.9/site-packages/gce_rescue/bin/rescue.py", line 44, in main
    vm = Instance(test_mode=False, **parse_kwargs)
  File "<string>", line 12, in __init__
  File "/home/username/.local/lib/python3.9/site-packages/gce_rescue/gce.py", line 116, in __post_init__
    self.compute = check.compute
  File "/home/username/.local/lib/python3.9/site-packages/gce_rescue/tasks/pre_validations.py", line 52, in compute
    return self._authentication()
  File "/home/username/.local/lib/python3.9/site-packages/gce_rescue/tasks/pre_validations.py", line 39, in _authentication
    return authenticate_check(
  File "/home/username/.local/lib/python3.9/site-packages/gce_rescue/tasks/validations/authentication.py", line 66, in authenticate_check
    request.execute()
  File "/usr/local/lib/python3.9/dist-packages/googleapiclient/_helpers.py", line 130, in positional_wrapper
    return wrapped(*args, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/googleapiclient/http.py", line 938, in execute
    raise HttpError(resp, content, uri=self.uri)

google.auth.exceptions.RefreshError: ('invalid_scope: Invalid OAuth scope or ID token audience provided.', {'error': 'invalid_scope', 'error_description': 'Invalid OAuth scope or ID token audience provided.'})

Hi

Can not use adc authentication via service account for gce-rescue

Can you please advise ?

[gce-rescue]#
[root@gce-rescue]# export GOOGLE_APPLICATION_CREDENTIALS="/root/gce-rescue/auth-simu.json"
[root@gce-rescue]# /usr/bin/gce-rescue -p dbg-cs-sz-32064e0b -z europe-west3-a -n gcsb84rhel098
Traceback (most recent call last):
File "/usr/bin/gce-rescue", line 33, in
sys.exit(load_entry_point('gce-rescue==0.4b0', 'console_scripts', 'gce-rescue')())
File "/usr/lib/python3.9/site-packages/gce_rescue-0.4b0-py3.9.egg/gce_rescue/bin/rescue.py", line 44, in main
vm = Instance(test_mode=False, **parse_kwargs)
File "", line 12, in init
File "/usr/lib/python3.9/site-packages/gce_rescue-0.4b0-py3.9.egg/gce_rescue/gce.py", line 112, in post_init
check = Validations(
File "", line 7, in init
File "/usr/lib/python3.9/site-packages/gce_rescue-0.4b0-py3.9.egg/gce_rescue/tasks/pre_validations.py", line 48, in post_init
authorize_check(project = self.project)
File "/usr/lib/python3.9/site-packages/gce_rescue-0.4b0-py3.9.egg/gce_rescue/tasks/validations/authorization.py", line 43, in authorize_check
result = service.projects().testIamPermissions(
File "/usr/lib/python3.9/site-packages/google_api_python_client-2.125.0-py3.9.egg/googleapiclient/_helpers.py", line 130, in positional_wrapper
return wrapped(*args, **kwargs)
File "/usr/lib/python3.9/site-packages/google_api_python_client-2.125.0-py3.9.egg/googleapiclient/http.py", line 923, in execute
resp, content = _retry_request(
File "/usr/lib/python3.9/site-packages/google_api_python_client-2.125.0-py3.9.egg/googleapiclient/http.py", line 191, in _retry_request
resp, content = http.request(uri, method, *args, **kwargs)
File "/usr/lib/python3.9/site-packages/google_auth_httplib2-0.2.0-py3.9.egg/google_auth_httplib2.py", line 209, in request
self.credentials.before_request(self._request, method, uri, request_headers)
File "/usr/local/lib/python3.9/site-packages/google/auth/credentials.py", line 228, in before_request
self._blocking_refresh(request)
File "/usr/local/lib/python3.9/site-packages/google/auth/credentials.py", line 191, in _blocking_refresh
self.refresh(request)
File "/usr/local/lib/python3.9/site-packages/google/oauth2/service_account.py", line 441, in refresh
access_token, expiry, _ = _client.jwt_grant(
File "/usr/local/lib/python3.9/site-packages/google/oauth2/_client.py", line 308, in jwt_grant
response_data = _token_endpoint_request(
File "/usr/local/lib/python3.9/site-packages/google/oauth2/_client.py", line 279, in _token_endpoint_request
_handle_error_response(response_data, retryable_error)
File "/usr/local/lib/python3.9/site-packages/google/oauth2/_client.py", line 72, in _handle_error_response
raise exceptions.RefreshError(
google.auth.exceptions.RefreshError: ('invalid_scope: Invalid OAuth scope or ID token audience provided.', {'error': 'invalid_scope', 'error_description': 'Invalid OAuth scope or ID token audience provided.'})
[root@ gce-rescue]#

Thanks for your support

`"Would you like to restore the original configuration"` Prompt Fails with Lowercase `y`

when answering lowercase y (not uppercase Y) to the "Would you like to restore the original configuration" prompt it is not accepted and instead Cancelled.

example:

% gce-rescue --zone us-central1-a --name example                                                                  
The instance "example" is currently configured to boot as rescue mode since 2023-02-05 00:00:00.
Would you like to restore the original configuration ? [y/N]: y
Cancelled.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.