hiliev / py-zfs-rescue Goto Github PK

View Code? Open in Web Editor NEW

22.0 3.0 3.0 74 KB

A very minimal user-space implementation of ZFS in Python and a tool for salvaging data from broken pools

Home Page: https://hiliev.eu/post/recovering-datasets-from-broken-zfs-raidz-pools/

License: BSD 3-Clause "New" or "Revised" License

Python 97.98% Makefile 0.57% Shell 1.45%

zfs zfs-pool rescue python dataset salvage data-recovery filesystem filesystem-library zfs-raid

py-zfs-rescue's Introduction

py-zfs-rescue

A very minimal implementation in Python 3 of ZFS in user-space for pool recovery purposes.

Background

This project evolved from a set of Python scripts for reading and displaying on-disk structures that the ZFS debugger zdb would not show. It is the culmination of the effort to salvage the data from a severly broken raidz1 array. More background information is available in this blog post.

What it is?

zfs_rescue is a Python 3 script that is able to read the structure of a ZFS pool provided an initial device that belongs to the pool and to extract various types of information from the pool:

list the accessible datasets with their sizes
recursively list the files in all or some of the datasets found
archive the content of all or some regular files in a given dataset

The code was developed specifically against a broken ZFS raidz1 pool created by an old Solaris 10 x86 system and thus handles:

ZFS version 10 on little-endian systems
pools that consist of a single mirror or raidz1 vdev (the mirror code should be able to handle single devices too)
for raidz1 the parity information is used to recreate the data from the failed device, if any
directories with small to moderately large number of elements
access to remote disks via a simple TCP/IP protocol

Thanks to the work of @eiselekd, additional support was added for:

LZ4 compression
Fletcher checksum validation
modern ZFS attributes

What it is not?

This is not a generic rescue tool or a filesystem debugger per se. It provides no command-line interface and all configuration is done by altering the source code. The output is quite technical and requires some understanding of the ZFS internals and on-disk structure.

The ZFS implementation is minimal and incomplete. It is basically in a "works for me" state. Notably the following features are missing:

support for really large directories (it could be implemented relatively easily)
~~validation of the block checksums -- currently the tool relies on all metadata being compressed and the LZJB decompressor failing with garbled input data~~
~~LZ4 and GZIP decompression~~
support for pools created on big-endian systems

There is minimal to no error recovery and encountering an unsupported object will abort the program. This is intentional as it helps easily spot unimplemented features and deviations from the specification.

How to use it?

Documentation is currently a WiP.

py-zfs-rescue's People

Contributors

Stargazers

Watchers

Forkers

zecke eiselekd kidscripto

py-zfs-rescue's Issues

Child datasets in FreeBSD cannot be retrieved

zfs create datapool0/datadir (childsets) seems to be handled differently on FreeBSD. The child dataset is found in the ZAP but In the childdataset code in zfs_rescue.py child = mos[v] will return None. So child datasets are only working for linux.

Object number question

Hi, thanks so much for sharing this code.

I'm trying to extend it to perform block-level recovery (for when the metadata is too damaged to properly recurse the filesystem) - using a whole disk scan. But one issue I'm running into is that I can't see any way to extract a file/directory's object number from the DMU_OT_PLAIN_FILE_CONTENTS dnode etc. It doesn't seem to be in the SA attributes (you have the parent object number obviously, but that's only partially helpful).

The only way to get the object number seems to be to know the exact relative position of that dnode within the containing array. And if your metadata is corrupt or you only have a partial array, then I can't think of a reliable way to get the object number. Does anyboy have some ideas for this?

I find it really odd that they didn't embed the object number (and file name maybe) in the actual file dnode.

Comment the code

Being written as a quick and dirty tool in a time of need, practically the whole code is not commented. In the name of its maintainability and future development, it will be nice to have it document itself with proper documentation strings.

different ashift parameter

I try to recover a raidz array with 3 disks on zfslinux. It uses a
ashift parameter of 12 (4086 byte blocks) which is I think not handled
by py-zfs-rescue.
Can you point out which parts I have to change to make py-zfs-rescue
work with this ashift parameter?
As a test I'm using a losetup based pool with
zpool create datapool -f -o ashift=12 -O -O compression=lz4 -O normalization=formD raidz /dev/loop0 /dev/loop1 /dev/loop2
Adding lz4 compression support I think I can handle however the ashift parameter
seems to be hardcoded in different location to 9 and I'm not shure weather this parameter
is dependent on the type of data one is loading.
I created a dataset inside the testpool and tried to open it with py-zfs-rescue, however it is not
possible and failes when loading ..

 [+] Loading object set dnode from <[L0 DMU objset] 800L/800P DVA[0]=<0:e0000:2200> DVA[1]=<0:40de000:2200> DVA[2]=<0:809a000:2200> birth=34 fletcher4 off LE contiguous fill=50>
> Failed - unsupported operand type(s) for //: 'NoneType' and 'int'

Support for ZPL attribute tables and embedded data

In order to be able to use py-zfs-rescue on pools created by modern OSes, the following two enhancements are needed:

Support for block pointers with embedded data
Support for ZPL attribute tables as an alternative to znode_phys_t (bonus data type 0x2c)

hiliev / py-zfs-rescue Goto Github PK

py-zfs-rescue's Introduction

py-zfs-rescue

Background

What it is?

What it is not?

How to use it?

py-zfs-rescue's People

Contributors

Stargazers

Watchers

Forkers

py-zfs-rescue's Issues

Child datasets in FreeBSD cannot be retrieved

Object number question

Comment the code

different ashift parameter

Support for ZPL attribute tables and embedded data

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent