In order to be able to use py-zfs-rescue on pools cre

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Support for ZPL attribute tables and embedded data about py-zfs-rescue HOT 27 CLOSED

hiliev commented on July 28, 2024

Support for ZPL attribute tables and embedded data

from py-zfs-rescue.

Comments (27)

eiselekd commented on July 28, 2024

I try to take a look at this and see weather I can make progress in that direction.

from py-zfs-rescue.

eiselekd commented on July 28, 2024

@hiliev I've implemented the embedded data in the dnode and detected the System Attribute bonusbuffer, however I'm trying to understand the format of its content. Is it a zap encoded buffer?

append: I think I found it in zdb: dump_znode(objset_t *os, uint64_t object, void *data, size_t size)
I have to first scan the SA master node, then scan the "SA attr layouts" and "SA attr registration" dnodes, then use that layout to scan the SA ? Is it really that complicated?

from py-zfs-rescue.

hiliev commented on July 28, 2024

This seems to be a bit more complicated than expected. ZFS has an attribute registration mechanism - SA. There is a bunch of layout tables that define the attributes and their offsets. Those are stored ZAP-like in several system objects. The order of the attributes may differ from pool to pool, therefore that system objects have to be parsed and the tables analysed. The objects are seen in your output from the other issue:

0:[SA master node] ...
1:[ZFS delete queue] ...
2:[ZFS directory] ...
3:[SA attr registration] ...
4:[SA attr layouts] ...

The SA master node (judging from the hex dump, although I haven't really decompressed the embedded data) appears to be a MicroZAP that holds the object IDs of the SA attribute registration and the SA attribute layouts objects.

The attributes in the bonus buffer itself are prefixed with a sa_hdr_phys. The index of the layout used is contained in the sa_layout_info field. The sa_impl.h header is very helpful.

from py-zfs-rescue.

eiselekd commented on July 28, 2024

@hiliev I think you are right, the SA master node (index 32 actually) contains:

    Object  lvl   iblk   dblk  dsize  dnsize  lsize   %full  type
        32    1   128K    512      0     512    512  100.00  SA master node (K=inherit) (Z=inherit)
	dnode flags: USED_BYTES USERUSED_ACCOUNTED USEROBJUSED_ACCOUNTED 
	dnode maxblkid: 0
	microzap: 512 bytes, 2 entries

		REGISTRY = 35 
		LAYOUTS = 36

Is it possible that I ask a question: In my test pool datapool I have created a
dataset zfs create datapool/datadir where my actual target files is located, called test.bin .
Now I look at zdb -ddddddd datapool and try to see how this pool is referenced starting from the MOS. But there is so many information that I cannot make out a structure.

One thing I notices is that py-zfs-rescue does collect the toplevel MOS dnodes with type 16 as the target datasets to archive. The root dataset "datapool" seems is in this set (it says there is data in it, however there are no objects inside it), however not the child dataset "datapool/datadir". How is the child-dataset traversal done when starting from the MOS ?
More confusing is that the search for type-16 return 3 datasets in the MOS of which 2 state "0 uncompressed bytes". The Dataset labeled information in the zdb dump on the other hand lists the hirarchical datasets present....Can you recommend some reading to understand how the whole structure is traversed?

from py-zfs-rescue.

hiliev commented on July 28, 2024

I never really looked into how parent-child relationships are implemented. In my case, the MOS was broken and the root dataset was lost. I was happy to just be able to find all accessible datasets and rescue their content.

from py-zfs-rescue.

eiselekd commented on July 28, 2024

@hiliev I have a question:

py-zfs-rescue/zfs/blockptr.py

Line 37 in 9b3b8ba

self._asize = (1 + (qword0 & 0xffffff)) << 9

does a +1 for asize calculation. Is the +1 a saveguard?

from py-zfs-rescue.

eiselekd commented on July 28, 2024

@hiliev The child dataset dependency seems to be retrieved by :

DSLdataset.ds_dir_obj points to the DSLdirectory
DSLdirectory.child_dir_zapobj points to a ZAP which a list of child directory names-ids pairs which point to the child DSLdirectory
DSLdirectory.head_dataset_obj of the child's DSLdirectory points the its dataset

from py-zfs-rescue.

eiselekd commented on July 28, 2024

@hiliev :
pushed PR for
#6
Maybe you can close this issue now...

from py-zfs-rescue.

hiliev commented on July 28, 2024

Let me test it on the pool of my server first. As for the _asize value, ZFS stores certain non-zero values in a biased format, i.e. as an offset from the minimum value, in that particular case equal to 1.

from py-zfs-rescue.

eiselekd commented on July 28, 2024

@hiliev : Just want to note that I have succeeded to retrive my files now. Wanna thank you for the py-zfs-rescue repo and the hints you gave. The unsorted patches are on https://github.com/eiselekd/dumpbin-py-zfs-rescue, maybe someone will find it useful in the future.

from py-zfs-rescue.

eiselekd commented on July 28, 2024

@hiliev : pushed also https://github.com/eiselekd/dumpbin-py-zfs-rescue/blob/master/zfs/sa.py#L59 and https://github.com/eiselekd/dumpbin-py-zfs-rescue/blob/master/zfs/dnode.py#L194 which implemente a more complete handling of system attribute and bonustype 0x2c. With it symlinks are also handled. Are you intereted in getting a PR?

from py-zfs-rescue.

hiliev commented on July 28, 2024

Sorry, I'm currently moving to a different country and my FreeNAS system is offline in a locker room and I'm very slow at testing and accepting PRs. I'll be able to work on it again in about a month.

from py-zfs-rescue.

eiselekd commented on July 28, 2024

@hiliev : ok, I understand. If you have time then let me know and I will supply PRs. There is one error that you might be interested in. https://github.com/eiselekd/dumpbin-py-zfs-rescue/blob/d21f4c28acee0d26ab3ba227fc7d8b03881dffd8/zfs/blocktree.py#L85 In the original repo the levelcache is a flat array that is shared between levels. I changed it to be a tree instead.

from py-zfs-rescue.

eiselekd commented on July 28, 2024

Hi again, If you are interested and have time now I can supply patches. (As for py-zfs-rescue enabled me to restore my data I thought I need to contribute back). Tell me which area you want to address first.

from py-zfs-rescue.

hiliev commented on July 28, 2024

Hi @eiselekd, I'm glad my little project helped your in recovering your data. I had great plans for it and still have a backlog of todo's geared towards making it more user friendly and in particular turning it into a visual ZFS debugger and explorer. Unfortunately, working at a startup company in a completely different field leaves me with zero spare time for this project. If you are willing to take over the CLI branch and develop it further, please feel free to do so. The areas that needs attention are perhaps adding a proper command-line interface, pool scrub functionality, and support for raidz with higher parity (e.g., raidz2). If you wish, I can also make you a project collaborator, so you don't have to fork a separate version.

from py-zfs-rescue.

eiselekd commented on July 28, 2024

@hiliev You can add me as a collaborator and maybe give me access to a special branch that I can hack around with. I could transfer the improvments from https://github.com/eiselekd/dumpbin-py-zfs-rescue back to your repo:

lz4 decompression (already pulled)
fletcher4 cksum
first level child datasets
blkptr with embedded data (already pulled)
improved block server protocol
bigger than 2TB disk support
support SystemAttributes, bonus type 0x2c (partially pulled)
variable asize (already pulled)
fuse (llfuse) interface for recovery
I could also contribute:
linux losetup or similar based testing
add a command line interface as you mentioned to make the configuration interactive

from py-zfs-rescue.

hiliev commented on July 28, 2024

I sent you an invitation to become a collaborator. It gives you push access and you should be able to create branches on your own. When I find the time, I'll hack on the GUI stuff in a separate branch too.

from py-zfs-rescue.

eiselekd commented on July 28, 2024

Accepted, thanks.

from py-zfs-rescue.

eiselekd commented on July 28, 2024

@hiliev : Added pull request #12 which add (from list above):

fletcher4 cksum (please pull)
first level child datasets (was already pulled)
improved block server protocol (please pull)
bigger than 2TB disk support (please pull)
support SystemAttributes, bonus type 0x2c (please pull)
linux losetup or similar based testing (please pull)

from py-zfs-rescue.

hiliev commented on July 28, 2024

Do I have to accept the pull request explicitly or your commit rights allow you to do it?

from py-zfs-rescue.

eiselekd commented on July 28, 2024

I didnt try to push it myself. It is also that even if I tested the code in linux (subfolder test/Makefile) I didnt test it for disks from FreeNAS. I have been setting up a home NAS recenty (with FreeNAS as in a KVM and a SATA controller card passthrough), however I find it a bit hard to work with because the /usr/ports is disabled and I cannot work FreeBSD style with it except within jails however I'm not familiar with those. I didnt find any description on howto enable /usr/ports in FreeNAS again. I could run in FreeBSD but then I'm not shure what the delta to FreeNAS is there.

from py-zfs-rescue.

hiliev commented on July 28, 2024

FreeNAS is based on FreeBSD-STABLE kernels and the ZFS code should be the same as in the vanilla FreeBSD. My FreeNAS box is back online and I'll be able to test the code.

from py-zfs-rescue.

eiselekd commented on July 28, 2024

I can also try it out on a FreeBSD box in the weekend.

from py-zfs-rescue.

eiselekd commented on July 28, 2024

I tested on FreeBSD 11.2 and mdconfig and zpool create datapool0 raidz /dev/${md0} /dev/${md1} /dev/${md2} and was able to read files back. zfs create datapool0/datadir (childsets) on the other hand seems to be handled differently on FreeBSD. The child dataset is found in the ZAP but In the childdataset code in zfs_rescue.py child = mos[v] will return None. So child datasets are only working for linux.

from py-zfs-rescue.

eiselekd commented on July 28, 2024

Conclusion from my side: Ok to push but create an issue to implement child datasets in BSD.

from py-zfs-rescue.

hiliev commented on July 28, 2024

That's strange. The ZFS implementation in FreeBSD should be the one closest to the reference implementation in OpenSolaris as it borrows directly most of the code. Perhaps Linux is the one that handles child datasets differently. It means that there are ZFS flavours and the code should be able somehow to detect the flavour or get it, e.g., via a command-line argument.

In any case, I'm fine with merging and creating a separate issue for ZFS on FreeBSD.

from py-zfs-rescue.

eiselekd commented on July 28, 2024

Attr tables and embedded data are handled

from py-zfs-rescue.

Support for ZPL attribute tables and embedded data about py-zfs-rescue HOT 27 CLOSED

Comments (27)

Related Issues (5)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent