Comments (27)
I try to take a look at this and see weather I can make progress in that direction.
from py-zfs-rescue.
@hiliev I've implemented the embedded data in the dnode and detected the System Attribute bonusbuffer, however I'm trying to understand the format of its content. Is it a zap encoded buffer?
append: I think I found it in zdb: dump_znode(objset_t *os, uint64_t object, void *data, size_t size)
I have to first scan the SA master node, then scan the "SA attr layouts" and "SA attr registration" dnodes, then use that layout to scan the SA ? Is it really that complicated?
from py-zfs-rescue.
This seems to be a bit more complicated than expected. ZFS has an attribute registration mechanism - SA. There is a bunch of layout tables that define the attributes and their offsets. Those are stored ZAP-like in several system objects. The order of the attributes may differ from pool to pool, therefore that system objects have to be parsed and the tables analysed. The objects are seen in your output from the other issue:
0:[SA master node] ...
1:[ZFS delete queue] ...
2:[ZFS directory] ...
3:[SA attr registration] ...
4:[SA attr layouts] ...
The SA master node (judging from the hex dump, although I haven't really decompressed the embedded data) appears to be a MicroZAP that holds the object IDs of the SA attribute registration and the SA attribute layouts objects.
The attributes in the bonus buffer itself are prefixed with a sa_hdr_phys
. The index of the layout used is contained in the sa_layout_info
field. The sa_impl.h header is very helpful.
from py-zfs-rescue.
@hiliev I think you are right, the SA master node (index 32 actually) contains:
Object lvl iblk dblk dsize dnsize lsize %full type
32 1 128K 512 0 512 512 100.00 SA master node (K=inherit) (Z=inherit)
dnode flags: USED_BYTES USERUSED_ACCOUNTED USEROBJUSED_ACCOUNTED
dnode maxblkid: 0
microzap: 512 bytes, 2 entries
REGISTRY = 35
LAYOUTS = 36
Is it possible that I ask a question: In my test pool datapool I have created a
dataset zfs create datapool/datadir
where my actual target files is located, called test.bin .
Now I look at zdb -ddddddd datapool
and try to see how this pool is referenced starting from the MOS. But there is so many information that I cannot make out a structure.
One thing I notices is that py-zfs-rescue does collect the toplevel MOS dnodes with type 16 as the target datasets to archive. The root dataset "datapool" seems is in this set (it says there is data in it, however there are no objects inside it), however not the child dataset "datapool/datadir". How is the child-dataset traversal done when starting from the MOS ?
More confusing is that the search for type-16 return 3 datasets in the MOS of which 2 state "0 uncompressed bytes". The Dataset
labeled information in the zdb
dump on the other hand lists the hirarchical datasets present....Can you recommend some reading to understand how the whole structure is traversed?
from py-zfs-rescue.
I never really looked into how parent-child relationships are implemented. In my case, the MOS was broken and the root dataset was lost. I was happy to just be able to find all accessible datasets and rescue their content.
from py-zfs-rescue.
@hiliev I have a question:
Line 37 in 9b3b8ba
from py-zfs-rescue.
@hiliev The child dataset dependency seems to be retrieved by :
- DSLdataset.ds_dir_obj points to the DSLdirectory
- DSLdirectory.child_dir_zapobj points to a ZAP which a list of child directory names-ids pairs which point to the child DSLdirectory
- DSLdirectory.head_dataset_obj of the child's DSLdirectory points the its dataset
from py-zfs-rescue.
@hiliev :
pushed PR for
#6
Maybe you can close this issue now...
from py-zfs-rescue.
Let me test it on the pool of my server first. As for the _asize
value, ZFS stores certain non-zero values in a biased format, i.e. as an offset from the minimum value, in that particular case equal to 1.
from py-zfs-rescue.
@hiliev : Just want to note that I have succeeded to retrive my files now. Wanna thank you for the py-zfs-rescue repo and the hints you gave. The unsorted patches are on https://github.com/eiselekd/dumpbin-py-zfs-rescue, maybe someone will find it useful in the future.
from py-zfs-rescue.
@hiliev : pushed also https://github.com/eiselekd/dumpbin-py-zfs-rescue/blob/master/zfs/sa.py#L59 and https://github.com/eiselekd/dumpbin-py-zfs-rescue/blob/master/zfs/dnode.py#L194 which implemente a more complete handling of system attribute and bonustype 0x2c. With it symlinks are also handled. Are you intereted in getting a PR?
from py-zfs-rescue.
Sorry, I'm currently moving to a different country and my FreeNAS system is offline in a locker room and I'm very slow at testing and accepting PRs. I'll be able to work on it again in about a month.
from py-zfs-rescue.
@hiliev : ok, I understand. If you have time then let me know and I will supply PRs. There is one error that you might be interested in. https://github.com/eiselekd/dumpbin-py-zfs-rescue/blob/d21f4c28acee0d26ab3ba227fc7d8b03881dffd8/zfs/blocktree.py#L85 In the original repo the levelcache is a flat array that is shared between levels. I changed it to be a tree instead.
from py-zfs-rescue.
Hi again, If you are interested and have time now I can supply patches. (As for py-zfs-rescue enabled me to restore my data I thought I need to contribute back). Tell me which area you want to address first.
from py-zfs-rescue.
Hi @eiselekd, I'm glad my little project helped your in recovering your data. I had great plans for it and still have a backlog of todo's geared towards making it more user friendly and in particular turning it into a visual ZFS debugger and explorer. Unfortunately, working at a startup company in a completely different field leaves me with zero spare time for this project. If you are willing to take over the CLI branch and develop it further, please feel free to do so. The areas that needs attention are perhaps adding a proper command-line interface, pool scrub functionality, and support for raidz with higher parity (e.g., raidz2). If you wish, I can also make you a project collaborator, so you don't have to fork a separate version.
from py-zfs-rescue.
@hiliev You can add me as a collaborator and maybe give me access to a special branch that I can hack around with. I could transfer the improvments from https://github.com/eiselekd/dumpbin-py-zfs-rescue back to your repo:
- lz4 decompression (already pulled)
- fletcher4 cksum
- first level child datasets
- blkptr with embedded data (already pulled)
- improved block server protocol
- bigger than 2TB disk support
- support SystemAttributes, bonus type 0x2c (partially pulled)
- variable asize (already pulled)
- fuse (llfuse) interface for recovery
I could also contribute: - linux losetup or similar based testing
- add a command line interface as you mentioned to make the configuration interactive
from py-zfs-rescue.
I sent you an invitation to become a collaborator. It gives you push access and you should be able to create branches on your own. When I find the time, I'll hack on the GUI stuff in a separate branch too.
from py-zfs-rescue.
Accepted, thanks.
from py-zfs-rescue.
@hiliev : Added pull request #12 which add (from list above):
- fletcher4 cksum (please pull)
- first level child datasets (was already pulled)
- improved block server protocol (please pull)
- bigger than 2TB disk support (please pull)
- support SystemAttributes, bonus type 0x2c (please pull)
- linux losetup or similar based testing (please pull)
from py-zfs-rescue.
Do I have to accept the pull request explicitly or your commit rights allow you to do it?
from py-zfs-rescue.
I didnt try to push it myself. It is also that even if I tested the code in linux (subfolder test/Makefile) I didnt test it for disks from FreeNAS. I have been setting up a home NAS recenty (with FreeNAS as in a KVM and a SATA controller card passthrough), however I find it a bit hard to work with because the /usr/ports is disabled and I cannot work FreeBSD style with it except within jails however I'm not familiar with those. I didnt find any description on howto enable /usr/ports in FreeNAS again. I could run in FreeBSD but then I'm not shure what the delta to FreeNAS is there.
from py-zfs-rescue.
FreeNAS is based on FreeBSD-STABLE kernels and the ZFS code should be the same as in the vanilla FreeBSD. My FreeNAS box is back online and I'll be able to test the code.
from py-zfs-rescue.
I can also try it out on a FreeBSD box in the weekend.
from py-zfs-rescue.
I tested on FreeBSD 11.2 and mdconfig and zpool create datapool0 raidz /dev/${md0} /dev/${md1} /dev/${md2}
and was able to read files back. zfs create datapool0/datadir
(childsets) on the other hand seems to be handled differently on FreeBSD. The child dataset is found in the ZAP but In the childdataset code in zfs_rescue.py child = mos[v]
will return None. So child datasets are only working for linux.
from py-zfs-rescue.
Conclusion from my side: Ok to push but create an issue to implement child datasets in BSD.
from py-zfs-rescue.
That's strange. The ZFS implementation in FreeBSD should be the one closest to the reference implementation in OpenSolaris as it borrows directly most of the code. Perhaps Linux is the one that handles child datasets differently. It means that there are ZFS flavours and the code should be able somehow to detect the flavour or get it, e.g., via a command-line argument.
In any case, I'm fine with merging and creating a separate issue for ZFS on FreeBSD.
from py-zfs-rescue.
Attr tables and embedded data are handled
from py-zfs-rescue.
Related Issues (5)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from py-zfs-rescue.