cxl-micron-reskit / famfs Goto Github PK
View Code? Open in Web Editor NEWThis is the user space repo for famfs, the fabric-attached memory file system
License: Apache License 2.0
This is the user space repo for famfs, the fabric-attached memory file system
License: Apache License 2.0
We use gcov to test code coverage. We are currently (and will probably continue to) use a combination of smoke and unit tests to measure "official" coverage (see https://github.com/cxl-micron-reskit/famfs/blob/master/markdown/getting-started.md).
But I would like to see our unit-test-only increase for various reasons. One is that smoke tests may not be runnable on Github, since they need an actual dax device. Moreover, there are some inherently hard-to-cover branches - mmap failures and file open failures are top of the list here. We probably need to enable mocking for open and mmap here in order to properly test failures in those functions.
So please work on unit (or smoke) test code coverage improvements, but especially unit test coverage improvements and send PRs.
Better unit test coverage will be good since enabling Github coverage reporting probably can't include coverage from smoke tests (but please correct me if this seems wrong).
Currently the log is always 8MiB, which currently holds something over 25000 log entries. Reasons we might support a configurable log size:
Under no circumstances should we support a log size that is not a 2MiB multiple. Honestly I'm not sure this needs to be addressed, but it's a question that has come up several times.
The trickiest part would be test coverage.
It currently thinks the project is mostly ROFF files :D
This function needs to either:
The first option leaves the cleanup almost exactly the same, whereas the second leaves two things to munmap.
This is a bug, but it's only a latent bug until we support more than one log size.
I'm thinking the test should prove that while the lock is held, other processes can't get the lock. What else? We currently rely on this being bulletproof...
The reference is here:
https://github.com/cxl-micron-reskit/famfs/blob/master/markdown/building-a-kernel.md
Planning to checkin a fames-config file to this repo, and specify a 'wget' command to retrieve it for the kernel build
Log entries currently have space for 80 characters of relative path. Need to fail gracefully if we overflow.
I added a test that manually puts a symlink where a directory should be, and then runs logplay. The link was to a directory (/tmp) and logplay just thinks the directory already existed (i.e. stat followed the link to its destination and reported on that).
We may need to use fstatat() in logplay (rather than stat()) to avoid this. Or always do an lstat() before the stat(). Also preventing symlinks might be a good idea, if there's a way.
This function is duplicated in famfs_cli.c and pcq.c. Maybe famfs_cli_lib.c, or something else...
Does famfs provide POSIX-like interfaces for operating files under the famfs file system, such as open, close, write, read, lseek, and other functions?
Something along the lines of what we did in this sub-thread: https://lore.kernel.org/linux-fsdevel/w5cqtmdgqtjvbnrg5okdgmxe45vjg5evaxh6gg3gs6kwfqmn5p@wgakpqcumrbt/.
Test artifacts should be captured in log files. Extra points for scriptage that sanity checks results (though this could get complicated since not all memory has the same performance...). I already have some script setup, but it will need to check core count and memdev size in order to properly setup on any system/memdev.
Role is MASTER if the system created the file system, and CLIENT if a different system created it
https://github.com/pmem/pmdk/tree/bbf9c4fa2a8ad338052a4b4aed26112e809bd5e4/src/libpmem2
No rush on this, but this looks like the right approach.
Log overflow has not been tested yet...
Mind you the default log size has space for >25000 entries, and the minimum allocation unit is 2MiB (but if you're creating files <=2MiB, you're probably missing the point of famfs).
Still: will add such a test
The famfs kernel RFC v1 has fault counters that can be enabled via /sys/fs/famfs/... These were implemented because we had a bug in the kernel module at one point that resulted in 4K mapping faults rather than 2M, and it killed performance for many concurrent computational processes hammering on the same data frame(s) in memory.
Famfs files are constrained to multiples of 2MiB, and mmap addresses are constrained to 2MiB alignment - both for this reason. But we need a good way to catch a regression.
However, the counters proved controversial (see thread [1]) so they will probably be dropped from the next version of the patch set. Dan Williams pointed out a user space test that ndctl uses to verify something similar [2], but I'm hoping we can do better. The first place to look is the rest of the thread at [1] to see of a prescription becomes clear.
[1] https://lore.kernel.org/linux-fsdevel/3jwluwrqj6rwsxdsksfvdeo5uccgmnkh7rgefaeyxf2gu75344@ybhwncywkftx/T/#m69d2b66e54f9657c38e6e0a0da94ab4b3eca7586
[2] https://github.com/pmem/ndctl/blob/main/test/dax.sh#L31
This should be fixed, but the workaround is put your dax device in a valid state and try again
It needs to be 0755 rather than 0644
The execute bits control ability to list the directory for non-owners. Hot fix on the way...
When I was trying to do some simple cmds to see the behavior, I notice that if I use cmds like:
echo 1234 >> /mnt/famfs/test
then I am using famfs_dax_write_iter to change the contents.
However the log is not added. Therefore, famfs verify/fsck cannot be passed.
Is that a issue? Or there is another way to write something to the file?
Exit code is 1, but no message. There should be a message.
Killing the superblock also only works on the host that created it. I can see going either way on this, but it at least seems reasonable. Should there be a double-secret force?
clflush needed:
after mkfs, on superblock and log
after log append (on log)
before logplay (on log)
after 'famfs cp'
after 'famfs creat' if the file was initialized.
User or app will be responsible for flushing the cash on any data written by apps other than the famfs cli
Still thinking about how the generalized cases should work. Probably a "famfs flush" command and api call that should flush data as appropriate...
The current run_smoke.sh script requires famfs.ko to NOT be loaded at start. This refactor should:
The kernel mount is performed, and then the famfs_mkmeta() discovers that there is no valid superblock and bails out on creating the meta files.
Need to think through the right way to handle this gracefully. Probably issue a umount on the way out.
"cp /path/to/* /mnt/famfs/" fails (when /mnt/famfs is the mount point), but succeeds to a subdirectory of the mount point
Thanks to Jacob Jacob for reporting
compile failed as the error:
#make all
make[3]: Entering directory '/root/famfs/debug'
[ 3%] Building C object CMakeFiles/libfamfs.dir/src/famfs_lib.c.o
/root/famfs/src/famfs_lib.c:423:28: warning: ‘enum extent_type’ declared inside parameter list will not be visible outside of this definition or declaration
423 | enum extent_type *type)
| ^~~~~~~~~~~
/root/famfs/src/famfs_lib.c:421:1: error: conflicting types for ‘famfs_get_device_size’; have ‘int(const char *, size_t *, enum extent_type *)’ {aka ‘int(const char *, long unsigned int *, enum extent_type *)’}
421 | famfs_get_device_size(const char *fname,
| ^~~~~~~~~~~~~~~~~~~~~
In file included from /root/famfs/src/famfs_lib.c:35:
/root/famfs/src/famfs_lib.h:21:12: note: previous declaration of ‘famfs_get_device_size’ with type ‘int(const char *, size_t *, enum famfs_extent_type *)’ {aka ‘int(const char *, long unsigned int *, enum famfs_extent_type *)’}
21 | extern int famfs_get_device_size(const char *fname, size_t *size, enum famfs_extent_type *type);
| ^~~~~~~~~~~~~~~~~~~~~
/root/famfs/src/famfs_lib.c: In function ‘famfs_mkfs’:
/root/famfs/src/famfs_lib.c:3812:14: error: variable ‘type’ has initializer but incomplete type
3812 | enum extent_type type = SIMPLE_DAX_EXTENT;
| ^~~~~~~~~~~
/root/famfs/src/famfs_lib.c:3812:26: error: storage size of ‘type’ isn’t known
3812 | enum extent_type type = SIMPLE_DAX_EXTENT;
| ^~~~
/root/famfs/src/famfs_lib.c:3812:26: warning: unused variable ‘type’ [-Wunused-variable]
make[3]: *** [CMakeFiles/libfamfs.dir/build.make:76: CMakeFiles/libfamfs.dir/src/famfs_lib.c.o] Error 1
make[3]: Leaving directory '/root/famfs/debug'
make[2]: *** [CMakeFiles/Makefile2:184: CMakeFiles/libfamfs.dir/all] Error 2
make[2]: Leaving directory '/root/famfs/debug'
make[1]: *** [Makefile:146: all] Error 2
make[1]: Leaving directory '/root/famfs/debug'
make: *** [Makefile:13: debug] Error 2
When famfs creates a file (famfs_mkfile() in the api, normally from 'famfs cp' or 'famfs creat'), it will fail if the file already exists. But if a rogue delete had taken place, and then a cp or creat tried to create the same relative path, it would not see the file in the mounted filesystem - and would proceed to create and log the file instance.
A rogue delete is any 'rm' that did not occur through the famfs api/cli - and since the api/cli currently does not support delete, it's any delete.
This would effectively make a mess of things.
This could be solved by building a hash table of relative paths during logplay (master only), and 1) detecting relative path collisions in the log, and 2) detecting famfs_mkfile() or famfs_mkdir() calls that generate relative path collisions in the log (which are detectable via the mounted namespace if there have been no rogue namespace operations)
One downside to this is that it will make the O() order of file creation worse; it's already kinda expensive because space allocation plays the log to get the free/available bitmap - which is not persisted. There is not an "easy" way to persist the hash table either, so that might need to be re-generated on each [batch of] file create. (batches because 'cp *' and 'cp -r' and 'mkdir -p' lock the log and build the bitmap once for a batch of creates).
Hmm. we could persist the bitmap in a new meta file, and only expose that on the master. Even the hash table could be handled that way too. Extend the flock(log) to cover those files, and it may be a fully working approach worth considering...eventually.
This has not been observed in the wild; Will put this "on ice" initially, but may need to
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.