zadammac / tapestry Goto Github PK
View Code? Open in Web Editor NEWSpecialist Batch File Backup Tool
License: GNU General Public License v3.0
Specialist Batch File Backup Tool
License: GNU General Public License v3.0
If any error causes a worker process to halt, the program will stop at the nearest tasks.join() call.
Proposed solution: better error handling to be implemented in 1.1.
Describe the bug
During backup operations, the process will complete without any signing taking place. Signing key lived on an attached Yubikey.
...
Signing Enqueued
The processing has completed. Your .tap files are here:
....
To Reproduce
Steps to reproduce the behavior:
Expected behavior
A clear and concise description of what you expected to happen.
Log/Console Output
...
Signing Enqueued
The processing has completed. Your .tap files are here:
....
Desktop (please complete the following information):
Additional context
Believe the issue is either in the signing logic itself, gpg's handling of the smart card, or possibly a directory misdirection problem.
Describe the bug
In a situation where ~/.ssh/known_hosts
contains multiple keys for the same hostname (for example, if a given host the user has access to has sshd running on multiple ports via containerization), Tapestry will raise the SSHException indicating a mismatched hostkey was found, and SFTP transfer will fail. Files are, in this condition, retained locally as expected, and an alert is raised to the user that this has occurred.
To Reproduce
Reproduction is preconditioned on having the setup described above. This could be reproduced as simply as creating a dummy SFTP host listening on 2222/tcp on another host which already has SSH of its own on 22/tcp as normal
Steps to reproduce the behavior:
Expected behavior
This should work; checking known_hosts should take all host-keys for a given user into consideration.
Log/Console Output
Where possible, add the output of the Logs or Console when the problem occurs.
Desktop (please complete the following information):
Additional context
This is probably a very unusual arrangement so I wouldn't be surprised if I am the only person to encounter this.
Is your feature request related to a problem? Please describe.
Tapestry purports to create a storage-security agnostic backup, but offers limited (and frankly low-availability) options for cloud or remote storage of Tapestry's .tap backup files. This remote-to-the-user storage is a key part of a true recoverable backup solution and the fact it is missing is a potential risk exposure to users who are storing backup files purely locally.
Describe the solution you'd like
In addition to the existing SFTP functionality, Tapestry should have a new network mode, 's3', which leverages the relatively low-cost Amazon S3 Glacier storage mechanism and allows for both automated backup and retrieval from a specified S3 bucket the user or organization controls. This would be provided alongside user-friendly tutorials on the setup and maintainance of S3 and allows us to leverage AWS's assurance regarding uptime and availability.
Describe alternatives you've considered
Alternatives looked at were to create a guide on using various cloud platform vendors to allow the use of SFTP, which is already supported by Tapestry as a storage and retrieval mechanism.
Additional context
This would be the first major feature release since SFTP support earlier in 2020.
Describe the bug
A long running backup will crash mid-process if a file vanishes at any point. This bug was originally encountered during the AccessTest function, which nominally should be checking if a file exists, so we have a double bug.
The exception is OSError, so we really just need to add handling for it.
To Reproduce
Steps to reproduce the behavior:
~/.config/
on an in-use system.--inc
mode.Expected behavior
An error should be logged but the actual backup process should continue.
Log/Console Output
-stack trace abbreviated-
OSError: [Errno 6] No such device or address: '/home/patches/.config/discord/SS'
Desktop (please complete the following information):
Additional context
Exposed during the same ordinary usage as #34 and likely a similarly minor fix. Should be fixed in the same minor release.
Describe the bug
[This was thought solved, then reinvented in 2.2.1]
This step takes a very long time because of the need to generate a hash for each individual file. We need to put some thought into either fixing the look and feel or fixing the performance.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
A clear and concise description of what you expected to happen.
Log/Console Output
Where possible, add the output of the Logs or Console when the problem occurs.
Desktop (please complete the following information):
Additional context
Add any other context about the problem here.
Describe the bug
In some conditions if two child processes both detect, concurrently, that the parent directory for a category does not exist, both will attempt to create the parent and one will error out. This causes a halt as the errorred-out process never completes.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
This error should be handled without being exposed to the user and in such a way as to not interrupt the flow of the program.
Log/Console Output
Where possible, add the output of the Logs or Console when the problem occurs.
Desktop (please complete the following information):
Additional context
Add any other context about the problem here.
Describe the bug
In investigating recent silent failures of an automated implementation of Tapestry, it was discovered that broken symbolic links cause an unhandled exception and crash Tapestry in place. The exception is FileNotFoundError
and raised by one of the subordinate calls in os.walk.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
The exception should be captured and the broken file reported in the skiplogger if at all possible.
Log/Console Output
Where possible, add the output of the Logs or Console when the problem occurs.
patches@Jupiter:~$ tapcontroller
/usr/lib/python3.6/runpy.py:125: RuntimeWarning: 'tapestry.__main__' found in sys.modules after import of package 'tapestry', but prior to execution of 'tapestry.__main__'; this may result in unpredictable behaviour
warn(RuntimeWarning(msg))
Welcome to Tapestry Backup Tool Version 2.0.2
Automatic updates are not currently a function of this script.
Please refer to the repo for updates: https://www.github.com/ZAdamMac/Tapestry
Gathering a list of files to archive - this could take a few minutes.
Traceback (most recent call last):
File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/patches/.local/lib/python3.6/site-packages/tapestry/__main__.py", line 1133, in <module>
do_main(state, gpg_conn)
File "/home/patches/.local/lib/python3.6/site-packages/tapestry/__main__.py", line 295, in do_main
ops_list = build_ops_list(namespace)
File "/home/patches/.local/lib/python3.6/site-packages/tapestry/__main__.py", line 71, in build_ops_list
size = os.path.getsize(absolute_path)
File "/usr/lib/python3.6/genericpath.py", line 50, in getsize
return os.stat(filename).st_size
FileNotFoundError: [Errno 2] No such file or directory: '/home/patches/Documents/Website/config/mods-enabled/dir.conf'
Desktop (please complete the following information):
Additional context
This can probably be made part of the logging improvements project and handled there.
The how: The subprocesses that do sample generation seem to hold the GPG system hostage. Short of instantiating a second install for the testing script (which is absurd, and would break several tests), it makes more sense to split the testing process into a suite of scripts, run successively.
Describe the bug
When generating a new key, tapestry will thereafter exhibit one of two behaviours:
This is because generate_keys is correctly updating the configuration, but it's called after parse_config() and therefore needs to be patched to update the NAMESPACE as well.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
A new key should be generated and written to configuration, and that key should be used throughout the rest of the backup process.
Log/Console Output
Where possible, add the output of the Logs or Console when the problem occurs.
Desktop (please complete the following information):
Additional context
Add any other context about the problem here.
Bug was reported by the redoubtable Kat Adam-MacEwen
Describe the bug
Tapestry does not currently correctly handle instances where it is invoked without a config file available.
To Reproduce
Steps to reproduce the behavior:
tapestry.cfg
python3 -m tapestry
Expected behavior
This should simply exit after the issue is raised rather than attempting to proceed, which is what is causing the error listed below.
Log/Console Output
Where possible, add the output of the Logs or Console when the problem occurs.
patches@sevastopol:~/Downloads$ python3 -m tapestry
The indicated config file: tapestry.cfg cannot be found.
Generating a template config file in that location.
Please edit this config file appropriately and rerun the program.
Traceback (most recent call last):
File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/patches/.local/lib/python3.8/site-packages/tapestry/__main__.py", line 13, in <module>
tapestry.runtime()
File "/home/patches/.local/lib/python3.8/site-packages/tapestry/functions.py", line 1192, in runtime
state = parse_config(state)
File "/home/patches/.local/lib/python3.8/site-packages/tapestry/functions.py", line 858, in parse_config
place_config_template(ns.config_path)
File "/home/patches/.local/lib/python3.8/site-packages/tapestry/functions.py", line 985, in place_config_template
config.set(section, option, value)
File "/usr/lib/python3.8/configparser.py", line 1200, in set
self._validate_value_types(option=option, value=value)
File "/usr/lib/python3.8/configparser.py", line 1185, in _validate_value_types
raise TypeError("option values must be strings")
TypeError: option values must be strings
Desktop (please complete the following information):
Additional context
Add any other context about the problem here.
Is your feature request related to a problem? Please describe.
When packaging a large number of files (~58965), the block build face is unacceptably under-performant if the block size is also sufficiently large; eg. in the instance above only 6 blocks were created. This is likely due to the delay in acquiring the lock as the parallelism on the unix block build is limited by the number of block output files being created.
The specific example cited above took roughly 12 hours, if not a bit more than that, to complete just the building portion. The attendant level-2 compression took ~8 minutes.
Describe the solution you'd like
Time to handle should be reduced signifcantly if at all possible.
Describe alternatives you've considered
Backgrounding the process would reduce the annoyance to the user, but also decrease the accuracy of the snapshot, since files could be written/modified during the build process if it takes a considerably long time.
Since the limitations here are disk i/o, which we can do nothing about, and the block writes, which we can also do nothing about, we should explore the idea of an optional flag that would allow Tapestry to break a file up into smaller chunks provided those chunks satisfy the following properties:
Alternatively, perhaps many smaller blocks could be briefly created to reduce the packaging load, then composited into right-size blocks?
We also need to investigate what is causing the actual delay here. If it really is locking we can solve it, but if it's just constrained by disk i/o there is nothing to be done.
Additional context
This is likely not a light lift as was the case for #34, #35, and #36; a fix is not required as a bug fix and may be considerably more involved.
Describe the bug
The 2.2.0 version is completely invalid as a function is called without ever actually having the name instantiated. How the hell did this pass into release?
To Reproduce
Steps to reproduce the behavior:
Expected behavior
The application should run.
Log/Console Output
Welcome to Tapestry Backup Tool Version 2.2.0
Automatic updates are not currently a function of this script.
Please refer to the repo for updates: https://www.github.com/ZAdamMac/Tapestry
Gathering a list of files to archive - this could take a few minutes.
/usr/lib/python3/dist-packages/requests/__init__.py:89: RequestsDependencyWarning: urllib3 (1.26.12) or chardet (3.0.4) doesn't match a supported version!
warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
Traceback (most recent call last):
File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/patches/.local/lib/python3.8/site-packages/tapestry/__main__.py", line 13, in <module>
tapestry.runtime()
File "/home/patches/.local/lib/python3.8/site-packages/tapestry/functions.py", line 1262, in runtime
do_main(state, gpg_conn)
File "/home/patches/.local/lib/python3.8/site-packages/tapestry/functions.py", line 334, in do_main
ops_list = build_ops_list(namespace)
File "/home/patches/.local/lib/python3.8/site-packages/tapestry/functions.py", line 106, in build_ops_list
hasher.update(chunk)
NameError: name 'hasher' is not defined
Desktop (please complete the following information):
Additional context
Add any other context about the problem here.
Describe the bug
Tapestry Will Crash on Encountering a File It Cannot Copy. Normally this shouldn't happen, but can in the event that you're copying out (e.g.) system directories or other tools.
To Reproduce
Steps to reproduce the behavior:
--inc
modes.Expected behavior
It would be better to skip and log such files, possibly with a displayed warning indicating that the script encountered files it cannot remove.
Log/Console Output
patches@Chimera-Coruscant:~$ python3 -m tapestry --inc -c tapestry.cfg
Welcome to Tapestry Backup Tool Version 2.1.1
Automatic updates are not currently a function of this script.
Please refer to the repo for updates: https://www.github.com/ZAdamMac/Tapestry
Gathering a list of files to archive - this could take a few minutes.
Traceback (most recent call last):
File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/patches/.local/lib/python3.8/site-packages/tapestry/__main__.py", line 13, in <module>
tapestry.runtime()
File "/home/patches/.local/lib/python3.8/site-packages/tapestry/functions.py", line 1211, in runtime
do_main(state, gpg_conn)
File "/home/patches/.local/lib/python3.8/site-packages/tapestry/functions.py", line 328, in do_main
ops_list = build_ops_list(namespace)
File "/home/patches/.local/lib/python3.8/site-packages/tapestry/functions.py", line 102, in build_ops_list
with open(absolute_path, "rb") as contents:
PermissionError: [Errno 13] Permission denied: '/home/patches/server/piminder/test/db/ibtmp1'
Desktop (please complete the following information):
Additional context
Add any other context about the problem here.
None, issue was previously not considered.
If the expected key is absent (has been deleted from the GPG keyring), Tapestry will proceed as normal through the tar-building, compression, and encryption phase. Because the key is absent, encryption will "complete" instantly, and then signing will fail. The program believes it has done its job correctly, however, as encryption is the phase which moves the tarball files to the output directory, and then the program deletes its temporary working directory at the end, the net effect is to waste the time of the whole process with no actual result.
Release 1.0.0 and later.
Describe the bug
Status print currently contains a bug where it is rounding up the completion percentage. While this may initially be desirable for very lengthy operations to get as soon as possible to 1%, it causes an issue where the percentage can become 100%
well before the round is actually complete.
When this happens the console is spammed repeatedly with DONE!
messages even though the operation is not done, each on its own new line.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
The 100% - Done message should only be shown on the final operation in the queue.
Log/Console Output
Packing: [###############] 100% - Done!
Packing: [###############] 100% - Done!
Packing: [###############] 100% - Done!
Packing: [###############] 100% - Done!
Packing: [###############] 100% - Done!
Packing: [###############] 100% - Done!
Packing: [###############] 100% - Done!
Packing: [###############] 100% - Done!
Packing: [###############] 100% - Done!
Packing: [###############] 100% - Done!
Desktop (please complete the following information):
and later
.Additional context
This is probably a minor fix along with the other issues described here, which came from working with production data. Its own minor release is not required if all of the open tickets can be closed reasonably close together.
Is your feature request related to a problem? Please describe.
parse_config(), which itself is called inside a runtime function that is already wrapped in an if name main statement, contains such a statement which prevents most of its functions from executing in a situation where it is not executing as main - for example in the positive_tests.py package currently part of test-enhance.
Describe the solution you'd like
Eliminate the if-name-main check from the function and shift everything behind it one tab to the left.
Describe alternatives you've considered
It might be possible to use some kind of override in the test itself, but this is the more elegant solution.
Additional context
This was tested in an uncommitted dev build of Tapestry and did not immediately appear to have any negative effects. It is required for positive_tests.test_parse_config() to complete successfully.
Is your feature request related to a problem? Please describe.
Tapestry's debug_print() function relies upon a global variable, state, which I believe is the master NS reference. This works as intended during runtime, but causes failures during testing under the test-enhance model.
Describe the solution you'd like
Add exception handling to catch the relevant NameError when debug_print() is called outside of its usual context, which will cause the function to simply pass.
Describe alternatives you've considered
Alternatives were to wrap tests known to trigger debug print as child process tasks but this seems to be the far more complex path and introduces performance issues in testing.
Additional context
A one-off test version of Tapestry was packaged with this change and it did not negatively impact performance during normal operation and allowed the new positive_tests package (currently on test-enhance) to complete. The changes were reverted and are not currently committed.
Describe the bug
Access tests during a regular or --inc
run of the backup build process will throw an error if write permission is not present on a target file. To my understanding write permission is not required for the backup process and perhaps this should be adjusted
To Reproduce
Steps to reproduce the behavior:
Expected behavior
The file should be included if at all possible.
Log/Console Output
Error accessing /home/patches/roms/gbc/tetris.gbc: AccessTestResult(exist=True, read=True, write=False, execute=None). Verify it exists and you have permissions
Desktop (please complete the following information):
Additional context
This is one of several flaws exposed during recent production use of Tapestry, all alike. It's a minor fix, but ideally will be fixed in a release including all such flaws, or at least as many as require minor fixes.
Describe the bug
The first two steps of the backup construction process (The "sorting" step and the actual "packing" step) scale in time-to-complete based strongly on the number of items included in the backup, with the overall file size playing a relatively small role in the time spent on these steps. The RIFF file size also scales virtually linearly based on file count alone with file size playing only the smallest influence on the RIFF output file's size.
To Reproduce
Steps to reproduce the behavior:
--inc
or unflagged backup against a data set containing 10s of the 1000s of files.Expected behavior
As much as possible the main constraint in Tapestry's run time should be the amount of data being backed up, rather than the individual number of small files. This is especially of concerned when used by users who generate a very large number of relatively small files; such users to include many FOSS advocates who doubtless have many locally cloned repos which may or may not be included in their backup paths.
Log/Console Output
Where possible, add the output of the Logs or Console when the problem occurs.
Program completes as normal, console output does not include runtime.
Desktop (please complete the following information):
Additional context
Add any other context about the problem here.
This became especially evident when working with a backup tree that included the qmk source code. The qmk source code contained well over 50,000 individual files which collectively made up roughly 93% of all files included in the backup run. Simply tarring the directory containing the qmk source code and removing the unpacked version of the repo from the path of the backup reduced backup times from tens of hours to 24 minutes, without changing anything about Tapestry itself or the hardware it was run on. This only changed the final output file size for the .tap files by roughly the size of a compact disk; it did however reduce the RIFF file sizes by over 75%
Is your feature request related to a problem? Please describe.
Current state of affairs is that the signature verification process is naive - if the key in question is included in the keyring and trusted, it will be used. This would allow issues where a key is compromised or improperly trusted and unintended keys can be used to sign Tapestry blocks.
Describe the solution you'd like
Currently envisioning adding a "strict signing" mode, passed in config. I'm looking at doing this in one of two ways:
Describe alternatives you've considered
Additional context
N/A
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.