Git Product home page Git Product logo

tapestry's People

Contributors

zadammac avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

ericapomme

tapestry's Issues

Borked Operations Break a Run

If any error causes a worker process to halt, the program will stop at the nearest tasks.join() call.

Proposed solution: better error handling to be implemented in 1.1.

Signatures Not Being Created in v 1.1.0.

Describe the bug
During backup operations, the process will complete without any signing taking place. Signing key lived on an attached Yubikey.

...
Signing Enqueued

The processing has completed. Your .tap files are here:
....

To Reproduce
Steps to reproduce the behavior:

  1. Run the program normally.

Expected behavior
A clear and concise description of what you expected to happen.

Log/Console Output

...
Signing Enqueued

The processing has completed. Your .tap files are here:
....

Desktop (please complete the following information):

  • OS: Ubuntu 18.04

Additional context
Believe the issue is either in the signing logic itself, gpg's handling of the smart card, or possibly a directory misdirection problem.

Tapestry incorrectly handles SFTP negotiation if the user's known_hosts file contains multiple entries for the same hostname.

Describe the bug
In a situation where ~/.ssh/known_hosts contains multiple keys for the same hostname (for example, if a given host the user has access to has sshd running on multiple ports via containerization), Tapestry will raise the SSHException indicating a mismatched hostkey was found, and SFTP transfer will fail. Files are, in this condition, retained locally as expected, and an alert is raised to the user that this has occurred.

To Reproduce
Reproduction is preconditioned on having the setup described above. This could be reproduced as simply as creating a dummy SFTP host listening on 2222/tcp on another host which already has SSH of its own on 22/tcp as normal

Steps to reproduce the behavior:

  1. Run tapestry in network mode, with the network configuration pointing to the "secondary" port.
  2. Error should raise after SFTP negotiation.

Expected behavior
This should work; checking known_hosts should take all host-keys for a given user into consideration.

Log/Console Output
Where possible, add the output of the Logs or Console when the problem occurs.

Desktop (please complete the following information):

  • OS: Ubuntu 20.04
  • Python Interpreter: 3.8
  • Version: 2.1.1 (Current Release)

Additional context
This is probably a very unusual arrangement so I wouldn't be surprised if I am the only person to encounter this.

[RFC] Add Native Support for Amazon S3 Glacier

Is your feature request related to a problem? Please describe.
Tapestry purports to create a storage-security agnostic backup, but offers limited (and frankly low-availability) options for cloud or remote storage of Tapestry's .tap backup files. This remote-to-the-user storage is a key part of a true recoverable backup solution and the fact it is missing is a potential risk exposure to users who are storing backup files purely locally.

Describe the solution you'd like
In addition to the existing SFTP functionality, Tapestry should have a new network mode, 's3', which leverages the relatively low-cost Amazon S3 Glacier storage mechanism and allows for both automated backup and retrieval from a specified S3 bucket the user or organization controls. This would be provided alongside user-friendly tutorials on the setup and maintainance of S3 and allows us to leverage AWS's assurance regarding uptime and availability.

Describe alternatives you've considered
Alternatives looked at were to create a guide on using various cloud platform vendors to allow the use of SFTP, which is already supported by Tapestry as a storage and retrieval mechanism.

Additional context
This would be the first major feature release since SFTP support earlier in 2020.

Build Process will crash if file stops existing at any point

Describe the bug
A long running backup will crash mid-process if a file vanishes at any point. This bug was originally encountered during the AccessTest function, which nominally should be checking if a file exists, so we have a double bug.

The exception is OSError, so we really just need to add handling for it.

To Reproduce
Steps to reproduce the behavior:

  1. Include a directory that recieves frequent changes, eg. ~/.config/ on an in-use system.
  2. Run the script in default or --inc mode.

Expected behavior
An error should be logged but the actual backup process should continue.

Log/Console Output
-stack trace abbreviated-
OSError: [Errno 6] No such device or address: '/home/patches/.config/discord/SS'

Desktop (please complete the following information):

  • OS: Ubuntu 20.04.5 LTS
  • Version: 2.1.1

Additional context
Exposed during the same ordinary usage as #34 and likely a similarly minor fix. Should be fixed in the same minor release.

Sorting the files to be archived - this could take a few minutes (Gross Understatement)

Describe the bug
[This was thought solved, then reinvented in 2.2.1]

This step takes a very long time because of the need to generate a hash for each individual file. We need to put some thought into either fixing the look and feel or fixing the performance.

To Reproduce
Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior
A clear and concise description of what you expected to happen.

Log/Console Output
Where possible, add the output of the Logs or Console when the problem occurs.

Desktop (please complete the following information):

  • OS: [e.g. iOS]
  • Browser [e.g. chrome, safari]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

[BUG] Race Condition in TarUnpack when a category path is missing.

Describe the bug
In some conditions if two child processes both detect, concurrently, that the parent directory for a category does not exist, both will attempt to create the parent and one will error out. This causes a halt as the errorred-out process never completes.

To Reproduce
Steps to reproduce the behavior:

  1. From a known-good backup, create a tapestry.cfg that is missing a category.
  2. Run --rcv
  3. Error may or may not reproduce.

Expected behavior
This error should be handled without being exposed to the user and in such a way as to not interrupt the flow of the program.

Log/Console Output
Where possible, add the output of the Logs or Console when the problem occurs.

Desktop (please complete the following information):

  • OS: Linux.

Additional context
Add any other context about the problem here.

Tapestry will crash if encountering a broken symlink during Tarbuild

Describe the bug
In investigating recent silent failures of an automated implementation of Tapestry, it was discovered that broken symbolic links cause an unhandled exception and crash Tapestry in place. The exception is FileNotFoundError and raised by one of the subordinate calls in os.walk.

To Reproduce
Steps to reproduce the behavior:

  1. Create a broken symbolic link in a test directory that will be consumed by tapestry.
  2. Run tapestry.

Expected behavior
The exception should be captured and the broken file reported in the skiplogger if at all possible.

Log/Console Output
Where possible, add the output of the Logs or Console when the problem occurs.

patches@Jupiter:~$ tapcontroller
/usr/lib/python3.6/runpy.py:125: RuntimeWarning: 'tapestry.__main__' found in sys.modules after import of package 'tapestry', but prior to execution of 'tapestry.__main__'; this may result in unpredictable behaviour
  warn(RuntimeWarning(msg))
Welcome to Tapestry Backup Tool Version 2.0.2
Automatic updates are not currently a function of this script.
Please refer to the repo for updates: https://www.github.com/ZAdamMac/Tapestry
Gathering a list of files to archive - this could take a few minutes.
Traceback (most recent call last):
  File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/patches/.local/lib/python3.6/site-packages/tapestry/__main__.py", line 1133, in <module>
    do_main(state, gpg_conn)
  File "/home/patches/.local/lib/python3.6/site-packages/tapestry/__main__.py", line 295, in do_main
    ops_list = build_ops_list(namespace)
  File "/home/patches/.local/lib/python3.6/site-packages/tapestry/__main__.py", line 71, in build_ops_list
    size = os.path.getsize(absolute_path)
  File "/usr/lib/python3.6/genericpath.py", line 50, in getsize
    return os.stat(filename).st_size
FileNotFoundError: [Errno 2] No such file or directory: '/home/patches/Documents/Website/config/mods-enabled/dir.conf'

Desktop (please complete the following information):

  • OS: Ubuntu 18.04.01 LTS
  • Tapestry Version: 2.0.2

Additional context
This can probably be made part of the logging improvements project and handled there.

Fix the cryptography problem.

Signing isn't working either - it's a silent failure, but failing HOW?

The how: The subprocesses that do sample generation seem to hold the GPG system hostage. Short of instantiating a second install for the testing script (which is absurd, and would break several tests), it makes more sense to split the testing process into a suite of scripts, run successively.

[BUG]Generate_Keys does not update the namespace, only the config.

Describe the bug
When generating a new key, tapestry will thereafter exhibit one of two behaviours:

  • If no older key exists in the keyring, an error will be raised when the key can not be found. This error is handled but causes the script to terminate.
  • If an older key exists, that older key is used instead.

This is because generate_keys is correctly updating the configuration, but it's called after parse_config() and therefore needs to be patched to update the NAMESPACE as well.

To Reproduce
Steps to reproduce the behavior:

  1. Run tapestry with the --genKey flag.

Expected behavior
A new key should be generated and written to configuration, and that key should be used throughout the rest of the backup process.

Log/Console Output
Where possible, add the output of the Logs or Console when the problem occurs.

Desktop (please complete the following information):

  • OS: Only observed on windows to date, but probably on both platforms.

Additional context
Add any other context about the problem here.

Bug was reported by the redoubtable Kat Adam-MacEwen

Inappropriate handling of absent config file

Describe the bug
Tapestry does not currently correctly handle instances where it is invoked without a config file available.

To Reproduce
Steps to reproduce the behavior:

  1. Navigate to a directory with no local copy of tapestry.cfg
  2. Run tapestry without specifying a config file, ie python3 -m tapestry
  3. See output below

Expected behavior
This should simply exit after the issue is raised rather than attempting to proceed, which is what is causing the error listed below.

Log/Console Output
Where possible, add the output of the Logs or Console when the problem occurs.

patches@sevastopol:~/Downloads$ python3 -m tapestry
The indicated config file: tapestry.cfg cannot be found.
Generating a template config file in that location.
Please edit this config file appropriately and rerun the program.
Traceback (most recent call last):
  File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/patches/.local/lib/python3.8/site-packages/tapestry/__main__.py", line 13, in <module>
    tapestry.runtime()
  File "/home/patches/.local/lib/python3.8/site-packages/tapestry/functions.py", line 1192, in runtime
    state = parse_config(state)
  File "/home/patches/.local/lib/python3.8/site-packages/tapestry/functions.py", line 858, in parse_config
    place_config_template(ns.config_path)
  File "/home/patches/.local/lib/python3.8/site-packages/tapestry/functions.py", line 985, in place_config_template
    config.set(section, option, value)
  File "/usr/lib/python3.8/configparser.py", line 1200, in set
    self._validate_value_types(option=option, value=value)
  File "/usr/lib/python3.8/configparser.py", line 1185, in _validate_value_types
    raise TypeError("option values must be strings")
TypeError: option values must be strings

Desktop (please complete the following information):

  • OS: [e.g. iOS] Ubuntu 20.04 with Python 3.8.2
  • Browser [e.g. chrome, safari] - Irrelevant? Why is this in the template?
  • Version [e.g. 22] 2.1.0

Additional context
Add any other context about the problem here.

Enhance Performance of unix_block_build; investigate porting to the windows version as well.

Is your feature request related to a problem? Please describe.
When packaging a large number of files (~58965), the block build face is unacceptably under-performant if the block size is also sufficiently large; eg. in the instance above only 6 blocks were created. This is likely due to the delay in acquiring the lock as the parallelism on the unix block build is limited by the number of block output files being created.

The specific example cited above took roughly 12 hours, if not a bit more than that, to complete just the building portion. The attendant level-2 compression took ~8 minutes.

Describe the solution you'd like
Time to handle should be reduced signifcantly if at all possible.

Describe alternatives you've considered
Backgrounding the process would reduce the annoyance to the user, but also decrease the accuracy of the snapshot, since files could be written/modified during the build process if it takes a considerably long time.

Since the limitations here are disk i/o, which we can do nothing about, and the block writes, which we can also do nothing about, we should explore the idea of an optional flag that would allow Tapestry to break a file up into smaller chunks provided those chunks satisfy the following properties:

  • They are permitted by the end user, and;
  • They are even factors of the block size limit.

Alternatively, perhaps many smaller blocks could be briefly created to reduce the packaging load, then composited into right-size blocks?

We also need to investigate what is causing the actual delay here. If it really is locking we can solve it, but if it's just constrained by disk i/o there is nothing to be done.

Additional context
This is likely not a light lift as was the case for #34, #35, and #36; a fix is not required as a bug fix and may be considerably more involved.

Critical: 2.2.0 Cannot Start - Bad Naming

Describe the bug
The 2.2.0 version is completely invalid as a function is called without ever actually having the name instantiated. How the hell did this pass into release?

To Reproduce
Steps to reproduce the behavior:

  1. Install 2.2.0.
  2. Run it in any build mode.

Expected behavior
The application should run.

Log/Console Output

Welcome to Tapestry Backup Tool Version 2.2.0
Automatic updates are not currently a function of this script.
Please refer to the repo for updates: https://www.github.com/ZAdamMac/Tapestry
Gathering a list of files to archive - this could take a few minutes.
/usr/lib/python3/dist-packages/requests/__init__.py:89: RequestsDependencyWarning: urllib3 (1.26.12) or chardet (3.0.4) doesn't match a supported version!
  warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
Traceback (most recent call last):
  File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/patches/.local/lib/python3.8/site-packages/tapestry/__main__.py", line 13, in <module>
    tapestry.runtime()
  File "/home/patches/.local/lib/python3.8/site-packages/tapestry/functions.py", line 1262, in runtime
    do_main(state, gpg_conn)
  File "/home/patches/.local/lib/python3.8/site-packages/tapestry/functions.py", line 334, in do_main
    ops_list = build_ops_list(namespace)
  File "/home/patches/.local/lib/python3.8/site-packages/tapestry/functions.py", line 106, in build_ops_list
    hasher.update(chunk)
NameError: name 'hasher' is not defined

Desktop (please complete the following information):

  • OS: Ubuntu 20.04.5 LTS
  • Version: 2.2.0

Additional context
Add any other context about the problem here.

Tapestry Will Crash on Encountering a File It Cannot Copy

Describe the bug
Tapestry Will Crash on Encountering a File It Cannot Copy. Normally this shouldn't happen, but can in the event that you're copying out (e.g.) system directories or other tools.

To Reproduce
Steps to reproduce the behavior:

  1. Add a file to a directory and remove your own read permissions from it.
  2. Run tapestry in the default or --inc modes.

Expected behavior
It would be better to skip and log such files, possibly with a displayed warning indicating that the script encountered files it cannot remove.

Log/Console Output

patches@Chimera-Coruscant:~$ python3 -m tapestry --inc -c tapestry.cfg
Welcome to Tapestry Backup Tool Version 2.1.1
Automatic updates are not currently a function of this script.
Please refer to the repo for updates: https://www.github.com/ZAdamMac/Tapestry
Gathering a list of files to archive - this could take a few minutes.
Traceback (most recent call last):
  File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/patches/.local/lib/python3.8/site-packages/tapestry/__main__.py", line 13, in <module>
    tapestry.runtime()
  File "/home/patches/.local/lib/python3.8/site-packages/tapestry/functions.py", line 1211, in runtime
    do_main(state, gpg_conn)
  File "/home/patches/.local/lib/python3.8/site-packages/tapestry/functions.py", line 328, in do_main
    ops_list = build_ops_list(namespace)
  File "/home/patches/.local/lib/python3.8/site-packages/tapestry/functions.py", line 102, in build_ops_list
    with open(absolute_path, "rb") as contents:
PermissionError: [Errno 13] Permission denied: '/home/patches/server/piminder/test/db/ibtmp1'

Desktop (please complete the following information):

  • OS: [e.g. iOS] Ubuntu 20.04
  • Version [e.g. 22]: 2.1.1

Additional context
Add any other context about the problem here.

[Bug] Unexpected Behaviour With Missing Keys

Expected Behaviour:

None, issue was previously not considered.

Actual Behaviour:

If the expected key is absent (has been deleted from the GPG keyring), Tapestry will proceed as normal through the tar-building, compression, and encryption phase. Because the key is absent, encryption will "complete" instantly, and then signing will fail. The program believes it has done its job correctly, however, as encryption is the phase which moves the tarball files to the output directory, and then the program deletes its temporary working directory at the end, the net effect is to waste the time of the whole process with no actual result.

Hash of Commit In Use, or Release Version Number:

Release 1.0.0 and later.

Undesirable Behaviour in `functions.status_print`

Describe the bug
Status print currently contains a bug where it is rounding up the completion percentage. While this may initially be desirable for very lengthy operations to get as soon as possible to 1%, it causes an issue where the percentage can become 100% well before the round is actually complete.

When this happens the console is spammed repeatedly with DONE! messages even though the operation is not done, each on its own new line.

To Reproduce
Steps to reproduce the behavior:

  1. Perform a backup against a sufficiently large (say, 300 item) number of files.
  2. Observe.

Expected behavior
The 100% - Done message should only be shown on the final operation in the queue.

Log/Console Output
Packing: [###############] 100% - Done!
Packing: [###############] 100% - Done!
Packing: [###############] 100% - Done!
Packing: [###############] 100% - Done!
Packing: [###############] 100% - Done!
Packing: [###############] 100% - Done!
Packing: [###############] 100% - Done!
Packing: [###############] 100% - Done!
Packing: [###############] 100% - Done!
Packing: [###############] 100% - Done!

Desktop (please complete the following information):

  • OS: Ubuntu 22.04.1 LTS
  • Tapestry Version: 2.1.1 and later.

Additional context
This is probably a minor fix along with the other issues described here, which came from working with production data. Its own minor release is not required if all of the open tickets can be closed reasonably close together.

[Test Enhancement] Remove redundant `if __name__ == '__main__':` statement from parse_config

Is your feature request related to a problem? Please describe.
parse_config(), which itself is called inside a runtime function that is already wrapped in an if name main statement, contains such a statement which prevents most of its functions from executing in a situation where it is not executing as main - for example in the positive_tests.py package currently part of test-enhance.

Describe the solution you'd like
Eliminate the if-name-main check from the function and shift everything behind it one tab to the left.

Describe alternatives you've considered
It might be possible to use some kind of override in the test itself, but this is the more elegant solution.

Additional context
This was tested in an uncommitted dev build of Tapestry and did not immediately appear to have any negative effects. It is required for positive_tests.test_parse_config() to complete successfully.

[Test Enhancements] Debug_Print needs a fail-safe.

Is your feature request related to a problem? Please describe.
Tapestry's debug_print() function relies upon a global variable, state, which I believe is the master NS reference. This works as intended during runtime, but causes failures during testing under the test-enhance model.

Describe the solution you'd like
Add exception handling to catch the relevant NameError when debug_print() is called outside of its usual context, which will cause the function to simply pass.

Describe alternatives you've considered
Alternatives were to wrap tests known to trigger debug print as child process tasks but this seems to be the far more complex path and introduces performance issues in testing.

Additional context
A one-off test version of Tapestry was packaged with this change and it did not negatively impact performance during normal operation and allowed the new positive_tests package (currently on test-enhance) to complete. The changes were reverted and are not currently committed.

Possible Misconfiguration of the Access Tests on Build

Describe the bug
Access tests during a regular or --inc run of the backup build process will throw an error if write permission is not present on a target file. To my understanding write permission is not required for the backup process and perhaps this should be adjusted

To Reproduce
Steps to reproduce the behavior:

  1. Create a file in an acceptable target directory without write permissions to that file.
  2. Run Tapestry, with a config that would attempt to back up the target directory.
  3. Observe error in console output or logs; backup will continue unabated but the unwriteable file will be excluded from the backup.

Expected behavior
The file should be included if at all possible.

Log/Console Output
Error accessing /home/patches/roms/gbc/tetris.gbc: AccessTestResult(exist=True, read=True, write=False, execute=None). Verify it exists and you have permissions

Desktop (please complete the following information):

  • OS: Ubuntu 20.04.5 LTS
  • Version: 2.1.1

Additional context
This is one of several flaws exposed during recent production use of Tapestry, all alike. It's a minor fix, but ideally will be fixed in a release including all such flaws, or at least as many as require minor fixes.

[PERF] Time-to-pack Increases Proportional to File Count Rather than Backup Size

Describe the bug
The first two steps of the backup construction process (The "sorting" step and the actual "packing" step) scale in time-to-complete based strongly on the number of items included in the backup, with the overall file size playing a relatively small role in the time spent on these steps. The RIFF file size also scales virtually linearly based on file count alone with file size playing only the smallest influence on the RIFF output file's size.

To Reproduce
Steps to reproduce the behavior:

  1. Run an --inc or unflagged backup against a data set containing 10s of the 1000s of files.

Expected behavior
As much as possible the main constraint in Tapestry's run time should be the amount of data being backed up, rather than the individual number of small files. This is especially of concerned when used by users who generate a very large number of relatively small files; such users to include many FOSS advocates who doubtless have many locally cloned repos which may or may not be included in their backup paths.

Log/Console Output
Where possible, add the output of the Logs or Console when the problem occurs.

Program completes as normal, console output does not include runtime.

Desktop (please complete the following information):

  • OS: Ubuntu 23.04.1
  • Python 3.10.12
  • Version 2.2.2

Additional context
Add any other context about the problem here.

This became especially evident when working with a backup tree that included the qmk source code. The qmk source code contained well over 50,000 individual files which collectively made up roughly 93% of all files included in the backup run. Simply tarring the directory containing the qmk source code and removing the unpacked version of the repo from the path of the backup reduced backup times from tens of hours to 24 minutes, without changing anything about Tapestry itself or the hardware it was run on. This only changed the final output file size for the .tap files by roughly the size of a compact disk; it did however reduce the RIFF file sizes by over 75%

[RFC] Strengthen Signature Verification By Adding a Strict Mode

Is your feature request related to a problem? Please describe.
Current state of affairs is that the signature verification process is naive - if the key in question is included in the keyring and trusted, it will be used. This would allow issues where a key is compromised or improperly trusted and unintended keys can be used to sign Tapestry blocks.

Describe the solution you'd like
Currently envisioning adding a "strict signing" mode, passed in config. I'm looking at doing this in one of two ways:

Describe alternatives you've considered

  1. Implement signing certificate pinning - specify a specific certificate which does the signing, and if any other key is used, the program should reject the block. This is useful in single-user contexts but less useful in a distributed sense, such as enterprise.
  2. Dedicated keyring - requires additional persistence on the drive and has security implications I'm uncertain of.

Additional context
N/A

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.