Git Product home page Git Product logo

librsync-rs's Introduction

librsync-rs

Build Status Coverage Status

Rust bindings to librsync.

API Documentation

Introduction

This library contains bindings to librsync 1, to support computation and application of network deltas, used in rsync and duplicity backup applications. This library encapsulates the algorithms of the rsync protocol, which computes differences between files efficiently.

The rsync protocol, when computes differences, does not require the presence of both files. It needs instead the new file and a set of checksums of the first file (namely the signature). Computed differences can be stored in a delta file. The rsync protocol is then able to reproduce the new file, by having the old one and the delta.

Installation

Simply add a corresponding entry to your Cargo.toml dependency list:

[dependencies]
librsync = "0.2"

And add this to your crate root:

extern crate librsync;

Overview of types and modules

This crate provides the streaming operations to produce signatures, delta and patches in the top-level module, with Signature, Delta and Patch structs. Those structs take some input stream (Read or Read + Seek traits) and implement another stream (Read trait) from which the output can be read.

Higher level operations are provided within the whole submodule. If the application does not need fine-grained control over IO operations, sig, delta and patch submodules can be used. Those functions apply the algorithms to an output stream (implementing the Write trait) in a single call.

Example: streams

This example shows how to go through the streaming APIs, starting from an input string and a modified string which act as old and new files. The example simulates a real world scenario, in which the signature of a base file is computed, used as input to compute differences between the base file and the new one, and finally the new file is reconstructed, by using the patch and the base file.

extern crate librsync;

use std::io::prelude::*;
use std::io::Cursor;
use librsync::{Delta, Patch, Signature};

fn main() {
    let base = "base file".as_bytes();
    let new = "modified base file".as_bytes();

    // create signature starting from base file
    let mut sig = Signature::new(base).unwrap();
    // create delta from new file and the base signature
    let delta = Delta::new(new, &mut sig).unwrap();
    // create and store the new file from the base one and the delta
    let mut patch = Patch::new(Cursor::new(base), delta).unwrap();
    let mut computed_new = Vec::new();
    patch.read_to_end(&mut computed_new).unwrap();

    // test whether the computed file is exactly the new file, as expected
    assert_eq!(computed_new, new);
}

Note that intermediate results are not stored in temporary containers. This is possible because the operations implement the Read trait. In this way the results does not need to be fully in memory, during computation.

Example: whole file API

This example shows how to go trough the whole file APIs, starting from an input string and a modified string which act as old and new files. Unlike the streaming example, here we call a single function, to get the computation result of signature, delta and patch operations. This is convenient when an output stream (like a network socket or a file) is used as output for an operation.

extern crate librsync;

use std::io::Cursor;
use librsync::whole::*;

fn main() {
    let base = "base file".as_bytes();
    let new = "modified base file".as_bytes();

    // signature
    let mut sig = Vec::new();
    signature(&mut Cursor::new(base), &mut sig).unwrap();

    // delta
    let mut dlt = Vec::new();
    delta(&mut Cursor::new(new), &mut Cursor::new(sig), &mut dlt).unwrap();

    // patch
    let mut out = Vec::new();
    patch(&mut Cursor::new(base), &mut Cursor::new(dlt), &mut out).unwrap();

    assert_eq!(out, new);
}

License

Licensed under either of

at your option.

This library uses librsync, which comes with an LGPL-2.0 license. Please, be sure to fulfill librsync licensing requirements before to use this library.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

librsync-rs's People

Contributors

bheesham avatar dependabot-preview[bot] avatar genail avatar goffrie avatar lu-zero avatar mbrt avatar nlopes avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

librsync-rs's Issues

Undefined behavior in `test::send_patch` due to invalid read

I've been experimenting with a version of Miri that can execute foreign functions by interpreting the LLVM bytecode that is produced during a crate's build process. We're hoping our results can assist with the Krabcake project.

Miri found a violation of Tree Borrows in the following test case:

 test::send_patch ... error: Undefined Behavior: deallocation through <93226> is forbidden
   --> .../rust/library/alloc/src/alloc.rs:117:14
    |
117 |     unsafe { __rust_dealloc(ptr, layout.size(), layout.align()) }
    |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ deallocation through <93226> is forbidden
    |
    = help: the accessed tag <93226> is a child of the conflicting tag <93157>
    = help: the conflicting tag <93157> has state Frozen which forbids this deallocation (acting as a child write access)
help: the accessed tag <93226> was created here
   --> src/lib.rs:573:31
    |
573 |           let t = thread::spawn(move || {
    |  _______________________________^
574 | |             let mut computed_new = String::new();
575 | |             patch.read_to_string(&mut computed_new).unwrap();
576 | |         });
    | |_________^
help: the conflicting tag <93157> was created here, in the initial state Reserved
   --> .../rust/library/std/src/panic.rs:141:55
    |
141 | pub fn catch_unwind<F: FnOnce() -> R + UnwindSafe, R>(f: F) -> Result<R> {
    |                                                       ^
help: the conflicting tag <93157> later transitioned to Frozen due to a reborrow (acting as a foreign read access) at offsets [0x0..0x10]
   --> src/lib.rs:452:9
    |
452 |         (*h).borrow_mut()
    |         ^^^^
    = help: this transition corresponds to a loss of write permissions
    = note: BACKTRACE (of the first span):
    = note: inside `std::alloc::dealloc` at .../rust/library/alloc/src/alloc.rs:117:14: 117:64
    = note: inside `<std::alloc::Global as std::alloc::Allocator>::deallocate` at .../rust/library/alloc/src/alloc.rs:254:22: 254:51
    = note: inside `<std::boxed::Box<std::rc::Rc<std::cell::RefCell<dyn ReadAndSeek>>> as std::ops::Drop>::drop` at .../rust/library/alloc/src/boxed.rs:1235:17: 1235:66
    = note: inside `std::ptr::drop_in_place::<std::boxed::Box<std::rc::Rc<std::cell::RefCell<dyn ReadAndSeek>>>> - shim(Some(std::boxed::Box<std::rc::Rc<std::cell::RefCell<dyn ReadAndSeek>>>))` at .../rust/library/core/src/ptr/mod.rs:497:1: 497:56
    = note: inside `std::ptr::drop_in_place::<Patch<'_, std::io::Cursor<&str>, std::io::BufReader<std::io::Cursor<std::vec::Vec<u8>>>>> - shim(Some(Patch<'_, std::io::Cursor<&str>, std::io::BufReader<std::io::Cursor<std::vec::Vec<u8>>>>))` at .../rust/library/core/src/ptr/mod.rs:497:1: 497:56
    = note: inside `std::ptr::drop_in_place::<{closure@src/lib.rs:573:31: 573:38}> - shim(Some({closure@src/lib.rs:573:31: 573:38}))` at .../rust/library/core/src/ptr/mod.rs:497:1: 497:56
note: inside closure
   --> src/lib.rs:576:9
    |
576 |         });

This appears to be caused by the following reborrow, which occurs on line 350 within the implementation of Patch::with_buf_read:

    let job = unsafe { raw::rs_patch_begin(patch_copy_cb, mem::transmute(&*cb_data)) };

When the pointer created by mem::transmute(&*cb_data) is read through in the callback patch_copy_cb, it invalidates the pointer associated with the Box within the instance of the Patch struct that is moved into the move closure on line 573. At the end of this closure, when Patch is dropped, the Box is deallocated using a pointer that is no longer valid due to this foreign read.

To fix this issue, you could use Box::into_raw to obtain a raw pointer to the underlying allocation. Then, you'd use this pointer as both the second argument to rs_patch_begin and the field raw of the struct Patch. Since this pointer would be "Reserved", it would allow foreign read accesses, which prevents this error from occurring.

I'd make a pull request with this change, but it would require an implementation of Drop for Patch so that you could call Box::from_raw. However, this also prevents the function Patch::into_inner from borrow checking, since you can't move out of a struct with a Drop implementation.

Remove logs from librsync

We can either:

  • set the log level to 0 with rs_trace_set_level();
  • or replace logs with no-ops;
  • or forward logs to log crate.

The only problem is the initialization. When can we initialize the logs? How can we make the static initialization in the crate, before any api is called?

Add documentation

  • Document public types and functions
  • Enable missing_docs error
  • Add examples

Windows support?

Trying to compile the basic example from the docs on Windows, fails in liner. Not sure if it's specifics of my Windows configuration, or whether this crate doesn't have Windows support yet. The original librsync should build normally on Windows.

The crate compiles normally on Mac OS with correct delta files generation. On Linux (Ubuntu server) I noticed different file size generated for a delta patch. And on Windows cannot build yet.

librsync/src/checksum.c(28): error C2491: 'RS_MD4_SUM_LENGTH': definition of dllimport data not allowed
librsync/src/checksum.c(29): error C2491: 'RS_BLAKE2_SUM_LENGTH': definition of dllimport data not allowed

Add whole file APIs

A very simple approach would be to implement those free functions in a module:

  • fn sig(f: Read, sig: Write, block_len: usize, strong_len: usize, sig_type: SigType) -> Result<()>;
  • fn delta(base_sig: Read, new: Read, delta: Write) -> Result<()>;
  • fn patch(base: Read, delta: Read, new: Write) -> Result<()>.

Generate correct config.h depending on arch

Needs to test against TARGET env var in the build script, to generate the correct defines, instead of using a pre-generated config.h.

  • Generate config files in officially supported Rust distributions;
  • Make a rust binary that generates them automatically, by using cmake crate.

Use BufRead in Job

Instead of implementing buffering from scratch, reuse std::io::BufReader implementation.
Need to add special constructor with_bufread to streams that avoids wrapping a BufRead into another BufRead.

aarch64-apple-darwin build issue when using

Hi

I am having a problem when I am using the library on apple silicon. I thinks it's more of cargo than library problem

I can compile the library fine (after I remembered about submodules)

But when I include it in a project it fails to compile due to missing config.h (Static config?):

` cargo:warning=librsync/src/base64.c:22:10: fatal error: 'config.h' file not found
cargo:warning=#include "config.h"
cargo:warning= ^~~~~~~~~~
cargo:warning=1 error generated.

--- stderr

error occurred: Command env -u IPHONEOS_DEPLOYMENT_TARGET "cc" "-O0" "-ffunction-sections" "-fdata-sections" "-fPIC" "-gdwarf-2" "-fno-omit-frame-pointer" "--target=arm64-apple-darwin" "-mmacosx-version-min=14.4" "-I" "librsync/static/aarch64-apple-darwin" "-I" "librsync/static" "-I" "librsync/src" "-I" "librsync/src/blake2" "-Wall" "-Wextra" "-DSTDC_HEADERS=1" "-o" "/Users/dennis/Projects/dsync/target/aarch64-apple-darwin/debug/build/librsync-sys-e8f77933a44e3c30/out/3901d9deb342bdfa-base64.o" "-c" "librsync/src/base64.c" with args cc did not execute successfully (status code exit status: 1).`

I am new to cargo, so I am a bit when to look.

I can cross-compile to x86_64-apple-darwin on the machine so the tools seems to be working. I am curious why the
target is set to arm64-apple-darwin

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.