Git Product home page Git Product logo

ocaml-seqbox's Introduction

ocaml-SeqBox

No longer maintained, use rsbx instead

This project has been superseded by rsbx and is no longer actively developed or maintained due to lack of active users(I no longer use this program as well).

Implementation of SeqBox in OCaml

The original pharsing was "Port of SeqBox to OCaml", but since no direct porting/translation was actually done due to the differences between Python 3(implementation language of SeqBox) and OCaml, which also force the architecture and design of this software(ocaml-SeqBox) to be independently developed, thus this is only an implementation(not a port or translation) of SeqBox according to its technical specifications. This is mainly to address the different licenses being used(SeqBox was using AGPL 3.0 at the time of writing while this project uses 3-Clause BSD license).

Official SeqBox Repo - https://github.com/MarcoPon/SeqBox

Table of Contents

Created by gh-md-toc

Acknowledgement

I would like to thank Marco (author of the official SeqBox) for discussing and clarifying several aspects of his project, and providing me with test data.

I would like to thank Ming for his feedback on the documentation, UX design, and several other general aspects of this project. And also his help on testing the building and installation of osbx on macOS.

Getting started

Recordings

You can view the recordings here

Installation

Osbx 1.2.4 is available through OPAM.

opam install osbx

There are four modes/commands for osbx currently : encode, decode, rescue, show

You can consult the man pages of osbx itself and all the four commands via

osbx        --help
osbx encode --help
osbx decode --help
osbx rescue --help
osbx show   --help

Notes

Version 1.2.4 is considered to be feature complete and mature enough for production use, but any bug reports or suggestions are very welcome - just open an issue!

Contributions are welcome as well, but note that by submitting a contribution, you agree that your code will be licensed under the 3-Clause BSD license.

No major active development will occur, but since 1.2.4 was designed to be quite scriptable via the options, you can always write helper scripts for advanced features. You can check out the ones I am writing here.

CRC-CCITT is currently implemented in pure OCaml and is translated from the implementation in libcrc

  • See src/crcccitt.ml, src/crcccitt.mli for the OCaml implementation
    • The translated source code is under the same MIT license used by and stated in libcrc source code
  • See libcrc_crcccitt/crcccitt.c, libcrc_crcccitt/checksum.h for the source code from libcrc used for the translation

Exact behaviours in non-standard cases are not specified in official SeqBox technical specification

  • See specification of ocaml-SeqBox for details on how ocaml-SeqBox behaves(if you care about undefined behaviour those sort of things)

Hashing libraries

  • Nocrypto for SHA1, SHA256, SHA512
  • Digestif for BLAKE2B_512

Tips

See wiki page

Gotchas

See wiki page

Helpers

See helpers

Links

Wiki

Index of source code

Changelog

Todo/wishlist

Possibly useful additional features of ocaml-SeqBox(possibly not yet in official SeqBox)

  • Allows random ordering in sbx container
    • This also means block corruption will not stop the decoding process
  • Allows duplicate metadata/data blocks to exist within one sbx container
    • This means you can concatenate multiple copies of sbx container together directly to increase chance of recovery in case of corruption

Technical Specification

The following specification is copied directly from the official specification (with possible slight modifications).

Also see section "Features currently NOT planned to be implemented" for features ocaml-SeqBox is probably not going to have.

Byte order: Big Endian

Common blocks header:

pos to pos size desc
0 2 3 Recoverable Block signature = 'SBx'
3 3 1 Version byte
4 5 2 CRC-16-CCITT of the rest of the block (Version is used as starting value)
6 11 6 file UID
12 15 4 Block sequence number

Block 0

pos to pos size desc
16 n var encoded metadata
n+1 blockend var padding (0x1a)

Blocks > 0 & < last:

pos to pos size desc
16 blockend var data

Blocks == last:

pos to pos size desc
16 n var data
n+1 blockend var padding (0x1a)

Versions:

N.B. Current versions differs only by blocksize.

ver blocksize note
1 512 default
2 128
3 4096

Metadata encoding:

Bytes Field
3 ID
1 Len
n Data

IDs

ID Desc
FNM filename (utf-8)
SNM sbx filename (utf-8)
FSZ filesize (8 bytes)
FDT date & time (8 bytes, seconds since epoch)
SDT sbx date & time (8 bytes)
HSH crypto hash (using Multihash protocol)
PID parent UID (not used at the moment)

Supported crypto hashes since 1.1.0 are

  • SHA1
  • SHA256
  • SHA512
  • BLAKE2B_512

Features currently NOT planned to be implemented

  • Data hiding (XOR encoding/decoding in official seqbox)
    • Provides neither sufficiently strong encryption nor sufficient stealth for any serious attempt to hide/secure data
    • You should use the appropriate tools for encryption

License

The following files directly from libcrc are under the MIT License(see license text in each of the code files)

  • libcrc_crcccitt/crcccitt.c
  • libcrc_crcccitt/checksum.h

The following files translated from libcrc source code are under the same MIT License as used by libcrc and as stated in libcrc source code, the license text of the crcccitt.c is copied over to src/crcccitt.ml as well

  • src/crcccitt.ml
  • src/crcccitt.mli

The files in tests folder copied from official SeqBox are under its license, which is MIT as of time of writing

  • tests/SeqBox/*

All remaining files are distributed under the 3-Clause BSD license as stated in the LICENSE file

ocaml-seqbox's People

Contributors

darrenldl avatar mdchia avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

mdchia

ocaml-seqbox's Issues

Add repair command

Use this command for fixing sbx containers, rather than trying to shove everything in decode module

This module should handle

  • Reordering of sbx blocks
  • Replace blocks that failed CRC checks with derived ones from Reed-Solomon if possible

Switch to actor model

Stream_file interface is no longer capable of managing the complexity

Switch to Lwt

Things to work on

  • Blocking queue, implement using Lwt's mailbox, OCaml's queue, and a mutex lock
----------------------------------------------------
| MailBox | -> Queue (lock protected) -> | MailBox |
----------------------------------------------------
  • Concurrent progress text printing using Lwt
  • Adhere to zero-copy model for more things maybe

Add Reed-Solomon support

Work is currently done in reed-solomon branch.

Users will be able to enable Reed-Solomon using versions(decimal) 11, 12, 13.

Version 11 corresponds to version 1 but with Reed-Solomon enabled, and so on.

The versions will not be compatible with official SeqBox(as of time of writing) or osbx of version <= 1.2.4.

Design details will be updated in this issue and finalised details will be stored in SPECS.md

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.