Git Product home page Git Product logo

stringsext's Introduction

stringsext

stringsext - search for multi-byte encoded strings in binary data.

Author: Jens Getreu
Copyright: Apache 2 license

stringsext is a Unicode enhancement of the GNU strings tool with additional functionalities: stringsext recognizes Cyrillic, CJKV characters and other scripts in all supported multi-byte-encodings, while GNU strings fails in finding any of these scripts in UTF-16 and many other encodings.

stringsext prints all graphic character sequences in FILE or stdin that are at least MIN bytes long.

Unlike GNU strings stringsext can be configured to search for valid characters not only in ASCII but also in many other input encodings, e.g.: UTF-8, UTF-16BE, UTF-16LE, BIG5-2003, EUC-JP, KOI8-R and many others. The option --list-encodings shows a list of valid encoding names based on the WHATWG Encoding Standard. When more than one encoding is specified, the scan is performed in different threads simultaneously.

When searching for UTF-16 encoded strings, 96% of all possible two byte sequences, interpreted as UTF-16 code unit, relate directly to a Unicode code point. As a result, the probability of encountering valid Unicode characters in a random byte stream, interpreted as UTF-16, is also 96%. In order to reduce this big number of false positives, stringsext provides a parameterizable Unicode-block-filter. See --encodings option in the manual page for more details.

stringsext is mainly useful for determining the Unicode content of non-text files.

When invoked with stringsext -e ascii -c i stringsext can be used as GNU strings replacement.

Documentation

User documentation
manual page
Developer documentation

Source code

Repository
stringsext on Github

Distribution

Binaries
Download stringsext binaries and verify hashes.
Manual page
stringsext.1.gz

Building and installing

  1. Install Rust with rustup:

    curl https://sh.rustup.rs -sSf | sh
    
  2. Download stringsext:

    git clone [email protected]:getreu/stringsext.git
    
  3. Build

    Enter the Stringsext source directory where the file Cargo.toml resides. Then execute:

    cargo build --release
    ./make-doc
    
  4. Install

    1. Linux:

      # install binary
      sudo cp target/release/stringsext /usr/local/bin/
      
      # install man-page
      sudo cp man/stringsext.1.gz /usr/local/man/man1/
      sudo dpkg-reconfigure man-db   # e.g. Debian, Ubuntu
    2. Windows

      Copy the binary target/release/stringsext.exe in a directory listed in your PATH environment variable.

stringsext's People

Contributors

getreu avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.