Git Product home page Git Product logo

codepage-strings's Introduction

Maintenance CI crates-io api-docs dependency status

codepage-strings: encode / decode strings for Windows code pages

Bart Massey 2021 (version 1.0.2)

This Rust crate builds on the excellent work of the encoding_rs, codepage, and oem-cp crates in an attempt to provide idiomatic encoding and decoding of strings coded according to Windows code pages.

Because Windows code pages are a legacy rathole, it is difficult to transcode strings using them. Sadly, there are still a lot of files out there that use these encodings. This crate was specifically created for use with RIFF, a file format that has code pages baked in for text internationalization.

No effort has been made to deal with Windows code pages beyond those supported by codepage and oem-cp. If the single-byte codepage you need is missing, I suggest taking a look at adding it to oem-cp, which seems to be the main Rust repository for unusual Windows code page tables. I believe that most of the single-byte code pages supported by iconv are dealt with here, but I haven't checked carefully.

Other than UTF-16LE and UTF-16BE, multibyte Windows code pages are not (for now) currently supported — in particular various Asian languages. Code page 65001 (UTF-8) is supported as an identity transformation. UTF-32LE and UTF32-Be are not supported. EBCDIC code pages and UTF-7 are not supported and are low priority, because seriously?

No particular effort has been put into performance. The interface allows std::borrow::Cow to some extent, but this is limited by the minor impedance mismatches between encoding_rs and oem-cp.

Examples

Do some string conversions on Windows code page 869 (alternate Greek).

let coding = Coding::new(869)?;
assert_eq!(
    coding.encode("αβ")?,
    vec![214, 215],
);
assert_eq!(
    coding.decode(&[214, 215])?,
    "αβ",
);
assert_eq!(
    coding.decode_lossy(&[214, 147]),
    \u{fffd}",
);
assert_eq!(
    coding.decode(&[214, 147]),
    Err(ConvertError::StringDecoding),
);

This crate is made available under the "MIT license". Please see the file LICENSE in this distribution for license terms.

Thanks to the cargo-readme crate for generation of this README.

codepage-strings's People

Contributors

bartmassey avatar

Watchers

 avatar  avatar

codepage-strings's Issues

Add support for missing 16-bit codepages

I haven't inventoried it, but I suspect some important 16-bit codepages are missing. It would be great to add them either here or to oem-cp so that these formats can be dealt with.

Finish integration with upstream oem-cp

Right now this crate is based on a private fork of the rust-oem-cp crate. PRs to upstream are in-progress. When they have landed, this crate should be updated to depend on upstream.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.