Git Product home page Git Product logo

rust-ascii's Introduction

ascii

A library that provides ASCII-only string and character types, equivalent to the char, str and String types in the standard library.

Types and conversion traits are described in the Documentation.

You can include this crate in your cargo project by adding it to the dependencies section in Cargo.toml:

[dependencies]
ascii = "1.1"

Using ascii without libstd

Most of AsciiChar and AsciiStr can be used without std by disabling the default features. The owned string type AsciiString and the conversion trait IntoAsciiString as well as all methods referring to these types can be re-enabled by enabling the alloc feature.

Methods referring to CStr and CString are also unavailable. The Error trait also only exists in std, but description() is made available as an inherent method for ToAsciiCharError and AsAsciiStrError in #![no_std]-mode.

To use the ascii crate in #![no_std] mode in your cargo project, just add the following dependency declaration in Cargo.toml:

[dependencies]
ascii = { version = "1.1", default-features = false, features = ["alloc"] }

Minimum supported Rust version

The minimum Rust version for 1.2.* releases is 1.56.1. Later 1.y.0 releases might require newer Rust versions, but the three most recent stable releases at the time of publishing will always be supported.
For example this means that if the current stable Rust version is 1.70 when ascii 1.3.0 is released, then ascii 1.3.* will not require a newer Rust version than 1.68.

History

This package included the Ascii types that were removed from the Rust standard library by the 2014-12 reform of the std::ascii module. The API changed significantly since then.

License

Licensed under either of

at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

rust-ascii's People

Contributors

alexcrichton avatar aochagavia avatar armavica avatar aturon avatar bcmyers avatar brendanzab avatar brson avatar burtonageo avatar chris-morgan avatar chromatic avatar clubby789 avatar ebiggers avatar emberian avatar erickt avatar huonw avatar kimundi avatar lilyball avatar nham avatar nrc avatar patricknorton avatar pcwalton avatar pfalabella avatar reedlepee123 avatar richo avatar simonsapin avatar sunshowers avatar tbu- avatar thestinger avatar tormol avatar zenithsiz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

rust-ascii's Issues

`no_std` feature is backwards: a `use_std` feature should be used, which is part of the default features

Cargo features are supposed to be additive, which means the current no_std feature is backwards. This is an issue if you're writing a library, because there is currently no way to pass a negative switch (e.g. if I have a use_std flag, there is no way to say 'enable the no_std flag if it's not enabled.

The current system should be switched, so that there is a use_std feature, which is enabled in the default features.

Mark AsciiStr `#[repr(transparent)]` ?

I'm unsure if even the non-mut AsciiStr <-> [AsciiChar] slice transmutes like this one here are sound, since you generally can't assume that the outer type is compatible with the inner type without using #[repr(transparent)]:

rust-ascii/src/ascii_str.rs

Lines 350 to 353 in 296c3a8

fn from(slice: &[AsciiChar]) -> &AsciiStr {
let ptr = slice as *const [AsciiChar] as *const AsciiStr;
unsafe { &*ptr }
}

Convert to and from AsciiString and Box<AsciiStr>

A String is usually preferred to a Box<str>. However, there is one good reason to prefer the Boxed variant to the growable String, which is that String requires more memory. See https://users.rust-lang.org/t/use-case-for-box-str-and-string/8295/4

Similarly, Box<AsciiStr> uses less memory than AsciiString, so it would be helpful if there were ways to convert a AsciiString to a Box<AsciiStr>, just as it is possible to do so for String and Box<str>.

On my machine, Box<AsciiStr> uses 16 bytes compared to 24 bytes for AsciiString.

Some feature requests

Thanks for the library. It'd be really awesome if we could add:

  • from_byte(u8) -> Ascii <- this one is really necessary as there's no way to implement this without exposed Ascii constructor.
  • trim(&AsciiStr) -> &AsciiStr just like String.trim.

If you like the idea I can add these myself and send a pull request.

Most functions could be const

See title. I'm not an expert on what subset of const functions are stabilized yet, but I think a lot could already be const fn (think AsciiChar::as_byte).

What is the intended behaviour of AsciiStr.lines()?

I assume "foo\nbar" producing ["foo"] is a bug,
But the lines_iter() test assert that AsciiStr::from("\n").lines() produces nothing, which is not what I expected, and also differs from str.lines() (which produces a single empty slice).

str.lines() handles "\r\n" and trailing newline, so could the (currently rather complex) implementation be replaced with a simple forwarding implementation?

EDIT on 2018-09-11: fixed typo "foo/nbar"and corrected .split() to .lines().

Implement quickcheck::Arbitrary for AsciiString and AsciiChar

See http://burntsushi.net/rustdoc/quickcheck/trait.Arbitrary.html for a description.

Quickcheck is a randomized testing tool that lets you check general properties. To make your type part of the Quickcheck ecosystem, you need to implement Arbitrary, which involves two methods:

(1) generate a random input
(2) shrink a failing input

This should be relatively easy. See http://burntsushi.net/rustdoc/src/quickcheck/arbitrary.rs.html#391 for how it is done for String and char -- it would probably be straightforward to port that logic to AsciiString and AsciiChar.

This can be behind a feature gate to make sure that if you aren't using quickcheck otherwise, you don't have to pull it in.

(I might do this or get someone else from Facebook to do it -- filing it to keep track :) )

Complete the set of traits that string types implement

From the documentation of std::string::String following traits could/should be implemented for ascii strings:

  • FromIterator
  • Extend
  • Pattern (tracked in #53)
  • Default
  • Add
  • Index (both AsciiString and AsciiStr)
  • Write (PR #33)
  • More inherent methods on AsciiString

Benchmark

Dear all,

Is there any evidence if this crate is faster than Rust std String (for ASCII)?
Did anybody do benchmark or something?

Generally I think there should be a section in README about why should somebody use this crate, for ASCII strings.

Publish next version to crate.io?

Hi. I want to use AsciiStr::split(), which is on the document. However, I realized split is not in the latest version (0.9.1) of ascii crate published to crates.io.

Can you please publish next version to crate.io if possible?

&str and &[u8] should implement IntoAsciiString

Just like &str implements Into<String>.

This would involve copying the string, but would allow writing APIs as:

fn foo<S: IntoAsciiString>(s: S) {
    let s = s.into_ascii_string();
    ....
}

etc, and have those work seamlessly with string and bytestring literals.

Add AsciiCStr and AsciiCString types

I'm working with a C API which requires me to use NUL-terminated, no-interior-NUL, ascii strings. Since the standard library contains CStr and CString types it might be nice if this library contained the ascii equivalents of those as well.

AsciiChar::SOX should be AsciiChar::STX

It seems that this crate's "Start of Text" member (ASCII '\x02') is mislabeled as SOX, whereas everything else on the web (including the linked-to Wikipedia page) references it as STX.

Since this would be a breaking change, I'm not sure how that would fit within the versioning, but I figured I'd report it anyway since I haven't seen any comments for this on the Issues/PRs for this repo.

Implement `std::ascii::AsciiExt` and `std::ascii::OwnedAsciiExt` for `Ascii`

Motivation

Preliminary
While str guarantees statically that data of its type is valid UTF-8, the type Ascii guarantees ASCII-conformance. Therefore the types [Ascii] and str and their owned counterparts Vec<Ascii> and String should behave similar.

Topic of this Issue
Ascii provides functions like to_uppercase() and to_lowercase() which can be applied to single ascii-characters. Currently such operations are not implemented on owned or borrowed strings of ascii-characters. As the types Vec<Ascii> and [Ascii] should be opaque manually implementing the iteration isn't recommended because it is a implementation detail of these types.

Example:

error: type `&[Ascii]` does not implement any method in scope named `to_uppercase`
let _ = "abcXYZ".to_ascii().unwrap().to_uppercase();
                                     ^~~~~~~~~~~~~~~~~~~~

Design

The types String and str provide functionality for converting to uppercase and lowercase with their implementations of the traits std::ascii::{AsciiExt, OwnedAsciiExt}. These traits are intended for "[ā€¦] ASCII-subset only operations on string slices" and owned strings. Of course Vec<Ascii> and [Ascii] are subsets of ascii, they are equivalent so it's valid to implement them for the ascii only string types.

Implement the traits:

impl AsciiExt<Vec<Ascii>> for [Ascii]
impl AsciiExt for Ascii
impl OwnedAsciiExt for Vec<Ascii>

The implementations use functionality present in Ascii if possible.

Design flaws

  • The traits std::ascii::{AsciiExt, OwnedAsciiExt} are marked experimental. This shouldn't be a real issue as I expect this crate to follow the way conversions are done in the standard library.
  • All functions declared by the traits std::ascii::{AsciiExt, OwnedAsciiExt} carry the infix ascii which is redundant in the case the traits are implemented on Vec<Ascii>, [Ascii] and Ascii. This redundancy must be tolerated to achieve the goals described above.

Drawbacks

  • Duplicate implementations arise because the type Ascii implements the same functionality in the functions to_uppercase() / to_ascii_uppercase() and to_lowercase() and to_ascii_lowercase().

Further considerations after discussion

  • Deprecate and remove the functions to_uppercase() and to_lowercase() on Ascii in favour of their equivalents in AsciiExt. That removes the duplication mentioned in the drawbacks.

Does not compile with the latest Rust

/Users/coreyf/.cargo/git/checkouts/rust-ascii-d57360998fa8e9eb/master/src/lib.rs:126:1: 130:2 error: the impl does not reference any types defined in this crate; only traits defined in the current crate can be implemented for arbitrary types [E0117]
/Users/coreyf/.cargo/git/checkouts/rust-ascii-d57360998fa8e9eb/master/src/lib.rs:126 impl fmt::Display for Vec<Ascii> {
/Users/coreyf/.cargo/git/checkouts/rust-ascii-d57360998fa8e9eb/master/src/lib.rs:127     fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
/Users/coreyf/.cargo/git/checkouts/rust-ascii-d57360998fa8e9eb/master/src/lib.rs:128         fmt::Display::fmt(&self[..], f)
/Users/coreyf/.cargo/git/checkouts/rust-ascii-d57360998fa8e9eb/master/src/lib.rs:129     }
/Users/coreyf/.cargo/git/checkouts/rust-ascii-d57360998fa8e9eb/master/src/lib.rs:130 }
/Users/coreyf/.cargo/git/checkouts/rust-ascii-d57360998fa8e9eb/master/src/lib.rs:132:1: 136:2 error: the impl does not reference any types defined in this crate; only traits defined in the current crate can be implemented for arbitrary types [E0117]
/Users/coreyf/.cargo/git/checkouts/rust-ascii-d57360998fa8e9eb/master/src/lib.rs:132 impl fmt::Display for [Ascii] {
/Users/coreyf/.cargo/git/checkouts/rust-ascii-d57360998fa8e9eb/master/src/lib.rs:133     fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
/Users/coreyf/.cargo/git/checkouts/rust-ascii-d57360998fa8e9eb/master/src/lib.rs:134         fmt::Display::fmt(self.as_str(), f)
/Users/coreyf/.cargo/git/checkouts/rust-ascii-d57360998fa8e9eb/master/src/lib.rs:135     }
/Users/coreyf/.cargo/git/checkouts/rust-ascii-d57360998fa8e9eb/master/src/lib.rs:136 }
/Users/coreyf/.cargo/git/checkouts/rust-ascii-d57360998fa8e9eb/master/src/lib.rs:188:1: 224:2 error: the impl does not reference any types defined in this crate; only traits defined in the current crate can be implemented for arbitrary types [E0117]
/Users/coreyf/.cargo/git/checkouts/rust-ascii-d57360998fa8e9eb/master/src/lib.rs:188 impl AsciiExt for [Ascii] {
/Users/coreyf/.cargo/git/checkouts/rust-ascii-d57360998fa8e9eb/master/src/lib.rs:189     type Owned = Vec<Ascii>;
/Users/coreyf/.cargo/git/checkouts/rust-ascii-d57360998fa8e9eb/master/src/lib.rs:190
/Users/coreyf/.cargo/git/checkouts/rust-ascii-d57360998fa8e9eb/master/src/lib.rs:191     #[inline]
/Users/coreyf/.cargo/git/checkouts/rust-ascii-d57360998fa8e9eb/master/src/lib.rs:192     fn is_ascii(&self) -> bool {
/Users/coreyf/.cargo/git/checkouts/rust-ascii-d57360998fa8e9eb/master/src/lib.rs:193         true
                                                                                     ...
/Users/coreyf/.cargo/git/checkouts/rust-ascii-d57360998fa8e9eb/master/src/lib.rs:226:1: 238:2 error: the impl does not reference any types defined in this crate; only traits defined in the current crate can be implemented for arbitrary types [E0117]
/Users/coreyf/.cargo/git/checkouts/rust-ascii-d57360998fa8e9eb/master/src/lib.rs:226 impl OwnedAsciiExt for Vec<Ascii> {
/Users/coreyf/.cargo/git/checkouts/rust-ascii-d57360998fa8e9eb/master/src/lib.rs:227     #[inline]
/Users/coreyf/.cargo/git/checkouts/rust-ascii-d57360998fa8e9eb/master/src/lib.rs:228     fn into_ascii_uppercase(mut self) -> Vec<Ascii> {
/Users/coreyf/.cargo/git/checkouts/rust-ascii-d57360998fa8e9eb/master/src/lib.rs:229         self.make_ascii_uppercase();
/Users/coreyf/.cargo/git/checkouts/rust-ascii-d57360998fa8e9eb/master/src/lib.rs:230         self
/Users/coreyf/.cargo/git/checkouts/rust-ascii-d57360998fa8e9eb/master/src/lib.rs:231     }

License files?

Hi --

Looks like rust-ascii is available under the MIT and Apache 2.0 licenses. Could you include the respective license files in the repo?

Thanks!

Update quickcheck dependency to 0.8

The quickcheck crate has been updated to 0.8 since the last release of ascii which depends on 0.6.
In Debian, we usually package the latest version of crates except if there is a compelling reason to package earlier versions in addition.
See also Debian Bug #927314.

Implementing `From<&mut AsciiStr>` for `&mut str` and `&mut [u8]` is unsound

They allow writing non-ASCII values to an AsciiStr which when read out as an AsciiChar will produce values outside the valid niche.

These impls were added by me in 4fbd050, so 0.9, 0.8 and 0.7 are affected.

Here's an example using these impls to create out-of-bounds array indexing in safe code (when compiled in release mode):

let mut buf = [0u8; 1];
let ascii = buf.as_mut_ascii_str().unwrap();
let byte_view = <&mut[u8] as From<&mut AsciiStr>>::from(ascii);
let arr = [0b11011101u8; 128];
byte_view[0] = 180;
assert_ne!(arr[ascii[0] as u8 as usize], 0b11011101);

I don't see any good way to tell users of the crate to stop using these impls:
Deprecation notices on trait impls are ignored (by both Rust 1.38 and Rust 1.9).
Changing the impls to panic or return an empty slice could break working code (that never writes non-ASCII values) at run-time.

The only fix we could make appears to be to remove the impls, telling users of them to do the unsafe pointer casting explicitly. On one hand this will make any accidental users of it aware of the problem when they update Cargo.lock, but it will also break any use that happened to be OK with a minor release, and any reverse dependencies of these uses.
On the other hand doing nothing and hoping nobody accidentally uses these impls feels irresponsible. What do you think @tomprogrammer?
In any case I don't think we need to fix 0.7 and 0.8, as Rust didn't backport the security fix in 1.29.1 to previous affected versions.

All files in crate are executables

Hello,

While packaging the latest crate for my distribution, I noticed that all files in the published crate have executable bits. This cause issue with one of our packaging script and seems to be an error. Could you remedy this?

Thank you,

methods unavailable with no_std

AsciiExt and Error are not in core, but some of their methods might still be useful.
We can add feature-gated inherent impls for those methods, but should we?

  • description() for ToAsciiCharError and AsAsciiStrError
  • eq_ignore_case() for AsciiChar and AsciiStr
  • to_ascii_{upper,lower}case() for AsciiChar
  • make_ascii_{upper,lower}case() for AsciiStr

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    šŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. šŸ“ŠšŸ“ˆšŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ā¤ļø Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.