null8626 / decancer Goto Github PK
View Code? Open in Web Editor NEWA library that removes common unicode confusables/homoglyphs from strings.
License: MIT License
A library that removes common unicode confusables/homoglyphs from strings.
License: MIT License
See #14 for full comments.
Some portions utilize unsafe code to optimize performance. It would be useful for developers (current and future) if the purpose and invariants of the unsafe code were laid out in a comment to prevent violations.
Not sure why this is the case. Happens on version 1.6.2 in Rust with the error attempt to subtract with overflow
, error location being src/lib.rs:95:20
. This also occurs with cure_char()
.
There is no reason for this crate to use unsafe
code.
I.e. when fed unicode character “ˑ” (U+02D1) (Modifier Letter Half Triangular Colon):
fn main() {
println!("{}", decancer::cure("ˑ").as_str());
println!("This never prints.");
}
Hi there,
I'm experiencing an issue where decancer
automatically lowercases all uppercase letters. As far as I understand, this is unintentional behavior (since it is not documented anywhere I looked). If it is intentional, could we have an option that keeps the case of non-violating characters?
decancer
Version: 2.0.2import decancer from "decancer"
console.log(decancer("Test").toString()) // Expected output: "Test", actual output: "test"
console.log(decancer("TeSt").toString()) // Expected output: "TeSt", actual output: "test"
There currently aren't any good libraries for decancering text in python.
I'd like python bindings using a lib such as PyO3.
I tried reinventing the wheel but failed and would love to integrate this perfectly working library into my python projects as well.
Hi @null8626 ,
Thank you for providing this very fast and bright library !
On VSCode, I'm putting some coding rules with TS and Eslint (only on IDE, it's still a JS project), I'm having an error on decancer
function
This expression is not callable.
Type 'typeof import("c:/Users/Administrator/WebstormProjects/classified-ads/node_modules/decancer/src/typings")' has no call signatures.
Line 233 in 3f4e7df
noundef
. This will cause aborts in a couple Rust versions, when this pattern is compiled to ud2
(undefined instruction, abort instantly).
This is already being caught by running clippy
locally, which I suggest is added to CI, under the clippy::uninit_assumed_init
lint.
Ever since July 2023, i have been thinking about adding back Arabic and Hebrew support for decancer
without causing issues because of their right-to-left madness. Then i've found unicode-bidi
which rerenders your mixed RTL/LTR text in memory as it were to be rendered by a web browser.
The plan is to somehow implement its algorithm here - an attempt has been in the works since then, but due to school and the complexity of Unicode's bidirectional algorithm, it has been in hiatus for months.
Because of this, development on this library has (publicly) stagnated as the attention has been directed to this enhancement.
I allow users to create simple patterns with asterisks which boils down to equal/startsWith/endsWith/contains. This is currently the only missing feature from adopting this library in production.
test
-> equaltest*
-> startsWith*test
-> endsWith*test*
-> containsI'd really love to see these functions being added <3. Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.