Git Product home page Git Product logo

hs-sc's Introduction

HS-SC: Homophonic Substitution in Simplified Chinese

简体中文的同音替换

The function of this script is to replace the Simplified Chinese characters entered by the user with other common Simplified Chinese characters with the same pronunciation and tone. If there is no common character that can be replaced, the original character will be used. Each replaced character is preceded and followed by a space. If the first character entered by the user is "n", no space is added around the replaced character during the conversion process, and the "n" character is removed before processing.

Usage:

pip install pypinyin
pip install pyperclip
python HS-SC.py

Script Explanation:

Initialise the pinyin-to-character mapping:

Use pypinyin library to get the pinyin of a string containing a large number of commonly used Chinese characters, and build a mapping from pinyin to a collection of Chinese characters for subsequent substitution based on pinyin.

Define the replacement function:

The replace_with_different_common_char function is responsible for replacing a character. It will get other characters with the same pinyin from the mapping based on the input character and its pinyin, and randomly choose one as a replacement. If the input character itself is among the candidate characters, then it will remove it first to avoid replacing it with the same character. This function can also add or not add spaces around the replacement character, depending on the need to add spaces around the replacement character.

Define the convert function:

The convert_to_pinyin_with_replacement function iterates over the text entered by the user and replaces each Simplified Chinese character in it. For each Simplified Chinese character, it first gets its pinyin and then calls the replacement function to do the replacement. Non-Simplified Chinese characters remain unchanged.

Get user input and set space flag:

Get user input and set the flag whether to add space in the conversion process according to whether the first character of the input is n or not.

Execute conversion and output the result:

Performs the conversion operation and outputs the converted text. If Add Spaces is selected, spaces will be added around the replaced characters in the converted text. Finally, copy the converted text to the clipboard.

Shortcomings:

  • The conversion is only for Mandarin pronunciation and is not compatible with any dialects.

  • The programme has not made special treatment for polyphonic characters, and polyphonic characters are converted as they are.

  • There is no GUI.

Preview:

拣 体 终 闻 的 童 因 惕 唤

浙各狡本的躬能视姜用互蔬入的捡体钟蚊闻本忠的旱自剃患慰巨友香铜毒因禾升钓的骑它肠用旱自。如裹鹊时眉友渴剃贷的尝件自择驶用员自。

美各背剃贷的自钱候君家医各空葛。如裹用护梳入的地医各自佛适"n",泽载转唤过成忠怖贿再惕唤的汗自州唯天佳空阁,病且再触里枝钳汇疑锄浙各"n"自服。

hs-sc's People

Contributors

aaron-chua avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.