Git Product home page Git Product logo

opencc-js's Introduction

opencc-js

The JavaScript version of Open Chinese Convert (OpenCC)

繁體版 - 简体版

Import

Import opencc-js in HTML page

Import in HTML pages:

<script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/umd/full.js"></script>     <!-- Full version -->
<script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/umd/cn2t.js"></script>     <!-- For Simplified to Traditional -->
<script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/umd/t2cn.js"></script>     <!-- For Traditional Chinese to Simplified Chinese -->

ES6 import

<script type="module">
  import * as OpenCC from './dist/esm/full.js'; // Full version
  import * as OpenCC from './dist/esm/cn2t.js'; // For Simplified to Traditional
  import * as OpenCC from './dist/esm/t2cn.js'; // For Traditional Chinese to Simplified Chinese
</script>

Import opencc-js in Node.js script

npm install opencc-js

CommonJS

const OpenCC = require('opencc-js');

ES Modules

import * as OpenCC from 'opencc-js';

Usage

Basic usage

// Convert Traditional Chinese (Hong Kong) to Simplified Chinese (Mainland China)
const converter = OpenCC.Converter({ from: 'hk', to: 'cn' });
console.log(converter('漢語')); // output: 汉语

Custom Converter

const converter = OpenCC.CustomConverter([
  ['香蕉', 'banana'],
  ['蘋果', 'apple'],
  ['梨', 'pear'],
]);
console.log(converter('香蕉 蘋果 梨')); // output: banana apple pear

Or using space and vertical bar as delimiter.

const converter = OpenCC.CustomConverter('香蕉 banana|蘋果 apple|梨 pear');
console.log(converter('香蕉 蘋果 梨')); // output: banana apple pear

Add words

  • Use low-level function ConverterFactory to create converter.
  • Get dictionary from the property Locale.
const customDict = [
  ['“', '「'],
  ['”', '」'],
  ['‘', '『'],
  ['’', '』'],
];
const converter = OpenCC.ConverterFactory(
  OpenCC.Locale.from.cn,                   // Simplified Chinese (Mainland China) => OpenCC standard
  OpenCC.Locale.to.tw.concat([customDict]) // OpenCC standard => Traditional Chinese (Taiwan) with custom words
);
console.log(converter('悟空道:“师父又来了。怎么叫做‘水中捞月’?”'));
// output: 悟空道:「師父又來了。怎麼叫做『水中撈月』?」

This will get the same result with an extra convertion.

const customDict = [
  ['“', '「'],
  ['”', '」'],
  ['‘', '『'],
  ['’', '』'],
];
const converter = OpenCC.ConverterFactory(
  OpenCC.Locale.from.cn, // Simplified Chinese (Mainland China) => OpenCC standard
  OpenCC.Locale.to.tw,   // OpenCC standard => Traditional Chinese (Taiwan)
  [customDict]           // Traditional Chinese (Taiwan) => custom words
);
console.log(converter('悟空道:“师父又来了。怎么叫做‘水中捞月’?”'));
// output: 悟空道:「師父又來了。怎麼叫做『水中撈月』?」

DOM operations

HTML attribute lang='*' defines the targets.

<span lang="zh-HK">漢語</span>
// Set Chinese convert from Traditional (Hong Kong) to Simplified (Mainland China)
const converter = OpenCC.Converter({ from: 'hk', to: 'cn' });
// Set the conversion starting point to the root node, i.e. convert the whole page
const rootNode = document.documentElement;
// Convert all elements with attributes lang='zh-HK'. Change attribute value to lang='zh-CN'
const HTMLConvertHandler = OpenCC.HTMLConverter(converter, rootNode, 'zh-HK', 'zh-CN');
HTMLConvertHandler.convert(); // Convert  -> 汉语
HTMLConvertHandler.restore(); // Restore  -> 漢語

API

  • .Converter({}): declare the converter's direction via locals.
    • default: { from: 'tw', to: 'cn' }
    • syntax : { from: local1, to: local2 }
  • locals: letter codes defining a writing local tradition, occasionally its idiomatic habits.
    • cn: Simplified Chinese (Mainland China)
    • tw: Traditional Chinese (Taiwan)
      • twp: with phrase conversion (ex: 自行車 -> 腳踏車)
    • hk: Traditional Chinese (Hong Kong)
    • jp: Japanese Shinjitai
    • t: Traditional Chinese (OpenCC standard. Do not use unless you know what you are doing)
  • .CustomConverter([]) : defines custom dictionary.
    • default: []
    • syntax : [ ['item1','replacement1'], ['item2','replacement2'], … ]
  • .HTMLConverter(converter, rootNode, langAttrInitial, langAttrNew ) : uses previously defined converter() to converts all HTML elements text content from a starting root node and down, into the target local. Also converts all attributes lang from existing langAttrInitial to langAttrNew values.
  • lang attributes : html attribute defines the languages of the text content to the browser, at start (langAttrInitial) and after conversion (langAttrNew).
  • ignore-opencc : html class signaling an element and its sub-nodes will not be converted.

Bundle optimization

  • Tree Shaking (ES Modules Only) may result less size of bundle file.
  • Using ConverterFactory instead of Converter.
import * as OpenCC from 'opencc-js/core'; // primary code
import * as Locale from 'opencc-js/preset'; // dictionary

const converter = OpenCC.ConverterFactory(Locale.from.hk, Locale.to.cn);
console.log(converter('漢語'));

opencc-js's People

Contributors

ayaka14732 avatar dlackty avatar ehoogerbeets avatar hugolpz avatar maple3142 avatar ren1244 avatar sgalal avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

opencc-js's Issues

使用window或globalThis声明OpenCCJSData变量

你好,请问能否将 OpenCCJSData 变量改为用 windowglobalThis 来声明?
目前使用的声明方式 const OpenCCJSData 在动态加载时不能使用 window.OpenCCJSData 进行访问。

Usecase for newbies

Hello, I created a jsffidle to test few usecases from your readme.md. I was partially successful.

The convert page content script didn't work :

((async () => {
  const convert = await OpenCC.Converter('hk', 'cn');
  const startNode = document.documentElement; // Convert the whole page
  const HTMLConvertHandler = OpenCC.HTMLConverter(convert, startNode, 'zh-HK', 'zh-CN'); // Convert all zh-HK to zh-CN
  HTMLConvertHandler.convert(); // Start conversion
  HTMLConvertHandler.restore(); // Restore
})());

I tried to assign to variable but also failed. (but this one is likely on me since I don't understand asyn well).

It could be interesting for users to have some usecases working on fiddle as demos.

Review new documentation section API

  • Gather definitions of terms within an API section
  • Initiate definitions, default, syntax presentations.
  • Review recent changes README.md
  • Continue improvements of English version : phrasing, style, etc.
  • Translate into local Chinese readme(s).

>=v1.0.0 take too much time to translate web pages

Sample files - testpages.zip

During my test, v0.3.6 and v0.3.7 took less than 100ms to finish the translation, while v1.0.0 and v1.0.1 took about 5 seconds.

For longer contents (for example, repeating the same content for 3 times), v0.3.6 and v0.3.7 still took less than 100ms to finish the translation, while v1.0.0 and v1.0.1 took about 16 to 18 seconds. Additionally for v1.0.0 and v1.0.1, the browser gave a pop-up that the tab is not responding.

效能提升的方式

javascript 的 object 只能用字串作為 key,導致每次搜尋時都要切割出子字串,對於大量文字的文章來說是負擔是比較重的。
我在自己的專案Map物件 實作,然後以 unicode 的數值作為 key。在查詢時不切割字串,而是用 codePointAt 來取得key,測試出來效能提升會超過3倍。

HTML 转换时手动跳过部分元素的实现

预置的 HTMLConverter 无法处理下面这种“简繁转换菜单”的情况:

<html lang="zh-hans">
...
<a href="javascript:void(0)" lang="zh-hans" onclick="setConvert(false)">简体</a>
<a href="javascript:void(0)" lang="zh-hant" onclick="setConvert(true)">繁體</a>
...
</html>

常见的 UX 设计方案当中,这种“简繁转换菜单”中菜单项使用的文字应该和对应的变体一致,但是转换器(在这个示例中是简体转繁体)会将“简体”一项也转换成繁体。

我个人的解决方法是在 _inner 函数中原有的跳过内嵌 JS / CSS 处加入一个类名检测:

/* Do not convert these elements */
if (currentNode.nodeType == Node.ELEMENT_NODE && currentNode.classList.contains('ignore-opencc')) return;
...

此时转换器检测到带有 ignore-opencc 类的元素就会跳过后续的转换,在网页中只需要加入同样的类名即可指定无需转换的部分。

不过,因为本人的能力实在有限,不知道这样的解决方法效率如何(毕竟把所有元素都 DFS 一遍还要检测类名觉得时间复杂度有点大),并且也不太会用 PR,就先发 Issue 了。

whether dynamic translation is possible

Hello.If I have been translated from SIMPLIFIED to TRADITIONAL, then my new input content will not be traditional. Is there any way to solve this problem?

es6 imports

Sorry if this is obvious (JS is still somewhat opaque to me...), but how would one use this with es6 imports? I tried various versions of what is described here but none worked. Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.