Git Product home page Git Product logo

wordless's Introduction

Wordless: An Integrated Corpus Tool With Multilingual Support for the Study of Language, Literature, and Translation



Wordless is an integrated corpus tool with multilingual support for the study of language, literature, and translation designed and developed by Ye Lei (叶磊), then MA student in interpreting studies at Shanghai International Studies University (上海外国语大学).

Download

The latest version (3.5.0) of Wordless supports Windows 7/8/8.1/10/11, macOS 10.11 or later, Ubuntu 18.04 or later, Debian 10 or later, and Arch Linux, all 64-bit only. Both Intel-based and Apple Silicon-based Macs are supported.

For a detailed changelog, please see CHANGELOG.md.

Release Remarks
Latest Release for Windows 1. Extract all files
2. Double-click Wordless/Wordless.exe to run
Latest Release for macOS 1. Extract all files
2. Double-click Wordless.app to run
Latest Release for Linux 1. Extract all files
2. Double-click Wordless/Wordless to run
3. [Optional] Double-click Wordless/Wordless - Create Shortucut to create a shortcut in Show Applications
Past Releases Not recommended
Baidu Netdisk For Chinese users with unstable connections to Github (PASSWORD: wdls)

Important

Note 1: It is recommended that the path to Wordless not contain any non-ASCII chatacters, such as Chinese characters and letters with diacritics.

Note 2: If your Mac says that “Wordless” is damaged and can’t be opened, please open Terminal (Launchpad → Other) and run:

xattr -rc /Applications/Wordless.app

remember to replace /Applications/Wordless.app with the actual path of Wordless on your computer (you could drag Wordless.app to the Terminal). Then, run Wordless again (the warning prompted in Terminal could be ignored if the program could be successfully opened).

Note 3: While opening corpora in languages other than English in Wordless, extra model files might need to be downloaded from the internet. If you encounter a Network Error dialog while downloading the model, chances are that it's a literal network error, so you just need to check your internet connections following the instructions in the error message and try downloading the model once more.

Users in China, where connections to Github and Hugging Face Hub are unstable, are recommended to use a proxy and set properly the settings in Menu Bar → Preferences → Settings → General → Proxy Settings. Alternatively, Chinese users can choose to manually download model files from Baidu Netdisk. The steps of installing models are as follows:

  1. Check the error message displayed in the Network Error dialog. If stanza is found in the error message, you need a Stanza model, otherwise you need a spaCy model.
  2. Download model files for the language of your corpus from the above link and extract all files.
  3. For Windows and Linux users, put spaCy models under Wordless/libs and Stanza models under Wordless/libs/stanza_resources. For macOS users, right click on Wordless.app, select Show Package Contents, then put spaCy models under Contents/Frameworks and Stanza models under Contents/Frameworks/stanza_resources.
  4. If your corpora are in different languages or both spaCy and Stanza models are required, repeat step 1 ~ 3 until the Network Error dialog disappears.
  5. Try opening corpora in Wordless again, the model downloading process should be skipped now.

If the problem persists or the model you need is missing from the above link, please contact the author for further support.

Need Help?

If you have any questions, find software bugs, need to provide feedback, or want to submit feature requests, you may seek support from the open-source community or contact me directly via any of the support channels listed below.

Support Channel Information
Official documentation Stable Version | Development Version
Tutorial videos YouTube | bilibili
Bug reports Github Issues
Usage questions Github Discussions
Email support blkserene@gmail.com
WeChat official account WeChat official account

Citing

If you are going to publish a work that uses Wordless, please cite as follows.

APA (7th edition):

Ye, L. (2024). Wordless (Version 3.5.0) [Computer software]. Github. https://github.com/BLKSerene/Wordless

MLA (8th edition):

Ye Lei. Wordless, version 3.5.0, 2024. Github, https://github.com/BLKSerene/Wordless.

Works Using Wordless

For details, please click HERE.

License

Copyright (C) 2018-2024  Ye Lei (叶磊)

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program.  If not, see <http://www.gnu.org/licenses/>.

FOSSA Status

Contributing

For details, please click HERE.

Donating

If you would like to support the development of Wordless, you may donate via PayPal, Alipay, or WeChat Pay.

PayPal Alipay WeChat Pay
PayPal Alipay WeChat Pay

Acknowledgments

For details, please click HERE.

wordless's People

Contributors

blkserene avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

wordless's Issues

检索对齐文本不太方便

image
这个问题出在哪儿?
还有在Parallel Concordancer,是不是不能同时显示多个tmx格式文件的检索结果?

Dependency Parser: `sentence_display` is escaped to HTML

Describe the bug
Some chars like ' are displayed as HTML in the Sentence column.

To Reproduce
Steps to reproduce the behavior:

  1. echo -e "What's that? The sign & means \"and\"." > sample.txt
  2. Go to the Dependency Parser tab
  3. Open sample.txt and click Generate table
  4. See error

Expected behavior
Sentences are shown as they actually are.

Screenshots
2023-08-29_11-52

Environment information

  • Operating System: Arch Linux
  • Wordless Version: 3.3.0

Additional context
I dug a bit and found that with this line commented, the problem disappears.

2023-08-29_11-51

(The Concordancer and Concordancer Parallel tabs are missing in the screenshots because I disabled them after failing to install some of their dependencies. But the screenshots do belong to Wordless version 3.3.0, though run from source code instead of the compiled release.)

Frequent crashes for some operations (N-gram, etc)

Hi,

I have a few documents mixed in Tibetan and Chinese. I found Wordless would crash multiple times, especially for n-gram, collocation extractor.

I'm not sure if that's because of the size of the corpus. From the profiler, there are 10909 paragraphs, and 172885 tokens, 581964 characters. I remember I tried with small files, but the app crashed too.

I'm on macOS 12.3.1 with M1 Pro chip. I tried another Macbook with intel chip but had the same experience. Wordless version: 2.2.0.

The path to Wordless doesn't have any non-ASCII characters, though file names are in Tibetan and Chinese.

I do have the crash report that generated by the system, but not sure if that's helpful. Please let me know what other information are needed for investigation.

Thanks

各项统计结果都是0

用wordless分析奥巴马的演讲A Just and Lasting Peace Nobel Peace Prize Lecture Oslo, Noway, December 10, 2009
出现如下提示:
Data processing has completed successfully, but there are no results to display. You can change your settings and try again.
是因为文章太短了吗?该怎么设置?

Can't run in Win10

Glad to see this and thank you for sharing. But it does run in my Win 10 machine.
Pls see the screenshot:

Wordless-screenshot

我不能打开exe文件

下载好了wordless 2.1.0, 解压之后打不开exe文件,没有任何提示. 用的是win10系统

pybo becomes pybo + botok

Not really an issue, but in your next releases, you might want to use botok instead of pybo.
Everything is the same except for the import line.

All the codebase related to the Tibetan tokenizers has been moved to botok. pybo is now a toolbox that imports and uses botok amongst others and provide a convenient command line interface for standard operations.

unexpected breakdown

each time when I finish import or almost finish, the software will breakdown, plz check!

A bug to my Mac M1

Describe the bug
i just want to upload a file in order to run the app but it said it has fatal error and freeze immediately

To Reproduce
Steps to reproduce the behavior:

  1. Go to file
  2. Click on open file
  3. Scroll down to add file
  4. See error

Expected behavior
please fix it as soon as possible

Screenshots
Screen Shot 2022-12-13 at 00 38 36

Environment information

  • Operating System: [e.g. Windows 11 x64, macOS Monterey 12.6, Ubuntu 22.04 x64]
  • Wordless Version: [e.g. 2.3.0]

Additional context
Add any other context about the problem here.

如何使用本软件计算中英文文本的信息密度?

这个软件功能强大,解决了语言研究者不会编程之苦!太帅了。请问有没有考虑加入统计中英文文本实词和虚词数量以及计算二者之间比率的功能?需要使用这个来计算文本的信息密度。

请问可否添加dpi适应?

使用matebook x pro,不知道是不是分辨率问题(3k x 2k)搭配的缩放200%,页面的字就算设置到extra large还是很小很小呢
微信截图_20210417174929

OSX version not loading

hi
thx for this tool, i tried using the OSX version but program does not want to start;
sorry i can't get any diagnostic info from Console for some reason;
my system is 10.11.6
thx

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.