Git Product home page Git Product logo

capnlp's Introduction

CapNLP

My NLP Programms on Spark

这里是Caphael的NLP程序主页 出于兴趣原因创建的项目,目前尽提供如下功能:

1 新词发现: 1.1 基于词语中汉子之间独立性(关联性)进行新词发现 (100%) 1.2 基于左右熵进行新词发现 (100%) 1.3 基于上述两种算法的多字新词发现 (100%)

Appendix 1: 以《冰与火之歌》第一卷为测试样本2-6个字的新词挖掘: 嶙峋, 蹒跚, 翡翠, 憔悴, 癞哈蟆, 狰狞, 恍然大悟, 魁梧, 谨慎, 玻璃, 柠檬, 蜿蜒, 踉跄, 峰峦, 兜帽, 摄政, 巡逻, 蜘蛛, 沐浴, 讥讽, 震慑, 玫瑰, 竖琴, 洋葱, 鸡皮疙瘩, 咆哮, 吩咐, 狱卒, 犹豫, 陡峭, 疲惫, 桑铎, 敏捷, 崎区, 姬琪, 庆祝, 赤裸, 祈祷, 奴隶, 旗帜, 吊桥, 呻吟, 蜂蜜, 盾牌, 墓窖, 优雅, 帘幕, 虚伪, 选择, 坟墓, 丘陵, 晕眩, 骄傲, 笨拙, 忠心耿耿, 催促, 挣扎, 愚蠢, 蒸腾, 哀悼, 鞠躬, 困惑, 搀扶, 缰绳, 盛夏群岛, 考虑, 融化, 喉咙, 哭哭啼啼, 啜泣, 捻捻, 弥赛菈, 偏偏, 糊涂, 吹嘘, 熟悉, 星辰, 仪式, 狭窄, 颈泽, 启程, 祝福, 诡计, 侏儒, 习俗, 协助, 遮蔽, 习惯, 咒骂, 泥泞, 旅店, 毒蛇, 诛儒, 严肃, 峡谷, 迅速, 审叛, 麻烦, 讨厌, 包括, 瑟曦, 羞辱, 暴躁, 朋友, 乌鸦, 胸膛, 翅膀, 饶富兴味, 建筑, 绞盘, 燃烧, 痕迹, 精力充沛, 宾客, 猜测, 烈焰, 畸形, 项链, 妓院, 阴霾, 颤抖, **, 秘密, 忠诚, 弓箭, 畏缩, 喃喃, 肋骨, 欺负, 窗棂, 轮宫, 疯狂, 苦涩, 八字胡, 帆船, 锁链, 畏惧, 青铜, 坚毅, 危险, 稻草, 摩擦, 牙齿, 夕阳, 摇曳, 悄悄, 慈悲, 旅馆, 乖乖, 皱眉, 窗户, 恢复, 荣誉, 惊讶, 宫廷, 斗篷, 聚集, 训练, 浪费, 勉强, 焦虑, 诸侯, 杂耍, 市镇, 恐怖, 笼罩, 严峻, 僵硬, 庭院, 卓戈, 呐喊, 托曼, 柔软, 抵达, 仔细, 纷纷, 艰苦, 寝室, 歌谣, 班扬, 垦求, 镰刀,...

capnlp's People

Contributors

caphael0925 avatar

Watchers

 avatar  avatar

Forkers

jz3707

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.