Git Product home page Git Product logo

wanganpingup / kashgari_textclassify_identify Goto Github PK

View Code? Open in Web Editor NEW

This project forked from brikerman/kashgari

0.0 0.0 0.0 14.69 MB

Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.

Home Page: http://kashgari.readthedocs.io/

License: Apache License 2.0

Shell 0.64% Python 99.36%

kashgari_textclassify_identify's Introduction

GitHub Slack Coverage Status PyPI

🎉🎉🎉 We released the 2.0.0 version with TF2 Support. 🎉🎉🎉

If you use this project for your research, please cite:

@misc{Kashgari
  author = {Eliyar Eziz},
  title = {Kashgari},
  year = {2019},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/BrikerMan/Kashgari}}
}

Overview

Kashgari is a simple and powerful NLP Transfer learning framework, build a state-of-art model in 5 minutes for named entity recognition (NER), part-of-speech tagging (PoS), and text classification tasks.

  • Human-friendly. Kashgari's code is straightforward, well documented and tested, which makes it very easy to understand and modify.
  • Powerful and simple. Kashgari allows you to apply state-of-the-art natural language processing (NLP) models to your text, such as named entity recognition (NER), part-of-speech tagging (PoS) and classification.
  • Built-in transfer learning. Kashgari built-in pre-trained BERT and Word2vec embedding models, which makes it very simple to transfer learning to train your model.
  • Fully scalable. Kashgari provides a simple, fast, and scalable environment for fast experimentation, train your models and experiment with new approaches using different embeddings and model structure.
  • Production Ready. Kashgari could export model with SavedModel format for tensorflow serving, you could directly deploy it on the cloud.

Our Goal

  • Academic users Easier experimentation to prove their hypothesis without coding from scratch.
  • NLP beginners Learn how to build an NLP project with production level code quality.
  • NLP developers Build a production level classification/labeling model within minutes.

Performance

Welcome to add performance report.

Task Language Dataset Score
Named Entity Recognition Chinese People's Daily Ner Corpus 95.57
Text Classification Chinese SMP2018ECDTCorpus 94.57

Installation

The project is based on Python 3.6+, because it is 2019 and type hinting is cool.

Backend kashgari version desc
TensorFlow 2.2+ pip install 'kashgari>=2.0.2' TF2.10+ with tf.keras
TensorFlow 1.14+ pip install 'kashgari>=1.0.0,<2.0.0' TF1.14+ with tf.keras
Keras pip install 'kashgari<1.0.0' keras version

You also need to install tensorflow_addons with TensorFlow.

TensorFlow Version tensorflow_addons version
TensorFlow 2.1 pip install tensorflow_addons==0.9.1
TensorFlow 2.2 pip install tensorflow_addons==0.11.2
TensorFlow 2.3, 2.4, 2.5 pip install tensorflow_addons==0.13.0

Tutorials

Here is a set of quick tutorials to get you started with the library:

There are also articles and posts that illustrate how to use Kashgari:

Examples:

Contributors ✨

Thanks goes to these wonderful people. And there are many ways to get involved. Start with the contributor guidelines and then check these open issues for specific tasks.

kashgari_textclassify_identify's People

Contributors

adline125 avatar alexwwang avatar allcontributors[bot] avatar bradfora avatar bratao avatar brikerman avatar cyberzhg avatar dependabot[bot] avatar echan00 avatar eryueniaobp avatar fossabot avatar haoyuhu avatar lemoz avatar lsgrep avatar mangopomelo avatar monkeywithacupcake avatar nirantk avatar sharpkoi avatar sunyancn avatar zxy199803 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.