Git Product home page Git Product logo

text_mining_resources's Introduction

Text Mining and Natural Language Processing Resources

 ____ ____ ____ ____ _________ ____ ____ ____ ____ ____ ____ 
||t |||e |||x |||t |||       |||m |||i |||n |||i |||n |||g ||
||__|||__|||__|||__|||_______|||__|||__|||__|||__|||__|||__||
|/__\|/__\|/__\|/__\|/_______\|/__\|/__\|/__\|/__\|/__\|/__\|

A curated list of resources for learning about natural language processing, text mining, text analytics, and unstructured data. Awesome

Table of Contents

Books

R

Python

General

Blogs

Blog Articles, Papers, Case Studies

General

Biases in NLP

Scraping

Cleaning

Stop Words

Stemming

Dimensionality Reduction

Sarcasm Detection

Document Classification

Entity and Information Extraction

Document Clustering and Document Similarity

Concept Analysis/Topic Modeling

Sentiment Analysis

Text Summarization

Machine Translation

Q&A Systems, Chatbots

Fuzzy Matching, Probabilistic Matching, Record Linkage, Etc.

Word and Document Embeddings

Deep Learning

Knowledge Graphs

Benchmarks

  • SQuAD leaderboard. A list of the strongest-performing NLP models on the Stanford Question Answering Dataset (SQuAD).
    • SQuAD 1.0 paper (Last updated October 2016). SQuAD v1.1 includes over 100,000 question and answer pairs based on Wikipedia articles.
    • SQuAD 2.0 paper (October 2018). The second generation of SQuAD includes unanswerable questions that the NLP model must identify as being unanswerable from the training data.
  • GLUE leaderboard.
    • GLUE paper (September 2018). A collection of nine NLP tasks including single-sentence tasks (e.g. check if grammar is correct, sentiment analysis), similarity and paraphrase tasks (e.g. determine if two questions are equivalent), and inference tasks (e.g. determine whether a premise contradicts a hypothesis).

Online courses

Udemy

Stanford

Coursera

DataCamp

Others

APIs and Libraries

Products

Getting Data out of PDFs

Online Demos and Tools

Datasets

Lexicons for Sentiment Analysis

Misc

Meta

Other Curated Lists

Contribute

Contributions are more than welcome! Please read the contribution guidelines first.

License

CC0

To the extent possible under law, @stepthom has waived all copyright and related or neighboring rights to this work.

text_mining_resources's People

Contributors

stepthom avatar aruncbhatia avatar csehdz avatar daiyiding avatar kritika2011 avatar ngilmore avatar ssh24 avatar singh-k01 avatar talolard avatar uwmonkey avatar xddenny avatar firecharm avatar canadamike avatar jennvlasiu avatar levi-b avatar malujane avatar rebeccaguy avatar tinaytpeng avatar torisopik avatar uoftcompeng avatar yhjyoon avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.