Natural Language Processing (a big picture)

Natural Language Technology (NLT)

Natural Language Technology (NLT) is a field that uses various techniques and methods to analyze, understand, and generate human language.
https://oneai.com/learn/natural-language-technology-nlt

Machine Learning:
Machine learning is a method that uses algorithms to learn from data and make predictions or decisions. It is widely used in NLT for tasks such as sentiment analysis, named entity recognition, and text summarization.
Deep Learning:
Deep learning is a subset of machine learning that uses neural networks with multiple layers to learn from data. It is particularly useful for tasks such as language generation, machine translation, and text classification.
Rule-Based Systems:
Rule-based systems use a set of pre-defined rules and heuristics to analyze and understand the text. They are commonly used in tasks such as part-of-speech tagging, syntactic parsing, and named entity recognition.
Language Modeling:
Language modeling is a statistical method used to predict the probability distribution of a sequence of words. It is used to train models that can generate text and improve the performance of other NLP tasks such as speech recognition and machine translation.
Transfer Learning:
Transfer learning is a method that involves pre-training a neural network on a large dataset and then fine-tuning it on a smaller dataset for a specific task. This approach can be used to improve the performance of tasks such as sentiment analysis, named entity recognition, and language translation.
Reinforcement Learning:
Reinforcement learning is a method in which an agent learns by taking actions in an environment to maximize a reward. It is used for tasks such as dialog systems and text generation.

1. Natural Language Processing (NLP)

Natural Language Processing (NLP) that is focused on enabling computers to understand human language in both written and verbal forms. It involves machine learning and deep learning techniques.(1) Tokenization, (2) Embedding, and (3) Model architectures, are three main components that help machines understand natural language.

Language Processing Pipelines
Text => tokenizer >> tagger > parser > ner > ... => Doc
- Natural Language Processing | Preprocessing using spaCy
- Natural Language Processing | Preprocessing using nltk
Tokenization and vectorization in NLP
Tokenization is the first step in natural language processing (NLP) projects. It involves dividing a text into individual units, known as tokens. Tokens can be words or punctuation marks. These tokens are then transformed into numerical vectors representing words. Two main concepts are vectorization and embedding. Text Vectorization is the process of turning words into numerical vectors in a one-dimensional space. Word Embedding (Word Vector) is a type of vectorization through deep learning as dense vectors in a high-dimensional space.
- Tokenization in NLP
  - Sentence based
  - Word based
  - Character based
  - Subwords based
- Vectorization in NLP
  - Text Vectorization
    - Traditional approach
  - Word Embedding
    - Context-independent
      - Neural Word Embedding
        
        Word2Vec
      - Pretrained Word-Embedding
        
        GloVe
        
        FastText
    - Context-dependent
      - RNN based
        
        ELMO
        
        CoVe
      - Transformer based
        
        BERT
        
        XML
        
        RoBERTa
        
        ALBERT
  - Document Embedding
    - Doc2Vec
      - Distributed Memory (DM)
      - Distributed Bag of Words (DBOW)

2. Natural Language Understanding (NLU)

Natural Language Understanding (NLU) that deals with the ability of computers to understand the meaning of text written in natural language.

3. Natural Language Generation (NLG)

Natural Language Generation (NLG) is an advanced Artificial Intelligence technique that transforms non-linguistic representations of information into human-like text.

3.1 Content analysis

--updating--

3.2 Data understanding

--updating--

3.3 Document structuring

--updating--

3.4 Sentence aggregation

--updating--

3.5 Grammar structuring

--updating--

3.6 Language presentation

--updating--

thuyhale / problem3_natural-language-processing Goto Github PK

problem3_natural-language-processing's Introduction

Natural Language Processing (a big picture)

Natural Language Technology (NLT)

1. Natural Language Processing (NLP)

2. Natural Language Understanding (NLU)

3. Natural Language Generation (NLG)

3.1 Content analysis

3.2 Data understanding

3.3 Document structuring

3.4 Sentence aggregation

3.5 Grammar structuring

3.6 Language presentation

problem3_natural-language-processing's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent