Git Product home page Git Product logo

natural_language_process_nlp_notes's Introduction

Natural Language Process NLP Notes

Key steps Follow

1) Import Library

2) Import data

3) Clean and compress (Data Preprocessing)

    1) Sentence Tokenization (Paragraphs to Sentence) (. ! ? ;)
    2) Word Tokenization (Sentance to word) (space, _ :)
    3) Punctual and Special character removal and Making text lowercase
    4) Stop word removal (is, a, an, the, them, couldn't, ....)
    5) Lemmatization and Stemming (Extract only the root words from data)

4) Exploratory Data Analysis (EDA)

    1) Generate a Word Cloud by plotting the data.

5) Encoding data (Text data to Numerical data)

  • TF-IDF (Term Frequency-Inverse Document Frequency)
  • Score of words in a particular row = (Number of times words in row / Total number of words in row) * log (Number of rows / Number of rows containing the word in them)

6) Apply Machine Learning

1) Split the data

     Features (X-axis) (2D Matrix)
     Targets (Y - axis) (1D Array)
     Train, Test, Split, Random state

2) Scaling the data

       1) Import model
       2) Initialize
       3) Fit (Learning process)
       4) Transform

3) Apply Machine learning algorithm

       1) Import model
       2) Initialize
       3) Fit (learning process)
       4) Predict

4) Evaluation matric (Check whether the model is correct or not)

       1) Regression - The evaluation metric for regression is R^2 between minus infinite to 1 
        A higher the R^2 is a better model
        
       2) Classification - The evaluation metric for classification is
           1) Accuracy score [ Higher accuracy is a better model (The value should be near 1) ]
           2) F1 score [ F1 score between 0 (low) to 1 (high), a Higher F1 score is better for the model ]

7) Sentiment analysis

natural_language_process_nlp_notes's People

Contributors

gopinathalpha7 avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.