Git Product home page Git Product logo

ml100days's Introduction

ML100Days

Natural language processing marathon

Hello there, this is my github repository where I push my assignment for the natural langauge processin online bootcamp hosted by Cupoy. Below is a table showing the topic(s) for each of the task given. The tasks were originally designed to be completed on a daily basis, but as a student, I have to say that keeping the homeworks done on a daily basis is beyond my capacity. So, I do this whenever I can (mostly during winter and summer vacations), and I also try to review the previously completed notebooks/tasks when I'm not working on new assignments.

Part 1: Machine learning (homework_ML)

Task # Task Note
Day 1 Python string operation Changed filename formatting
Day 2 Python string operation Changed filename formatting
Day 3 Regular expression (Regex) Changed filename formatting
Day 4 Regular expression (Regex) in python
Day 5 Word segmentation - introduction
(Markov Model, HMM, Viterbi algorithm)
Added comparison with solution
Day 6 Word segmentation with jieba
Day 7 Word segmentation with Ckiptagger
Day 8 N-gram
(bigram counts, bigram probability)
Day 9 N-gram
(basic language model / next word prediction using n-gram)
Day 10 Part-of-speech tagging - introduction
Day 11 Part-of-speech tagging using jieba
Day 12 Bag-of-words - introduction
Day 13 Stemming and lemmatization - introduction
Day 14 Text preprocessing (regex, text segmentation, stop words & stemming)
Day 15 Term Frequency - Inverted Document Frequency (TF-IDF)
Day 16 Word embedding & SVD (Singular value decomposition) Added comments
Day 17 Word embedding, SVD, KNN, PPMI, TF-IDF & Co-occurrence matrix
Day 18 Individual research - LDA/PCA/Supervised & unsupervised learning
Working on an individual research project for a mandatory course
Day 19 K-nearest neighbors algorithm practice with sklearn
Day 20 K-nearest neighbors algorithm practice with sklearn
Day 21 Naive Bayes - individual research assignment
Day 22 Naive Bayes (hand craft) Added comments
Day 23 Naive Bayes (with scikit learn)
Day 24 Decision tree
(Information gain)
2021/08/28 Added comments
Day 25 Bias-variance tradeoff
Day 26 Ensemble learning - Blending vs. Stacking
Day 27 Implementation of random forest and decision tree
Day 28 Tree-based models using Scikitlearn
Day 29 Final project 1
n-gram based word recommendation system (Part 1)
Day 30 Final project 1
n-gram based word recommendation system (Part 2)
interpolation/base-off smoothing
Day 31 Final project 2
News classifier (Part 1)
POS, BOW, Cosine similarity
Day 32 Final project 2
News classifier (Part 2)
TFIDF and PCA
Day 33 Final project 2
News classifier (Part 3)
PPMI and SVD
Day 34 Final project 3
Spam filter (Part 1)
Comparison of different classifiers
Day 35 Final project 3
Spam filter (Part 2)
Implementation of filter
Day 36 Final project 4
Sentiment analysis
Day 37 Final project 5
Latent sentiment analysis
Day 38 Final project 6
Trigram application (Article spinner)
Added non-probablistic replacement
and 5-gram
Day 39 Final project 7
Rule-based chatbot (Single round)
Day 40-42 Final project 8
Rule-based chatbot (Multiple-round)
Google Dialogflow and Line Bot integration

Part 2: Deep learning (homework_DL)

Task # Task Note
Day 1 Google colab setup
Day 2 Tensor operation / Pytorch
Day 3 Pytorch autograd / differentiation / backpropagation Added comments
Day 4 Pytorch data loading Added comments
Day 5 Pytorch data loading Added comments
Day 6 Pytorch Natural language data loading (using torchtext) Added comments
Day 7 Pytorch neural network model building Added comments
Day 8 Pytorch - model modification, register_forward/backward_hook
, and weight initializatiion
Day 9 Pytorch - model builiding Added notes on cross-entropy loss
Day 10 Pytorch - model training
Day 11 (Individual research/reading assignment) Introduction to word2vec
Day 12 (Individual research/reading assignment) Introduction to CBOW and skipgram
Day 13 Implementing CBOW and skipgram with python
Day 14 (Individual research/reading assignment) Introduction to accelerating word2vec
Day 15 Implementing accelerated word2vec using subsampling/Training a skipgram model
Day 16 Introduction to gensim natural language processing toolkit
Day 17 Using GloVe model with gensim
Day 18 Introduction to Recurrent Neural Network (RNN)

ml100days's People

Contributors

ludougan123234 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.