Git Product home page Git Product logo

Hello (Hej) , I'm (jag heter) Anthony Kwok!

Data Science Banner

📩 Contact Me

     

🤪 Introduction

🔆 I am passionate Data Analyst / Data Scientist 📈 📉 who has worked in interationnal luxury retail group and equine hospital 🏢.

🔆 I enjoy a lot in helping companies to discover and empower the potential of the data with data science tools and algorithms.

🔆 Self-motivated and eager to learn different technologies and concepts. Recently, I am learning Scala and DBT.

📝 Experience in ML/DL

  • Sales and traffic forecasting
  • Data anomaly detection
  • Stock price prediction with algo-trading
  • Mobile price range classification
  • Recommendation system
  • Social network analysis

📝 Experience in BI Development & Automation

  • Sales performance dashboard
  • Network traffic monitoring dashboard
  • Drugs and vaccination usage dashboard
  • Injuries and rehabilitation dashboard

⌨️ Programming

  • Python: Numpy,Pandas,Scipy, multiprocessing
  • R: ggplot2, tidyr
  • SQL
  • Scala
  • Java
  • C++

🛠️ Tools with Experience

Visualization Tools 👀

  • Software: Power BI, Tableau, Snowflake
  • Library: Plotly, Seaborn, Matplotlib

Machine Learning Tools 🔅

  • Frameworks: statsmodels, scikit-learn
  • Regressions: Linear, Logistic, Lasso, Ridge
  • Boosting and Trees: XGBoost, Catboost, Adaboost, Decision Tree, Random Forest
  • Clustering: SVM, K-Means, LOF, DBSCAN
  • Time Series: Prophet, DeepAR, SARIMAX

Deep Learning Tools 🔆

  • Frameworks: PyTorch & Keras
  • Models: CNN, RNN, LSTM, Transformer
  • Large Language Model: Hugging Face

AutoML Tools 💨

  • H2O.ai

Data Pipeline Tools 🚰

  • DBT

Continuous Integration / Continuous Delivery (CI/CD) Tools ♻️

  • Git
  • Docker
  • Airflow

Cloud Computing Tools ☁️

  • Platforms: Google Cloud Platform, Amazon Web Services

Big Data Tools 🆙

  • Frameworks: Spark, Hadoop
  • Python API: PySpark, Dask

🧠 Skills

Statistics 📊

  • Regression Analysis
  • Correlation Analysis
  • Statistical Analysis

Product Development 📤

  • Agile Methodology & Kanban
  • Scrum & Sprint
  • JIRA
  • Confluence
  • Streamlit
  • Flask

Data Wrangling 📝

  • Data Cleaning
  • Exploratory Data Analysis
  • Feature Engineering
  • Feature Extraction

LLM Skills 🔧

  • Prompt Engineering
    • Zero-shot Inference
    • One-shot Inference
    • Few-shot Inference
  • Fine-Tuning;
    • Instruction Fine-Tuning
    • LoRA
    • Soft Prompt

📂 Data Science Projects

10/2023 - 12/2023
  • Utilized ensemble learning and deep learning to predict the price range of a used car in the North American market.
  • Aimed to provide a data-driven price prediction for buyers and sellers to improve market efficiency. Try it out now! Project-Demo
08/2023 - 10/2023
  • Applied Logistic Regression, gradient boosting, random forest, KNN and Naive Bayes to predict mobile phone price ranges.
  • Predict the price range of mobile phones based on their functionality and hardware components.
  • Dataset from Kaggle and achieve 93.8% of weighted F1 score in baseline model - Logistic Regression.

🏆 Predictive Analysis of West Nile Virus in Chicago

09/2023 - 10/2023
  • Perform predictive analysis and linear & logistic regression to predict the presence of West Nile Virus.
  • Perform exploratory data analysis and data cleaning before data modelling

🏆 Personalised Algo-Trading on US stock market

04/2021 - 06/2022
  • Applied SVC, XGBoost, Catboost, Prophet and CNN-LSTM to predict stock prices.
  • Make trading decisions based on our model prediction and users' risk classification.
  • Model Performance outperforms “Buy-and-hold” Strategies.

🏆 Social Network Analysis

03/2021 - 06/2021
  • Perform network analysis to spot the key opinion leader in the network.
  • Applied Random Walk Generator to extract information from local and global networks.
  • Applied DeepWalk and Node2Vec for embedding stage
  • The AUC-ROC score reached 0.9323.

🏆 Recommendation System with Neural Collaborative Filtering

02/2021 - 05/2021
  • Applied the Neural Collaborative Filtering (NCF) model in the recommendation system to predict the user's rating (1-6).
  • Used Wide & Deep Learning model for prediction.
  • RMSE dropped to 0.99.

🏆 Sentiment Analysis on Restaurant Review

01/2021 - 04/2021
  • Applied MLP, Flair, CNN and BERT with the Pytorch framework to predict the score of restaurant reviews.
  • Utilized various techniques such as tokenization, stopword removal, stemming and word embedding with Word2Vec & GloVe

📂 Data Analysis Projects

🏆 Data Analysis & Visualization on Air Traffic Data

09/2023
  • Perform data analysis with Tableau on Air Traffic Data
  • Analyse the performance of major US Airlines such as Delta Airlines, American Airlines and Southwest Airlines

🏆 Data Analysis & Visualization on Kickstarter Campaign Data

08/2023
  • Provide data-driven recommendations based on the past 10 years of Kickstarter campaign data.
  • Perform visualizations to support the business insights and recommendations.

💬 Languages

                 

💭 Hobbies & Interest

Sport Stratrgic Games
Volleyball 🏐 Texas Poker 🃏
Badminton 🏸 Mahjong 🀄
Bowling 🎳

Anthony Kwok's Projects

scala-learning icon scala-learning

Self Learning Scala with https://www.handsonscala.com/index.html

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.