Thomas George Thomas's Projects
Clustering Neighborhoods of Paris and London using Machine learning. IBM Data Science Capstone
Predicting the Energy consumed by appliances using Machine Learning algorithms built from scratch
Materials & Resources aimed at acquiring the AWS Certified Solutions Architect Associate 2020
Exploring and drawing meaningful insights for patients readmitted with Diabetes
End-to-End deployment of E-commerce customers segmentation using Clustering Machine learning algorithms in Google Cloud Platform and MLOps Tools
Front end work on Education Data Set
Macro to remove empty rows based on the first column in Excel document using Visual Basic ( Making a colleagues life easy )
Analyzing FBI uniform major crimes reporting in every US state visualized on a Tableau dashboard. A data mining hackathon.
Wanna know which languages and execution engines are the quickest or the slowest at processing files? Well here's your answer. š Data Analysis & comparison between the time taken ā for computing word counts in various languages and execution engines for files of different sizes.
Predicting the cost of treatment and insurance using Machine Learning
Hackerrank functional programming solutions in scala.
The solutions of all my SQL HackerRank Python challenges
The solutions of all SQL hackerrank challenges using MySQL environment
Complete Solutions and related tutorials for the Linux Shell - Bash, text processing, Arrays in Bash, Grep Sed Awk Challenges on HackerRank
A highly customizable and mobile first Hugo template for personal portfolio and blog.
Learning materials, Quizzes & Assignment solutions for the entire IBM data science professional certification. Also included, a few resources that I found helpful.
Practicing popular Java Programs
Simple Custom Producer Consumer group demonstrating message passing in Kafka
Stream real time Tweets of current affairs like covid-19 using Kafka 2.0.0 high throughput producer & consumer into Elasticsearch using safe, idempotent and compression configurations. Aggregate the data and use it for further analytics.
Data cleaning, pre-processing, and Analytics on a million movies using Spark and Scala.
Looking at 120 years of Olympic history and discovering interesting trends, patterns and visualizing our findings using R.
Predicting hit songs on Spotify by classifying 40,000 songs using Machine Learning
Building a content-based recommendation engine API for the retro movie lovers (1900s) using NLP, Flask and Heroku.
Taking a look at data of 1.6 million twitter users and drawing useful insights while exploring interesting patterns visualized with concise plots. The techniques used include text mining, sentimental analysis, probability, time series analysis and Hierarchical clustering on text/words using R.
Generating an analytics dashboard based on YouTube videos
Readme for my :octocat: Profile
My very own space on the Internet! Portfolio and Personal Website. https://thomasgeorgethomas.com
Streaming / Ingesting tweets using Flume into a hive data lake.