Git Product home page Git Product logo

VASU KUMAR's Projects

advanced-data-visualizations---immigration-data-and-novel icon advanced-data-visualizations---immigration-data-and-novel

Dataset: Immigration to Canada from 1980 to 2013 - International migration flows to and from selected countries - The 2015 revision from United Nation's website The dataset contains annual data on the flows of international migrants as recorded by the countries of destination. The data presents both inflows and outflows according to the place of birth, citizenship or place of previous / next residence both for foreigners and nationals. I have focused on the Canadian Immigration data. For the word cloud generation I have also used a text file of Alice in the Wonderland Novel.

autonomous-driving---car-detection icon autonomous-driving---car-detection

The notebook presents the AI based object detection using the very powerful YOLO model. Many of the ideas in this notebook are described in the two YOLO papers: Redmon et al., 2016 (https://arxiv.org/abs/1506.02640) and Redmon and Farhadi, 2016 (https://arxiv.org/abs/1612.08242). You will learn to: Use object detection on a car detection dataset Deal with bounding boxes

awesome-nlp icon awesome-nlp

:book: A curated list of resources dedicated to Natural Language Processing (NLP)

cheatsheets-ai icon cheatsheets-ai

Essential Cheat Sheets for deep learning and machine learning researchers https://medium.com/@kailashahirwar/essential-cheat-sheets-for-machine-learning-and-deep-learning-researchers-efb6a8ebd2e5

customer-churn-analysis-for-a-telecommunications-company icon customer-churn-analysis-for-a-telecommunications-company

I have used a telecommunications dataset for predicting customer churn. This is a historical customer dataset where each row represents one customer. The data is relatively easy to understand, and you may uncover insights that can be used immediately. Typically it is less expensive to keep customers than acquire new ones, so the focus of this analysis is to predict the customers who will stay with the company. This data set provides information to help in predicting what behavior will help the company to retain customers. I have focused on analyzing all relevant customer data and develop focused customer retention programs.

data-science-ipython-notebooks icon data-science-ipython-notebooks

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

data-science-portfolio_vasu-kumar icon data-science-portfolio_vasu-kumar

Repository containing portfolio of data science projects completed by me for academic, self learning, and hobby purposes. Presented in the form of iPython Notebooks, and R files. *Note: Data used in the projects (accessed under data directory) is for demonstration purposes only.

deep-learning-and-lstm-models-for-delivery-date-and-cost-prediction icon deep-learning-and-lstm-models-for-delivery-date-and-cost-prediction

The objective of the project is to obtain the prediction of delivery date and freight cost based on the historic trend and attributes.The notebook showcases the exploratory data analysis and Deep learning model for delivery date prediction as well as LSTM model for freight cost prediction.

drone-flight-data-anlaysis-and-flight-range-prediciton-using-neural-networks icon drone-flight-data-anlaysis-and-flight-range-prediciton-using-neural-networks

Understanding of the flight condition, performance, efficiency is a vital part of a company’s strategy. Analyzing and studying flight details can help a company to direct the use of right type of product at right location/conditions at right time. In this report we would be leveraging Data Analysis techniques to study the flight details, launches, to target the right parts and perform EDA of flight stat

ensemble-modeling-rare-event-classification icon ensemble-modeling-rare-event-classification

Rare Event Classification (Ensemble Modelling incorporating Random Under-sampling). The data consists of 10,500 credit applications, each classified as good or bad credit. However, there are only 500 bad credit applications. Since this is less than 5% of the data, classifying applicants as bad credit is referred to as a rare event problem. This is also known as anomaly dete ction in many applications. Approach: The best ratio is discovered by trying ratios between 50:50 to 85:15. Build an ensemble model based on the optimum ratio selected. This is done my creating ensemble of trees using the optimum ratio, fitting a model to each, making classification probability predictions for each and then averaging those to get predicted classification probabilities. From that we can calculate the loss totaled over all the trees. The base model is a decision tree with a minimum leaf size is 5, and the minimum split size is 5. The optimum depth for this model is determined by optimizing the F1-score using 10-fold cross-validation.

face-recognition-for-a-guest-house icon face-recognition-for-a-guest-house

The challenge is to build a face recognition and verificaiton based entry system for a house which hosts some people frequently. The notebook presents a face recognition system. Tthe ideas presented here are from FaceNetand DeepFace. Face recognition problems commonly fall into two categories: Face Verification - "is this the claimed person?". For example, at some airports, you can pass through customs by letting a system scan your passport and then verifying that you (the person carrying the passport) are the correct person. A mobile phone that unlocks using your face is also using face verification. This is a 1:1 matching problem. Face Recognition - "who is this person?". For example, the video lecture showed a face recognition video (https://www.youtube.com/watch?v=wr4rx0Spihs) of Baidu employees entering the office without needing to otherwise identify themselves. This is a 1:K matching problem.

immigration-data---visual-analytics icon immigration-data---visual-analytics

Dataset: Immigration to Canada from 1980 to 2013 - International migration flows to and from selected countries - The 2015 revision from United Nation's website. The dataset contains annual data on the flows of international migrants as recorded by the countries of destination. The data presents both inflows and outflows according to the place of birth, citizenship or place of previous / next residence both for foreigners and nationals. The focus is on Canadian Immigration data.

improvise-a-jazz-solo-with-a-lstm-network icon improvise-a-jazz-solo-with-a-lstm-network

You would like to create a jazz music piece specially for a friend's birthday. However, you don't know any instruments or music composition. Fortunately, you know deep learning and will solve this problem using an LSTM netwok. You will train a network to generate novel jazz solos in a style representative of a body of performed work. Process Flow: Apply an LSTM to music generation. Generating jazz music with deep learning.

loan-default-prediction icon loan-default-prediction

This dataset is about past loans. The data set includes details of 346 customers whose loan are already paid off or defaulted. he goal is to understand the important factors affecting the loan status and building an optimized classification model. ML Models used - kNN, SVM, Decision Tree and Logistic Regression

machine-learning-based-car-insurance-claim-prediction icon machine-learning-based-car-insurance-claim-prediction

It is important for insurance companies to estimate the risk involved while covering a future customer. Predicting the chances of insurance claim for a future customer allows the company to avoid adding potentially loss making customers. It also helps in determining appropriate monthly charges form the insured person or object. Through this project I try to build a prediction model for a car insurance company to predict the chances of loss making insurance deal based on the details of the car. A major challenge in this project is associated with the number of claims in the training data, only about 0.7% of the data has cars that have claimed insurance. Though this small claim percentage seem insignificant bu the cost requires to cover these insurance could cost the company millions of dollars. To address this challenge, I try to implement sampling techniques to increase the percentage of claim data. Language: R Repository contains: Test and Training data files, R code and a detailed project report conveying challenges in the data and approach to build a final prediction model.

machine-learning-based-supply-chain-demand-forecast-for-a-global-pharmacy-company icon machine-learning-based-supply-chain-demand-forecast-for-a-global-pharmacy-company

The dataset is of a Global Pharmacy Company. The dataset comprises of Historical sales, Product Information and products which need forecasting. The demand forecasting is required at a quarterly level. Carry-Over products are those products which have historical data present and New products do not have any historical data present.

natural-language-processing-using-nltk-on-ebooks icon natural-language-processing-using-nltk-on-ebooks

There are 8 different text files of ebooks which are available freely on http://www.gutenberg.org/ . Steps Performed: Importing of text files to python, Text Parsing and transformation operations are performed such as lower case conversion, removal of special characters, contraction words, tokenizing etc., Tagging parts of speech to each term, Stemming terms to get their root word, Stop Word Removal. The project also shows the difference in the outcome when POS Tagging, Stop Word Removal and Stemming operations are not performed.

prediction-analysis-of-risky-credit-ann-rf-dt- icon prediction-analysis-of-risky-credit-ann-rf-dt-

The original dataset contains 1000 entries with 20 categorial/symbolic attributes prepared by Prof. Hofmann. In this dataset, each entry represents a person who takes a credit by a bank. Each person is classified as good or bad credit risks according to the set of attributes. The objective of the problem is to develop a model for correctly identifying the credit risk of a customer for a bank.

recommender-system-for-groceries-contractor icon recommender-system-for-groceries-contractor

Groceries are critical part for any Restaurants, Bakery, Breakfast Spot, Brewery, Cafe (with fresh and high quality groceries) etc. Their quality and timely delivery plays a critical role in the serviceability of a vendor. The supply chain's most important element in its supply is distributor or supplier. The location, transportation method, cost and quality plays a significant role. Moreover, the warehouse location plays a critical role in development and progress of a contractor / distributor. The project discusses a method to data mapping, visualization and applying machine learning techniques in order to identify optimum location in a neighborhood in Scarborough of Toronto for a groceries contractor.

text-analytics-on-reviews-of-california-cabernet-sauvignon icon text-analytics-on-reviews-of-california-cabernet-sauvignon

9 distinct topic cluster were formed using 13136 text reviews of wine. Parsed (tokenization & POS tagging) and Filtered (stop-word removal & stemming) the text reviews to build the term/doc matrix. The matrix was weighted using TF_IDF. Applied latent dirichlet allocation (LDA) and SVD - latent semantic analysis to classify and analyze topic clusters. Further calculated region wise average price and points of wine. Last but not the least calculated contribution region wise contribution to each topic cluster

trigger-word-detection-for-a-laptop-using-deep-learning icon trigger-word-detection-for-a-laptop-using-deep-learning

In this notebook, we will construct a speech dataset and implement an algorithm for trigger word detection (sometimes also called keyword detection, or wakeword detection). Trigger word detection is the technology that allows devices like Amazon Alexa, Google Home, Apple Siri, and Baidu DuerOS to wake up upon hearing a certain word. For this exercise, our trigger word will be "Activate." Every time it hears you say "activate," it will make a "chiming" sound. By the end of this assignment, you will be able to record a clip of yourself talking, and have the algorithm trigger a chime when it detects you saying "activate."

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.