Git Product home page Git Product logo

Hi there, I'm Samuel Ntsua

Analyst at UNC-Chapel Hill

I am enthusiastic about data, around which I formulate reliable and rational arguments to transform business rules and concepts from often ambiguous and incomplete instruction into a working programming logic. At my current job, I harvest, move, transform, and store data while automating the process. I write scripts in bash, PowerShell, python, SQL, and Stata to build multi-panel and hierarchical datasets out of administrative data and survey sampling data. I am seeking an opportunity to join a data team at mid-career level as Data Scientist, Data Engineer, or Machine Learning Engineer to propel the team's efforts and challenge myself in a production environment.

Skills

Deep Learning | R (Programming Language) | A/B Testing | MySQL | PostgreSQL | Python (Programming Language) | Amazon Web Services (AWS) | Amazon Dynamodb | Amazon S3 | Amazon API Gateway | STATA | SAS Programming | Linux/Bash/SSH | Rsync | Globus | Hadoop | Apache Spark | Git |Tableau Desktop | Anaconda(Jupyter,Spyder,Pandas,R-RStudio,dplyr) | Machine Learning | Big Data Analytics | Data Analysis | Linear Regression | Data Collection | Statistical Modeling | Microeconometrics

Connect With Me

github linkedin

trophy

Top Langs

Anurag's GitHub stats

GitHub metrics

GitHub streak stats

Profile views

Stock Exchange Data Analysis using Big-Data tools such as Hadoop, HIVE and Sqoop.

Readme Card

Objectives

  • To use HIVE and Sqoop features for data engineering or analysis and sharing the actionable insights.

Technology/Techniques Used

  • python3 mysql hiveQL hue-api hadoop-hdfs sqoop-import

DataScience_Capstone_Project

Readme Card

Objectives

  • Predict whether or not a patient has diabetes , based on certain diagnostic measurements included in the dataset.
  • Build a model to accurately predict whether the patients in the dataset have diabetes or not.

Technology/Techniques Used

  • Pandas NumPy machine-learning-algorithms scikit-learn xgboost missing-values analysis dimensionality reduction seaborn-plots extratrees GitLab

Mercedes-Benz Greener Manufacturing

Readme Card

Objectives

  • Used Xgboost to narrow down features, yet get a good prediction of vehicule safety standard, thus reducing the time a Mercedes-Benz spends on the test bench.

Technology/Techniques Used

  • Pandas NumPy machine-learning-algorithms scikit-learn xgboost label encoder dimensionality reduction seaborn-plots GitLab

Data Science with R Programming

Readme Card

Objectives

  • To record the patient statistics, the agency wants to find the age category of people who frequent the hospital and has the maximum expenditure.
  • In order of severity of the diagnosis and treatments and to find out the expensive treatments, the agency wants to find the diagnosis related group that has maximum hospitalization and expenditure.
  • To make sure that there is no malpractice, the agency needs to analyze if the race of the patient is related to the hospitalization costs.
  • To properly utilize the costs, the agency has to analyze the severity of the hospital costs by age and gender for proper allocation of resources. Since the length of stay is the crucial factor for inpatients, the agency wants to find if the length of stay can be predicted from age, gender, and race.
  • To perform a complete analysis, the agency wants to find the variable that mainly affects the hospital costs.

Technology/Techniques Used

  • r-programming-language/rstudio supervised learning linear regression GitLab

DataScience_with_Python

Readme Card

Objectives

Technology/Techniques Used

  • Pandas NumPy supervised learning linear regression scikit-learn xgboost seaborn-plots GitLab

Tableau_project

Readme Card

Objectives

Compute and display a Country's economic growth indicator as well as the percentage of it's population who purchased life insurance.

Technology/Techniques Used

  • Tableau public growth-kpi linear-trend kpi-dashboard data merge statistical measures computation

Samuel Ntsua's Projects

tableau_project icon tableau_project

Compute and display a Country's economic growth indicator as well as the percentage of it's population who purchased life insurance.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.