Git Product home page Git Product logo

Data scientist with background in linguistics, leveraging computational linguistics and data analysis to narrate deep insights. Proficient in Python, SQL, and regex for data handling and analysis. Experienced in machine learning models, data visualization with Tableau and JavaScript, and research coordination.

Highlights of Work on Data Science

  • MADAIN - Mole Analysis with Deep Adam-Optimized Inception Network: We've developed a convolutional neural network (CNN) using the InceptionV3 architecture with the Adam optimizer, aiming to classify skin lesions into one of seven categories, prioritizing recall for cancerous classes to minimize false negatives. After benchmarking multiple CNN architectures and optimizers, and running extensive tests including adjusting epochs, custom weight schemes, and implementing both multiclass and binary classifiers, our model has been integrated into a web app showcased on GitHub pages. Our dataset, sourced from Kaggle, features 10,015 images. Despite challenges like class imbalance, our ongoing efforts include fine-tuning through increased neuron density, inverse proportional weighting, and experimental augmented image training to improve classification accuracy and recall rates, particularly for underrepresented classes.
  • Effects of Climate Variability on Wine Production Metrics: We developed a Pandas database to organize global wine production data and historical temperature records. Our role involved extensive use of Pandas for data cleaning and integration, which ensured the high quality and uniformity of the dataset. We applied statistical analysis tools from the SciPy library to verify the data's integrity and accuracy. Additionally, we created insightful visualizations using Matplotlib and Seaborn, which helped in effectively narrating the findings of our analysis. This project showcased our proficiency in Python and various data science tools, contributing valuable insights into the effects of climate variability on wine production metrics.
  • Geospatial Visualization of Volcanic Activity: Geospatial Visualization of Volcanic Activity project involved creating a dynamic and interactive web-based platform to visualize volcanic activity across the globe. A SQL database in PostgreSQL was developed for efficient management and storage of refined data. Utilizing the interactive mapping capabilities of Folium and Ipyleaflet, the platform presents geospatial and seismological data with precision. To further enhance the user experience, interactive elements were crafted with JavaScript, enabling smooth navigation through various data visualizations. The immersive environment of the platform is amplified by the background use of an MP4 video, offering users both an educational and analytical tool. The visualizations shed light on the Volcanic Explosivity Index (VEI), as well as the human and economic impacts of volcanic events, thus serving as a comprehensive resource for understanding the significance of volcanoes over time.

Highlights of Work on Linguistics

  • Towards Accounting for L2 Accent - The Case of Turkish Vowel Space: My methodology involved the meticulous collection, structuring, and analysis of human speech data. To ensure precise formant frequency tracking and analysis, I employed the powerful Praat software for formant tracking, while Audacity was utilized for efficient audio file processing and modification. A robust Excel database was created to facilitate data cleaning, tagging, and organizing, enabling a structured approach to the complex dataset. Through this rigorous process, I aimed to shed light on the nuances of L2 accent and contribute to the broader understanding of language acquisition and phonetic variation. The findings of this research have significant implications for linguistic theory and practical applications in language teaching and speech technology.

I am excited about collaborating with fellow data scientists and linguists to tackle challenges and drive innovation. Let's connect on LinkedIn or explore my projects here on GitHub to work together and make an impact in data science and linguistics.

Mustafa Can Ayter's Projects

credit-risk-classification-analysis icon credit-risk-classification-analysis

A machine learning model using logistic regression to predict loan defaults based on financial data. The purpose of the analysis was to assist in risk assessment and decision-making processes in the lending industry.

crowdfunding-etl icon crowdfunding-etl

Using Python and Pandas methods and functions and to list comprehensions to extract, transform, and clean data. Then using regular expressions to find patterns and extract data from text and string data

excel-crowdfunding icon excel-crowdfunding

Project arranging and examining a database of 1,000 sample projects in Excel to discover underlying patterns and trends.

food-hygiene-rating-data icon food-hygiene-rating-data

Analyzing the UK Food Standards Agency's food hygiene rating data for various establishments across the United Kingdom, using MongoDB and Jupyter Notebook. The goal is to help food magazine editors at Eat Safe, Love decide where to focus future articles by updating the database and performing exploratory analysis.

geospatial-visualization-of-volcanic-activity icon geospatial-visualization-of-volcanic-activity

This project showcases an interactive web-based visualization of global volcanic activity, highlighting the Volcanic Explosivity Index, casualties, and economic impacts. It features dynamic maps with a dropdown menu and an engaging video background, providing a comprehensive yet immersive educational tool.

javascript-belly-button-biodiversity icon javascript-belly-button-biodiversity

Building an interactive dashboard to explore the Belly Button Biodiversity dataset, which catalogs the microbes that colonize human navels. The dataset reveals that a small handful of microbial species (also called operational taxonomic units, or OTUs, in the study) were present in more than 70% of people.

madain icon madain

Using TensorFlow and Keras to predict skin cancer from mole images, achieving high accuracy and precision through extensive testing and benchmarking of CNN architectures and optimization techniques.

pandas-school-budget-analysis icon pandas-school-budget-analysis

Helping the school board and mayor make strategic decisions regarding future school budgets and priorities using Python and Pandas.

python-climate-change-analysis icon python-climate-change-analysis

A comparative analysis of winery production volume with global temperature changes. Is there a correlation between rising temperatures and its effect on wine production?

sql-employee-database icon sql-employee-database

Design the tables to hold the data from the CSV files, import the CSV files into a SQL database, creating ERD diagrams, and then answer questions about the data. That is, performing data modeling, data engineering, and data analysis, respectively.

sqlalchemy-trip-planner icon sqlalchemy-trip-planner

Planning a trip and running a feasibility analysis by analyzing the climate of the area using SQLAlchemy ORM queries, Pandas, and Matplotlib.

vba-stock-market-analysis icon vba-stock-market-analysis

Stock Market Analysis using VBA Excel script to analyze stock market data, by looping through datasets and generating output calculations.

weather-analysis icon weather-analysis

Python requests, APIs, and JSON traversals to answer a fundamental question: "What is the weather like as we approach the equator?"

web-scraping-data-collection icon web-scraping-data-collection

Identifying id and class attributes from websites, and use this knowledge to extract information via automated browsing with Splinter and HTML parsing with Beautiful Soup.

wine_and_climate_change icon wine_and_climate_change

Utilized Pandas for data cleaning and integration to create a high-quality, uniform dataset of global wine production and historical temperature records. Employed SciPy for statistical analysis to ensure data integrity and accuracy. Generated visualizations using Matplotlib and Seaborn to effectively communicate findings.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.