Git Product home page Git Product logo

Hi there 👋

I am a Senior Data Scientist with a proven track record of driving innovative data science initiatives, leading the renewal of Machine Learning structures, implementing MLOps practices at various organization, with achievements in drug replacement algorithms, Machine Learning models for authorization of medical claims, and NLP. Previously, played a key role in establishing and leading the Data Science/BI area, overseeing DW creation using Python + Amazon RedShift + Power BI, and implementing BI processes. Demonstrated expertise in ETL, data visualization, and ML model applications across various sectors. Explore more about my journey at my Profile!

Platforms & Tools

Experienced with Python, R and Java;

Already performed data transformations inside Pyspark, Pandas, Dask, Oracle Data Integration/Data Flow, AWS Glue;

Create Pipelines of Data Engineering inside Databricks, Alteryx, AWS Glue + Athena;

Started a Data Science Area inside a software development company, starting with just me and exiting with a team of 1 BI Developer, 2 Data Eng, 1 Product Owner and 1 UI/UX Developer, formulating a Data Warehouse for Power BI tasks, and a Data LakeHouse structure after some maturity with the data, Deploying multiple models for production, with great performance for a streaming process with Pyspark;

Formulate AI Systems using Machine and Deep Learning Tasks, like Recommendation Systems, HealthCare Audit, Forecasts in different granularities, pricing elasticities development and analysis, key driver analysis using Machine Learning Models and Model Interpretability, key driver analysis using Structural Equation Modelling, Find Similarity between groups of data using Clustering, using LLM's with vector databases (RAG) to delivery faster results from internal processes and documents with LLangChain + different llm models and Muti-Stage Reasoning for automated processes inside a pipeline, between other applications of Data Science, Machine Learning and Deep Learning (including llms!);

Delivery models to a model store inside Amazon Sagemaker, Databricks, Azure Machine Learning Studio and Oracle Data Science / Model Catalog;

Visualized Data with Power BI, Tableau, Plotly/Seaborn inside Python, and ggplot2 inside R;

Follow DevOps and MLOps practices along this way, helping other developers as Tech Lead / at a Senior Position, Leading different projects and delivering Data Science and Machine Learning / AI Systems with great quality and adherence to business objectives, leading discoveries with different companies in different areas.

Studies

Currently, studying an intersect between MLOps, LLMOps and how large language models was made - inside the black box of "binarized models" and apis, how transformers, attention and other structures of Deep Learning and Feature Engineering made to transform text to numbers, inside matrices, and back to text, image, sounds, videos, codes, etc.

Contact

Gabriel Pehls's Projects

demparser icon demparser

Parser de arquivos .dem, vindos do download do replay de uma partida de Dota 2. Constam informações estatísticas, além do replay em si da partida.

descr icon descr

Descriptive statistics for R

descritor-de-ativos icon descritor-de-ativos

Projeto para o curso de AI in Financial Market, da I2A2 - Data H, contando com uma descrição (no estilo relatório) do snapshot atual de um ativo, seja via yfinance (bolsa tradicional) ou cripto ativo (via binance e bittrex)

dota2ia-v0.01 icon dota2ia-v0.01

Projeto para cadeira de IA, com outros dois módulos anteriores. --em construção--

gp27_techchallenge icon gp27_techchallenge

Trabalho realizado para o primeiro módulo da pós graduação em ciência de dados/analytics, com python e streamlit

gp27_techchallenge_3 icon gp27_techchallenge_3

Tech Challenge of the Postgraduate in Data Analytics, from FIAP, developing a Data Warehouse with data from PNAD-COVID-19, from IBGE, using Pyspark and Google BigQuery for ETL, as well as an analysis of the importance of variables using permutation and a random forest model for classifying the condition of COVID-19,to guide the analysis of the data

gp27_techchallenge_4 icon gp27_techchallenge_4

Tech Challenge of the Postgraduate in Data Analytics, from FIAP, analyzing Brent Oil price data, in comparison with historical, economic and societal data, integrating correlation and causality analyzes of items with prices, as well as developing a model forecast and an importance analysis through information gain from a forest model (XGBoost)

healthcare icon healthcare

Trabalho de qualidade de software, em Android (java), sobre informações de saúde

i2a2naivebayes icon i2a2naivebayes

Desafio de aplicar naive bayes a um modelo que efetua compra e venda de ações utilizando PETR4 como base, e um modelo de gestão de risco baseado no desempenho de indicadores em cima do ativo.

llm-deploy-locally icon llm-deploy-locally

First steps into llangchain universe, using llama2 as a model, inside and chromadb as a vector db, going to a local deployment of a llm, as a backend api

mlops_structure icon mlops_structure

MLOps full structure for llm/ml, from mlops study, done with Python.

movieapp icon movieapp

app para visualizar filmes/traillers, apenas testando funcionalidades

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.