Hello! ποΈ I am a Senior AI & Data Science Engineer with 6 years of experience. As a Senior Data Scientist, Machine Learning Engineer, and MLOPS expert, I specialize in Natural Language Processing (NLP) and Large Language Models (LLMs) in the field of Generative AI. I excel in designing and implementing data-driven models tailored to client needs.
π Areas of Passion:
- Advanced NLP techniques including:
- LLMs π€
- Qlora π¦
- LORA π¦
- RAG (retrieval-augmented generation) πΈοΈ
- π€Ώ Currently, Iβm immersed in exploring augmented data generation techniques for NLP tasks.
- π¬ Got queries about NLP, AI, or machine learning? Don't hesitate to ask! π§
- Large Language Models (LLMs) π€
- MLE | MLOPS π
- Data Analysis π
- Natural Language Processing π
- Dashboard Realization π
- Business Intelligence π
- Data Management & Transformation βοΈ
- Machine Learning π¦Ύ
- Deep Learning π§¬
- Data Visualization πΌ
Whether you're looking to tell a compelling story with your data, develop a real-time dashboard with KPIs for monitoring your company's health, or explore natural language processing solutions π£, I can assist you in your next venture.
Over the years, I've successfully managed projects worth +β¬1M across 12+ countries π for major clients including the European Parliament, Kering, Atos, Renault Nissan Mitsubishi, Damart, and more. I wear multiple hats as a π§βπ¬ Data Scientist, π Data Analyst, and π§βπΌ Project Manager with both functional and managerial expertise.
- Microsoft Certified: Power BI Data Analyst Associate (PL-300)
- Microsoft Certified: Data Analyst Associate (DA-100)
- Microsoft Certified: Azure Fundamental (AZ-900)
Location: Paris (October 2020)
Prize: β¬10,000
- Objective: Development of a prediction/recommendation platform based on AI.
- Description:
- Prediction of the environmental impact of Kering's various activities throughout the supply chain.
- Accurate evaluation of environmental impacts: resource depletion, biodiversity, greenhouse gases.
- Decision support for designers, material researchers, and consumers in their choices to reduce the impact of luxury.
- Automatic creation of predictive models from user-provided data or directly from Kering data and integrated native models.
- Recommendation technique: Collaborative-based method, Content-based method, Hybrid method.
- Results: The team won 1st place for the best platform for predicting the environmental footprint of Kering's products.
- References:
Location: Paris (February 2019)
- Objective: Production of a Business Intelligence tool based on voice commands.
- Description:
- Real-time monitoring of the actions of each sales agent on the different business units and real-time feedback.
- Automation of the customer prospecting process by centralizing data.
- Streamlining the Manager - Salesperson - Client relationship through an omnichannel approach based on high-value-added artificial intelligence.
- Results: This project allowed decision-makers to discover insights into their activity and monitor the actions of each salesperson in real-time.
- References:
π Senior Data Scientist | LLM expert - TOTAL ENERGIES, Paris
Objective: SQL Chatbot for Database Management
- Led the development of an advanced SQL chatbot to enhance database querying and data visualization using NLP and LLMs.
- Architected the SQL chatbot leveraging LangChain and OpenAI's GPT-4, enabling intuitive data visualizations and command translations.
- Enhanced model efficiency & performance using LLMs.
- Designed a full-stack solution hosted on Azure SQL Database, integrating Azure Bot Services and Azure Language Understanding (LUIS) for a dynamic user interface.
- Optimized model performance with hyperparameter tuning.
- Adapted the model to cater to different building types.
- Result: Enhanced model efficiency & performance using advanced NLP techniques & LLMs.
- Technical Stack: Python, Azure, PostgreSQL, LangChain, Streamlit, Gitlab, AzureDevOps
π§βπΌ IFACI β Expert LLM
Objective: AI GEN Assistant for Auditing Profession
- Role: AI GEN Assistant for Auditing Profession
- Developed a generative AI base and an assistant for field agents, enhancing natural language understanding capabilities using Spacy.
- Utilized a combination of Azure, LanceDB, RAG, Vector Store, HNSW, and Hybrid search technologies to optimize performance.
- Result: Improved efficiency and accuracy in the auditing process through the implementation of advanced AI techniques.
- Technical Stack: Python, Azure, Neo4J, AzureDevOps, LanceDB, Chroma, Milvus, MLFlow, HNSW, Hybrid search
π§βπΌ GSF β Senior Data Scientist / MLE Architect
Objective: Predictive Maintenance for Cleaning Services
- Role: Workplace Accident Prediction + Explainability
- Spearheaded the implementation of CI/CD pipelines and developed a system for the evaluation of prediction explainability.
- Utilized Azure Machine Learning, Azure Datafactory, Azure Pipelines, Azure Devops, and integrated with Snowflake and Control-M for workflow management.
- Result: Improved workplace safety through accurate accident prediction and enhanced model explainability.
- Technical Stack: Python, Azure, MLFlow, Databricks, Pandas, Terraform, TensorFlow, Scikit-learn, PyTest, Docker, Azure Machine Learning, Azure Datafactory, Azure Pipelines, Azure Devops, Snowflake, Control-M
π§ Expert NLP LLM / Senior Data Scientist - THUASNE
Objective: Email Order System (1K orders/day)
- Created an email order management system and a multimodal model for information extraction employing BERT, Azure, ChatGPT, NLP, LLMs, and Melusine.
- Enhanced the system with Azure Document AI for advanced document processing.
- Improved order processing efficiency & anomaly detection.
- Used explainability tools like LIMETextExplainer, ELI5NLP, SHAP, and AnchorsNLP.
- Optimized system performance with hyperparameter tuning.
- Adapted the system to cater to different orthopedic domains.
- Result: Enhanced system efficiency & performance using advanced NLP techniques, LLMs, and Melusine tool.
- Technical Stack: BERT, Azure, OpenAI, NLP, LLMs, Melusine, Azure Document AI, Scikitlearn,Docker
π¨ Senior Data Scientist - ADELAIDE
Objective: Automatic Email Processing (10K emails/day)
- Developed an explainability module for email classification and automatic responses using open-source tools such as Melusine, LIMETextExplainer, ELI5NLP, SHAP, AnchorsNLP, and integrations with Hugging Face and RASA.
- Managed version control and continuous integration using Git and CI/CD practices.
- Result: Streamlined email processing and improved response accuracy through the implementation of advanced NLP techniques and explainability tools.
- Technical Stack: Melusine, LIMETextExplainer, CNN, ELI5NLP, SHAP, AnchorsNLP, Hugging Face, RASA, Git, CI/CD
π΅οΈββοΈ Expert LLM - ACOSS/URSSAF/CNAF/CNAM
Objective: Documentary AI for Social Fraud Prevention
- Developed a demonstrator for multimodal processing of large data volumes using Transformers, LayoutLM, OCR, NLP, and Topic Modeling to detect fraud.
- Result: Enhanced fraud detection capabilities through the implementation of advanced AI techniques for multimodal data processing.
- Technical Stack: Transformers, OpenCV, PyTorch, CNN LayoutLM, OCR, NLP, Topic Modeling
π΅οΈββοΈ Senior Data Scientist - Quantmetry, Paris
Objective: Documentary AI Fraud Demonstrator
- Designed a demonstrator for document processing (insurance invoices).
- Detected fraudulent patterns & document falsification.
- Extracted key invoice fields & verified their consistency.
- Identified potentially suspicious overbilling cases.
- Result: Significant improvement in fraud detection using AI, outperforming traditional OCR techniques.
- Technical Stack: Python, Azure, OCR, OpenCV, PyTorch, CNN, NLP, Machine Learning, Scikitlearn, Docker
π Senior Data Scientist - Stellantis, Paris
Objective: Part Forecasting: PFO β Technical Lead/ Technical Expert
- Provided 18-month forecasts to suppliers, mitigating semiconductor crisis impact.
- Developed PFO architecture as a Streamlit web app hosted in Azure.
- Integrated data from Oracle Exadata Database.
- Used Azure Data Factory for file transfer & processing tasks.
- Containerized the PFO app using Docker & deployed to Azure Container Registry.
- Result: Enhanced inventory management & supplier collaboration, improving part prediction accuracy.
- Technical Stack: Streamlit, MLFlow, Airflow, Terraform, PyTest, Databricks, Azure, Oracle Exadata Database, Azure Data Factory, Docker, Azure Container Registry
πΈ Project Manager / Technical Expert - ATOS, Grenoble
Objective: Travel & Expense Dashboard Atos (β¬10M+/ year)
- Developed a KPI dashboard to monitor Atos' expenses in real-time with geolocation and carbon footprint of travel.
- Recovered +10% VAT + billable expense reports (+β¬1 million annual gain).
- Gained more than 3214 hours of work per year.
- Result: Automation of weekly reports, significant cost savings, and improved efficiency through real-time expense monitoring and analysis.
- Technical Stack: Pandas, Matplotlib, Numpy, Scikit-learn, Jupyter Notebook, Power BI
π£οΈ Lead Data Scientist - ATOS, Grenoble
Objective: R&D β Expressive TTS System
- Collected and adapted a large corpus of interactive behaviors in English (LJ Speech) and French (MAILABS).
- Developed and trained an expressive TTS system based on the Tacotron2 model by NVIDIA.
- Implemented a methodology for evaluating the learning quality of the prototype based on the distribution of lengths (number of spectrogram frames) of the predicted clips compared to the originals.
- Prepared a scientific paper: "Linking Utterances via Punctuations for Improved End-to-End Speech Synthesis".
- Captured the variability of styles and emotional state and their syntheses to the user profile for better prediction of speech synthesis applied to the text-to-speech (TTS) system.
- Result: Improved robustness and accuracy of TTS e-spectrogram generation, control and generation of styles, verbal behaviors, and prosody based on the user.
- Technical Stack: Pytorch, Tensorflow, Python, LSTM, Transformers, Attention Mechanism
π Data Scientist - Renault Nissan Mitsubishi, Paris
Objective: Industry Automobile β Home to Car Next Generation Alliance
- Realized prototypes and developed the first generation of voice assistants of the Renault Nissan Mitsubishi alliance.
- Integrated the Google Assistant with Nissan cars to receive information from the car and control it remotely from your phone or from a Google Home.
- Connected to the authentication servers of the RNM Alliance and complied with cybersecurity specifications.
- Deployed Alexa and Google Actions project environments fully configured and ready to use.
- Documented user journey to configure the service.
- Result: Launched these features with the Nissan Juke at the 2019 Frankfurt Motor Show.
- Technical Stack: Python, Azure, Tensorflow, Keras, Dialogflow, Luis, Reddit, Alexa Skill, Bot Framework
π¨ Data Scientist - ATOS (European Parliament), Grenoble
Objective: Service β SAMBOT an intelligent conversational agent for room reservation (+3000 users)
- Developed a multilingual chatbot for room reservation in natural language (text and voice).
- Implemented a recommendation system based on user habits, locations, and room occupancy.
- Paired with Outlook calendars & email systems (Skype).
- Documented functional, technical, and user journey aspects.
- Result: Realized room reservations in record time considering user habits.
Technical Stack: Python, Azure, OCR, OpenCV, PyTorch, CNN, NLP, Machine Learning, Scikitlearn, Docker
- Project Management π
- Preparation π
- Planning ποΈ
- Management π
- Evaluation π
- Monitoring and Control of:
- Resources π°
- Calendar π
- Costs πΈ
- Scope π―
- Risk π¨
- Quality π
- Requirements π
- Value π
- Satisfaction π
- Tools: TFS, MS Project, GANT, PERT