Git Product home page Git Product logo

Project Overview

This document lists and describes various GitHub projects and pages which are relevant to my work. This is a living document and will be updated regularly to highlight new ideas and directions.

The document is divided into four main sections: Pre-Release (Private) Projects, Public Projects, Projects from DeepLearning.AI, etc, and Projects Inspired by Others.

Public Projects

These projects are completed and publicly available for use or contribution.

  1. Project Delta: Placeholder for a publically available project.

Pre-Release (Private) Projects

These are projects currently under development or in a pre-release stage. They are not yet publicly available but are significant to the overall development roadmap.

  1. FigLang 2024 Euphemisms: This project is the work associated with FigLan 2024 Sharted Task on Euphemisms

  2. Project Beta: This project related to ...

Projects Based on DeepLearningAI, OpenAI, Coursera ..

  1. Gen-AI-for-everyone: Generative AI for Everyone
  2. Build-Eval-AdvRAG
  3. xxx: Prompt Engineering / Fine Tuning LLMs.
  4. Align-LLM-DPO: Align LLMs with Direct Preference Optimization (DPO).
  5. Knowledge-Graphs-for-RAG: Knowledge Graphs for RAG.
  6. Micosoft AutoGen: An open-source framework that allows developers to build LLM applications via multiple agents that can converse with each other to accomplish tasks.

Projects Inspired by Others

This section acknowledges projects and ideas that have inspired me. These may be in various states (working?) and are more here for reference

  1. flairNLP - FlairNLP for named entity recognition (NER)
  2. ollama-voice-mac - offline voice assistant using Mistral 7b via Ollama and Whisper speech recognition models
  3. Automatic Prompt Engineer - This repo contains code for "Large Language Models Are Human-Level Prompt Engineers"
  4. Stanford-DSPy - DSPy: Programming with Foundation Models
  5. Choma-core - Chroma - the open-source embedding database
  6. 584-final: Sentence Embeddings using Supervised Contrastive Learning. Danqi Liao.
  7. ACLPUB: The official tool for creating proceedings for conferences of the Association for Computational Linguistics (ACL).
  8. annotated-transformer: http://nlp.seas.harvard.edu/2018/04/03/attention.html
  9. BERTopic: Leveraging BERT and c-TF-IDF to create easily interpretable topics.
  10. BERT_basic: BERT repository to demonstrate basic functionality
  11. Contrastive-Tension: State of the art Semantic Sentence Embeddings
  12. COVID-19: Novel Coronavirus (COVID-19) Cases, provided by JHU CSSE
  13. diseaseBERT: Code and dataset of EMNLP 2020 paper "Infusing Disease Knowledge into BERT for Health Question Answering, Medical Inference and Disease Name Recognition"
  14. FastChat: The release repo for "Vicuna: An Open Chatbot Impressing GPT-4"
  15. Fine-Tuning-BERT: Example BERT fine-tuned to perform spam classification
  16. huggingface_hub: All the open source things related to the Hugging Face Hub.
  17. introduction_to_ml_with_python: Notebooks and code for the book "Introduction to Machine Learning with Python"
  18. ISHate: This repository contains the dataset and implementation details of the paper "An In-depth Analysis of Implicit and Subtle Hate Speech Messages" accepted at EACL 2023.
  19. KPA_2021_shared_task: Shared task hosted by IBM in the ArgMining workshop in EMNLP
  20. langchain: ⚡ Building applications with LLMs through composability ⚡
  21. LeafNATS: Learning Framework for Neural Abstractive Text Summarization
  22. llama: Inference code for LLaMA models
  23. medium_articles: Scripts/Notebooks used for articles published regarding Time series and asset allocation as reference for a data science class
  24. NATS: Neural Abstractive Text Summarization with Sequence-to-Sequence Models
  25. nlp-with-transformers: Jupyter notebooks for the Natural Language Processing with Transformers book
  26. PythonClass: Looks to be stale
  27. Reddit-Data-Mining: How to extract and analyse different parts of reddit threads and comments
  28. redditDataExtractor: The reddit Data Extractor is a cross-platform GUI tool for downloading almost any content posted to reddit. Downloads from specific users, specific subreddits, users by subreddit, and with filters on the content is supported. Some intelligence is built in to attempt to avoid downloading duplicate external content.
  29. rogue-dimensions: replication code for EMNLP 2021 paper
  30. sent-summary: Looks to be stale
  31. sentence-transformers: Multilingual Sentence & Image Embeddings with BERT
  32. SimCSE: EMNLP'2021: SimCSE: Simple Contrastive Learning of Sentence Embeddings
  33. stl-scraper: Scrape short-term listings providers (Airbnb)
  34. tensor2tensor: Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
  35. text-summarization-tensorflow: Tensorflow seq2seq Implementation of Text Summarization.

v1.0

Todd Firsich's Projects

584-final icon 584-final

Sentence Embeddings using Supervised Contrastive Learning. Danqi Liao.

aclpub icon aclpub

The official tool for creating proceedings for conferences of the Association for Computational Linguistics (ACL).

bert_basic icon bert_basic

BERT repository to demonstrate basic functionality

bertopic icon bertopic

Leveraging BERT and c-TF-IDF to create easily interpretable topics.

covid-19 icon covid-19

Novel Coronavirus (COVID-19) Cases, provided by JHU CSSE

diseasebert icon diseasebert

Code and dataset of EMNLP 2020 paper "Infusing Disease Knowledge into BERT for Health Question Answering, Medical Inference and Disease Name Recognition"

fastchat icon fastchat

The release repo for "Vicuna: An Open Chatbot Impressing GPT-4"

flairnlp icon flairnlp

A very simple framework for state-of-the-art Natural Language Processing (NLP)

gstat-r icon gstat-r

Spatial and spatio-temporal geostatistical modelling, prediction and simulation

ishate icon ishate

This repository contains the dataset and implementation details of the paper "An In-depth Analysis of Implicit and Subtle Hate Speech Messages" accepted at EACL 2023.

langchain icon langchain

⚡ Building applications with LLMs through composability ⚡

leafnats icon leafnats

Learning Framework for Neural Abstractive Text Summarization

llama icon llama

Inference code for LLaMA models

llama_index icon llama_index

LlamaIndex is a data framework for your LLM applications

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.