tianbingsz Goto Github PK

followers: 25.0 following: 0.0 repos: 52.0 gists: 0.0

Name: Tianbing Xu

Type: User

Bio: Researcher: Large Language Models, Reinforcement Learning, Deep Learning Software Engineer: proficient coding in C++, Java, Python and Go.

Location: California

Blog: https://sites.google.com/view/tianbing

Tianbing Xu's Projects

alignment-handbook

Robust recipes for to align language models with human and AI preferences

alpaca_farm

A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.

baselines

OpenAI Baselines: high-quality implementations of reinforcement learning algorithms

codingpractise

c++ coding exercise

d3po

[CVPR 2024] Code for the paper "Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model"

dart

deep-learning-pytorch-huggingface

epg

Open-sourced code for Evolved Policy Gradients

gpt-2

Code for the paper "Language Models are Unsupervised Multitask Learners"

gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

halos

A library with extensible implementations of DPO, KTO, PPO, and other human-centered loss functions (HALOs).

install-guides

Various installation guides for Large Language Models

lit-llama

Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.

llama-adapter

[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters

llama3

The official Meta Llama 3 GitHub site

llama3-from-scratch

llama3 implementation one matrix multiplication at a time

llava

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

llm

Sharing LLM basic ideas and code

llm-course

Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

llm.c

LLM training in simple, raw C/CUDA

llm101n

LLM101n: Let's build a Storyteller

magpie

Official repository for "Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing". Your efficient and high-quality synthetic data generation pipeline!