Topic: dpo Goto Github
Some thing interesting about dpo
Some thing interesting about dpo
dpo,A open-source framework designed to adapt pre-trained Language Models (LLMs), such as Llama, Mistral, and Mixtral, to a wide array of domains and languages.
User: adithya-s-k
dpo,Notus is a collection of fine-tuned LLMs using SFT, DPO, SFT+DPO, and/or any other RLHF techniques, while always keeping a data-first approach
Organization: argilla-io
dpo,SiLLM simplifies the process of training and running Large Language Models (LLMs) on Apple Silicon by leveraging the MLX framework.
User: armbues
dpo,Examples for using the SiLLM framework for training and running Large Language Models (LLMs) on Apple Silicon
User: armbues
dpo,A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).
Organization: contextualai
Home Page: https://arxiv.org/abs/2402.01306
dpo,Introducing Filtered Direct Preference Optimization (fDPO) that enhances language model alignment with human preferences by discarding lower-quality samples compared to those generated by the learning model
Organization: cyberagentailab
Home Page: https://arxiv.org/abs/2404.13846
dpo,EasyRLHF aims to provide an easy and minimal interface to train aligned language models, using off-the-shelf solutions and datasets
User: daehankim
dpo,Projeto criado durante a imersão React v2
User: developermiranda
Home Page: https://dpoquiz.vercel.app/
dpo,Unofficial PHP wrapper for Direct Pay Online API
Organization: dipnot
dpo,This is the DPO Group plugin for Magento 2.
Organization: directpay-online
dpo,This is the DPO Group plugin for WHMCS.
Organization: directpay-online
dpo,This is the DPO Group plugin for Gravity Forms.
Organization: dpo-group
dpo,This is the DPO Pay plugin for WooCommerce.
Organization: dpo-group
dpo,Align a Large Language Model (LLM) with DPO loss
User: ducnh279
dpo,Implementation for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs"
Organization: dvlab-research
dpo,This repository contains the source code used for finetuning the LLM phi-2 with several frameworks, such as DPO.
User: eyess-glitch
dpo,Unofficial Go library for DPO Group
Organization: golang-malawi
Home Page: https://nndi.cloud/oss/go-dpo/
dpo,(MultiDi)Graph Morphism Example Generator
User: itnef
dpo,Towards Robust Alignment of Language Models: Distributionally Robustifying Direct Preference Optimization
User: junkangwu
dpo,Re-usable & scalable RLHF training pipeline with Dagster and Modal.
Organization: kyryl-opens-ml
Home Page: https://kyrylai.com/2024/06/10/rlhf-with-dagster-and-modal/
dpo,CodeUltraFeedback: aligning large language models to coding preferences
User: martin-wey
Home Page: https://arxiv.org/abs/2403.09032
dpo,Use PEFT or Full-parameter to finetune 300+ LLMs or 60+ MLLMs. (Qwen2, GLM4v, Internlm2.5, Yi, Llama3.1, Llava-Video, Internvl2, MiniCPM-V-2.6, Deepseek, Baichuan2, Gemma2, Phi3-Vision, ...)
Organization: modelscope
Home Page: https://swift.readthedocs.io/zh-cn/latest/LLM/index.html
dpo,Privacy Mapping Open Source Software
User: nycyberlawyer
Home Page: https://sites.google.com/mydpo.us/mydpous/home
dpo,This repository contains Jupyter Notebooks, scripts, and datasets used in our finetuning experiments. The project focuses on Direct Preference Optimization (DPO), a method that simplifies the traditional finetuning process by using the model itself as a feedback mechanism.
User: omarmnfy
dpo,Awesome tools and information for Data Protection Officers - GDPR professionals
User: pforret
dpo,Align Anything: Training All Modality Model with Feedback
Organization: pku-alignment
dpo,DPO Group Payment gateway PHP SDK
Organization: razor-informatics
Home Page: https://razorinformatics.co.ke
dpo,Various training, inference and validation code and results related to Open LLM's that were pretrained (full or partially) on the Dutch language.
User: robinsmits
dpo,Step-aware Preference Optimization: Aligning Preference with Denoising Performance at Each Step
User: rockeycoss
Home Page: https://arxiv.org/abs/2406.04314
dpo,Direct Preference Optimization of ChatGPT2 using TRL Library
User: sharathhebbar
dpo,MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型,实现了包括增量预训练(PT)、有监督微调(SFT)、RLHF、DPO、ORPO。
User: shibing624
dpo,基于DPO算法微调语言大模型,简单好上手。
User: sugarandgugu
dpo,[ACL 2024] Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning
User: tianduowang
Home Page: https://arxiv.org/abs/2407.18248
dpo,tensorflow를 사용하여 텍스트 전처리부터, Topic Models, BERT, GPT, LLM과 같은 최신 모델의 다운스트림 태스크들을 정리한 Deep Learning NLP 저장소입니다.
User: ukairia777
dpo,Data and models for the paper "Configurable Safety Tuning of Language Models with Synthetic Preference Data"
User: vicgalle
Home Page: https://arxiv.org/abs/2404.00495
dpo,A Laravel package to simplify using DPO Payment API in your application. https://dpogroup.com
Organization: zepson-tech
Home Page: https://zepsonhost.com/
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.