This repository contains the code and documentation for a course project in STATS C263 at UCLA, created by Team C: Youngzie (Zoe) Lee, Jun Yu Chen, Andrea Kang, and Shuhao Fu.
This project aims to generate synthetic data using Large Language Models (LLMs) to improve the prediction of depression scores. Our approach leverages advanced AI models to create synthetic datasets and enhance prediction accuracy.
The repository is organized into three branches, each containing the code for different components of the project:
This branch contains the code for predicting depression scores using the DAIC-WOZ dataset.
This branch contains the code for generating a synthetic DAIC dataset using Azure's OpenAI GPT API.
This branch contains the code for fine-tuning the LLaMA 3 model and generating a synthetic DAIC dataset.
- Clone the repository:
git clone https://github.com/fushuhao6/daic_depression_detection.git
- Checkout the desired branch:
git checkout [branch-name]
- Follow the instructions in the branch-specific README file to set up and run the code. Note that we do not include the dataset in this repo because the dataset is not publicly available at this time.
- Youngzie (Zoe) Lee
- Jun Yu Chen
- Andrea Kang
- Shuhao Fu