This project aims to finetune a lightweight Gemma-2B model with a custom dataset for humor detection. The project involves several steps:
The first step is to import the necessary libraries and load the dataset. The dataset is split into training and testing sets. The training set is converted into a Dataset object from the datasets library.
The tokenizer and the model for 2B are loaded using the AutoTokenizer and AutoModelForCausalLM classes from the transformers library.
The baseline model is tested on the test set. The model is prompted to classify humor with the provided text. The function classify_humor_2b takes a text as input and returns whether the text contains humor or not.
The model is finetuned using the SFTTrainer class from the trl library. The finetuned model is saved for later use.
The finetuned model is loaded and tested on the test set. The function classify_humor_tuned2b is similar to classify_humor_2b, but uses the finetuned model instead of the baseline model.
The results of the finetuned model are compared with the baseline model to evaluate the performance improvement. The classification report and confusion matrix are generated and saved for further analysis.