mrshoenel / hmm-paper-2020-r-experiments Goto Github PK

View Code? Open in Web Editor NEW

A repository holding R experiments and data for the paper "Exploiting Relations, Sojourn-Times and Joint Conditional Probabilities for Automated Commit Classification"

License: GNU General Public License v3.0

R 4.77% HTML 89.14% Jupyter Notebook 6.09%

hmm-paper-2020-r-experiments's Introduction

HMM Experiments

This repository holds data, experimental setups, and code closely related to the paper "Exploiting Relations, Sojourn-Times and Joint Conditional Probabilities for Automated Commit Classification".

While the name implies hidden Markov Models, those were just one type of model tested here. We attempt also to fit dependent mixture models, joint conditional density models, as well as attempting traditional machine learning. We reverse-engineer rules for manually labeling commits and create labels for a few hundred commits. Then, we produce datasets of adjacent, labeled commits that can be used with the mentioned models:

commits_t-0.csv: Approx. 300 newly labeled commits. Some of these were contained previously in Levin's dataset and we labeled them here again to see if we would reach the same consensus. All commits have size properties from Git-Density attached and come also with the usual information, like author, committer, email, timestamps, messages, hashes, etc.
commits_t-1.csv, commits_t-2.csv, commits_t-3.csv: Those are the "interesting" datasets. These have the same feature names as commits_t-0.csv, but also come with 1/2/3 directly predecessing commits, so they 2/3/4 times the features as commits_t-0.csv. The names are the same, but suffixed by _t_{1,2,3}.

The latter type of commit chains can be exploited for sequential learning. For example, is there any value in knowing the activity that was carried out in the previous commit(s)?

Recommend Projects

mrshoenel / hmm-paper-2020-r-experiments Goto Github PK

hmm-paper-2020-r-experiments's Introduction

HMM Experiments

hmm-paper-2020-r-experiments's People

Contributors

Stargazers

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent