Git Product home page Git Product logo

epsilon's Introduction

Stanford ML Project CS229'18: epsilon

1 Abstract



We envision using cutting-edge machine learning techniques and novel features to deliver higher accuracy predictions for assignees. During the duration of this project, we will focus on reducing the prediction error and view this project as an Application Result. In the future, we envision capabilities to infer priority and generalize the model to work with EHR diagnosis prediction, ticket triaging for customer support, bug reports, and exception tracking.

2 Introduction

Web and SaaS companies handle high volumes of tickets in the form of exceptions, support requests, user-reported bugs, and crash reports. JIRA and Asana are the most widely used task and ticketing systems to tame this beast with dedicated teams that work on aggregating, triaging and assigning these tickets to the right individual or team. However, effective automation is essential to improve productivity and obviate the tedious work of manually triaging tickets.

Project Epsilon aims to eliminate this overhead by experimenting with supervised-learning classifiers to intelligently and automatically assign tickets to a developer. It builds off of work done by Linkedin’s autotriager, a simple classifier that designates support tickets to pre-defined teams using an SVM and a simple text body input feature. The aforementioned autotriager has a high failure rate and only predicts a set of classes (teams). We aim to deliver higher accuracy for predicting assignee for a new ticket based on past tickets using more robust methods.

The input to our algorithm is a collection of historic JIRA tickets in JSON format. These tickets are preprocessed and featurized and following which we predict the assignee for new or unassigned tickets using 3 different methods: SVM, Naive Bayesian and Deep Neural Network Classifiers. We then compare the performance for these 3 methods as pertaining to 2 primary input datasets: a public Expium generated dataset as well as a dataset with real Jira tickets from Linkedin.

In the future, we envision extending Epsilon’s capabilities to infer priority and generalize the model to work with EHR diagnosis prediction, ticket triaging for customer support, bug reports, and exception tracking.

3 Introduction

Classification on open bug reports with supervised learning is a fairly common area of work. Common strategies are Naive Bayes and SVM classification on a multinomial event model input featurized as bag of words. The primary research revolves around interpreting text to predict assignee based on observed history. Additional research on time slicing methods, applicability of reinforcement learning, principal component analysis and feature extraction using neural networks prove promising.

Project Epsilon aims to eliminate this overhead by experimenting with supervised-learning classifiers to intelligently and automatically assign tickets to a developer. With high failure rates in state of the art solutions we aim to deliver higher accuracy for predicting assignee for a new ticket based on past tickets using deep learning models.

The input to our algorithm is a collection of historic JIRA tickets in JSON format. These tickets are preprocessed and featurized and following which we predict the assignee for new or unassigned tickets using 3 different methods: SVM, Naive Bayesian and Deep Neural Network Classifiers. We then compare the performance for these 3 methods as pertaining to 2 primary input datasets: a public Expium generated dataset as well as a dataset with real Jira tickets from Linkedin.

After complteing a grid search with 5-Fold cross vlaidation across multiple parameters the winner with the best performance has been found to be: 3-32.relu.0.005 Mean Train Accuracy: 0.999379073598363 Mean Test Accuracy: 0.28183346116770486 -0.7175456124306581

The next step is to onboard to word2vec and pandas to look through the data available.

We invite you to look over the report and code for more details.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.