Git Product home page Git Product logo

machine-learning's Introduction

MachineLearning

Upton, Lance T.

A pattern recognition algorithm to classify questions asked at a library service desk, based on a term document matrix (TDM).

Raw Data

Predictors

##        location         format           date           
##  info desk :   79   walk-up:48738   Min.   :2015-05-01  
##  media     :   56   phone  : 6989   1st Qu.:2016-01-16  
##  tech desk : 5301                   Median :2016-08-22  
##  staff desk:    5                   Mean   :2016-07-12  
##  west desk :24596                   3rd Qu.:2017-01-12  
##  east desk :25654                   Max.   :2017-08-01  
##  off-desk  :   36                                       
##     datetime                     question            answer         
##  Min.   :2015-05-01 11:18:00   Length:55727       Length:55727      
##  1st Qu.:2016-01-16 15:50:00   Class :character   Class :character  
##  Median :2016-08-22 10:51:00   Mode  :character   Mode  :character  
##  Mean   :2016-07-12 19:31:22                                        
##  3rd Qu.:2017-01-12 14:48:30                                        
##  Max.   :2017-07-31 20:56:00                                        
## 

Categories

##             tag1                        tag2                    tag3      
##  Reference    :27830   internal_directions:21505   other/na       :11692  
##  Directional  :25362   policy             :10895   classroom/space: 7704  
##  Miscellaneous: 2535   technology         : 6849   equipment      : 7408  
##                        known_item_lookup  : 3500   desk           : 3255  
##                        supplies           : 3265   hours          : 2987  
##                        external_directions: 3017   account        : 1994  
##                        (Other)            : 6696   (Other)        :20687

NOTE: Categories are hierarchical (tag1 > tag2 > tag3).

Method

Step 1: Base Model

  1. Question is cleaned.
  2. A TDM is generated and features are extracted.
  3. Data is split into training and testing sets.
  4. Models are created to predict tag1.
  5. Models are run and tested.
  6. A model with satisfactory accuracy is used to split data by tag1.
  7. tag2 is predicted (and so on...).

Step 2: Add predictors

  1. Different combinations of other predictors are added to the model.
  2. ???
  3. Profit.

References

machine-learning's People

Contributors

lanceupton avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.