Git Product home page Git Product logo

jas's Introduction

Code associated with our EMNLP paper "A Generative Joint, Additive, Sequential Model of Topics and Speech Acts in Patient-Doctor Communication". Unfortunately, we have not yet been able to secure IRB approval to release the actual data :(. But we give the expected data format and sample usage below, anyway. 

-Byron Wallace
[email protected]
http://www.cebm.brown.edu/byron


Data format
--
The basic data format is as follows:

boundary boundary:case_id=BMC3013_1
19 32 42 56 226 309 558 1889 1,2
558 19 58 145 168 216 1,2
56 35 16 20 3,2
โ€ฆ
boundary boundary:case_id=XXX

Where the "boundary" strings demarcate a new session. The last two (comma-separated) entries are the topic and speech act, respectively. The numbers are assumed to map to tokens (features).

Sample usage
--
import transcripts
import joint_sequential_SATs

tnb = transcripts.tnb_from_file("data/unigram-cases-joint/train.CRF.speakers.pronoun.question.unigram.joint.0.dat", hold_out_a_set=True)

# train model and make predictions
m = joint_sequential_SATs.JointSequential(tnb)
m.estimate_parameters() # may take a while...
test_cases = transcripts.load_test_cases("data/unigram-cases-joint/test.CRF.speakers.pronoun.question.unigram.joint.0.dat", tnb)
preds_Y, preds_S = m.predict_set_sequential_joint(test_cases)

# now assess performance 
import process_results
test_Y, test_S = transcripts.parse_labels_file("data/unigram-cases-joint/test.CRF.speakers.pronoun.question.unigram.joint.0_labels.dat")
print process_results.calc_metrics(test_Y, test_S, preds_Y, preds_S)

jas's People

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.