Git Product home page Git Product logo

dnaaun / openframing Goto Github PK

View Code? Open in Web Editor NEW
9.0 4.0 5.0 12.89 MB

Tools for automatic frame discovery and labeling based on topic modeling and deep learning, made widely accessible to researchers from non computational backgrounds.

Home Page: http://www.openframing.org

Python 19.91% HTML 11.67% JavaScript 30.85% CSS 33.88% Dockerfile 0.78% Shell 0.05% SCSS 2.87%
framing topic-modeling text-classification flask-application

openframing's Introduction

OpenFraming

Introduction

We have introduced OpenFraming, a Web-based system for analyzing and classifying frames in the text documents. OpenFraming is designed to lower the barriers to applying machine learning for frame analysis, including giving researchers the capability to build models using their own labeled data. Its architecture is designed to be user-friendly and easily navigable, empowering researchers to com- fortably make sense of their text corpora without specific machine learning knowledge.

You can find the preprint of our work here

Requirements

Docker

You need Docker. Feel free to read up on Docker if you wish. Our best short explanation for Docker is that, Docker is for deploying applications with complicated dependencies, what the printing press was to publishing books (it allows you to do it in a much quicker, and much more reproducible way).

The link above has guides on how to install Docker on the most popular platforms.

How to install

  1. git clone https://github.com/davidatbu/openFraming.git
  2. cd openFraming
  3. docker-compose build
  4. docker-compose up

You might have to add sudo at the beginning of commands at step 3 and 4.

E-mails

If you want to send actual e-mails through Sendgrid with this system (as opposed to just printing the e-mails that would be sent to the console), please set the environment variables:

export SENDGRID_API_KEY=     # An API key from Sendgrid
export SENGRID_FROM_EMAIL=   # An email address to put in the "from" field. Note that
			     # you'll have to verify this email in Sendgrid as a 
			     # "Sender". 

If you happen to need sudo in the section above, please pass the -E flag to make sure these environment variables are picked up. i.e.,

sudo -E docker-compose up

Video demonstration

You can check the following YouTube video for a quick demonstration of our Website's features.

IMAGE ALT TEXT

Getting help

If you have any question, concern, or bug report, please file an issue in this repository's Issue Tracker and we will respond accordingly.

Citation

@article{smith2020openframing,
  title={OpenFraming: We brought the ML; you bring the data. Interact with your data and discover its frames},
  author={Smith, Alyssa and Tofu, David Assefa and Jalal, Mona and Halim, Edward Edberg and Sun, Yimeng and Akavoor, Vidya and Betke, Margrit and Ishwar, Prakash and Guo, Lei and Wijaya, Derry},
  journal={arXiv preprint arXiv:2008.06974},
  year={2020}
}

Funding

This research is funded by the following NSF Award:

NSF Award #1838193 BIGDATA: IA: Multiplatform, Multilingual, and Multimodal Tools for Analyzing Public Communication in over 100 Languages

Acknowledgement

We are truly grateful to Gerard Shockley, Boston University Cloud Broker, for helping us seamlessly host our Website and run in an Amazon Web Services EC2 instance.

Credits

Alyssa Smith*, David Assefa Tofu*, Mona Jalal*, Edward Edberg Halim, Yimeng Sun, Vidya Prasad Akavoor, Margrit Betke, Prakash Ishwar, Lei Guo, Derry Wijaya

openframing's People

Contributors

asmithh avatar dnaaun avatar edward-edberg avatar monajalal avatar vidyaap avatar yimengsun avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

openframing's Issues

Add LDA & naive bayes-based dataset creation

  • add random article sampler that randomly samples k articles and creates a spreadsheet for labeling
  • add intake for labeled sample and full dataset (assumes the original dataset contains the articles in the labeled sample)
  • do naive bayes to create a labeled dataset using the small labeled sample
  • create a spreadsheet that can be uploaded to the classifier endpoint

Need an endpoint for policy issue name

After we select the 'policy issue' endpoint, we need an endpoint to use it in Ajax.

Current incomplete code for processing it is:

	$("button[type='submit']").on('click', function (){
		let radioValue = $("input[name='gunviolence']:checked").val();
		alert(radioValue)
		console.log(radioValue)
		debugger;
	})

if this is a duplicate issue, please link it to the other issue/pull.

Create queueing protocol for training with BERT

In the event that multiple people want to use the BERT training at one time, we'll need to figure out queueing. David suggested Python's multiprocessing library, which is what I think I'll be using.

Create a Signup/Sign-in Page Frontend

Users should enter the following if they choose to sign-up and create an account:

  • user Full Name
  • username
  • user email
  • password

Users should enter the following if they choose to sign-in and login to their account:

  • username or email (that they used for creating the account)
  • password
  • an option for forgetting the password and resetting the password
  • an option for forgetting the username/password

The page should also have the following at the very minimum:

  • logo of OpenFraming
  • Sign-in (if the user already has an account with us)
  • Sign-up (if the user already doesn't have an account with us)

Accuracy scorer for BERT-based classifier

Given a validation set and a fine-tuned BERT-based (binary) classifier, return the following:
{
"macro_f1": float,
"macro_precision": float,
"macro_recall": float,
"accuracy": float
}

May need to do this for each category if we're doing one binary classifier per category.

Add an input box to get the email address of the user

We need to have the email address of the user so that when the results are available we can send them the result.

*We should also use methods such as email address validation to make sure the user is entering a correctly formatted email address.

Architecture for training multilabel BERT classifier

We should probably discuss this in more depth - but are there any specific considerations I should have in mind when designing the architecture for training multiple binary classifiers? We /are/ doing multiple binary classifiers, right? (not a multiclass classifier)? In that case, will we need to modify the dev set accuracy to include scoring for all classes?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.