Git Product home page Git Product logo

tweetmapping's Introduction

Semantic Mapping of Twitter Healthcare Related Posts to Formal Ontologies

This repository is to satisfy the research task requirement for the Coursera Data Science, Real World project as defined by the Coursolve need; Healthcare Twitter Analysis This research task will be to explore the possibility and the usefulness of mapping tweet content (healthcare keywords) to formal ontologies at the NCBO Bioportal.

tweet_mapping.py is a Python app to perform the search and annotation via REST API. It creates output file(s) containing the links that can be analyzed and visualized.

This is where you can find the Bioportal REST API documentation

The NCBO REST API examples are included here and modified so that you will place your API key in a file named ncbo_api_key.txt This file is in the ignore list and will not be upload to GitHub. However, there is an EXAMPLE_ncbo_api_key.txt that shows where to put your api key.

You will need to register on the Bioportal to get your own key. Once you register your key will be available in your account settings.

Documentation for the Bioportal REST API is on their website. The wiki contains a lot of useful information as well.
I suggest you use these examples to test your API key. They are not needed for the application source tree.

Usage:

This project requires Python 2.7

See the file ConfigREADME.md for details on setting functionality and data paths.

Once the configuration file is complete just run the executable file:

$./tweetmapping.py

Example:

The example data is from the main Google Drive repository labeled as 'June'. These data files are CSV files with headers. The headers are:

firstpost_date,url,trackback_author_nick,content,score,trackback_permalink,trackback_author_url

This input format is required.

The output file(s) are CSV files with the following format:

trackback_permalink,char_from,char_to,matched,ontology,code,ann_link

  • trackback_permalink - the unique id to the tweet

  • char_from - the starting character in the tweet text for this annotation match

  • char_to - the ending character in the tweet text for this annotation match

  • matched - the text that was matched in the ontology/vocabulary

  • ontology - the abbreviation for the matched ontology/vocabulary

  • code - the unique code in the ontology/vocabulary for the matched text

  • ann_link - the PURL (permanent) link to the code on the bioportal

The trackback_permalink can be used to find the original tweet.

tweetmapping's People

Watchers

James Cloos avatar Donna Coffman avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.