Git Product home page Git Product logo

connector's Introduction

Connector

All Contributors

The Freshwater Waterhackweek Connector is a tool to visualize and summarize connections between researchers in the hydrologic sciences. This project is currently in the planning and early development phase.

Link to the demonstration Waterhackweek 2019 visualization on the University of Washington Freshwater website

To run the python code or explore the visualization with NetworkX, click on the badge to launch the Binder software environment. Binder

Big picture

The ultimate goal is to produce an interactive network visualization of freshwater scientists. Ideally, it would be similar to the ERC-citations tool hosted on nanoHUB. It might even make sense to use the ERC tool directly or with some small modifications.

The ERC-citations tool uses a bibtex file input to construct the network. There are positives and negatives to using this approach.

What works

  • standardized format for Collaborations as a digital object
  • can be easily obtained from ORCID
  • if web resources like GitHub and Hydroshare have a DOI, then it is easy to include in bibtex

Known Issues

  • any resources that don't have an easy bibtex form will have to be translated programmatically to bibtex, which could be difficult to standardize
  • it is complicated to include entities like "Waterhackweek" as centralized nodes

Data

Users' information will be captured through some channel (online form, Waterhackweek application, etc.). This information will include:

  • Name
  • GitHub username
  • ORCID
  • HydroShare username

This information will be procured using the respective APIs for Github and Hydroshare, and by parsing through the bibtex file containing journal and paper publications of the WHW participants.

We welcome feedback in our Github Issues

Any feedback on the Connector is welcome! Please post your review on the connector as a Github Issue in the following Q/A format:

  • Question 1

Is the visualization intuitive ? Can you interact with all the individual elements ?

  • Question 2

Is there anything that is not working as expected?

  • Question 3

Any suggestions that can enhance the visualization ?

GitHub username

GitHub has a fully-featured API that should allow for the following information to be retrieved for any given user:

  • public repos that they own
  • public repos that they have contributed to
  • public GitHub organizations they belong to
  • other contributors to public repos they own
  • other contributors to public repos they have contributed to
  • other members of the GitHub organizations they belong to
  • ...more

There are a number of Python wrappers for the GitHub API (https://developer.github.com/v3/libraries/). It is probably best to use one of these as it should be easier to maintain than using the raw JSON returned by a lower-level library like requests.

ORCID

Like GitHub, ORCID already has a public API that can be queried with Python. There is even a Python wrapper for the ORCID API that should make things easier.

However, ORCID requires user permission, even for read-only

HydroShare

HydroShare also has an API and a Python client that will allow for easier extraction of a user's HydroShare Resources and collaborators.

Paper Publications

Most of the paper and journal publications are present in the form of bibtex files, which can be easily parsed using the Bibtexparser.

New Developers:

  1. Fork, clone, branch

git clone https://github.com/waterhackweek/Connector.git

  1. You will need a token authorizing your access to the Github repository where you are accessing user data.

Go to Github Profile. Click through Settings/Developer Settings/Personal Access Token.

Select Scope by checking repo: Full control of private repositories

Generate token

Copy token to file to a safe place NOT in Github directory

  1. You will need the HydroShare UserID and Password and Ownership Access to the Group

Database

Currently, the data generated for the visualization in Tableau consists of the following tables :

WHW_XY (Person, X-coordinate, Y-coordinate)
WHW_edge_collabs (Person1, Person2, Link, Collaboration)

Contributors โœจ

Thanks goes to these wonderful people (emoji key):


Christina Bandaragoda

๐Ÿ’ป ๐Ÿ“– ๐Ÿค” ๐Ÿ” ๐Ÿ“† ๐Ÿ‘€

Madhavi Srinivasan

๐Ÿ’ป ๐Ÿ“– ๐Ÿค”

Tony Castronova

๐Ÿค”

Jacob Deppen

๐Ÿค” ๐Ÿ“– ๐Ÿ’ป

National Science Foundation

๐Ÿ’ต

This project follows the all-contributors specification. Contributions of any kind welcome!

connector's People

Contributors

allcontributors[bot] avatar christinab avatar deppen8 avatar madhasri avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

connector's Issues

Scoping next steps for code development and automation

  • Use jenkins to run scripts to build daily updates.

  • use hstools to replace resource on hydroshare automatically on a periodic basis

  • add code to create a tableau workbook from the sql database

  • add code for a user to drop in their own Bibtex file or list of users to create a visualization for their own community, event, project, etc.

  • improve visuals of the graph network

User Input for Search Engine Workflow

Steps to create a graph network connected by a search term instead of an author?
Current connections create edges from a list of people.

New type of graph network: create edges from a list of terms: tools or topics
Humans, leaders, and organizers determine the short list of search terms and publish visualizations in report.

  • Promote and recruit collection of updated bibtex files (manual)

  • Request permission to scrape OrcID and Github and data repository and get a list of IDs and repositories

  • Add code to scrape OrcID

  • String manipulation to reconcile names to develop consistent ID connection for nodes

  • Scrape HydroShare (to do other data repositories)

  • Scrape Github based on input 'Organization Name' requires permission and Admin access or key generated to access the API

  • Organizer/Coordinator User Inputs = bibtex file, list provided by network members

  • Natural language processing on abstracts and readme and all metadata fields. (See github.com/hydroshare Discover search engine does this, for example). Output = List of terms

  • Mapping of keywords corresponding to each term/keyword.

  • View graph network with color coded nodes and links by tags in the search. MVP = pick 5 keywords, make a graph network view that provides a filter with check box that turns on and off like Github, HydroShare, Journals

  • Print selection of filter to make a table and report by tags

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.