nlpsandbox / participation-dashboard Goto Github PK

View Code? Open in Web Editor NEW

0.0 0.0 0.0 1.85 MB

A flexdashboard that reports participation metrics for the NLP Sandbox

License: Apache License 2.0

R 95.27% Dockerfile 4.23% CSS 0.50%

participation-dashboard's Introduction

nlpsandbox

Home repository

participation-dashboard's People

Contributors

Watchers

participation-dashboard's Issues

Interview MCW and Mayo Clinic on the content of the current dashboard

Share the dashboards with Bradley (MCW) and Hongfang (Mayo Clinic) and ask them what they think of them. Regarding the participation dashboard generated for each data site, ask them whether there are additional metrics that they would like to include, for example for the grant reports.

This ticket depends on the completion of the following tasks:

Increase visibility of current best-performing tool in performance dashboard

Take the following plot as example:

As time passes, the point of the current best-performing tool gets pushed and compressed to the left side of the plot. At some point, an observer may easily overlook the point of the best-performing tool if it is on the far left side of the plot.

Idea:

Draw an horizontal segment between the point of the best-performing tool and the right side of the plot (equal to the time of the most recent submission). When placing the mouse cursor on this segment, display the information about the best-performing tool.

Use different icons for the tiles in the participation dashboard

Flexdashboard icons; https://rstudio.github.io/flexdashboard/articles/using.html#icon-sets

plotly not installed

docker run -e SYNAPSE_AUTH_TOKEN=... nlpsandbox/participation-dashboard:0.2.0 Rscript render_markdown.R --source_table_synapse_id syn23633030 --destination_folder_synapse_id syn26127246
Unable to find image 'nlpsandbox/participation-dashboard:0.2.0' locally
0.2.0: Pulling from nlpsandbox/participation-dashboard
16ec32c2132b: Already exists 
43156c469c75: Pull complete 
ff9235179179: Pull complete 
150c98294b3b: Pull complete 
562c1c9e2b91: Pull complete 
31d357ac90ff: Pull complete 
45c622a91246: Pull complete 
ddea88b3b92c: Pull complete 
1e24733cf252: Pull complete 
a70478df8afc: Pull complete 
Digest: sha256:c7ab5fee1e8cf780667311f67c1464dcc5f925fe56cfdee6b11b69d55a73a9d9
Status: Downloaded newer image for nlpsandbox/participation-dashboard:0.2.0
Welcome, Thomas Yu!NULL


processing file: participation-dashboard.Rmd
  |...                                                                   |   4%
  ordinary text without R code

  |......                                                                |   8%
label: setup (with options) 
List of 1
 $ include: logi FALSE

  |........                                                              |  12%
  ordinary text without R code

  |...........                                                           |  16%
label: unnamed-chunk-1 (with options) 
List of 2
 $ message: logi FALSE
 $ include: logi FALSE

Rows: 165 Columns: 64
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (27): ROW_ETAG, name, status, dockerrepositoryname, dockerdigest, orgSag...
dbl (37): ROW_ID, ROW_VERSION, id, createdOn, createdBy, modifiedOn, evaluat...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
  |..............                                                        |  20%
   inline R code fragments

  |.................                                                     |  24%
label: unnamed-chunk-2
  |....................                                                  |  28%
  ordinary text without R code

  |......................                                                |  32%
label: unnamed-chunk-3
  |.........................                                             |  36%
  ordinary text without R code

  |............................                                          |  40%
label: unnamed-chunk-4
  |...............................                                       |  44%
  ordinary text without R code

  |..................................                                    |  48%
label: unnamed-chunk-5
  |....................................                                  |  52%
  ordinary text without R code

  |.......................................                               |  56%
label: unnamed-chunk-6
  |..........................................                            |  60%
  ordinary text without R code

  |.............................................                         |  64%
label: unnamed-chunk-7
  |................................................                      |  68%
  ordinary text without R code

  |..................................................                    |  72%
label: unnamed-chunk-8
  |.....................................................                 |  76%
  ordinary text without R code

  |........................................................              |  80%
label: unnamed-chunk-9
Quitting from lines 153-191 (participation-dashboard.Rmd) 
Error in loadNamespace(x) : there is no package called 'plotly'
Calls: <Anonymous> ... loadNamespace -> withRestarts -> withOneRestart -> doWithOneRestart
In addition: Warning message:
Problem with `mutate()` column `datetime`.
ℹ `datetime = lubridate::as_datetime(.data$createdOn/1000, origin = "1970-01-01")`.
ℹ One or more parsing issues, see `problems()` for details 

Execution halted

Report number of active submitter over time

Goal

Report the number of unique submitter IDs (they can be user ID or team ID) who made at least one submission during a given month. Report for the past 12 months. The motivation is to follow the evolution of the number of users who actively uses the NLP Sandbox over time (instead of just looking at the number of registered users).

Similar to

Create data site participation dashboard

Coming back to the idea of having our first dashboard rendered with different (sub)set of data. At this point I'm looking for two rendering of the dashboard:

Overall participation dashboard fed with all the data currently being used (done)
For each data site, the same participation dashboard but when fed with only data for this data site.

The best way to select the data to use is probably at the level of the submission queues. As a reminder, for each NLP Sandbox task, we have an entrypoint, public-facing submission queue. We then have an internal queue for each data site that supports this task (not all data sites support all the tasks). Taking the example of the Date Annotation Task:

Entrypoint queue ID: 9614652
Sage internal queue ID: Sage is actually an exception and tap into the Entrypoint queue. :)
MCW internal queue ID: 9614719

Overall participation dashboard

Below are all the entrypoint queue IDs that can be used for the overall participation dashboard:

Date annotation: 9614652
Person name annotation: 9614657
Contact annotation: 9614799
ID annotation: 9614797
Location annotation: 9614658

Note: The current overall dashboard taken as input all the queues and report 6 tasks open. However the sixth task, covid-19 annotation, is not open yet. Filtering using submission queue will solve this issue.

How to identify the ID of the submissions (rows) to keep in the table Main Admin View from which participation data are obtained:

Keep all the rows in Main Admin View (syn22277124) where evaluationid takes a value in the set of queue IDs listed above.

MCW participation dashboard

MCW internal queue ID:

Date annotation: 9614719
Person name annotation: 9614720
Contact annotation: 9614804
ID annotation: 9614803
Location annotation: 9614721

How to identify the ID of the submissions (rows) to keep in the table Main Admin View from which participation data are obtained:

Select all the rows in the table MCW (syn22277124) where evaluationid takes a value in the set of queue IDs listed above. From these rows, create a set of unique submission ID from the column name called A.
In the Main Admin View (syn22277124), keep all the rows where id takes a value in A.

Participation dashboard template

The title of the dashboards rendered should be:

NLP Sandbox
NLP Sandbox - {data site name}
- NLP Sandbox - MCW
- NLP Sandbox - Sage Bionetworks (in the future, for now it's not convenient because of the Sage Bionetworks data site is not like the other data site, see note above)

Configuration

Consider adding a (JSON) configuration file to the code base to customize the rendering of the participation dashboard, and soon other dashboards. At this point, for the participation dashboard, the configuration file should define the submission queue to use to filter the data and the title of the dashboard.

Parameterize Performance dashboard plot items

Add CI to link Dockerfile and push to Docker Hub

Reduce participation dashboard html files to under 5MB

Report the number of NLP Sandbox (registered) Users

Goal

Show the value of the number of registered user now. This value correspond to the number of Synapse users in the Synapse team NLP Sandbox Users (currently 16 users)

Add the overall participation dashboard to the NLP Sandbox home page

Depends on #4

An option is also to push the HTML dashboard manually before adding it to the home page. Once #4 is done, the dashboard will be refreshed regularly.

Setup Kubernetes to run the dockerized program on a cronjob

Add the the Kubernetes "workflow" to this GH repo.
Document how to setup the Kubernetes instance to run this workflow.
Finalize the setup of the Kubernetes instance.

Depends on #2

Report on the number of submissions made over time

Goal

Report on the number of submissions made over time to visualize whether the adoption of the NLP Sandbox increase.

Implementation

Suggestion:

Use a bar plot:
- x: time with one bar per month
- y: total number of submission
Report data only for the last 12 months

The motivation is that we should have at least one submission per month ^^. In this early phase of the project, we don't get a submission every week.

Add release-it configuration

Finalize and publish performance dashboards

The current filenames include a mix of dashes and underscores separators. Use only dashes.
- E.g. performance-dashboard_contact_annotation.html => performance-dashboard-contact-annotation.html
Remove Andrew Lamb from the header
Update title to NLP Sandbox Contact Annotators (adapt for other annotators)
Remove Overview box
Push the HTML files to the public folder https://www.synapse.org/#!Synapse:syn26156540

New dashboard - Dahboard that reports the performance of the best solution of each task

Goal

The NLP Sandbox decomposed the PHI annotation task into smaller, modular tasks like the date annotation task, person name annotation task, etc. One of the motivation is to enable tool developers to identify where their time would be best invested by looking at the leaderboard of each task. For example, if there is multiple solutions with a near perfect score for the date annotation task but no satisfying solution yet for the person name annotation task, this will indicates to the developer that their time would be best spent working on a new solution for the person name annotation task.

Instead of visiting all the leaderboards in order to obtain this information, we could compile this information in a small dashboard made of tiles (square or rectangle), one for each task. Each tile should include the following information:

The name of the task
The score of the best submission submitted to this task
Color the background of the tile based on the performance.
- Example: red (min score) to green (max score)

Therefore, just by looking at the color of the tiles, one would be able to identify the challenging tasks for which no satisfying solution has been submitted yet (red-orange colors).

@andrewelamb We can discuss offline how to get the above information if needed.

Note that for each of the current task, we report a score for two datasets, so we can either:

Select one dataset that we consider as representative of the other
Average the score obtained on the two datasets (ideally weighted average)

Also since we report more than one performance metric, we would need to select one.

And to complicate further, we have task like the Location annotation task that report scores for two variants of the task. :)

Prototype

For now, consider only the performance for the i2b2 dataset
For now, consider only the tasks that have only one variant: date annotation and person name annotation

Generate participation dashboards for data sites

Dashboards:

Overall participation dashboard: fed with all the data
One participation dashboard by (external) data sites: fed only with the data from a given data site. Currently we have one external data site, MCW.

Report number of successful submissions

Goal

Report the number/ratio of submissions that are successful. A submission is either failing because of a bug in the benchmarking infrastructure or because of a mistake/bug in the submitted tools. It would be nice to be able to distinguish the two as only the first type of errors inform on the stability of the infrastructure. Yet, we can indirectly contribute to reduce the second type of errors with good documentation. Anyway, this metric will be an incentive for us to improve the system/documentation.

Implementation

Suggested layout:

Main number: Number of submissions with status = ACCEPTED (always based on the Main Admin View unless specified)
Total number: Total number of submission in the table
Keep only the submissions made over the last month.

Create Dockerfile

Specification

The Docker image should get the Synapse personal token as an environment variable SYNAPSE_TOKEN.
Ultimately, the Docker image will be used to build multiple notebooks (Rmd => html) and push them to different location on Synapse.
The program in the Docker image should must be run locally, for example by mounting a volume to the docker container where HTML files will be saved. The user can then access and review the HTML files in the mounted volume/folder.

Update README

Describe what this repository provide
Describe how to run the Docker container locally
- Describe the inputs and outputs
Add a brief description on how the container is run at Sage + link to https://github.com/Sage-Bionetworks/kubernetes_deployments

Specify the version of the R packages installed in Dockerfile

New dashboard - Report the evolution of the performance of all the submission for a given task

Goal

For each task, draw a plot similar to this one from paperswithcode. In this plot, the performance of each submissions is represented by a dot.

X axis: time (range: determine by the timestamp of the submissions)
Y axis: performance (range: min score to max score)

The blue line represents the evolution of the best submission.

It would be great to include this plot on the top of the leaderboard page of each task. Therefore, individual HTML file should be rendered for each task.

Push HTML dashboard to Synapse and add it to a wiki page

Push the HTML page to Synapse
For now, insert the HTML page to this private Synapse page (staging site): https://www.synapse.org/#!Synapse:syn22277124/wiki/612561

nlpsandbox / participation-dashboard Goto Github PK

participation-dashboard's Introduction

nlpsandbox

participation-dashboard's People

Contributors

Watchers

participation-dashboard's Issues

Goal

Similar to

Overall participation dashboard

MCW participation dashboard

Participation dashboard template

Configuration

Goal

Goal

Implementation

Goal

Prototype

Goal

Implementation

Specification

Goal

Recommend Projects

Recommend Topics

Recommend Org