Git Product home page Git Product logo

8knot's Issues

UI: add descriptions to visualizations

For new users coming onto the page, context on each visualization would be helpful. What format this is TBD by the contributor that works on this, but the following should be done:

  • Create a consistent format for the descriptions to be added to each viz but only visible if someone hovers or clicks something. This should be reproducible for each future visualization made
  • type out a description for each one currently made with guidance on toggles

Determine next page to start on for explorer

We should start to think about the next page(s) for explorer once the overview page is on the "plug and chug" phase on visualizations.
Questions:

  • What page should be next?
  • What are some proposed visualizations for this page?
  • What are the pages we play on doing? How many? What grouping?

Adding user persistence - settings and history

Various opportunities of users to save some instance state relating to their workflow have become evident. Therefore, we ought to see how we can integrate user persistence where possible and explore best practices for doing this with Dash.

We could integrate it into a Javascript / React-based webapp:
https://dash.plotly.com/integrating-dash

Or we could try to embed the Dash app inside of a Flask app:
https://hackersandslackers.com/plotly-dash-with-flask/

Or we could ultimately move to a flask / plotly stack rather than relying on Dash:
https://towardsdatascience.com/web-visualization-with-plotly-and-flask-3660abf9c946

Interestingly, the last option seems to be more extendible in this respect, but it would obviously take us out of the nice sandbox that Dash allows us to play in.

More to come.

Visualization: Love-orbit

Updates to come with more details. As a starting point as we explore anonymity, we will start a graph ploting the number of contributors over time with a variable "love score". This will allow for seeing active contributors based on multiple different inputs scaled over time (issues, PRs, comments, etc)

Inspiration: https://github.com/orbit-love/orbit-model

Chaoss collaboration trial

With transitioning into the visualization creation phase of this project, the goal of explorer was to provide a deeper perspective than other tooling available. Choass metrics working group has spent time collaborating and working through the theoretical side of this, the "what to measure and why" side, with diverse perspectives in the open source landscape. More details to come on the break down of tasks

Visualization to be created (more to be added):

  • Drive By/Repeat Contributions
  • First time Contributions Per Quarter
  • Contributor types over time
  • #52
  • #53

New Visualization: Response rate to Bugs

Please describe the background and context for this new visualization
Response/Close rate to issues with the bug/defect tag

Describe the perspective you'd like the final visual to give
We would like to show the response rate to an issue that can be assumed to be associated with a bug.

Describe the acceptance criteria for the issue and visualization to be complete
This should be completed in the finalized visualization tool or as a demo testing tools out

Additional context
Add any other context or screenshots about the feature request here.

Active contributors by Action

This visualization will show the number of contributors by time interval per action.

User inputs:

  • selects from drop down which action to look at
  • chooses # of months for time buckets

This will be implemented using a histogram.

Integrate non-visualization "metrics" to Overview Page

Refer to the miro board design for reference

The work pipeline for non visualization data points is TBA

Non-Visualizations Required for Completion(short hand, refer to miro board for more details):

  • README Description if available
  • mailing list (if accessible from repo)
  • Code of Conduct (Y/N)
  • License
  • Contributor Guideline (Y/N)
  • Website Link
  • Social Media Link
  • Languages in Repo
  • Company Involvement List
  • # of Releases
  • Last Release, number and date

Question: Can a search bar be connected across pages?

Once the search bar is created on opening page, is there a way to

  • Show selected options on all pages
  • make it where it shows as the boxes with x's (figure out better way to verbalize this) for people can deselect options and call back triggers visualizations reload
  • connecting search bar across pages where its all connected (some form of "global variable esque thing)

Search bar- Object across pages

We want our user to be able to change their inputs to the search bar on any page they are on where it triggers updates across pages. More details to come

Collaborate for UX/UI design guidance

TODO:

  • meet with tigger and liz to get input
  • determine color scheme
  • determine basic design template for pages
  • design for opening page

dash bootstrap themes

Bug fix: ~12 second wait time for app to load

Apparently not linked to getting or setting data, our app can take upward of 12 seconds to become usable upon initialization.

I'm going to profile the callbacks to determine the origin of this bottleneck.

EDIT:
How will a dynamic callback for the dropdown affect this problem?

  • Implement the dynamic dropdown
  • check speed compared to the current soln
  • verify if the dynamic dropdown is the alternative we should be using

Charming data Dash Overview

@cdolfi will be going through the following videos and documenting notes and reflections here:

  1. Introduction to Dash Plotly Data Visualization in Python - https://youtu.be/hSPmj7mK6ng
  2. Introduction to Plotly Data Visualization - https://youtu.be/_b2KXL0wHQg
  3. The Dash Callback Input, Output, State, and more - https://youtu.be/mTsZL-VmRVE
  4. Dropdown Selector Python Dash Plotly - https://youtu.be/UYH_dNSX1DM
  5. Pie Chart (Dropdowns) Python Dash Plotly - https://youtu.be/iV51JqP6y_Q
  6. All about the Graph Component Python Dash Plotly - https://youtu.be/G8r2BB3GFVY
  7. Complete Guide to Bootstrap Dashboard Apps Dash - https://youtu.be/0mfIK8zxUds

Repo Link: https://github.com/Coding-with-Adam/Dash-by-Plotly (all credit goes to @Coding-with-Adam for content)

New Visualization: Seasonality of Community:

The seasonality of a community pertains to the different time based cycles on community activity. As described by @GregSutcliffe :
"

  • The day-of-week is useful to maintainers when thinking about things like when to post news, event invites etc, and when to be available/have office hours. #203
  • Day-of-year gives maintainers some way to adjust expectation for the coming time-horizon (i.e is it March? I might expect a dip in contribution then, and should not panic when it happens). Clearly this does not work for new projects (STL requires 2 time periods, so 2 years for this) #205
  • Not shown, but other seasonalities are possible. Hourly? Might reveal geographic information (i.e. if you get peaks at UTC +/-1 then you perhaps have an EU-centric community) #205
  • Non-periodic things can be done too, holidays are common. Also I once used an STL with "holidays" corresponding to release dates to analyse the effect of new releases on people upgrading from old versions."

The trend is useful to maintainers and contributors alike - the former will want to know how the project is faring, the latter will want to know if the project is alive and worth contributing too"

Criteria for acceptance(to be updated):

  • Determine which of the following to measure, why, and how
  • Create a small write up to go with the charts to make them accessible and understandable for all
  • Make conscious design choices that make it readable for people with non data science-statistics background
  • create a notebook with the visuals

How does layering pages work with column width?

It is known that the overall view of a dash app has a 12 units to work with. When we have layered page view, we have 9 units to work with. When coding the pages (not index page), do we have 12 to work with or 9? Will figure out better wording to explain this question

Bus factor graph

Bus factor esc graph: show the number/percent of files that have been updated by 1,2,3, etc contributors in x amount of time (input by user)

2 Bucket Bus Factor:

  • Files changed in past 18 months
  • Contributors active in past 18 months
  • Intersection

Current TODOs:

  • @sgoggins creating initial query for this visualization

Create testing framework

Our application lacks any testing process before it is deployed or merged into dev or main. A difficulty in testing our app is that the data we are using are constantly updating- we need to target functions of the app that are data-agnostic.

The following are good candidate technical tests:

  • Augur database is available
  • Query to Augur with known return (up to max historical time) matches previously expected return value (past data isn't changing underfoot)
  • Augur query of reasonably large size doesn't take too long
  • App doesn't raise any runtime errors on deployment (run the app, hooks into logging for graph rendering)
  • Large query takes any amount of time but doesn't error out.
  • Moving between tabs w/ web driver changes url route as expected (Selenium)

We should have the long-run ability to test:

  • different versions of python
  • different operating systems (good practice for contributor accessibility)

We'll begin by implementing tests in PyTest and move to using Tox later when we want to test across environments. GithubActions handles these OS environments as well, so that's an option.

Dash code review?

As we are starting the development process with dash, can we find someone with significant experience in dash to do some code review? Especially as myself and @JamesKunstle are getting more acquainted with dash, it would be beneficial to have some tuned eyes on our work

Bug: Time out error on query call back

⛑️Callback error updating contributions.data2:54:12 PM
Callback error updating contributions.data

504 Gateway Time-out

The server didn't respond in time.

@sgoggins this is on the query you made, any thoughts? Might have to move to materialize view

Add visualizations to pages -- Overview

Refer to the miro board design for reference

The visualizations will be generated via jupyter notebook in the sandiego repo. An issue is to be created for each integration once the corresponding notebook is created.

Visualizations Required for Completion(short hand, refer to miro board for more details):

Chaoss page: graph formatting

When working on the dash app locally, the two graphs are on the same row but on separate rows in the open shift deployment

New visualization: Response/Merge time analysis comparing contributors

There may be some value in comparing the speed and engagement in responding to and merging PRs between the top (10ish?) contributors and the median/mean contributors. This visualization is not fully fleshed out for the purpose of exploring some different ways of looking at this to determine if this is useful to look at and which fashion

Acceptance criteria:

  • Determine if the number of "top contributors" should be a set number or a percent of the contributor count
  • Response: time to first response should be tracked, but is there a value in number of response or other metrics in that area?
  • Determine if mean, median, (or something else) should be the comparison group
  • Would a histogram of some sort be better for this?
  • Determine if this should be added to an existing notebook or stand on its own

New Visualization: Bus Factor

Please describe the background and context for this new visualization
The Bus Factor is a compelling metric because it visualizes the question "how many contributors can we lose before a project stalls?" by hypothetically having these people get run over by a bus (more pleasantly, how many would have to win in a lottery and decide to move on).

The Bus Factor is the smallest number of people that make 50% of contributions.
Describe the perspective you'd like the final visual to give
https://chaoss.community/metric-bus-factor/

Describe the acceptance criteria for the issue and visualization to be complete
when the issue is taken on, the people working on this should edit to describe the steps necessary to complete this as a visualization in the explorer dashboard

Reorganize repo to enable multipage callbacks

Page layouts of non-main pages don't have access to callbacks of app object because they're 'lower' in the directory hierarchy than app.py.

The repository needs to be reorganized to fix this, and the best way to do this is likely with URL routing via the index.py invocation page.

Backend OO rework

To support faster integration of visualizations from sandiego-rh/sandiego into sandiego-rh/explorer, it is necessary to flesh out a more organize backend.

Connecting to the Augur DB, setting the dbschema, and getting a repo's id from its name, among other operations, are common to all visualization scripts and ought to be presented to the user as a a simple user interface.

Component Update: DropDown component on index page.

The Dropdown contributed to ~12 seconds of wait-time for the user because it's doing background work, likely loading data into memory. We would like to speed this up and one option is to make a dynamic callback.

#49

The above issue explores this option.

For this issue, our definition of done is:

  • Try dynamic callbacks for the Dropdown bar
  • Ensure case-insensitivity and look into very conservative fuzzy-matching.
  • Implement the dynamic callback for the dropdown bar, ensure that it's more timely.

ERROR: Exceeds Quota

When storing a large org dash gives the following error: "Failed to execute 'setItem' on 'Storage': Setting the value of 'commits-data' exceeded the quota."

With that, the graph still updates with new data and performs as expected. Only thing I would be concerned about is if this is a subset of the data. More investigation to come

Investigation needs to be done on what are the data transfer limitations are and how we need to format queries to work through this.

Code/Repo Clean up

After getting a little farther in the dash app, the following things need to be done to improve the repo and allow for better experience for new developers to come in:

TODO:

  • Move db_interface folder
  • clean up code that was created to use for repeating functions, make it applicable to what we know now
  • made py file to have all call back code for queries to keep index page clean
  • add comments in index page explaining where the call backs are and how to add more
  • update all of the in line comments for current app
  • clean code from a design perspective (CSS, formatting) to make sure the code is all consistent

Notes on the last bullet point: We use a lot of different template code using different strategies with formatting (class_name, style, etc). Before things build too much, lets get this all consistent for when we build there is not weird format issue and we have items to use as templates

Create basic Search bar

This will be the initial step with completing the search bar to be accessible across pages.
TODO:

  • Determine best dash object for search bar to be connected across many visualizations and pages (is there a single object for it or must it be directly connected to a single graph?)
  • create search bar with repos/orgs as searchable inputs
  • allow for multiple selections
  • show selected options on search page

New Visualization: Metric License Coverage

Please describe the background and context for this new visualization
Determine how many files are covered by licenses and number of files covered by each license

Describe the perspective you'd like the final visual to give
https://chaoss.community/metric-license-coverage/
https://chaoss.community/metric-license-declared/

Describe the acceptance criteria for the issue and visualization to be complete
Not to be done in a notebook. Either used to test a visualization tooling option or completed when the tool is established

Additional context
Add any other context or screenshots about the feature request here.

REMINDER:
Before a visualization issue can be closed,there must be clear documentation on the notebook of the decisions made at each step and the "why." Also,
any ml ideas generated from this process should be created into as issue with the ml request tag**

Query for gathering repo_ids from multiple github urls

Given the input from the search bar, generate the query and any additional code necessary to output the necessary repo_ids for the explorer pages

input could be:

-one or many repos
-one or many orgs
-combination of both

Enhancement: Inform user of Plotly view change ability

Plotly gives the ability to change view, which would be important to users with tables that have edited views and a user wants to see a different subset. Determine how to have noted functionality pop up. This may already be a plotly functionality that may need to be turned on

UX Enhancements

Through different conversations with @harishpillay and @sgoggins, the following UX suggestions will be implemented:

  • Notes on the start page to help guide users, a placeholder until we do more design implementations
  • Add button for GH issue creation: bug report, new visualization request, new repo/org request
  • loading bar as graphs/search bar updates (see #37 )
  • Create issue templete to link to buttons

Add deterministic requirements using pipenv

Right now we install our Dash application requirements from a requirements.txt file.

We ought to have closer control of which versions of modules we use when necessary to ensure that
our production state matches our development state as nearly as possible, avoiding "gotcha's" arising
from unvetted library incompatibilities that appear in prod but not in dev.

We can do this by converting from the typical python "pip install -r requirements.txt" workflow to a pipenv workflow.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.