oss-aspen / 8knot Goto Github PK
View Code? Open in Web Editor NEWDash app in development to serve open source community visualizations using GitHub data from Augur. Hosted app: https://eightknot.osci.io
License: MIT License
Dash app in development to serve open source community visualizations using GitHub data from Augur. Hosted app: https://eightknot.osci.io
License: MIT License
Please describe the background and context for this new visualization
Response/Close rate to issues with the bug/defect tag
Describe the perspective you'd like the final visual to give
We would like to show the response rate to an issue that can be assumed to be associated with a bug.
Describe the acceptance criteria for the issue and visualization to be complete
This should be completed in the finalized visualization tool or as a demo testing tools out
Additional context
Add any other context or screenshots about the feature request here.
Our application lacks any testing process before it is deployed or merged into dev or main. A difficulty in testing our app is that the data we are using are constantly updating- we need to target functions of the app that are data-agnostic.
The following are good candidate technical tests:
We should have the long-run ability to test:
We'll begin by implementing tests in PyTest and move to using Tox later when we want to test across environments. GithubActions handles these OS environments as well, so that's an option.
We should start to think about the next page(s) for explorer once the overview page is on the "plug and chug" phase on visualizations.
Questions:
This visualization will show the number of contributors by time interval per action.
User inputs:
This will be implemented using a histogram.
⛑️Callback error updating contributions.data2:54:12 PM
Callback error updating contributions.data
@sgoggins this is on the query you made, any thoughts? Might have to move to materialize view
Various opportunities of users to save some instance state relating to their workflow have become evident. Therefore, we ought to see how we can integrate user persistence where possible and explore best practices for doing this with Dash.
We could integrate it into a Javascript / React-based webapp:
https://dash.plotly.com/integrating-dash
Or we could try to embed the Dash app inside of a Flask app:
https://hackersandslackers.com/plotly-dash-with-flask/
Or we could ultimately move to a flask / plotly stack rather than relying on Dash:
https://towardsdatascience.com/web-visualization-with-plotly-and-flask-3660abf9c946
Interestingly, the last option seems to be more extendible in this respect, but it would obviously take us out of the nice sandbox that Dash allows us to play in.
More to come.
No issues with uses sqlalchemy, just want to make sure this early in the dev process that we research other options a little to make sure this is the correct path
With transitioning into the visualization creation phase of this project, the goal of explorer was to provide a deeper perspective than other tooling available. Choass metrics working group has spent time collaborating and working through the theoretical side of this, the "what to measure and why" side, with diverse perspectives in the open source landscape. More details to come on the break down of tasks
Visualization to be created (more to be added):
Please describe the background and context for this new visualization
The Bus Factor is a compelling metric because it visualizes the question "how many contributors can we lose before a project stalls?" by hypothetically having these people get run over by a bus (more pleasantly, how many would have to win in a lottery and decide to move on).
The Bus Factor is the smallest number of people that make 50% of contributions.
Describe the perspective you'd like the final visual to give
https://chaoss.community/metric-bus-factor/
Describe the acceptance criteria for the issue and visualization to be complete
when the issue is taken on, the people working on this should edit to describe the steps necessary to complete this as a visualization in the explorer dashboard
The Dropdown contributed to ~12 seconds of wait-time for the user because it's doing background work, likely loading data into memory. We would like to speed this up and one option is to make a dynamic callback.
The above issue explores this option.
For this issue, our definition of done is:
TODOs to come as more is known
Given the input from the search bar, generate the query and any additional code necessary to output the necessary repo_ids for the explorer pages
input could be:
-one or many repos
-one or many orgs
-combination of both
Once the search bar is created on opening page, is there a way to
Through different conversations with @harishpillay and @sgoggins, the following UX suggestions will be implemented:
Given the loaded meaning behind the term "drive-by", we should probably change the term to something like "infrequent" or "irregular".
To support faster integration of visualizations from sandiego-rh/sandiego into sandiego-rh/explorer, it is necessary to flesh out a more organize backend.
Connecting to the Augur DB, setting the dbschema, and getting a repo's id from its name, among other operations, are common to all visualization scripts and ought to be presented to the user as a a simple user interface.
Apparently not linked to getting or setting data, our app can take upward of 12 seconds to become usable upon initialization.
I'm going to profile the callbacks to determine the origin of this bottleneck.
EDIT:
How will a dynamic callback for the dropdown affect this problem?
For new users coming onto the page, context on each visualization would be helpful. What format this is TBD by the contributor that works on this, but the following should be done:
Plotly gives the ability to change view, which would be important to users with tables that have edited views and a user wants to see a different subset. Determine how to have noted functionality pop up. This may already be a plotly functionality that may need to be turned on
TODO:
The seasonality of a community pertains to the different time based cycles on community activity. As described by @GregSutcliffe :
"
The trend is useful to maintainers and contributors alike - the former will want to know how the project is faring, the latter will want to know if the project is alive and worth contributing too"
Criteria for acceptance(to be updated):
When working on the dash app locally, the two graphs are on the same row but on separate rows in the open shift deployment
Refer to the miro board design for reference
The work pipeline for non visualization data points is TBA
Non-Visualizations Required for Completion(short hand, refer to miro board for more details):
As we are starting the development process with dash, can we find someone with significant experience in dash to do some code review? Especially as myself and @JamesKunstle are getting more acquainted with dash, it would be beneficial to have some tuned eyes on our work
When storing a large org dash gives the following error: "Failed to execute 'setItem' on 'Storage': Setting the value of 'commits-data' exceeded the quota."
With that, the graph still updates with new data and performs as expected. Only thing I would be concerned about is if this is a subset of the data. More investigation to come
Investigation needs to be done on what are the data transfer limitations are and how we need to format queries to work through this.
TODO:
dash bootstrap themes
Bus factor esc graph: show the number/percent of files that have been updated by 1,2,3, etc contributors in x amount of time (input by user)
2 Bucket Bus Factor:
Current TODOs:
After getting a little farther in the dash app, the following things need to be done to improve the repo and allow for better experience for new developers to come in:
TODO:
Notes on the last bullet point: We use a lot of different template code using different strategies with formatting (class_name, style, etc). Before things build too much, lets get this all consistent for when we build there is not weird format issue and we have items to use as templates
There may be some value in comparing the speed and engagement in responding to and merging PRs between the top (10ish?) contributors and the median/mean contributors. This visualization is not fully fleshed out for the purpose of exploring some different ways of looking at this to determine if this is useful to look at and which fashion
Acceptance criteria:
Refer to the miro board design for reference
The visualizations will be generated via jupyter notebook in the sandiego repo. An issue is to be created for each integration once the corresponding notebook is created.
Visualizations Required for Completion(short hand, refer to miro board for more details):
This will be the initial step with completing the search bar to be accessible across pages.
TODO:
It is known that the overall view of a dash app has a 12 units to work with. When we have layered page view, we have 9 units to work with. When coding the pages (not index page), do we have 12 to work with or 9? Will figure out better wording to explain this question
We want our user to be able to change their inputs to the search bar on any page they are on where it triggers updates across pages. More details to come
Page layouts of non-main pages don't have access to callbacks of app object because they're 'lower' in the directory hierarchy than app.py.
The repository needs to be reorganized to fix this, and the best way to do this is likely with URL routing via the index.py invocation page.
Right now we install our Dash application requirements from a requirements.txt file.
We ought to have closer control of which versions of modules we use when necessary to ensure that
our production state matches our development state as nearly as possible, avoiding "gotcha's" arising
from unvetted library incompatibilities that appear in prod but not in dev.
We can do this by converting from the typical python "pip install -r requirements.txt" workflow to a pipenv workflow.
@cdolfi will be going through the following videos and documenting notes and reflections here:
Repo Link: https://github.com/Coding-with-Adam/Dash-by-Plotly (all credit goes to @Coding-with-Adam for content)
Discussion needed on if we should change our page structure to be consistent with dash documentation. Include conversation around using validation_layout
See resources:
https://community.plotly.com/t/introducing-dash-pages-a-dash-2-x-feature-preview/57775
https://dash.plotly.com/urls
https://www.youtube.com/watch?v=RMBSQ6leonU
https://www.youtube.com/watch?v=sxGO1FAeQwU
Please describe the background and context for this new visualization
Determine how many files are covered by licenses and number of files covered by each license
Describe the perspective you'd like the final visual to give
https://chaoss.community/metric-license-coverage/
https://chaoss.community/metric-license-declared/
Describe the acceptance criteria for the issue and visualization to be complete
Not to be done in a notebook. Either used to test a visualization tooling option or completed when the tool is established
Additional context
Add any other context or screenshots about the feature request here.
REMINDER:
Before a visualization issue can be closed,there must be clear documentation on the notebook of the decisions made at each step and the "why." Also,
any ml ideas generated from this process should be created into as issue with the ml request tag**
Updates to come with more details. As a starting point as we explore anonymity, we will start a graph ploting the number of contributors over time with a variable "love score". This will allow for seeing active contributors based on multiple different inputs scaled over time (issues, PRs, comments, etc)
Inspiration: https://github.com/orbit-love/orbit-model
Currently, a baby version of Dash Overview page is supposed to be deployed to OpenShift, with the help of Misc, but it's broken. Next steps are to fix this deployment and learn how to deploy on OpenShift in the future.
The connection to Augur drops and causes a callback exception sometimes. Working on minimizing this problem with a pessimistic connection-checking process detailed in the sqlalchemy documentation using 'pre_pool_ping'.
Currently the queries are making the page load slowly. We need to add a loading bar to inform users that the app is working, just loading. Add in loading view for the visualizations as well
https://www.youtube.com/watch?v=t1bKNj021do&list=PLh3I780jNsiS3xlk-eLU2dpW3U-wCq4LW&index=6
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.