mozilla / glean-dictionary Goto Github PK
View Code? Open in Web Editor NEWPublic-facing dictionary of Glean (and Glean-derived) metadata
Home Page: https://dictionary.telemetry.mozilla.org
License: Mozilla Public License 2.0
Public-facing dictionary of Glean (and Glean-derived) metadata
Home Page: https://dictionary.telemetry.mozilla.org
License: Mozilla Public License 2.0
Storybook snapshot tests in CI (where we verify that stories still render after a pull request) can often catch problems, e.g. it was helpful in the iodide project (see iodide-project/iodide#2506). I'm not sure how to set this up in Svelte but I'm guessing it should be possible. You can see the aforementioned PR for iodide for some ideas.
While reading the proposal for glean-dictionary I came across this, Which proposes that the Metrics section on the ping page should be filterable
Currently, If we navigate to this page
http://localhost:5000/#!/apps/fenix
It shows a large list of metrics for the fenix app which shows that the current Metrics section on the ping page is not filterable.
I think it would be great if we add the FilterInput.svelte component to the metrics section to make it filterable.
As per the Glean dictionary proposal :
Application page should have a link to the application's source repository which is currently missing.
Where to add:
How to add:
We might need a pre-existing database with this information
OR
have to store it in a JSON format , which would need maintenance if any application is added.
Code to change : src/pages/AppDetail.svelte
This issue is intended as an onboarding task for potential outreachy applicants. Please do not work on it unless you have completed the initial qualification task and it has been assigned to you.
Despite having a bunch of metadata in them, we show almost nothing about the metrics on the metric page.
For a start, let's show everything the earlier glean dictionary prototype did:
https://glean-dictionary.netlify.app/?metric=media_state_play
The code that needs to be modified is here:
https://github.com/mozilla/glean-dictionary/blob/main/src/pages/MetricDetail.svelte
All the information we want to display should already be extracted. If you have the server running locally, go to e.g.:
http://localhost:5000/data/mach/metrics/mach.system.memory.json
This is the dataset corresponding to:
http://localhost:5000/#!/apps/mach/metrics/mach.system.memory
You can skip the "help" parts of this dialog for now. We'll tackle that in a separate issue. Also, don't worry about styling the component too much-- can just use a table for now (like we do in the table view already: http://localhost:5000/#!/apps/mach/tables/usage)
Currently there's only an example test. It would be a good idea to add some testing maybe with something like svelte-testing-library along with Jest. This way we can familiarise ourselves with the codebase through writing some tests.
It would be super handy to provide searchfox search URLs that search the code base for uses of a given metric. It should be enough to search for category_name.metric_name
as a string.
Depends on https://bugzilla.mozilla.org/show_bug.cgi?id=1668547
This bug requires specialized knowledge and access to Mozilla's internal systems, so is not a good issue for contributors
The Glean Dictionary currently has some etl code which you need to run adhoc to create a bunch of static data assets in https://github.com/mozilla/glean-dictionary/blob/12ff6ac8f603f1245f6c6dcb9e4be9b85e28b135/scripts/build-glean-metadata.py -- we may eventually move to storing this in an elastic search cluster (see discussion here: https://docs.google.com/document/d/1OkTWA3rsSJ0m5g9GDnxXVUMkJP-xJMQk_bDgDq-Z9xM/edit#heading=h.tn5dtaq0zat6) but this seems like the easiest approach for now. While we're in this phase, we should schedule this etl code to run after a schema deploy and upload it to the bucket we're using with protosaur (#60). Need to talk to someone from dataops when we're ready to do this.
On ping page, if there are multiple emails in the Notification Emails section, it is shown one after one without any gap which makes the email unreadable.
Where to change:
This issue is for tracking purposes only, it is not meant to be assigned
Tracking issue for the ping page feature of the MVP Glean dictionary.
According to the proposal, there should be a filterable list of pings associated with the application.
Currently, in our application page there is no filter been added for pings. The page looks like this:
Where to change: https://github.com/mozilla/glean-dictionary/blob/main/src/pages/AppDetail.svelte
Or maybe we should show both? (in the details view perhaps)? I find categories helpful in understanding the organization of metrics, so it would be nice to be able to more easily pick them out.
Opening an external link
in a new tab allows one to explore the other site as much as they want without having to hit the back button again. It helps to keep focus in one place without losing other websites' information.
Currently, metrics page opens external links in new tab. Pings page and Tables page need to be updated.
Here another thing to be noted, we should not open internal link
in the new tab because it might confuse users. Also, Glean dictionary has the flexibility to navigate back or elsewhere according to users' needs. So for this, keeping users in the same tab helps them understand the navigation flow better.
In the BigQuery table view we currently allow the user to search through the column names to find the ones of interest:
http://localhost:5000/#!/apps/mozphab/tables/usage
It would be very handy if we could persist that search in the URL (and restore it when they visit it), so that people could link to specific views and have the search prepopulated. In the above example, that would be:
http://localhost:5000/#!/apps/mozphab/tables/usage?search=build
To perform this task, have a look at the documentation for page.js, which is what we currently use for routing: https://visionmedia.github.io/page.js/
You will probably need to modify the main router (App.svelte
) in addition to the component for the table view.
This issue is for tracking purposes only, it is not meant to be assigned
Tracking issue for the searchbar feature of the MVP Glean dictionary.
We don't have a ton of Glean application metadata currently, but we have more than we're currently displaying (name + description).
In particular, we should include the following:
application id and source code URL should only be displayed in the application detail screen. For now, just display them as a table, as we do for the BigQuery table view e.g. http://localhost:5000/#!/apps/fenix-nightly/tables/activation
"deprecated" should be displayed as a pill (in both the application list and application detail screen). Create a new svelte component using tailwind with a pleasing style. You can see some documentation on how to create a pill here:
https://tailwindcss.com/docs/border-radius#pills-and-circles
On mozilla/probe-scraper#244 we added a flag that signals when an app is still on the prototype stage.
We should have the dictionary show that on the UI somehow.
This bug should be handled in two stages:
This issue is intended as an onboarding task for potential outreachy applicants. Please do not work on it unless you have completed the initial qualification task and it has been assigned to you.
Currently the filter search in the table view (e.g. http://localhost:5000/#!/apps/mach/tables/usage) only filters for content in the last "node" in the structure. e.g. if you have
client_info.app_version
client.info.app_display_version
client_info.client_id
...
and you search for "client", it will filter out all of the above except for client_info.client_id
. Ideally the search would include all entries whose parent elements have the term in them (in the example above, this would be everything that includes client
).
To fix this, you'll need to edit the schemaviewer component:
If I click on the reference_browser app it navigates me to AppDetails.svelte page.
reference_browser app doesn't have any data available for pings
http://localhost:5000/data/reference_browser/pings/
As in this screenshot of the page
It would be nice if we add some meaningful information here.
Navigating to a page for ping information (e.g http://localhost:5000/#!/apps/mozregression/pings/usage) leads to a broken page with the following error.
Uncaught (in promise) Error: {#each} only iterates over array-like objects.
validate_each_argument index.mjs:1615
create_then_block bundle.js:2938
update index.mjs:1032
handle_promise index.mjs:1065
In Glean, we want to encourage users to remove deprecated labels from their metrics. But it would still be good to document in the dictionary all the labels that have been used on historical data.
This would basically require going through the entire history for a metric and collecting all of the labels used. Bonus points for flagging the ones that no longer exist in the latest revision.
See https://bugzilla.mozilla.org/show_bug.cgi?id=1587430 for additional context.
Issue description
glean.page.path bugs and data reviews hyperlinks takes to invalid Bugzilla page.
Steps to reproduce the issue :
http://localhost:5000/data/glean-js/metrics/glean.page.path.json , check the json key: value pair of bugs and data_reviews
"bugs": ["https://bugzilla.mozilla.org/show_bug.cgi?id=actually-we-dont-have-this"], "data_reviews": ["https://bugzilla.mozilla.org/show_bug.cgi?id=actually-we-dont-have-this"],
What can be done here :
Solving this issue might also require spec'ing what would the UI page say when such bugs/links are found.
This issue is for tracking purposes only, it is not meant to be assigned
Tracking issue for the BigQuery table page feature of the MVP Glean dictionary.
cc @spasovski
I know this is a pre-alpha, but I thought to share my feedback on this anyway :-) There's a few small nits that I believe would make this page a bit more digestible (see the relative colored numbers on the image):
relevant bugs
[1](link to the first bug/GH issue), [2](...)
.lifetime
is usually confusing.Timing_distribution
, this should probably drop the _
and also link to the proper glean docs (e.g. https://mozilla.github.io/glean/book/user/metrics/timing_distribution.html) - note that the name of the metric type is also the same name of the documentation for that metric type. This is on purpose, so that you can do https://mozilla.github.io/glean/book/user/metrics/{metric_type_name}.html
For pages like http://localhost:5000/#!/apps/fenix , navigation becomes too difficult if the data is too much.
Apart from filtering as mentioned in another issue, we can also add Next Previous Buttons to show 1 to 100 or page limit.
This will be useful when user does not have the exact search term to filter.
For example on https://glean-dictionary-dev.netlify.app/#!/apps/fenix we see this on the second page:
This is clearly incorrect, which you can see if you navigate to the metric detail:
https://glean-dictionary-dev.netlify.app/#!/apps/fenix/metrics/browser.search.ad_clicks
It's unclear to me right now whether this is a bug in the ETL or the display, but it's hopefully not too hard to fix.
Currently we just silently fail if the user navigates to an entity that does not exist. e.g.:
http://localhost:5000/#!/apps/burnham2
http://localhost:5000/#!/apps/fenix-nightly/pings/activation-doesnotexist
It would be better if we displayed some kind of friendly error page saying something like "Could not find application burnham2
" or "Could not find ping application-doesnotexist
". I don't expect this to happen frequently but this sort of thing can happen occasionally (e.g. if an application is added and then withdrawn)
To accomplish this task, you'll want to create a new Svelte component to cover this functionality and update each page to use/display it in the event that fetching information fails.
In the filter box when we search for something and it doesnt match with any item, just a blank page stares at us. There should a text telling search doesnt match with any application or martics.
GEThttp://localhost:5000/favicon.png
[HTTP/1.1 404 Not Found 0ms]
This issue is for tracking purposes only, it is not meant to be assigned
Tracking issue for the metric page feature of the MVP Glean dictionary.
Currently the enhancement on Metric detail page so far shows only Json data.
We should add the information about :
Where :
Below description of the Metric
More to do :
This issue is intended as an onboarding task for potential outreachy applicants. Please do not work on it unless you have completed the initial qualification task and it has been assigned to you.
GLAM has a footer element with some useful information at the bottom of each page:
We should have something similar for the Glean dictionary.
You can mostly copy over the existing footer from GLAM (putting it into the src/components/
directory) and then put it into each of the pages we display:
Some styling should be adjusted and obviously the links should be different (e.g. the link to a slack channel should instead be a link to our channel on Matrix: https://chat.mozilla.org/#/room/#glean-dictionary:mozilla.org)
I added a link to the BigQuery table view when working on the initial skeleton. With some of the recent changes, it now looks pretty out of place:
It should be an item in the table below (just after "notification email"). I would propose the following structure:
You may need to update the metadata gathering step in scripts/build-glean-metadata.py
to fetch the name of the stable table to put in the ping view.
This issue is intended as an onboarding task for potential outreachy applicants. Please do not work on it unless you have completed the initial qualification task and it has been assigned to you.
Currently we're only gathering metrics and ping data from the applications themselves, not those specified by the libraries (e.g. the events
ping is part of glean-core: https://github.com/mozilla/glean/blob/2261845761251d91b6968f29846dc3aabbc0cc45/glean-core/pings.yaml#L80)
Mozilla Schema Generator (which does something similar to what we do) enumerates dependencies for each application and gathers data for them e.g. https://github.com/mozilla/mozilla-schema-generator/blob/1264148ff82adc3357de33a25541f1313f93449f/mozilla_schema_generator/glean_ping.py#L126
We should modify our build-glean-metadata.py
script (https://github.com/mozilla/glean-dictionary/blob/main/scripts/build-glean-metadata.py) with similar logic.
Steps to reproduce:
http://localhost:5000/#!/apps/reference-browser/metrics/experiments.metrics.active_experiment
http://localhost:5000/#!/apps/reference-browser/metrics/toolbar.events.url_committed
Here is the screenshot of the error it throws when accessed.
Failed to fetch some metrics data for fenix apps.
The issue arises only when the metric name starts with metrics.any_metric
navigate to http://localhost:5000/#!/apps/fenix
search for a metric that starts with metrics.default_browser. Use this link http://localhost:5000/#!/apps/fenix/metrics/metrics.default_browser
Check the browser's console, and It throws this error
Uncaught (in promise) TypeError: Failed to fetch
The issue arises only when the metric name starts with metrics.any_metric
This issue is for tracking purposes only, it is not meant to be assigned
Tracking issue for the landing page (aka "application browser") feature of the MVP Glean dictionary.
This is an off-the-wall idea that came up in the data science team meeting today: it would be super helpful if data scientists had a place to leave comments on probes to discuss their behaviour and share warnings for future travellers!
Maybe the answer is "just use Bugzilla," or maybe there's another place, but this is a place that many data scientists look and so it seems like it could profitably live here.
Moderation or authentication is an obvious concern; possibly this could link out to Discourse threads or some other already-moderated Mozilla space, but it would be great if we could see whether there's a comment available, and ideally what it is.
This issue is intended as an onboarding task for potential outreachy applicants. Please do not work on it unless you have completed the initial qualification task and it has been assigned to you.
If you look at the bigquery table view, you'll see that there's a filter box that lets you easily search for the subset of columns that you're interested in:
We should have a similar widget on the main page to filter through the list of applications where e.g. putting "firefox" into the box will only show applications with Firefox in the name.
This issue is intended as an onboarding task for potential outreachy applicants. Please do not work on it unless you have completed the initial qualification task and it has been assigned to you.
We show only a tiny subset of the data related to the ping on the ping page right now. You can see an example of markdown documentation which more fully represents the metadata we have here:
https://github.com/mozilla/mozregression/blob/master/docs/glean/metrics.md#pings
The code that needs to be modified is here:
https://github.com/mozilla/glean-dictionary/blob/main/src/pages/PingDetail.svelte
All the information we want to display should already be extracted. If you have the server running locally, go to e.g.:
http://localhost:5000/data/mozregression/pings/usage.json
This is the dataset corresponding to:
http://localhost:5000/#!/apps/mozregression/pings/usage
Don't worry about styling the component too much-- can just use a table for now (like we do in the table view already: http://localhost:5000/#!/apps/mach/tables/usage)
Currently we're now rendering markdown in Glean description fields, which doesn't look great:
http://localhost:5000/#!/apps/mozregression
We should render these types of fields with markdown. For the schema dictionary I used the marked
parser, which seems to work pretty well: https://www.npmjs.com/package/marked
It seems most likely that we'll define a svelte component for rendering this type of information, in which case we should add a story for it in: https://github.com/mozilla/glean-dictionary/tree/main/stories
This issue is intended as an onboarding task for potential outreachy applicants. Please do not work on it unless you have completed the initial qualification task and it has been assigned to you.
We have a single story for the schema viewer, but it doesn't look great:
This is because the Tailwind css components aren't being imported directly into the story. It should be possible to fix this with some configuration changes. This repository may have some hints on how to configure things (my suggestion would be to look at the postcss configuration):
This issue is for tracking purposes only, it is not meant to be assigned
Tracking issue for the application page feature of the MVP Glean dictionary.
Right now the notification email for mozphab is not in the right format:
This is due to an error in the source data https://probeinfo.telemetry.mozilla.org/glean/repositories.
We could update our email in the static data file to fix this, however I wonder if this could be better addressed by fixing it on a higher level.
We currently display email addresses on the ping page:
http://localhost:5000/#!/apps/mozregression/pings/usage
The repository metadata also has this information, however, and it would be good to display it on the application page (http://localhost:5000/#!/apps/mozregression).
As part of this implementation, let's create a component for rendering this type of information (maybe EmailAddresses
?) nicely and create a story for it. One nicety we could add would be creating a mailto:
URL for the email address, to make it a little easier to send a mail to the relevant party.
Many of the probes in the existing probe dictionary are expired or deprecated (and they don't always have build end dates). This causes confusion, since it might not be clear that no data will be available in those probes for currently-released products. While the ability to access historical probes should always exist, we should optimize for what is presumably the common case of looking at new data flowing in.
This might include:
This issue is intended as an onboarding task for potential outreachy applicants. Please do not work on it unless you have completed the initial qualification task and it has been assigned to you.
Currently we list all Glean applications in the UI by default on the home page (https://github.com/mozilla/glean-dictionary/blob/12ff6ac8f603f1245f6c6dcb9e4be9b85e28b135/src/pages/AppList.svelte), regardless of deprecation status. Instead we should have a checkbox that allows you to show/hide the applications that are deprecated. Something like this:
You can use the deprecated
property in the apps json file to accomplish this task. Assuming you have the application running, have a look at this JSON payload:
This issue is intended as an initial onboarding task for potential outreachy applicants. Please do not work on it unless you have completed the initial qualification task and it has been assigned to you.
We only have a minimal amount of python code so far, but we will accumulate more over time. To keep the quality level up, we should enable linting with flake8, black, and isort.
This is a somewhat non-trivial issue, as it will involve:
Some prior art which might be helpful is the mozregression repository where I recently added some linter code:
Note however that it uses travis rather than circleci. You'll need to do some extra research to get this going with CircleCI.
This bug requires specialized knowledge and access to Mozilla's internal systems, so is not a good issue for contributors
We should deploy a copy of the Glean dictionary to protosaur.dev on a regular basis. protosaur currently requires auth (making it inaccessible to those outside Mozilla), but that should be fixed soon by mozilla/protodash#16
According to the proposal, on ping page there should a backlink to the application be added which produced the ping.
Where to change: https://github.com/mozilla/glean-dictionary/blob/main/src/pages/PingDetail.svelte
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.