Comments (9)
Since we're taking the time to fix this, wouldn't Cloud Logging be a better substitute for BQ, GCS, and Firebase?
from weather-tools.
I'm less familiar with Cloud Logger. How would it help suit our needs here, which are primarily to have a central record of our download status?
from weather-tools.
Cloud Logging is Stackdriver's new name, where Dataflow pipeline logs show up too. Since it's very unlikely that the users are interested in any complex analytics on the manifest logs, BQ is not a great option. Stackdriver is cheaper, easier to use, and the permissions are less complex to manage.
I've created and tested a tiny Stackdriver manifest implementation here. The logs will show up in Cloud Console like this:
from weather-tools.
What every the service / implementation, I ideally want to be able to answer the following questions quickly (say, with a local script or even a dashboard):
- What data is currently in progress? What's queued for download?
- What data have we already ingested?
- Approximately, how long will it take to finish this job?
Do you think that Cloud Logging would help us answer these questions? The Manifest
is not really a logging system, but rather a database that we intend to query to answer these questions.
from weather-tools.
To put this feature in terms of a problem: Right now, our default implementation for the manifest is Firestore
. We have a CLI script we include with weather-dl
that checks what the current status is:
https://github.com/google/weather-tools/blob/main/weather_dl/download-status
Unfortunately, due to our database choice and current number of records, this script is not performant + hits API limits right away. This inspiration for this issue was:
- it would make much more sense if we used a traditional RDBMS instead of a key-value store
- we'd prefer to use BigQuery rather than other DB implementations to start, since our project is pretty wed to BQ.
from weather-tools.
How about using custom metrics, and having monitoring dashboard ?
This requires modification in the existing pipeline code, where it updates the our custom metrics, and dashboard will be able to show all things in the one place.
This monitoring can be extended for alerting mechanism too.
from weather-tools.
I like the idea of using metrics to answer some of these questions. However, it looks like custom metrics only last 30 days, so they wouldn't solve the problem of having a record of what was downloaded.
from weather-tools.
Further, it looks like we could possibly run into quota limits for storing values as metrics, whereas as traditional DB would allow us to have unlimited writes.
from weather-tools.
Closing this issue as it has been addressed by the changes made in PR #295 which has been merged.
from weather-tools.
Related Issues (20)
- `gcloud alpha commands` used but not installed in enviroment
- ruff not used in CI pipeline
- Missing ruff checks
- Don't keep NULLs in the CSVs for feature collection
- Provide support to give time range while opening zarr HOT 1
- weather-mv rg gave data with offset by 180 degree longitude.
- weather-sp: Provide an option to append the filename with the splitted filename.
- weather-mv bq raster issue while reading ecmwf grib file HOT 2
- Find a way to exclude test data when building docker image. HOT 2
- All tools should make use of public runtime container image to manage dependencies
- weather-mv ee: Add a couple of time-metrics to asset attributes
- Deprecated Apache Beam Version Causing Error in weather-dl tool.
- Make use of secret-manager while using weather-dl for license keys. HOT 1
- Enhanced support in weather-dl for downloading data across month ranges spanning multiple years. HOT 1
- Add new functionality (--async) in weather-dl to terminate tool after dataflow job launched.
- Strengthen feature collection ingestion logic in weather-mv
- [CI/CD failing] Ruff version deprecated. HOT 2
- Add a feature in weather-mv to extract specific date's data from any files.
- Faster ingestion into BQ by converting the chunk into pd.Dataframe
- Pangeo Showcase talk on weather-tools/xql? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from weather-tools.