Add support for MQL language in Stackdriver backend. This will unloc

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Support for MQL in Cloud Monitoring backend about slo-generator HOT 7 CLOSED

google commented on May 5, 2024 1

Support for MQL in Cloud Monitoring backend

from slo-generator.

Comments (7)

ocervell commented on May 5, 2024

Above issue is fixed, we can now target this for v2.

from slo-generator.

lvaylet commented on May 5, 2024

@ocervell Just to make sure we are on the same page, are we talking about this MQL?

from slo-generator.

ocervell commented on May 5, 2024

@lvaylet yes !

from slo-generator.

lvaylet commented on May 5, 2024

@ocervell Taking samples/stackdriver/slo_gae_app_availability.yaml as an example, do you confirm that supporting MQL will let us write backend.measurement.filter_good and backend.measurement.filter_valid like:

---
service_name:     gae
feature_name:     app
slo_description:  Availability of App Engine app
slo_name:         availability
slo_target:       0.95
backend:
  class:          Stackdriver
  method:         good_bad_ratio
  project_id:     ${STACKDRIVER_HOST_PROJECT_ID}
  measurement:
    filter_good:  >
      fetch gae_app
      | metric 'appengine.googleapis.com/http/server/response_count'
      | filter
        resource.project_id == '${GAE_PROJECT_ID}'
        &&
        ( metric.response_code == 429 ||
          metric.response_code == 200 ||
          metric.response_code == 201 ||
          metric.response_code == 202 ||
          metric.response_code == 203 ||
          metric.response_code == 204 ||
          metric.response_code == 205 ||
          metric.response_code == 206 ||
          metric.response_code == 207 ||
          metric.response_code == 208 ||
          metric.response_code == 226 ||
          metric.response_code == 304 )
    filter_valid: >
      fetch gae_app
      | metric 'appengine.googleapis.com/http/server/response_count'
      | filter
        resource.project_id == '${GAE_PROJECT_ID}'
exporters:
- class:          Stackdriver
  project_id:     ${STACKDRIVER_HOST_PROJECT_ID}

or even replace both fields with a new ratio field that leverages the native features of MQL, like:

---
service_name:     gae
feature_name:     app
slo_description:  Availability of App Engine app
slo_name:         availability
slo_target:       0.95
backend:
  class:          Stackdriver
  method:         good_bad_ratio
  project_id:     ${STACKDRIVER_HOST_PROJECT_ID}
  measurement:
    ratio: >
      fetch gae_app
      | metric 'appengine.googleapis.com/http/server/response_count'
      | filter resource.project_id == '${GAE_PROJECT_ID}'
      | { filter
          ( metric.response_code == 429 ||
            metric.response_code == 200 ||
            metric.response_code == 201 ||
            metric.response_code == 202 ||
            metric.response_code == 203 ||
            metric.response_code == 204 ||
            metric.response_code == 205 ||
            metric.response_code == 206 ||
            metric.response_code == 207 ||
            metric.response_code == 208 ||
            metric.response_code == 226 ||
            metric.response_code == 304 )
        ;
          ident
        }
      | ratio
exporters:
- class:          Stackdriver
  project_id:     ${STACKDRIVER_HOST_PROJECT_ID}

Extra questions:

Can I assume that the good_bad_ratio method of the StackdriverBackend class should be reused and adjusted to perform different operations based on the presence/absence of the YAML fields above? Or shall we come up with a new method with a different return value and/or return type, as good_bad_ratio currently returns a tuple with the number of good and bad events?
Shall we introduce a new flag for the language used in the queries (one of legacy or mql, with a default value of legacy and a deprecation notice for v3 in favor of MQL)? Or shall we detect the language automatically, on a best-effort basis? I would rather be explicit and go for the extra argument. Automatic detection could be tricky and not as simple as .startsWith('fetch').

What do you think? Did you have something in mind already?

from slo-generator.

ocervell commented on May 5, 2024

@lvaylet yes for your main question, that's exactly how we should be able to write MQL. The second way to do MQL with ratio looks good too, I think we can support both but the ratio could be another method called query_sli (similar to what we need for Prometheus backend here - and that one will return one value (the SLI) instead of a tuple (good, bad)).

For your extra questions:

Yes, I think we should be using the same good_bad_ratio method, even though it can call a different instance method under the hood (like query_mql instead of query if we're using MQL.
I think you're right about auto-detection, there are a lot of edge cases... I think a new flag lang=mql, lang=mqf (MQF = Monitoring Query Filters) would be more explicit. If not passed, the flag sets to mqf for now, until we deprecate it. Not sure when Monitoring Query Filters will be deprecated, but currently all our users use this instead of MQL, so we might even target > v4 for deprecation.

from slo-generator.

lvaylet commented on May 5, 2024

@ocervell As mentioned in googleapis/python-monitoring#47, support for MQL was added in google-cloud-monitoring version 2.2.0.

Unless there is a specific reason to target version 1.x.x of google-cloud-monitoring (like supporting Python 2.7), is it OK if I bump the google-cloud-monitoring version in setup.py? It is currently set to 'google-cloud-monitoring < 2.0.0' for the cloud_monitoring and cloud_service_monitoring extras. Then I am not sure whether we should also bump the version of google-api-python-client, also set to 'google-api-python-client < 2.0.0' for both extras?

Again in setup.py, we might have to bump the required Python version from >=3.4 to >=3.6 as google-cloud-monitoring 2.0.0 requires Python 3.6+ (refer to the 2.0.0 Migration Guide for more details).

from slo-generator.

lvaylet commented on May 5, 2024

Upon investigation, bumping the version of google-cloud-monitoring to v2 requires significant updates to the existing code. As mentioned in the 2.0.0 Migration Guide:

The 2.0 release of the google-cloud-monitoring client is a significant upgrade based on a next-gen code generator, and includes substantial interface changes. Existing code written for earlier versions of this library will likely require updates to use this version.

As a consequence, most of the Cloud Monitoring backend must be rewritten. For example, the static method get_window(timestamp, window):

    @staticmethod
    def get_window(timestamp, window):
        measurement_window = monitoring_v3.types.TimeInterval()
        measurement_window.end_time.seconds = int(timestamp)
        measurement_window.end_time.nanos = int(
            (timestamp - measurement_window.end_time.seconds) * 10**9)
        measurement_window.start_time.seconds = int(timestamp - window)
        measurement_window.start_time.nanos = measurement_window.end_time.nanos
        LOGGER.debug(pprint.pformat(measurement_window))
        return measurement_window

must be rewritten as:

    @staticmethod
    def get_window(timestamp, window):
        seconds = int(timestamp)
        nanos = int((timestamp - seconds) * 10**9)
        measurement_window = monitoring_v3.TimeInterval({
            "end_time": {"seconds": seconds, "nanos": nanos},
            "start_time": {"seconds": int(seconds - window), "nanos": nanos},
        })
        LOGGER.debug(pprint.pformat(measurement_window))
        return measurement_window

At the end of the day, supporting MQL is not as simple as instantiating a new QueryServiceClient next to the existing MetricServiceClient and let it handle the MQL queries.

In addition to the backend code, we might have to migrate the unit tests too. Maybe write more of them to make sure the whole backend is covered before such a major refactoring.

@ocervell As discussed offline, let's target a minor release like v2.1 if using the new client does not introduce any breaking change for the end users, or v3 in case we need to introduce breaking changes.

from slo-generator.

Support for MQL in Cloud Monitoring backend about slo-generator HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent