Git Product home page Git Product logo

Comments (12)

Pascal-Delange avatar Pascal-Delange commented on August 28, 2024 1

Thanks for the answers,

Ok, I had a bit of a hard time extracting only the minimal parts of the code, but here you go:

package main

import (
	"context"
	"fmt"
	"log"
	"time"

	"github.com/GoogleCloudPlatform/opentelemetry-operations-go/exporter/trace"
	"github.com/exaring/otelpgx"
	"github.com/jackc/pgx/v5/pgxpool"

	"go.opentelemetry.io/otel"
	"go.opentelemetry.io/otel/propagation"
	"go.opentelemetry.io/otel/sdk/resource"
	sdktrace "go.opentelemetry.io/otel/sdk/trace"
	semconv "go.opentelemetry.io/otel/semconv/v1.4.0"
	oteltrace "go.opentelemetry.io/otel/trace"
)

func main() {
	type ContextKey int

	var ContextKeyOpenTelemetryTracer ContextKey = 0

	connectionString := fmt.Sprintf("host=%s user=%s password=%s database=%s sslmode=disable port=%s",
		"localhost", "postgres", "password", "database", "5432")

	exporter, err := trace.New(trace.WithProjectID("projectId"))
	if err != nil {
		panic(err)
	}

	res, err := resource.New(context.Background(),
		resource.WithTelemetrySDK(),
		resource.WithAttributes(
			semconv.ServiceNameKey.String("test app"),
		),
	)
	if err != nil {
		panic(err)
	}

	tp := sdktrace.NewTracerProvider(
		sdktrace.WithBatcher(exporter),
		sdktrace.WithResource(res),
	)
	otel.SetTextMapPropagator(
		propagation.NewCompositeTextMapPropagator(
			propagation.TraceContext{},
			propagation.Baggage{},
		),
	)
	otel.SetTracerProvider(tp)

	tracer := tp.Tracer("test app")

	connectionPool, err := newPostgresConnectionPool(connectionString)
	if err != nil {
		log.Fatal("error creating postgres connection to database", err.Error())
	}

	ctx := context.Background()
	ctx = context.WithValue(ctx, ContextKeyOpenTelemetryTracer, tracer)

	for i := range 1000 {
		fmt.Println(i)
		tracer, _ := ctx.Value(ContextKeyOpenTelemetryTracer).(oteltrace.Tracer)
		ctx, span := tracer.Start(ctx, fmt.Sprintf("span %v", i))
		_, err := connectionPool.Exec(ctx, "SELECT 1")
		if err != nil {
			panic(err)
		}
		span.End()
		time.Sleep(2 * time.Second)
	}
}

func newPostgresConnectionPool(connectionString string) (*pgxpool.Pool, error) {
	cfg, err := pgxpool.ParseConfig(connectionString)
	if err != nil {
		return nil, fmt.Errorf("create connection pool: %w", err)
	}
	cfg.ConnConfig.Tracer = otelpgx.NewTracer()
	cfg.MaxConns = 10
	cfg.MaxConnIdleTime = 5 * time.Minute

	pool, err := pgxpool.NewWithConfig(context.Background(), cfg)
	if err != nil {
		return nil, fmt.Errorf("unable to create connection pool: %w", err)
	}
	return pool, nil
}

(it's perhaps a bit clumsy but it should illustrate what happens).

So after running this test, it appears that what is happening is rather, upgrading beyond v0.154.0 makes the trace client trace itself. It actually does not seem to do so abnormally (it's flushing every few seconds and takes ~200-300ms each time) - but if the activity in the app is not too high, this is still drowning the trees (app request traces) behind a forest of trace client self-traces.

See the screenshot below, where the "fast" traces are from the db query spans and the "slow" traces are from the tracer client.
Capture d’écran 2024-06-14 à 22 30 40

This is with the full go.mod as attached
go.mod.txt

from google-api-go-client.

quartzmo avatar quartzmo commented on August 28, 2024 1

Thanks for the quick response. The next thing I think we should try is to programmatically disable the trace context propagation in the Cloud Trace client (cloud.google.com/go/trace) with option.WithTelemetryDisabled. In your script above, change this line:

exporter, err := trace.New(trace.WithProjectID("projectId"))

to

exporter, err := trace.New(trace.WithProjectID("projectId"), trace.WithTraceClientOptions([]option.ClientOption{option.WithTelemetryDisabled()}))

This should disable the trace context propagation (source), which was the only OpenTelemetry-related code change in v0.154.0.

from google-api-go-client.

Pascal-Delange avatar Pascal-Delange commented on August 28, 2024 1

Hi,
Considering tracing is made to estimate the time things (tasks, requests...) take to execute and the tracing operation itself is happening asynchronously (except perhaps the odd "flush" operation), I would say it's safe to assume that the traces from the trace client sending traces itself are going to be noise >99.99% of the time (which will be more or less visible hence more or less annoying depending on the type of app being traced).
(it may still show up if you're profiling the application, but that is a different story)
A default behavior of not sending traces would seem reasonable to me.

from google-api-go-client.

quartzmo avatar quartzmo commented on August 28, 2024

@Pascal-Delange Can you please provide some more detail about your configuration? Are you using the OpenCensus Bridge? Please show your imports as well as your complete OpenCensus and OpenTelemetry configuration. If you can write a simple demo that I can easily run that reproduces the loop, that would be ideal, thanks!

from google-api-go-client.

quartzmo avatar quartzmo commented on August 28, 2024

Are you using the OpenCensus Bridge?

Because you do not directly require go.opencensus.io in the attached go.mod file, I think you are not using the bridge. But please provide your full configuration, thanks.

from google-api-go-client.

quartzmo avatar quartzmo commented on August 28, 2024

@Pascal-Delange Are you able to upgrade libraries such as cloud.google.com/go/storage, firebase.google.com/go/v4 and google.golang.org/api to the latest versions?

from google-api-go-client.

quartzmo avatar quartzmo commented on August 28, 2024

How are you changing the version of google.golang.org/api if it is not a direct dependency in go.mod? Can you also provide the < v0.154.0 go.mod for the example you just posted? I am wondering what the version differences are for the direct dependencies that indirectly require google.golang.org/api. It would be good to establish the change in the direct dependency(ies) that results in the issue.

Within google.golang.org/api itself, there really isn't very much telemetry code. I believe it's just the trace context propagation support for http and grpc. Please correct me if I'm wrong.

from google-api-go-client.

Pascal-Delange avatar Pascal-Delange commented on August 28, 2024

Hi,
I pinpointed that google.golang.org/api might be the root cause by trying to narrow down at which version my issue was appearing, after observing that bumping any gcp library was also bumping google.golang.org/api's version along. I specifically pinned the version by running go get google.golang.org/api@version.
Attached are the go.mod with v153 and 154, along with the diff below:

pascal % diff go.mod_153 go.mod_154
20a21
> 	github.com/felixge/httpsnoop v1.0.4 // indirect
31a33,34
> 	go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.46.1 // indirect
> 	go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.46.1 // indirect
40c43
< 	google.golang.org/api v0.153.0 // indirect
---
> 	google.golang.org/api v0.154.0 // indirect

So there are 3 additional indirect dependencies included if I switch to v154, with the following (or similar) dependency path:

pascal % go mod why -m go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc
# go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc
github.com/checkmarble/marble-backend
github.com/GoogleCloudPlatform/opentelemetry-operations-go/exporter/trace
cloud.google.com/go/trace/apiv2
google.golang.org/api/transport/grpc
go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc

go.mod_153.txt
go.mod_154.txt

from google-api-go-client.

quartzmo avatar quartzmo commented on August 28, 2024

Release v0.154.0 added OpenTelemetry trace context propagation via 46421d4. It also added the following (direct and indirect) dependencies to go.mod:

	go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.46.1
	go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.46.0
	github.com/felixge/httpsnoop v1.0.4 // indirect
	github.com/go-logr/logr v1.3.0 // indirect
	github.com/go-logr/stdr v1.2.2 // indirect
	go.opentelemetry.io/otel v1.21.0 // indirect
	go.opentelemetry.io/otel/metric v1.21.0 // indirect
	go.opentelemetry.io/otel/trace v1.21.0 // indirect

Can you modify your v153 demo script above to directly import these dependencies, and specify these versions (or whatever versions are in your v154 script) in your go.mod, while keeping google.golang.org/api at v0.153.0? The goal is to isolate the problem at its source. If you can still reproduce the problem while google.golang.org/api is at v0.153.0, we will know that the context propagation mechanism in v0.154.0 is not the problem. And if you can't reproduce, then I will investigate the context propagation mechanism in v0.154.0.

from google-api-go-client.

Pascal-Delange avatar Pascal-Delange commented on August 28, 2024

Thanks for bearing with me - and for your helpful indications.

The modified script with the additional direct dependencies as added in v0.154 (script and modified go.mod attached): the trace client does not trace itself.
go.mod_153_modified.txt
main.go.txt

If I now further add v0.154.0 from this base, I get my issue with self-tracing again. Note that bumping to v0.154.0 also bumps the (direct) dependency go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp from v0.46.0 to v0.46.1 - but that doing just that part with v0.153.0 also does not replicate the issue I have.

from google-api-go-client.

Pascal-Delange avatar Pascal-Delange commented on August 28, 2024

Thank you ! Passing this option seems to do the job.
I honestly tried to look for such an option, but TBH I didn't find it and didn't think of moving up to the "google.golang.org/api/option" package.
All good for me then, it might be a good thing for developer convenience to expose a WithoutTelemetry option directly on the trace exporter :-)

from google-api-go-client.

quartzmo avatar quartzmo commented on August 28, 2024

@Pascal-Delange Thanks for your quick responses and for doing the legwork on this issue. Very much appreciated!

I am wondering if the Cloud Trace client (cloud.google.com/go/trace) should internally set WithTelemetryDisabled. Can you think of any use cases in which the Cloud Trace client should propagate the context for its own tracing to the Cloud Trace service? Or will it always just cause extra noise?

from google-api-go-client.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.