Git Product home page Git Product logo

Comments (5)

Luckyboys avatar Luckyboys commented on August 23, 2024

in go issue was said rand.Rand need use in only one goroutine.
using a mutex is the right solution for multiple goroutine.
golang/go#3611

so, either we make the RandomSampler has a mutex to control the write operation only one for a goroutine, or let the rand.Rand has multiple copies in every goroutine (it like a random computer pool).

from go2sky.

wu-sheng avatar wu-sheng commented on August 23, 2024

I am not sure in Go, but at least in Java, we keep one random seed per thread. A lock is not acceptable, because sampling is for better performance, lock is on the opposite way.

from go2sky.

Luckyboys avatar Luckyboys commented on August 23, 2024

In this https://github.com/SkyAPM/go2sky/blame/master/sampler.go.(RandomSampler) implement, we only create one RandomSampler and set it to the tracer.

no matter how we created the tracer, we using the sampler entity is only one.
so, if multiple goroutine to use it to call IsSampled,
the IsSampled implement

func (s *RandomSampler) IsSampled(operation string) bool {
	return s.threshold > s.rand.Intn(100)
}

it has propability called s.rand.Intn(100) in multiple goroutine.

if the mutex solution is not acceptable,
I suggest use a only one goroutine to create the ramdom number and push to a channel,
other read goroutine is pull the random number from the channel.
that's will be reduces performance loss due to locking.

from go2sky.

hanahmily avatar hanahmily commented on August 23, 2024

I suggest use a only one goroutine to create the ramdom number and push to a channel,
other read goroutine is pull the random number from the channel.
that's will be reduces performance loss due to locking.

How about using a new rand in every goroutine? it's very neat, might be faster than mutex. If we have concerns about the overhead of the object allocation, the object pool should improve it.

Anyway, a benchmark is always helpful to find the ideal solution.

from go2sky.

Luckyboys avatar Luckyboys commented on August 23, 2024

How about using a new rand in every goroutine? it's very neat, might be faster than mutex.

a question is how to known current goroutine was created the rand.Rand entity.
As I knew, goroutine is memory shared.
It doesn't like working mechanism of thread.
so, we can't check the variable was created in this goroutine.

And I was benchmarked current RandomSampler, used mutex solution, and used channel to store generate random number solution.

Go Benchmark Results

CPU: Apple silicon M1 Pro 10 cores
Memory: 16GB
goos: darwin
goarch: arm64
pkg: github.com/SkyAPM/go2sky
BenchmarkRandomSampler_IsSampled-10               	195834091	         5.965 ns/op
BenchmarkRandomMutexSampler_IsSampled-10          	 8900761	       144.6 ns/op
BenchmarkRandomGlobalSampler_IsSampled-10         	 8900469	       146.2 ns/op
BenchmarkRandomChannelSampler0_IsSampled-10       	 2612180	       442.4 ns/op
BenchmarkRandomChannelSampler1_IsSampled-10       	 2770339	       386.1 ns/op
BenchmarkRandomChannelSampler128_IsSampled-10     	 4160923	       296.5 ns/op
BenchmarkRandomChannelSampler1024_IsSampled-10    	 4435966	       293.8 ns/op
BenchmarkRandomPoolSampler_IsSampled-10           	190921699	         6.359 ns/op
PASS
ok  	github.com/SkyAPM/go2sky	16.800s

The conclusion is non mutex is very fast, but the rand.Rand is not concurrency safe.
so it can't use in production.
However, if we need concurrency safe, it's at least lock once when use the rand.Rand to generate a random number.

for example, the entity pool solution is need a lock to get and put the rand.Rand entity.
or the go channel solution, because go channel it using a lock in internal implement.
but if use sync.pool to implement pool, the lock is use internal lock, it will try to use optimistic lock in internal, so the performance will great better than to use mutex to impletement it.

Sampler Implement Codes

// RandomMutexSampler Use mutex to implement concurrent-safe for randomizer.
type RandomMutexSampler struct {
	mutex        sync.Mutex
	samplingRate float64
	rand         *rand.Rand
	threshold    int
}

// IsSampled implements IsSampled() of Sampler.
func (s *RandomMutexSampler) IsSampled(operation string) bool {
	s.mutex.Lock()
	defer s.mutex.Unlock()
	return s.threshold > s.rand.Intn(100)
}

func (s *RandomMutexSampler) init() {
	s.rand = rand.New(rand.NewSource(time.Now().Unix()))
	s.threshold = int(s.samplingRate * 100)
}

func NewRandomMutexSampler(samplingRate float64) *RandomMutexSampler {
	s := &RandomMutexSampler{
		samplingRate: samplingRate,
	}
	s.init()
	return s
}

// RandomChannelSampler Use channel to implement concurrent-safe for randomizer.
type RandomChannelSampler struct {
	samplingRate        float64
	randomNumberChannel chan int
	rand                *rand.Rand
	threshold           int
}

// IsSampled implements IsSampled() of Sampler.
func (s *RandomChannelSampler) IsSampled(operation string) bool {
	n := <-s.randomNumberChannel
	return s.threshold > n
}

func (s *RandomChannelSampler) init() {
	s.rand = rand.New(rand.NewSource(time.Now().Unix()))
	s.threshold = int(s.samplingRate * 100)
	go func() {
		for {
			s.randomNumberChannel <- s.rand.Intn(100)
		}
	}()
}

func NewRandomChannelSampler(samplingRate float64, channelSize int) *RandomChannelSampler {
	s := &RandomChannelSampler{
		samplingRate:        samplingRate,
		randomNumberChannel: make(chan int, channelSize),
	}
	s.init()
	return s
}

// RandomGlobalSampler Use global randomizer to ensure concurrent-safe, it's used a local lock too.
type RandomGlobalSampler struct {
	samplingRate float64
	threshold    int
}

// IsSampled implements IsSampled() of Sampler.
func (s *RandomGlobalSampler) IsSampled(operation string) bool {
	return s.threshold > rand.Intn(100)
}

func (s *RandomGlobalSampler) init() {
	s.threshold = int(s.samplingRate * 100)
}

func NewRandomGlobalSampler(samplingRate float64) *RandomGlobalSampler {
	s := &RandomGlobalSampler{
		samplingRate: samplingRate,
	}
	s.init()
	return s
}

// RandomPoolSampler Use sync.Pool to implement concurrent-safe for randomizer.
type RandomPoolSampler struct {
	samplingRate float64
	threshold    int
	initCount    int
	pool         sync.Pool
}

// IsSampled implements IsSampled() of Sampler.
func (s *RandomPoolSampler) IsSampled(operation string) bool {

	return s.threshold > s.generateRandomNumber()
}

func (s *RandomPoolSampler) init() {
	s.threshold = int(s.samplingRate * 100)
	for i := 0; i < s.initCount; i++ {
		s.pool.Put(s.newRand())
	}
	s.pool.New = s.newRand
}

func (s *RandomPoolSampler) generateRandomNumber() int {

	r := s.getRandomizer()

	randomNumber := r.Intn(100)

	s.returnRandomizer(r)

	return randomNumber
}

func (s *RandomPoolSampler) returnRandomizer(r *rand.Rand) {
	s.pool.Put(r)
}

func (s *RandomPoolSampler) getRandomizer() *rand.Rand {

	var r *rand.Rand

	generator := s.pool.Get()
	if generator == nil {
		generator = s.newRand()
	}

	r, ok := generator.(*rand.Rand)
	if !ok {
		r = s.newRand().(*rand.Rand) // it must be *rand.Rand
	}

	return r
}

func (s *RandomPoolSampler) newRand() interface{} {
	return rand.New(rand.NewSource(time.Now().UnixNano()))
}

func NewRandomPoolSampler(samplingRate float64, initCount int) *RandomPoolSampler {
	s := &RandomPoolSampler{
		samplingRate: samplingRate,
		initCount:    initCount,
	}
	s.init()
	return s
}

Go Test Codes

func BenchmarkRandomSampler_IsSampled(b *testing.B) {
	sampler := NewRandomSampler(0.5)
	operationName := "op"
	// will be panic
	/*b.RunParallel(func(pb *testing.PB) {
		for pb.Next() {
			sampler.IsSampled(operationName)
		}
	})*/
	// Only run serial
	for i := 0; i < b.N; i++ {
		sampler.IsSampled(operationName)
	}
}

func BenchmarkRandomMutexSampler_IsSampled(b *testing.B) {
	sampler := NewRandomMutexSampler(0.5)
	operationName := "op"
	b.RunParallel(func(pb *testing.PB) {
		for pb.Next() {
			sampler.IsSampled(operationName)
		}
	})
}

func BenchmarkRandomGlobalSampler_IsSampled(b *testing.B) {
	sampler := NewRandomGlobalSampler(0.5)
	operationName := "op"
	b.RunParallel(func(pb *testing.PB) {
		for pb.Next() {
			sampler.IsSampled(operationName)
		}
	})
}

func BenchmarkRandomChannelSampler0_IsSampled(b *testing.B) {
	sampler := NewRandomChannelSampler(0.5, 0)
	operationName := "op"
	b.RunParallel(func(pb *testing.PB) {
		for pb.Next() {
			sampler.IsSampled(operationName)
		}
	})
}

func BenchmarkRandomChannelSampler1_IsSampled(b *testing.B) {
	sampler := NewRandomChannelSampler(0.5, 1)
	operationName := "op"
	b.RunParallel(func(pb *testing.PB) {
		for pb.Next() {
			sampler.IsSampled(operationName)
		}
	})
}

func BenchmarkRandomChannelSampler128_IsSampled(b *testing.B) {
	sampler := NewRandomChannelSampler(0.5, 128)
	operationName := "op"
	b.RunParallel(func(pb *testing.PB) {
		for pb.Next() {
			sampler.IsSampled(operationName)
		}
	})
}

func BenchmarkRandomChannelSampler1024_IsSampled(b *testing.B) {
	sampler := NewRandomChannelSampler(0.5, 1024)
	operationName := "op"
	b.RunParallel(func(pb *testing.PB) {
		for pb.Next() {
			sampler.IsSampled(operationName)
		}
	})
}

func BenchmarkRandomPoolSampler_IsSampled(b *testing.B) {
	sampler := NewRandomPoolSampler(0.5, runtime.NumCPU())
	operationName := "op"
	b.RunParallel(func(pb *testing.PB) {
		for pb.Next() {
			sampler.IsSampled(operationName)
		}
	})
}

from go2sky.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.