cloudfoundry / go-diodes Goto Github PK

View Code? Open in Web Editor NEW

452.0 452.0 22.0 2.73 MB

Diodes are ring buffers manipulated via atomics.

License: Apache License 2.0

Go 100.00%

go-diodes's People

Contributors

Stargazers

Watchers

go-diodes's Issues

Batch allocate buckets

There has been discussion about batch allocating buckets to prevent the single bucket allocation on every write from slowing down the write path. The idea is that perhaps Go can allocate a contiguous block of buckets faster than it can allocate individual buckets. Additionally, these buckets will likely be part of the same cache line since they are contiguous. I am creating this issue to document the results. If the results are positive we should consider making this change.

OneToOne - This is easy to implement since the single writer can batch allocate for itself when it runs out of buckets.
ManyToOne - This is more difficult since we will need to synchronize access to the buckets and coordinate what goroutine will do the allocation. This might be a net loss but worth attempting.

Add a GH action to lint the codebase with golangci-lint

Len calculation in diodes

re: #2

The diodes are frequently doing this thing:

idx := someUint64Variable % len(buffer)

Very, very occasionally data will be lost if the buffer length does not divide evenly into int64 = 18446744073709551615 . Probably not something worth stressing about, but thought I'd mention. The amount of data lost increases as the buffer size increases.

Move to code.cloudfoundry.org

This will require:

Rewriting test imports.
Adding some documentation to the README.md for how to go get this package.

is this better than go channel?

var strChan = make(chan string, 1024)

func Set(str string) {
	select {
	case strChan <- str:
	default:
		// drop
	}
}

I think what go-diodes better than go-channel is it override previous, but go-channel keeps previous and drops incoming.

Please configure GITBOT

Pivotal uses GITBOT to synchronize Github issues and pull requests with Pivotal Tracker.
Please add your new repo to the GITBOT config-production.yml in the Gitbot configuration repo.
If you don't have access you can send an ask ticket to the CF admins. We prefer teams to submit their changes via a pull request.

Steps:

Fork this repo: cfgitbot-config
Add your project to config-production.yml file
Submit a PR

If there are any questions, please reach out to [email protected].

Benchmarks measure diode and `sync.WaitGroup` setup.

In many of the benchmarks where we are not calling b.ResetTimer() we are also measuring the creation time of the diode/channel under test. This seems odd since a majority of the performance demands of the lifetime of a diode are in writes/reads and not instantiation.

Additionally, we are measuring sync.WaitGroup setup. This also seems odd.

go-diodes/benchmarks_test.go

Lines 20 to 40 in 1273221

 func BenchmarkOneToOnePoller(b *testing.B) { 

 d := diodes.NewPoller(diodes.NewOneToOne(b.N, diodes.AlertFunc(func(missed int) { 

 panic("Oops...") 

 }))) 

 var wg sync.WaitGroup 

 wg.Add(1) 

 defer wg.Wait() 

 go func() { 

 defer wg.Done() 

 for i := 0; i < b.N; i++ { 

 data := []byte("some-data") 

 d.Set(diodes.GenericDataType(&data)) 

 } 

 }() 

 for i := 0; i < b.N; i++ { 

 d.Next() 

 } 

 }

StartTimer in benchmarks

It seems odd to call b.StartTimer in the benchmarks since the benchmark framework starts the timer before the benchmark code is ran. I'm thinking this was meant to be b.ResetTimer.

go-diodes/benchmarks_test.go

Line 128 in c187a7c

b.StartTimer()

@apoydence

Diode `TryNext` returns unnecessary values.

This API returns two values. The second is unnecessary as the first value will always be nil if the second is false:

data, ok := d.TryNext()
if !ok {
	// handle failed read
}

data := d.TryNext()
if data == nil {
	// handle failed read
}

This conveys to the user that nil reads (nil, true) and failed reads with data (data, false) are possible.

Slow polling from a diode

I'm using a diode to read price feed data (my ws has a frequency of around 100ms) and drop old prices if reacting to some other event is keeping main busy. I also want to be notified when a new price arrives to be as reactive as possible when main is idle. Is there a way to do this naturally with go-diodes ?

For now i'm using:

a diode to overwrite old data
a 1-buffered Notifier channel to notify me when a new price arrives that accepts input only when its empty to avoid blocking
(code below)

This model seems to work for a while until it doesn't: the d.d.Next() op starts taking 100ms per read blocking everything else. I feel like this "notifier + diode" hack is not the proper way to do it but I can't find a better solution.

type OneToOne struct {
	d *diodes.Poller
}

func NewOneToOne(size int, alerter diodes.Alerter) *OneToOne {
	return &OneToOne{
		d: diodes.NewPoller(diodes.NewOneToOne(size, alerter)),
	}
}

func (d *OneToOne) Set(data []byte) {
	d.d.Set(diodes.GenericDataType(&data))
}

func (d *OneToOne) Next() *PriceFeed {
	t := time.Now()
	data := d.d.Next()
	golog.Stdlogger.Debugf("Next price took %v", time.Since(t))

	bytes := *(*[]byte)(data)
	return Unpack(bytes)
}

func Unpack(msg []byte) *PriceFeed {
	t := time.Now()
	feed := PriceFeed{}
	json.Unmarshal(msg, &feed)
	golog.Stdlogger.Debugf("Unpacking price took %v", time.Since(t))
	return &feed
}

Writing loop:

	for {
		if time.Since(tlastprice) > PRICE_TIMEOUT {
			panic(fmt.Sprintf("Didn't recieve new price feed in %s", PRICE_TIMEOUT))
		}
		messageType, msg, err := priceWS.ReadMessage()
		if err == nil {
			if err := priceWS.WriteMessage(messageType, msg); err == nil {
				tlastprice = time.Now()
				diode.Set(msg)
				if len(priceNotifier) == 0 {
					priceNotifier <- 1
				}
			}
               }
        }

reading in main, only place where Next is called:

for {
      select {
		// Update state when new price arrives
		case <-priceNotifier:
			t := time.Now()
			priceFeed = priceDiode.Next()
			tdiode += time.Since(t)
			t = time.Now()
                // other cases
            }
}

Writers can get ahead of readers

We have demonstrated that it is possible for readers to get ahead of writers. When this happens the writer must write to catch up to the reader before the reader can continue. Additionally, messages are delayed for quite some time. Imagine the following 4 element diode:

|  0  |  1  |  2  |  3  |
|  4  |  5  |  6  |  7  |
|  8  |  9  |  10 |  11 |

Both the read and write heads start at 0. | nil | nil | nil | nil | r: 0, w: 0
The write head writes values to 0 and 1. | 0 | 1 | nil | nil | r: 0, w: 2
The read head reads values 0 and 1. | nil | nil | nil | nil | r: 2, w: 2
In the mean time the write head writes values 2, 3, 4, 5, and 6. | 4 | 5 | 6 | 3 | r: 2, w: 7
The read head reads value 6, as 2 was overwritten by the writer. The read head detects this and fast forwards. | 4 | 5 | nil | 3 | r: 7, w: 7
The read head then reads value 3 expecting value 7. It reads the value happily and continues to read 4, and 5. | nil | nil | nil | nil | r: 10, w: 7

As you can see, we now have the reader ahead of the writer. This is undesirable. Both the many-to-one and one-to-one diodes suffer this bug.

Remove Travis CI

It doesn't run anyway and there's some security vulnerabilities associated with it.

Upgrade testing to use ginkgo v2

Not swapping with nil on reads.

Currently we swap in nil every time we do a read:

atomic.SwapPointer(&d.buffer[idx], nil)

It was discussed previously that we might want to switch this to a load:

atomic.LoadPointer(&d.buffer[idx])

This may give us a performance improvement at the cost of memory use. Some things we should try:

Benchmark using atomic.LoadPointer to see what the performance advantages are on modern hardware.
If there are substantial performance gains, create new diodes that trade off memory use for the performance gain.

Allow nil alerter func.

If I don't care about alerting it would be nice to be able to pass in nil as the Alerter. This would clean up the API.

diodes.NewManyToOne(10000, nil)

diodes.NewManyToOne(10000, diodes.AlertFunc(func(int) {
	// NOP
}))

If doing the if d.alerter == nil check when alerting is a performance consideration we can do the nil check in the constructor and simply create the nop alert func for them.

Another consideration is doing something like:

diodes.NewManyToOne(10000) // no alert func
diodes.NewManyToOne(10000, diodes.WithAlertFunc(func(int) {
	// some alert
})))

but that would be a breaking change.

Are our benchmarks doing anything?

Benchmarks allocate memory during the benchmark

A lot of the benchmarks allocate memory during the bench. This skews -benchmem and ns/op results.

go-diodes/benchmarks_test.go

Line 32 in 1273221

data := []byte("some-data")

Diode len problem (uint64 vs int64)

The diodes are frequently doing this thing:

idx := someUint64Variable % len(buffer) // len = int = int64 on most compilers

This is a problem, I think. Notes:

int64 max = 9223372036854775807 // 63 bits
uint64 max = 18446744073709551615 // 64 bits
difference = 9223372036854775808 // about half, b/c uint64 has 1 more bit

So, the buffer will go like this:

fill to 100% buffer size (63 bits, 50% read index)
begin back at index 0 (0% buffer size) because of mod, but readindex is 50%+
at 50% buffer size, begin back at 0 because readIndex wrapped

test issue

Rewrite go diodes with generics?

Could we improve the functionality of go diodes by rewriting them to use go generics.

	func BenchmarkOneToOnePoller(b *testing.B) {
	d := diodes.NewPoller(diodes.NewOneToOne(b.N, diodes.AlertFunc(func(missed int) {
	panic("Oops...")
	})))

	var wg sync.WaitGroup
	wg.Add(1)
	defer wg.Wait()

	go func() {
	defer wg.Done()
	for i := 0; i < b.N; i++ {
	data := []byte("some-data")
	d.Set(diodes.GenericDataType(&data))
	}
	}()

	for i := 0; i < b.N; i++ {
	d.Next()
	}
	}

cloudfoundry / go-diodes Goto Github PK

go-diodes's People

Contributors

Stargazers

Watchers

Forkers

go-diodes's Issues

Recommend Projects

Recommend Topics

Recommend Org