Git Product home page Git Product logo

Comments (19)

tchakabam avatar tchakabam commented on July 26, 2024

I am using LockableStorage in combination with Crosstab to avoid concurrency issues: https://github.com/tchakabam/LockableStorage

from crosstab.

nevcos avatar nevcos commented on July 26, 2024

I've tested 10 more times and I can't reproduce it anymore, but certainly there's a hole somewhere.

Thank you by your suggestion @tchakabam.
Actually, checking the code, there should be no need on a LockableStorage, because like is described on the README, it's an "lockless system that is entirely event driven".

from crosstab.

tchakabam avatar tchakabam commented on July 26, 2024

that doesn't mean that it never needs a lock :D

as soon as you want to cover transactions (concurrent reading and writing to local storage) from several tabs behavior is undefined.

i very much assume that to determine a master tab some kind of "lock" has to be acquired and held by one of the tabs.

we have an own mechanism in our app to determine a master tab for something application specific (independently of what crosstab does to do its stuff) and this definitely requires some kind off locking mechanism like a mutex simply because there is no way to take ownership of something without doing some kind of atomic read & write.

you might be hitting a very rare race condition where in the end no-one has acquired because of a very specific order of reads and writes and subsequent events.

from crosstab.

nevcos avatar nevcos commented on July 26, 2024

After some more tests, I've found a way to reproduce it some times.
I've installed the Awesome Reload All Tabs Button extension and had the following results:

Steps:

  1. Refresh Tabs 1 and 2
  2. (921ms) Tab 1 borns as master
  3. (984ms) Tab 2 borns as master
  4. (1110ms) Tab 1 is demoted to slave (probably received the Tab 2 event)
  5. (1196ms) Tab 2 is demoted to slave (probably received the Tab 1 event)

Also happened:

  1. Refresh Tabs 1 and 2
  2. (274ms) Tab 2 is promoted to master
  3. (283ms) Tab 2 is demoted to slave (don't know why)
  4. (301ms) Tab 1 borns as master
  5. (364ms) Tab 1 is demoted to slave (probably received the Tab 2 event)

from crosstab.

tejacques avatar tejacques commented on July 26, 2024

@tchakabam the issue with locking via localStorage (which is how I assume you are doing "locking") is that the way localStorage propagation is implemented in most browsers is not atomic and cannot be used to implement a proper mutex. You can always end up in situation where both tabs believe they hold the lock. Because of that, I gave up on the idea of locking and instead went for more of a distributed system that notifies other tabs of its own state.

From this issue it seems like there is a bug or race condition. It's likely that it has to do with the way events are cited on unload as the tab refresh is done. I'll look into it.

from crosstab.

tchakabam avatar tchakabam commented on July 26, 2024

@tejacques you shouldn't give up, there always is a way ;)

you are right localStorage stuff is not atomic, but using some tricks you can do something that does locking.

i am not sure it does exactly cover 100% of all cases and is completely fail-proof, but we use an implementation based on this article [1]. See implementation [2].

[1] http://balpha.de/2012/03/javascript-concurrency-and-locking-the-html5-localstorage/

[2] https://github.com/tchakabam/LockableStorage

from crosstab.

tejacques avatar tejacques commented on July 26, 2024

@tchakabam hate to break it to you, but I originally based crosstab's first implementation off the very same article. While the idea in theory works on a regular system with shared memory, this is not the way localStorage engines work these days in most browsers. The following browsers: Chrome, Firefox, Safari, IE, Opera end up sending the change in localStorage to the process via message passing, and it is not applied until the event loop is free (essentially when the StorageEvent is processed). Because of this, it's possible (and I've verified it on the above browsers) for every tab to believe it holds the lock by doing some heavy computing in a synchronous block.

I did not test your implementation, but if you open ~10-20 tabs on a test page that just locks and bumps a counter a set number of times in localStorage, in a synchronous block with some king of busy loop you will likely discover it the final number isn't correct when you refresh all tabs, even accounting for the right delay to use in the algorithm.

Because of that, and the fact that locking with a delay can add a significant chunk of latency, I removed it and tried to go for a design that didn't have shared state mutatable by more than one tab.

from crosstab.

tchakabam avatar tchakabam commented on July 26, 2024

ah ok very interesting :) thanks a lot for sharing that insight!

no i really didn't know that and probably trusted too much that the assumptions around the mutex algorithm were applicable here.

but can one basically say that they are applicable under the condition where no heavy computation is done on the main loop in any tab?

from crosstab.

tejacques avatar tejacques commented on July 26, 2024

Sadly no :( that was just a trick I discovered to force the race condition. The article's method becomes more reliable the longer you wait to verify you still hold the lock (and nobody else was trying to acquire it), but that adds a lot of latency anytime you want to do something which is a killer.

from crosstab.

tejacques avatar tejacques commented on July 26, 2024

This is a really fascinating issue. This is due to a race condition, where when starting up, each tab has an inconsistent state of the other tabs, and they may disagree about who won the election (specifically, each believes it has won the election).

Here's an example of what happened:

  • Tabs 1-4 all start at the same time and read from localStorage at the same time, finding nothing there
  • They each broadcast and updateTab status in their keepalive loop.
  • This causes each of them to run a masterTabElection, wherein their own tab wins, because it is the only tab they know of
  • Each of the tabPromoted events are sent out and processed in some order, in this case: Tab4, Tab3, Tab1, Tab2. However, Tab2 will process it's own event FIRST, meaning it will see Tab1 as the most recently promoted tab, while ALL other tabs see Tab2 as the most recently promoted tab.
  • This leads to no master tab

The fix is actually fairly straightforward -- any time a promotion occurs, if your tab is actually more senior, respond by broadcasting a tabPromoted event with your own id, and continue being master.

There is sadly a fairly real downside to this, which is that if you have a lot of tabs open to the same domain, and close them all, then bring them all back, they will likely all begin believing they are the master, and do the regular masterTab setup stuff. Ideally we could come up with a better solution here, where not all of them end up running all of this. Perhaps adding in a random number of event loop deferals before running the election would work? That way they would be more likely to receive updates.

from crosstab.

nevcos avatar nevcos commented on July 26, 2024

Why the case where you "bring them (tabs) all back" is different from the first case?
And, can't we use the "tabId" number to untie instead of using an random number?

Maybe there should be a period of time where an promotion could be accepted/rejected by the other tabs, what did you think? Example:

  • Tab1 (id=1) proposes himself as Master
  • Tab2 (id=2) proposes himself as Master
  • Tab1 receives Tab2 propose and rejects it because id 1< id 2 and t/2 timeout not reached
  • Tab2 receives Tab1 propose and accepts it because id 1 < id 2
  • Tab2 received the rejection so it will continue as slave
  • Tab1 received the accept from Tab2
  • Tab1 reaches the timeout (t) for receiving feedback
  • Tab1 is promoted to Master and send the promoted event
  • Tab2 receives Tab1 promoted event

Was thinking that this is very similar with the Bully algorithm, maybe it's a good option to use it.

from crosstab.

tejacques avatar tejacques commented on July 26, 2024

It sort of uses the bully algorithm now, just without correcting or bullying out the other master if they try to claim it -- that will definitely be added.

Your idea to introduce latency works, but I try be very careful about not introducing latency wherever possible, because that's going to slow down the page. Even 50ms is a pretty big deal. It's also really only an issue in the case of refreshing all open tabs. Maybe the right answer here is to use that fact to our advantage -- if the tab is not currently active/visible, it can more safely introduce latency. If that ends up being too complicated I think I will just use several event loop deferrals.

from crosstab.

tchakabam avatar tchakabam commented on July 26, 2024

@tejacques so basically you would say that the current Crosstab implementation to determine a master tab is more reliable than what "Lockable Storage" does?

Does Crosstab also reliably detect closing tabs (in order to redetermine the master role) and what do you do when you don't get a 'closed' even from a tab. Are you able to detect inconsistent state and clean it up reliably?

Sorry if I am kind off hi-jacking this discussion to gain general knowledge .. ;)

from crosstab.

tchakabam avatar tchakabam commented on July 26, 2024

and about the current issue and in general fixing inconsistent state: i tried to push a heartbeat from the master to all the tabs every 3 seconds or so to at least detect such a problem. when a child tab doesn't receive the heartbeat (using a timeout over 3+ seconds) it should try to acquire the master role (that is not Crosstab master role, but master role in my app). however this fails a lot because it seems that sometimes crosstab propagation actually takes really long (apparently sometimes over double of the heartbeat period). can you explain that?

from crosstab.

tejacques avatar tejacques commented on July 26, 2024

@tchakabam I mean the fact that this issue exists means there are still issues :), but the general idea of a distributed algorithm without shared state should be more reliable than trying to implement a mutex given the restrictions that localStorage has. That was the conclusion I came to and why I am trying to solve it that way. I also wanted to be very careful about latency, as I've mentioned, and concluded that going the locking route simply wasn't viable given those constraints.

The tabs do evict stale tabs that have not been updated for TAB_TIMEOUT (5s), they heartbeat every TAB_KEEPALIVE (3s).

I do not know why you would be seeing propagation take a long time unless a tab was in a busy/blocking loop which prevented the heartbeat. I'd have to see your code and the issue to say for sure.

from crosstab.

tchakabam avatar tchakabam commented on July 26, 2024

OK cool. Thanks for this explanation!

So basically a broken state would get cleaned up via this distributed algorithm in about a couple of seconds?

Doesn't that mean that this issue here should actually "fix itself" in a few seconds after the tabs reloading as they figure out nobody is the master since there is no heartbeat propagated?

Maybe adding some randomization to the timeout durations in each tabs will help for making it also work with a concurrent reload of all tabs?

I found that this whole part of master role distribution (which can be very useful to applications) is not really documented, although it is probably simple by using the events here [1] which are usually only for "under-the-hood".

So I can just listen to these events as an API consumer? How about collision with application events with same naming?

I would probably like to contribute some API documentation ;) and maybe some other stuff ...

[1] https://github.com/tejacques/crosstab/blob/master/src/crosstab.js#L180-L186

from crosstab.

tejacques avatar tejacques commented on July 26, 2024

Yes -- that's pretty much spot on. There might originally be some incorrect state, but because they're all talking to each other, they know "hey this thing is wrong, here's the right thing", and it gets corrected in a few cycles.

Documentation is severely lacking. As part of the v1.0.0 release I'd love to have everything thoroughly documented with examples and explanations, but right now I'm just trying to get it to work correctly. If you'd like to help contribute that would be very appreciated!

All that said, I should have a fix for this issue in tonight.

from crosstab.

tejacques avatar tejacques commented on July 26, 2024

@nevcos This should be resolved in master. Will release an update to npm soon.

from crosstab.

nevcos avatar nevcos commented on July 26, 2024

Thank you @tejacques!
I've done some tests and seems to be working well. Good job 👍 !

from crosstab.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.