Git Product home page Git Product logo

Comments (12)

danielchalef avatar danielchalef commented on August 23, 2024 1

Good point. An exponential backoff may work better here. I'm loath to increase the max backoff time beyond ~10 seconds, preferring the client time out and retry. See #330

from zep.

nicoeiris11 avatar nicoeiris11 commented on August 23, 2024 1

@danielchalef thank you so much for releasing a new version with the changes so fast.

I'll let you know as soon as I have updates from the new tag testing.

from zep.

danielchalef avatar danielchalef commented on August 23, 2024 1

The Postgres website has good guidance on tuning. I've also found this tool useful: https://pgtune.leopard.in.ua/

from zep.

danielchalef avatar danielchalef commented on August 23, 2024

Thanks for raising this. We candidly did not design user metadata handling with high concurrency requirements in mind. It would be helpful to understand how you're using user metadata. What data are you storing in the user object?

from zep.

nicoeiris11 avatar nicoeiris11 commented on August 23, 2024

Hi @danielchalef , thanks for your quick response.

My data flow is the following:

  • Client requests a new user attaching a text file associated to it (this file contains 2 sections to be summarized at the same time)
  • My FastAPI service creates the user in zep with empty metadata and triggers 2 Celery task associated to the user (each async task will do heavy the processing of summarizing each section).
  • At this point, I queue both Celery tasks in the "main" thread and update user metadata saving both Celery tasks ID.
  • Each Celery async task process summary of the corresponding section and when it's done updates the user metadata in Zep with the resulting summary.

So in the user metadata I save:

  1. User info
  2. Celery task 1 ID
  3. Celery task 2 ID
  4. Celery task 1 summary result (section 1 of the file)
  5. Celery task 2 summary result (section 2 of the file)

What happens is that sometimes both async tasks or even "main" thread updates, and one of the async tasks attempt to update at the same time and the app breaks because of APIError (produced by the advisory lock in pg db).

Ideally, one of the concurrent updates should keep waiting for the release to happen. Or, at least, the API should provide a method like zep.user(user_id).is_locked() to avoid the APIError and wait.

from zep.

danielchalef avatar danielchalef commented on August 23, 2024

Candidate fix in #329

from zep.

danielchalef avatar danielchalef commented on August 23, 2024

@nicoeiris11 Zep v0.24.0 includes an experimental approach to locking user metadata that should cope better with high-concurrency updates to the same user record. Please try it out and let me know if this fixes your issue. We'll apply this fix more widely if so.

from zep.

nicoeiris11 avatar nicoeiris11 commented on August 23, 2024

@danielchalef, thank you so much for the quick fix.

I tested v.0.24.0 and the number of failed updates decreased significantly, but still had some cases in which Zep failed after the 3 retries. Is it possible to experiment with a retry policy with exponential backoff and more time between attempts?
For an environment with a heavy load and multiple users updated concurrently, 200ms doesn't seem to be enough time for locks to be released. What about something like starting with 5 seconds with exponential backoff?

I appreciate your time and dedication to improving the user experience/development of the tool.

from zep.

danielchalef avatar danielchalef commented on August 23, 2024

v0.25.0 - Use Exponential Backoff for Metadata Lock Fails is building. Please let me know your thoughts once you've had a chance to check it out.

from zep.

nicoeiris11 avatar nicoeiris11 commented on August 23, 2024

@danielchalef I want to confirm that with v0.25.0 my load test passed 100% ok.

There will always be a threshold of requests load that will cause Zep to return time out due to the number of concurrent DB operations. But from my last experiments, I didn't notice ApiError due to locks anymore (only time outs when I manually force a very high load scenario).

I want to thank you for your hard work and dedication to maintaining the repo and addressing developers' issues.
Best regards!

from zep.

danielchalef avatar danielchalef commented on August 23, 2024

@nicoeiris11 Great to hear! If you're using Zep's default docker-compose setup, the Postgres instance is not tuned for production use. There is however plenty of literature online around sizing and tuning Postgres implementations.

from zep.

nicoeiris11 avatar nicoeiris11 commented on August 23, 2024

@danielchalef Yes, I'm using docker-compose in production pointing to stable tag release.
Any specific recommendation/resource about pg tuning to better leverage this docker service?

from zep.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.