Git Product home page Git Product logo

Comments (10)

baileygm avatar baileygm commented on June 9, 2024 1

That's correct, no other extra flags were passed

I'll try to retest with the setting you requested before the end of the week

from gpbackup-s3-plugin.

shivzone avatar shivzone commented on June 9, 2024

In our internal experiments we noticed that wth table sizes in the range (10MB-3GB), setting restore_multipart_chunksize to 10MB is more optimal. Can you pass both the gpbackup gprestore flags you used as well.
In the mean time we'll continue evaluating

from gpbackup-s3-plugin.

baileygm avatar baileygm commented on June 9, 2024

These are the settings we are using

backup_max_concurrent_requests: 6
backup_multipart_chunksize: 100M
restore_max_concurrent_requests: 6
restore_multipart_chunksize: 100M
encryption: off

from gpbackup-s3-plugin.

baileygm avatar baileygm commented on June 9, 2024

our tables sizes vary from 1 MB to 217GB

from gpbackup-s3-plugin.

shivzone avatar shivzone commented on June 9, 2024

Just to clarify, no extra flags were passed during both gpbackup or gprestore (--single-data-file , --no-compression, --include-table, --jobs, etc) ?
Also, can you run gprestore by resetting restore_multipart_chunksize to 10MB.
(Note: Also just to clarify make sure you use 10MB or 100MB and not 10M or 100M)

from gpbackup-s3-plugin.

baileygm avatar baileygm commented on June 9, 2024

It appears that there were errors with my config so my tests were all using the default setting for restore_multipart_chunksize

I have now fixed that and run some tests with restoration of a 40G schema

with S3 Plugin version 1.160.0

chunk size = 5MB time = 3m45s
chunk size = 10MB time = 3m17s
chunk size = 100MB time = 4m10s

with S3 Plugin version 1.17.0

chunk size = 5MB time = 3m23s
chunk size = 10MB time = 3m46s
chunk size = 20MB time = 3m23s

I cannot draw any conclusions from these brief tests but I am surprised to see a much closer comparison in performance between the two plug versions

I will try to repeat the tests next week for our full system

from gpbackup-s3-plugin.

shivzone avatar shivzone commented on June 9, 2024

Yes please. In our internal experiments based on workloads switching chunk size have always helped out. You can start out with 500MB chunk size (which is most suited when average table size is large), and alter them to a lower number 10-50MB when your median table size is small (~1GB).
We do have a separate effort/feature where we plan to dynamically modify the chunk size based on the size of each table being backed up or restored. That will be made available in a future release

from gpbackup-s3-plugin.

shivzone avatar shivzone commented on June 9, 2024

@baileygm if you haven't noticed these issues anymore with configuration updates can I close this issue ?

from gpbackup-s3-plugin.

baileygm avatar baileygm commented on June 9, 2024

I am unfortunately still seeing the same poor performance when using s3 plugin 1.17.0
I have been working on other tasks but am returning to this issue now and will update this ticket with any more information

from gpbackup-s3-plugin.

baileygm avatar baileygm commented on June 9, 2024

I can confirm now that I have established settings which mean I am getting the same performance from v1.17.0 as I was from v1.16.0. From my tests it seems that the most critical parameter was restore_multipart_chunksize

If this is set above 100MB then restoration slows to a very low rate - the sweetspot for our system appears to be for it to be set between 5MB and 50MB

I think the problem with the validation check was not helping my testing so it may be worth creating a new v1.18.0 release incorporating the fix

Thanks for your help and advice

from gpbackup-s3-plugin.

Related Issues (6)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.