Having run tests with Greenplum 5.x and 6.x with the latest s3 plugin 1.17.0 I have no

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Performance issue with v1.17.0 about gpbackup-s3-plugin HOT 10 CLOSED

greenplum-db commented on June 9, 2024

Performance issue with v1.17.0

from gpbackup-s3-plugin.

Comments (10)

baileygm commented on June 9, 2024 1

That's correct, no other extra flags were passed

I'll try to retest with the setting you requested before the end of the week

from gpbackup-s3-plugin.

shivzone commented on June 9, 2024

In our internal experiments we noticed that wth table sizes in the range (10MB-3GB), setting restore_multipart_chunksize to 10MB is more optimal. Can you pass both the gpbackup gprestore flags you used as well.
In the mean time we'll continue evaluating

from gpbackup-s3-plugin.

baileygm commented on June 9, 2024

These are the settings we are using

backup_max_concurrent_requests: 6
backup_multipart_chunksize: 100M
restore_max_concurrent_requests: 6
restore_multipart_chunksize: 100M
encryption: off

from gpbackup-s3-plugin.

baileygm commented on June 9, 2024

our tables sizes vary from 1 MB to 217GB

from gpbackup-s3-plugin.

shivzone commented on June 9, 2024

Just to clarify, no extra flags were passed during both gpbackup or gprestore (--single-data-file , --no-compression, --include-table, --jobs, etc) ?
Also, can you run gprestore by resetting restore_multipart_chunksize to 10MB.
(Note: Also just to clarify make sure you use 10MB or 100MB and not 10M or 100M)

from gpbackup-s3-plugin.

baileygm commented on June 9, 2024

It appears that there were errors with my config so my tests were all using the default setting for restore_multipart_chunksize

I have now fixed that and run some tests with restoration of a 40G schema

with S3 Plugin version 1.160.0

chunk size = 5MB time = 3m45s
chunk size = 10MB time = 3m17s
chunk size = 100MB time = 4m10s

with S3 Plugin version 1.17.0

chunk size = 5MB time = 3m23s
chunk size = 10MB time = 3m46s
chunk size = 20MB time = 3m23s

I cannot draw any conclusions from these brief tests but I am surprised to see a much closer comparison in performance between the two plug versions

I will try to repeat the tests next week for our full system

from gpbackup-s3-plugin.

shivzone commented on June 9, 2024

Yes please. In our internal experiments based on workloads switching chunk size have always helped out. You can start out with 500MB chunk size (which is most suited when average table size is large), and alter them to a lower number 10-50MB when your median table size is small (~1GB).
We do have a separate effort/feature where we plan to dynamically modify the chunk size based on the size of each table being backed up or restored. That will be made available in a future release

from gpbackup-s3-plugin.

shivzone commented on June 9, 2024

@baileygm if you haven't noticed these issues anymore with configuration updates can I close this issue ?

from gpbackup-s3-plugin.

baileygm commented on June 9, 2024

I am unfortunately still seeing the same poor performance when using s3 plugin 1.17.0
I have been working on other tasks but am returning to this issue now and will update this ticket with any more information

from gpbackup-s3-plugin.

baileygm commented on June 9, 2024

I can confirm now that I have established settings which mean I am getting the same performance from v1.17.0 as I was from v1.16.0. From my tests it seems that the most critical parameter was restore_multipart_chunksize

If this is set above 100MB then restoration slows to a very low rate - the sweetspot for our system appears to be for it to be set between 5MB and 50MB

I think the problem with the validation check was not helping my testing so it may be worth creating a new v1.18.0 release incorporating the fix

Thanks for your help and advice

from gpbackup-s3-plugin.

Performance issue with v1.17.0 about gpbackup-s3-plugin HOT 10 CLOSED

Comments (10)

with S3 Plugin version 1.160.0

with S3 Plugin version 1.17.0

Related Issues (6)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent