Comments (10)
That's correct, no other extra flags were passed
I'll try to retest with the setting you requested before the end of the week
from gpbackup-s3-plugin.
In our internal experiments we noticed that wth table sizes in the range (10MB-3GB), setting restore_multipart_chunksize
to 10MB is more optimal. Can you pass both the gpbackup
gprestore
flags you used as well.
In the mean time we'll continue evaluating
from gpbackup-s3-plugin.
These are the settings we are using
backup_max_concurrent_requests: 6
backup_multipart_chunksize: 100M
restore_max_concurrent_requests: 6
restore_multipart_chunksize: 100M
encryption: off
from gpbackup-s3-plugin.
our tables sizes vary from 1 MB to 217GB
from gpbackup-s3-plugin.
Just to clarify, no extra flags were passed during both gpbackup or gprestore (--single-data-file
, --no-compression
, --include-table
, --jobs
, etc) ?
Also, can you run gprestore
by resetting restore_multipart_chunksize
to 10MB.
(Note: Also just to clarify make sure you use 10MB or 100MB and not 10M or 100M)
from gpbackup-s3-plugin.
It appears that there were errors with my config so my tests were all using the default setting for restore_multipart_chunksize
I have now fixed that and run some tests with restoration of a 40G schema
with S3 Plugin version 1.160.0
chunk size = 5MB time = 3m45s
chunk size = 10MB time = 3m17s
chunk size = 100MB time = 4m10s
with S3 Plugin version 1.17.0
chunk size = 5MB time = 3m23s
chunk size = 10MB time = 3m46s
chunk size = 20MB time = 3m23s
I cannot draw any conclusions from these brief tests but I am surprised to see a much closer comparison in performance between the two plug versions
I will try to repeat the tests next week for our full system
from gpbackup-s3-plugin.
Yes please. In our internal experiments based on workloads switching chunk size have always helped out. You can start out with 500MB chunk size (which is most suited when average table size is large), and alter them to a lower number 10-50MB when your median table size is small (~1GB).
We do have a separate effort/feature where we plan to dynamically modify the chunk size based on the size of each table being backed up or restored. That will be made available in a future release
from gpbackup-s3-plugin.
@baileygm if you haven't noticed these issues anymore with configuration updates can I close this issue ?
from gpbackup-s3-plugin.
I am unfortunately still seeing the same poor performance when using s3 plugin 1.17.0
I have been working on other tasks but am returning to this issue now and will update this ticket with any more information
from gpbackup-s3-plugin.
I can confirm now that I have established settings which mean I am getting the same performance from v1.17.0 as I was from v1.16.0. From my tests it seems that the most critical parameter was restore_multipart_chunksize
If this is set above 100MB then restoration slows to a very low rate - the sweetspot for our system appears to be for it to be set between 5MB and 50MB
I think the problem with the validation check was not helping my testing so it may be worth creating a new v1.18.0 release incorporating the fix
Thanks for your help and advice
from gpbackup-s3-plugin.
Related Issues (6)
- Dependency download fails with Go version 1.12 HOT 3
- `make build` and also `gpbackup-s3-plugin --version` cause error. HOT 1
- upload failed when `backup_multipart_chunksize` is much smaller than the size of compressed data in each segment HOT 1
- s3 plugin doesn't work with s3-compatible storage HOT 2
- faild in build s3 plugin HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gpbackup-s3-plugin.