Comments (5)
To begin initial experimentation, I'm turning binlog_format='ROW'
on my-test-machine
and am waiting for the binlog to fill up. It should take around 1:30
hours. Will then restore binlog_format='STATEMENT'
on this server.
I will take the ROW
binlog offline, but on same machine, and will measure how long it takes to read plus parse it from within a go
app that:
- invokes
mysqlbinlog --verbose --base64-output=DECODE-ROWS
- parses all statements to extract
schema
.table
- parses queries of a given table
We must have the parsing time << 1:30
hours. Like, really shorter than that. If it is more than 50% of the time (i.e. more than 45
minutes) I consider this to be a failure. I would like to see it run in 10%
of the time, at most.
from gh-ost.
mysql-bin.012323
is on shlomi-noach@my-test-machine:~/tmp
. It is a 1G
file generated between Mar 21 03:47
and Mar 21 04:33
, ie generated throughout 46
minutes of production traffic.
Our aim is to parse it in much less than 23
minutes.
from gh-ost.
Initial parsing experiment (#2) running on my-test-machine
on an offline binlog file, while server is running and replicating (so under normal load) shows:
shlomi-noach@my-test-machine:~$ time /tmp/gh-osc --debug --mysql-basedir=/usr --mysql-datadir=/home/shlomi-noach/tmp/ --binlog-file=mysql-bin.012323 --internal-experiment=true 2> /dev/null
real 2m40.214s
user 3m28.010s
sys 0m23.715s
This is yet not the full blown parsing (right now only parsing start pos, end_log_pos, statement type, schema & table)
On the other hand, we subject every line of output to three regexes, since there is no (not yet?) automaton in place, so we just blindly and aggressively parse everything. And this means we're putting a lot of load.
2m.40s
is 5%
of the time it took to generate the file, and this is a good result!
from gh-ost.
BTW output looks like this:
2016-03-23 04:46:32 INFO starting gh-osc
2016-03-23 04:46:32 DEBUG starting experiment
2016-03-23 04:46:32 DEBUG Next chunk range 4 - 33554436
2016-03-23 04:46:32 DEBUG execCmd: /usr/bin/mysqlbinlog --verbose --base64-output=DECODE-ROWS --start-position=4 --stop-position=33554436 /home/shlomi-noach/tmp/mysql-bin.012323
2016-03-23 04:46:32 DEBUG [/tmp/gh-osc-process-cmd-130566840]
2016-03-23 04:46:33 DEBUG entry: {LogPos:322 EndLogPos:1070 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:1264 EndLogPos:1362 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:1583 EndLogPos:1775 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:2008 EndLogPos:2755 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:2949 EndLogPos:3047 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:3258 EndLogPos:3840 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:3925 EndLogPos:4095 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:4316 EndLogPos:4508 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:4741 EndLogPos:4909 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:5117 EndLogPos:5182 StatementType:INSERT DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:5415 EndLogPos:5583 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:5777 EndLogPos:5875 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:6119 EndLogPos:6515 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:6723 EndLogPos:6841 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:7055 EndLogPos:7135 StatementType:INSERT DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:7328 EndLogPos:7504 StatementType:INSERT DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:7725 EndLogPos:7967 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:8200 EndLogPos:9056 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:9289 EndLogPos:10145 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:10378 EndLogPos:10538 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:10771 EndLogPos:10931 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:11139 EndLogPos:11257 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:11468 EndLogPos:12050 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:12135 EndLogPos:12305 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:12511 EndLogPos:13292 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:13376 EndLogPos:13466 StatementType:INSERT DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:13710 EndLogPos:14116 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:14324 EndLogPos:14442 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:14656 EndLogPos:14760 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:14993 EndLogPos:15903 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:16096 EndLogPos:16258 StatementType:INSERT DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:16479 EndLogPos:16671 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:16885 EndLogPos:16989 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:17187 EndLogPos:17312 StatementType:INSERT DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:17396 EndLogPos:17489 StatementType:INSERT DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:17573 EndLogPos:17667 StatementType:INSERT DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:17757 EndLogPos:18377 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:18482 EndLogPos:18946 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:19179 EndLogPos:19872 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:20065 EndLogPos:20236 StatementType:INSERT DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:20480 EndLogPos:20948 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
from gh-ost.
Closing this for now, having nice success with #5
from gh-ost.
Related Issues (20)
- "attempt-instant-ddl" should support lock_wait_timeout HOT 1
- max-lag-millis parameter description is misleading when running on master HOT 3
- When will the throttled state be triggered HOT 1
- Does gh-ost support distributed databases? HOT 2
- binlog syncer closes unexpectedly
- PR HOT 2
- Feature request: support setting the `wait_timeout` of `gh-ost` HOT 1
- cut-over locks not released when gh-ost pauses mid-cut-over
- Aliii HOT 1
- FATAL invalid sequence 8 != 1
- Cannot log in to mysql DB by ghost user
- collect table statistics before renaming table
- Clarification on `aurora_enable_repl_bin_log_filtering` in Aurora HOT 6
- RDS/Aurora Docs clarifications.
- skip delete events of changelog table in binlog listener
- [Suggestion] Apply simple validation to alter command
- ERROR Error 1146: Table '<db>._temp_no_ghc' doesn't exist HOT 2
- Add atomic cut over function that renames locked tables in the same session
- UCS2 to UTF8MB4 Encoding error
- Handling Slow Queries Without Indexes in gh-ost
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gh-ost.