Git Product home page Git Product logo

Comments (5)

shlomi-noach avatar shlomi-noach commented on August 19, 2024

To begin initial experimentation, I'm turning binlog_format='ROW' on my-test-machine and am waiting for the binlog to fill up. It should take around 1:30 hours. Will then restore binlog_format='STATEMENT' on this server.

I will take the ROW binlog offline, but on same machine, and will measure how long it takes to read plus parse it from within a go app that:

  • invokes mysqlbinlog --verbose --base64-output=DECODE-ROWS
  • parses all statements to extract schema.table
  • parses queries of a given table

We must have the parsing time << 1:30 hours. Like, really shorter than that. If it is more than 50% of the time (i.e. more than 45 minutes) I consider this to be a failure. I would like to see it run in 10% of the time, at most.

from gh-ost.

shlomi-noach avatar shlomi-noach commented on August 19, 2024

mysql-bin.012323 is on shlomi-noach@my-test-machine:~/tmp. It is a 1G file generated between Mar 21 03:47 and Mar 21 04:33, ie generated throughout 46 minutes of production traffic.

Our aim is to parse it in much less than 23 minutes.

from gh-ost.

shlomi-noach avatar shlomi-noach commented on August 19, 2024

Initial parsing experiment (#2) running on my-test-machine on an offline binlog file, while server is running and replicating (so under normal load) shows:

shlomi-noach@my-test-machine:~$ time /tmp/gh-osc --debug --mysql-basedir=/usr --mysql-datadir=/home/shlomi-noach/tmp/ --binlog-file=mysql-bin.012323 --internal-experiment=true 2> /dev/null

real    2m40.214s
user    3m28.010s
sys     0m23.715s

This is yet not the full blown parsing (right now only parsing start pos, end_log_pos, statement type, schema & table)
On the other hand, we subject every line of output to three regexes, since there is no (not yet?) automaton in place, so we just blindly and aggressively parse everything. And this means we're putting a lot of load.

2m.40s is 5% of the time it took to generate the file, and this is a good result!

from gh-ost.

shlomi-noach avatar shlomi-noach commented on August 19, 2024

BTW output looks like this:

2016-03-23 04:46:32 INFO starting gh-osc
2016-03-23 04:46:32 DEBUG starting experiment
2016-03-23 04:46:32 DEBUG Next chunk range 4 - 33554436
2016-03-23 04:46:32 DEBUG execCmd: /usr/bin/mysqlbinlog --verbose --base64-output=DECODE-ROWS --start-position=4 --stop-position=33554436 /home/shlomi-noach/tmp/mysql-bin.012323
2016-03-23 04:46:32 DEBUG [/tmp/gh-osc-process-cmd-130566840]
2016-03-23 04:46:33 DEBUG entry: {LogPos:322 EndLogPos:1070 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:1264 EndLogPos:1362 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:1583 EndLogPos:1775 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:2008 EndLogPos:2755 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:2949 EndLogPos:3047 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:3258 EndLogPos:3840 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:3925 EndLogPos:4095 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:4316 EndLogPos:4508 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:4741 EndLogPos:4909 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:5117 EndLogPos:5182 StatementType:INSERT DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:5415 EndLogPos:5583 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:5777 EndLogPos:5875 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:6119 EndLogPos:6515 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:6723 EndLogPos:6841 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:7055 EndLogPos:7135 StatementType:INSERT DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:7328 EndLogPos:7504 StatementType:INSERT DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:7725 EndLogPos:7967 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:8200 EndLogPos:9056 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:9289 EndLogPos:10145 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:10378 EndLogPos:10538 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:10771 EndLogPos:10931 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:11139 EndLogPos:11257 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:11468 EndLogPos:12050 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:12135 EndLogPos:12305 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:12511 EndLogPos:13292 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:13376 EndLogPos:13466 StatementType:INSERT DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:13710 EndLogPos:14116 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:14324 EndLogPos:14442 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:14656 EndLogPos:14760 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:14993 EndLogPos:15903 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:16096 EndLogPos:16258 StatementType:INSERT DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:16479 EndLogPos:16671 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:16885 EndLogPos:16989 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:17187 EndLogPos:17312 StatementType:INSERT DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:17396 EndLogPos:17489 StatementType:INSERT DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:17573 EndLogPos:17667 StatementType:INSERT DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:17757 EndLogPos:18377 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:18482 EndLogPos:18946 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:19179 EndLogPos:19872 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:20065 EndLogPos:20236 StatementType:INSERT DatabaseName:xxxxxxxx TableName:xxxxxxxx}
2016-03-23 04:46:33 DEBUG entry: {LogPos:20480 EndLogPos:20948 StatementType:UPDATE DatabaseName:xxxxxxxx TableName:xxxxxxxx}

from gh-ost.

shlomi-noach avatar shlomi-noach commented on August 19, 2024

Closing this for now, having nice success with #5

from gh-ost.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.