sripathikrishnan / redis-rdb-tools Goto Github PK

Parse Redis dump.rdb files, Analyze Memory, and Export Data to JSON

License: MIT License

Python 100.00%

redis-rdb-tools's Introduction

Parse Redis dump.rdb files, Analyze Memory, and Export Data to JSON

Rdbtools is a parser for Redis' dump.rdb files. The parser generates events similar to an xml sax parser, and is very efficient memory wise.

In addition, rdbtools provides utilities to :

Generate a Memory Report of your data across all databases and keys
Convert dump files to JSON
Compare two dump files using standard diff tools

Rdbtools is written in Python, though there are similar projects in other languages. See FAQs for more information.

See https://rdbtools.com for a gui to administer redis, commercial support, and other enterprise features.

Installing rdbtools

Pre-Requisites :

python-lzf is optional but highly recommended to speed up parsing.
redis-py is optional and only needed to run test cases.

To install from PyPI (recommended) :

pip install rdbtools python-lzf

To install from source :

git clone https://github.com/sripathikrishnan/redis-rdb-tools
cd redis-rdb-tools
sudo python setup.py install

Command line usage examples

Every run of RDB Tool requires to specify a command to indicate what should be done with the parsed RDB data. Valid commands are: json, diff, justkeys, justkeyvals and protocol.

JSON from a two database dump:

> rdb --command json /var/redis/6379/dump.rdb

[{
"user003":{"fname":"Ron","sname":"Bumquist"},
"lizards":["Bush anole","Jackson's chameleon","Komodo dragon","Ground agama","Bearded dragon"],
"user001":{"fname":"Raoul","sname":"Duke"},
"user002":{"fname":"Gonzo","sname":"Dr"},
"user_list":["user003","user002","user001"]},{
"baloon":{"helium":"birthdays","medical":"angioplasty","weather":"meteorology"},
"armadillo":["chacoan naked-tailed","giant","Andean hairy","nine-banded","pink fairy"],
"aroma":{"pungent":"vinegar","putrid":"rotten eggs","floral":"roses"}}]

Filter parsed output

Only process keys that match the regex, and only print key and values:

> rdb --command justkeyvals --key "user.*" /var/redis/6379/dump.rdb

user003 fname Ron,sname Bumquist,
user001 fname Raoul,sname Duke,
user002 fname Gonzo,sname Dr,
user_list user003,user002,user001

Only process hashes starting with "a", in database 2:

> rdb -c json --db 2 --type hash --key "a.*" /var/redis/6379/dump.rdb

[{},{
"aroma":{"pungent":"vinegar","putrid":"rotten eggs","floral":"roses"}}]

Converting dump files to JSON

The json command output is UTF-8 encoded JSON. By default, the callback try to parse RDB data using UTF-8 and escape non 'ASCII printable' characters with the \U notation, or non UTF-8 parsable bytes with \x. Attempting to decode RDB data can lead to binary data curroption, this can be avoided by using the --escape raw option. Another option, is to use -e base64 for Base64 encoding of binary data.

Parse the dump file and print the JSON on standard output:

> rdb -c json /var/redis/6379/dump.rdb

[{
"Citat":["B\u00e4ttre sent \u00e4n aldrig","Bra karl reder sig sj\u00e4lv","Man ska inte k\u00f6pa grisen i s\u00e4cken"],
"bin_data":"\\xFE\u0000\u00e2\\xF2"}]

Parse the dump file to raw bytes and print the JSON on standard output:

> rdb -c json /var/redis/6379/dump.rdb --escape raw

[{
"Citat":["B\u00c3\u00a4ttre sent \u00c3\u00a4n aldrig","Bra karl reder sig sj\u00c3\u00a4lv","Man ska inte k\u00c3\u00b6pa grisen i s\u00c3\u00a4cken"],
"bin_data":"\u00fe\u0000\u00c3\u00a2\u00f2"}]

Generate Memory Report

Running with the -c memory generates a CSV report with the approximate memory used by that key. --bytes C and '--largest N can be used to limit output to keys larger than C bytes, or the N largest keys.

> rdb -c memory /var/redis/6379/dump.rdb --bytes 128 -f memory.csv
> cat memory.csv

database,type,key,size_in_bytes,encoding,num_elements,len_largest_element
0,list,lizards,241,quicklist,5,19
0,list,user_list,190,quicklist,3,7
2,hash,baloon,138,ziplist,3,11
2,list,armadillo,231,quicklist,5,20
2,hash,aroma,129,ziplist,3,11

The generated CSV has the following columns - Database Number, Data Type, Key, Memory Used in bytes and RDB Encoding type. Memory usage includes the key, the value and any other overheads.

Note that the memory usage is approximate. In general, the actual memory used will be slightly higher than what is reported.

You can filter the report on keys or database number or data type.

The memory report should help you detect memory leaks caused by your application logic. It will also help you optimize Redis memory usage.

Find Memory used by a Single Key

Sometimes you just want to find the memory used by a particular key, and running the entire memory report on the dump file is time consuming.

In such cases, you can use the redis-memory-for-key command:

> redis-memory-for-key person:1

> redis-memory-for-key -s localhost -p 6379 -a mypassword person:1

Key 			person:1
Bytes				111
Type				hash
Encoding			ziplist
Number of Elements		2
Length of Largest Element	8

NOTE :

This was added to redis-rdb-tools version 0.1.3
This command depends redis-py package

Comparing RDB files

First, use the --command diff option, and pipe the output to standard sort utility

> rdb --command diff /var/redis/6379/dump1.rdb | sort > dump1.txt
> rdb --command diff /var/redis/6379/dump2.rdb | sort > dump2.txt

Then, run your favourite diff program

> kdiff3 dump1.txt dump2.txt

To limit the size of the files, you can filter on keys using the --key option

Emitting Redis Protocol

You can convert RDB file into a stream of redis protocol using the protocol command.

> rdb -c protocol /var/redis/6379/dump.rdb

*4
$4
HSET
$9
users:123
$9
firstname
$8
Sripathi

You can pipe the output to netcat and re-import a subset of the data. For example, if you want to shard your data into two redis instances, you can use the --key flag to select a subset of data, and then pipe the output to a running redis instance to load that data. Read Redis Mass Insert for more information on this.

When printing protocol output, the --escape option can be used with printable or utf8 to avoid non printable/control characters.

By default, expire times are emitted verbatim if they are present in the rdb file, causing all keys that expire in the past to be removed. If this behaviour is unwanted the -x/--no-expire option will ignore all key expiry commands.

Otherwise you may want to set an expiry time in the future with -a/--amend-expire option which adds an integer number of seconds to the expiry time of each key which is already set to expire. This will not change keys that do not already have an expiry set.

Using the Parser

from rdbtools import RdbParser, RdbCallback
from rdbtools.encodehelpers import bytes_to_unicode

class MyCallback(RdbCallback):
    ''' Simple example to show how callback works.
        See RdbCallback for all available callback methods.
        See JsonCallback for a concrete example
    '''

    def __init__(self):
        super(MyCallback, self).__init__(string_escape=None)

    def encode_key(self, key):
        return bytes_to_unicode(key, self._escape, skip_printable=True)

    def encode_value(self, val):
        return bytes_to_unicode(val, self._escape)

    def set(self, key, value, expiry, info):
        print('%s = %s' % (self.encode_key(key), self.encode_value(value)))

    def hset(self, key, field, value):
        print('%s.%s = %s' % (self.encode_key(key), self.encode_key(field), self.encode_value(value)))

    def sadd(self, key, member):
        print('%s has {%s}' % (self.encode_key(key), self.encode_value(member)))

    def rpush(self, key, value):
        print('%s has [%s]' % (self.encode_key(key), self.encode_value(value)))

    def zadd(self, key, score, member):
        print('%s has {%s : %s}' % (str(key), str(member), str(score)))


callback = MyCallback()
parser = RdbParser(callback)
parser.parse('/var/redis/6379/dump.rdb')

Other Pages

Frequently Asked Questions
Redis Dump File Specification
Redis Dump File Version History - this also has notes on converting a dump file to an older version.

License

rdbtools is licensed under the MIT License. See LICENSE

Maintained By

Sripathi Krishnan : @srithedabbler

Credits

redis-rdb-tools's People

Contributors

Stargazers

Watchers

Forkers

dcshi kr1sp1n tjuer splee bgzxz clofresh solso mbrandoll ttgive wtmmac ulugbekov germbed thaingo jonathancua cniclsh popravich redsmin jsrawan-mobo alepharchives mcos wen866595 aeglethemis nmmmnu magicbill anismiles akahn lestr josexie pixelkaiser codeaholics epigos nettedfish fujiaghost pragnesh tly1980 jackscott scalp42 ace7 kexuejin bagwanpankaj xiaoniainiu rvoicilas mr-clean alemic gary1234qwer brianantonelli i-sam jorgeuriarte n2bh yoav-steinberg affirm radagaisus demonlife anoopelias edwardchuang samyubw xssworm rockybean bssatya abased wjin chenbk85 mrfaheemkhan raindylong yifzhang shnjp martin-ly timiblossom jacko972 loisaidasam seneque miqui adrianmester baimoon knowledgehacker c2002509 4lian marshal003 laurita zyuyou osso cyaliven jeffreywugz dpippen xinyu7git victorblasco sluceno j3rrynorm4n haolei kongjustin lihanharry justecorruptio liyazhou is00hcw williamren monday0rsunday flyinghazi yrttyr upan heartshare

redis-rdb-tools's Issues

Account for Embedded String Encoding when calculating memory used by a key

Antirez has a new memory optimization branch - redis/redis@unstable...memopt

Smaller strings (less than 32 bytes) are inlined. This gets rid of an extra pointer, and so saves some memory (not a whole lot). Not a big deal, but something that we should account never the less.

This should be fixed only after the optimization branch reaches unstable/master branch.

redis-memory-for-key installation issues

Running redis-memory-for-key fails unless I independently install a "global" redis python lib:

pip install redis

I noticed the setup.py is missing a:

install_requires=[
        "redis",

    ],

So I guess this is the issue. I can send a PR that solves this, if this is acceptable by you.

Also: Thanks for this great tool! saved me this week

Report progress of memory report

Could we add reporting of progress (ideally and ETA, but that's a bonus) on STDERR? (I see we're using STDOUT for actual output.)

I know it's an OSS project and so PRs are welcome :) so this issue is to solicit opinions on whether it should be added / how to do so.

i found if the parameter "-f" is obtained, the parameter "-k" is not working

python ./rdbtools/cli/rdb.py --command json -f "test.json"  -k 'abc__*'  dump.rdb

i run this, but found that the parameter -k 'abc__*' is not working.then i check the source code :

parser = RdbParser(callback)

if the parameter '-f' has assigned, filters is not given to RdbParser. i want to know why do so?

ps: please forgive my poor english...

How do I merge multiple .rdb files into one?

Throwing error when encoding None key

It seems None(null) key is not handled while encoding
Error Trace
Traceback (most recent call last): File "bin/rdb", line 9, in <module> load_entry_point('rdbtools==0.1.7', 'console_scripts', 'rdb')() File "/home/ec2-user/python-benchmark/env/local/lib/python2.7/dist-packages/rdbtools/cli/rdb.py", line 82, in main parser.parse(dump_file) File "/home/ec2-user/python-benchmark/env/local/lib/python2.7/dist-packages/rdbtools/parser.py", line 293, in parse self.parse_fd(open(filename, "rb")) File "/home/ec2-user/python-benchmark/env/local/lib/python2.7/dist-packages/rdbtools/parser.py", line 337, in parse_fd self._callback.end_database(db_number) File "/home/ec2-user/python-benchmark/env/local/lib/python2.7/dist-packages/rdbtools/memprofiler.py", line 131, in end_database self._stream.next_record(record) File "/home/ec2-user/python-benchmark/env/local/lib/python2.7/dist-packages/rdbtools/memprofiler.py", line 82, in next_record self._out.write("%d,%s,%s,%d,%s,%d,%d\n" % (record.database, record.type, encode_key(record.key), File "/home/ec2-user/python-benchmark/env/local/lib/python2.7/dist-packages/rdbtools/callbacks.py", line 92, in encode_key return _encode(s, quote_numbers=True) File "/home/ec2-user/python-benchmark/env/local/lib/python2.7/dist-packages/rdbtools/callbacks.py", line 89, in _encode return _encode_basestring_ascii(s) File "/home/ec2-user/python-benchmark/env/local/lib/python2.7/dist-packages/rdbtools/callbacks.py", line 70, in _encode_basestring_ascii return '"' + str(ESCAPE_ASCII.sub(replace, s)) + '"' TypeError: expected string or buffer

I have printed actual record throwing this error
MemoryRecord(database=0, type='dict', key=None, bytes=104.0, encoding=None, size=None, len_largest_element=None)

RDB file is hereby attached.
dump.zip

small fix in readme

Hello, I just found your project and it would be great to have working git command - so instead of:
git checkout [email protected]:sripathikrishnan/redis-rdb-tools.git

something like:
git clone https://github.com/sripathikrishnan/redis-rdb-tools

Thanks :)

Invalid json format by missing comma when handling hash

When hash's size is greater then 253, comma will be lost in the output json file.

Here's an example data. Copy it to a formatter and check it out.

{"2012-12-26 03:45":"139#0#117#1.0","2012-12-26 12:55":"461#0#383#1.0","2012-12-26 22:50":"244#0#214#1.0","2012-12-26 23:10":"205#0#186#1.0","2012-12-26 14:00":"424#0#357#1.0","2012-12-26 00:55":"195#0#139#1.0","2012-12-26 10:50":"354#0#290#1.0","2012-12-26 09:25":"301#0#261#1.0","2012-12-26 15:45":"428#0#378#1.0","2012-12-26 20:20":"345#0#285#1.0","2012-12-26 07:20":"210#0#181#1.0","2012-12-26 02:00":"184#0#156#1.0","2012-12-26 13:15":"423#0#345#1.0","2012-12-26 01:15":"199#0#158#1.0","2012-12-26 22:25":"294#0#241#1.0","2012-12-26 04:05":"145#0#124#1.0","2012-12-26 16:30":"452#0#369#1.0","2012-12-26 06:35":"180#0#146#1.0","2012-12-26 01:40":"160#0#145#1.0","2012-12-26 10:25":"318#0#270#1.0","2012-12-26 16:05":"420#0#358#1.0","2012-12-26 04:30":"153#0#125#1.0","2012-12-26 18:35":"444#0#354#1.0","2012-12-26 09:50":"326#0#283#1.0","2012-12-26 13:40":"415#0#351#1.0","2012-12-26 11:10":"377#0#304#1.0","2012-12-26 19:20":"412#0#326#1.0","2012-12-26 20:45":"355#0#299#1.0","2012-12-26 15:10":"398#0#330#1.0","2012-12-26 07:45":"229#0#192#1.0","2012-12-26 05:15":"163#0#136#1.0","2012-12-26 21:30":"351#0#279#1.0","2012-12-26 02:50":"154#0#130#1.0","2012-12-26 08:05":"288#0#226#1.0","2012-12-26 02:25":"156#0#141#1.0","2012-12-26 08:30":"229#0#194#1.0","2012-12-26 06:00":"169#0#129#1.0","2012-12-26 14:25":"480#0#390#1.0","2012-12-26 23:35":"173#0#152#1.0","2012-12-26 18:00":"442#0#362#1.0","2012-12-26 16:55":"467#0#365#1.0","2012-12-26 04:55":"165#0#137#1.0","2012-12-26 03:10":"176#0#140#1.0","2012-12-26 00:20":"178#0#156#1.0","2012-12-26 19:45":"411#0#344#1.0","2012-12-26 21:05":"370#0#314#1.0","2012-12-26 17:40":"418#0#344#1.0","2012-12-26 05:40":"172#0#140#1.0","2012-12-26 12:20":"416#0#344#1.0","2012-12-26 17:15":"446#0#377#1.0","2012-12-26 11:35":"397#0#337#1.0","2012-12-26 14:50":"446#0#355#1.0","2012-12-26 06:50":"173#0#142#1.0","2012-12-26 23:00":"230#0#194#1.0","2012-12-26 01:05":"175#0#152#1.0","2012-12-26 18:50":"391#0#322#1.0","2012-12-26 07:10":"196#0#170#1.0","2012-12-26 22:15":"300#0#240#1.0","2012-12-26 09:40":"327#0#280#1.0","2012-12-26 03:35":"184#0#133#1.0","2012-12-26 13:05":"445#0#340#1.0","2012-12-26 20:10":"404#0#326#1.0","2012-12-26 12:45":"397#0#327#1.0","2012-12-26 19:10":"424#0#345#1.0","2012-12-26 06:25":"183#0#152#1.0","2012-12-26 13:30":"454#0#384#1.0","2012-12-26 00:45":"181#0#155#1.0","2012-12-26 21:55":"291#0#247#1.0","2012-12-26 04:20":"131#0#111#1.0","2012-12-26 10:40":"361#0#299#1.0","2012-12-26 18:25":"406#0#350#1.0","2012-12-26 01:30":"175#0#143#1.0","2012-12-26 09:15":"319#0#266#1.0","2012-12-26 08:55":"298#0#255#1.0","2012-12-26 15:35":"428#0#356#1.0","2012-12-26 11:00":"345#0#287#1.0","2012-12-26 10:15":"367#0#291#1.0","2012-12-26 22:40":"285#0#236#1.0","2012-12-26 16:20":"427#0#358#1.0","2012-12-26 05:30":"126#0#111#1.0","2012-12-26 23:50":"231#0#200#1.0","2012-12-26 16:45":"529#0#419#1.0","2012-12-26 11:50":"436#0#358#1.0","2012-12-26 23:25":"226#0#183#1.0","2012-12-26 19:35":"455#0#372#1.0","2012-12-26 07:35":"202#0#178#1.0","2012-12-26 03:00":"173#0#142#1.0","2012-12-26 12:10":"372#0#299#1.0","2012-12-26 08:20":"216#0#189#1.0","2012-12-26 14:15":"430#0#360#1.0","2012-12-26 05:05":"141#0#121#1.0","2012-12-26 00:10":"196#0#157#1.0","2012-12-26 20:35":"382#0#304#1.0","2012-12-26 21:20":"373#0#289#1.0","2012-12-26 17:30":"397#0#347#1.0","2012-12-26 14:40":"463#0#375#1.0","2012-12-26 02:15":"177#0#146#1.0","2012-12-26 04:45":"112#0#102#1.0","2012-12-26 02:40":"157#0#138#1.0","2012-12-26 11:25":"447#0#344#1.0","2012-12-26 17:05":"425#0#348#1.0","2012-12-26 13:55":"449#0#375#1.0","2012-12-26 01:55":"186#0#157#1.0","2012-12-26 15:00":"463#0#376#1.0","2012-12-26 17:55":"410#0#340#1.0","2012-12-26 06:15":"146#0#126#1.0","2012-12-26 19:00":"402#0#329#1.0","2012-12-26 09:05":"315#0#262#1.0","2012-12-26 15:25":"432#0#367#1.0","2012-12-26 13:20":"455#0#366#1.0","2012-12-26 18:15":"478#0#387#1.0","2012-12-26 09:30":"318#0#274#1.0","2012-12-26 07:00":"190#0#169#1.0","2012-12-26 15:50":"457#0#380#1.0","2012-12-26 08:45":"264#0#216#1.0","2012-12-26 16:10":"418#0#357#1.0","2012-12-26 10:30":"326#0#265#1.0","2012-12-26 03:25":"129#0#114#1.0","2012-12-26 22:30":"296#0#238#1.0","2012-12-26 20:00":"377#0#316#1.0","2012-12-26 06:40":"177#0#156#1.0","2012-12-26 18:40":"418#0#351#1.0","2012-12-26 00:35":"196#0#157#1.0","2012-12-26 21:45":"324#0#258#1.0","2012-12-26 03:50":"121#0#104#1.0","2012-12-26 04:10":"119#0#106#1.0","2012-12-26 22:05":"308#0#242#1.0","2012-12-26 10:05":"310#0#261#1.0","2012-12-26 12:35":"383#0#320#1.0","2012-12-26 01:20":"171#0#156#1.0","2012-12-26 05:55":"179#0#146#1.0","2012-12-26 05:20":"150#0#126#1.0","2012-12-26 02:30":"184#0#155#1.0","2012-12-26 08:10":"265#0#223#1.0","2012-12-26 14:30":"469#0#369#1.0","2012-12-26 17:20":"473#0#390#1.0","2012-12-26 04:35":"118#0#103#1.0","2012-12-26 09:55":"332#0#285#1.0","2012-12-26 20:50":"336#0#269#1.0","2012-12-26 19:50":"366#0#310#1.0","2012-12-26 19:25":"421#0#352#1.0","2012-12-26 21:10":"330#0#278#1.0","2012-12-26 02:05":"177#0#143#1.0","2012-12-26 13:45":"457#0#351#1.0","2012-12-26 11:40":"414#0#324#1.0","2012-12-26 23:15":"220#0#191#1.0","2012-12-26 07:25":"214#0#184#1.0","2012-12-26 00:00":"177#0#143#1.0","2012-12-26 23:40":"178#0#166#1.0","2012-12-26 20:25":"367#0#304#1.0","2012-12-26 11:15":"402#0#328#1.0","2012-12-26 01:45":"157#0#131#1.0","2012-12-26 16:35":"431#0#364#1.0","2012-12-26 12:00":"405#0#340#1.0","2012-12-26 22:55":"203#0#182#1.0","2012-12-26 10:55":"402#0#321#1.0","2012-12-26 14:05":"386#0#330#1.0","2012-12-26 07:50":"233#0#194#1.0","2012-12-26 00:50":"227#0#171#1.0","2012-12-26 06:05":"163#0#138#1.0","2012-12-26 22:20":"275#0#225#1.0","2012-12-26 12:50":"400#0#340#1.0","2012-12-26 17:45":"439#0#358#1.0","2012-12-26 16:00":"450#0#368#1.0","2012-12-26 18:30":"415#0#351#1.0","2012-12-26 06:30":"181#0#148#1.0","2012-12-26 10:20":"376#0#308#1.0","2012-12-26 12:25":"373#0#324#1.0","2012-12-26 13:10":"433#0#357#1.0","2012-12-26 03:15":"140#0#125#1.0","2012-12-26 21:35":"372#0#289#1.0","2012-12-26 00:25":"208#0#166#1.0","2012-12-26 15:15":"485#0#385#1.0","2012-12-26 05:45":"144#0#128#1.0","2012-12-26 03:40":"146#0#122#1.0","2012-12-26 08:35":"264#0#228#1.0","2012-12-26 01:10":"203#0#177#1.0","2012-12-26 14:55":"467#0#376#1.0","2012-12-26 04:00":"123#0#103#1.0","2012-12-26 09:20":"309#0#254#1.0","2012-12-26 18:05":"437#0#352#1.0","2012-12-26 02:55":"134#0#118#1.0","2012-12-26 15:40":"412#0#363#1.0","2012-12-26 21:00":"352#0#305#1.0","2012-12-26 16:25":"439#0#343#1.0","2012-12-26 14:20":"451#0#373#1.0","2012-12-26 07:15":"196#0#166#1.0","2012-12-26 11:05":"355#0#304#1.0","2012-12-26 16:50":"425#0#350#1.0","2012-12-26 07:40":"214#0#186#1.0","2012-12-26 10:45":"371#0#301#1.0","2012-12-26 22:45":"229#0#194#1.0","2012-12-26 18:55":"448#0#352#1.0","2012-12-26 17:10":"452#0#384#1.0","2012-12-26 06:55":"200#0#168#1.0","2012-12-26 20:40":"351#0#294#1.0","2012-12-26 20:15":"390#0#323#1.0","2012-12-26 11:30":"387#0#324#1.0","2012-12-26 02:20":"166#0#143#1.0","2012-12-26 23:30":"227#0#193#1.0","2012-12-26 08:00":"184#0#167#1.0","2012-12-26 09:45":"312#0#264#1.0","2012-12-26 04:25":"139#0#124#1.0","2012-12-26 19:40":"384#0#320#1.0","2012-12-26 04:50":"145#0#112#1.0","2012-12-26 23:05":"227#0#198#1.0","2012-12-26 05:10":"122#0#108#1.0","2012-12-26 13:35":"460#0#380#1.0","2012-12-26 19:15":"409#0#337#1.0","2012-12-26 01:35":"185#0#158#1.0","2012-12-26 14:45":"420#0#354#1.0","2012-12-26 05:35":"156#0#127#1.0","2012-12-26 09:10":"287#0#236#1.0","2012-12-26 08:50":"269#0#219#1.0","2012-12-26 08:25":"225#0#179#1.0","2012-12-26 21:50":"310#0#258#1.0","2012-12-26 01:00":"207#0#175#1.0","2012-12-26 15:30":"427#0#365#1.0","2012-12-26 22:10":"292#0#243#1.0","2012-12-26 03:30":"160#0#109#1.0","2012-12-26 11:55":"410#0#327#1.0","2012-12-26 02:45":"171#0#148#1.0","2012-12-26 18:20":"445#0#353#1.0","2012-12-26 12:15":"397#0#329#1.0","2012-12-26 17:35":"464#0#368#1.0","2012-12-26 13:00":"435#0#365#1.0","2012-12-26 00:15":"167#0#146#1.0","2012-12-26 12:40":"394#0#318#1.0","2012-12-26 00:40":"188#0#156#1.0","2012-12-26 06:20":"192#0#155#1.0","2012-12-26 03:05":"143#0#125#1.0","2012-12-26 10:10":"374#0#298#1.0","2012-12-26 21:25":"376#0#297#1.0","2012-12-26 15:05":"471#0#391#1.0","2012-12-26 16:15":"423#0#344#1.0","2012-12-26 05:00":"142#0#117#1.0","2012-12-26 22:35":"254#0#216#1.0","2012-12-26 07:05":"180#0#148#1.0","2012-12-26 07:30":"212#0#176#1.0","2012-12-26 17:00":"453#0#369#1.0","2012-12-26 01:25":"159#0#143#1.0","2012-12-26 06:45":"176#0#142#1.0","2012-12-26 01:50":"162#0#137#1.0","2012-12-26 13:50":"431#0#361#1.0","2012-12-26 16:40":"460#0#384#1.0","2012-12-26 13:25":"443#0#361#1.0","2012-12-26 03:55":"152#0#117#1.0","2012-12-26 02:10":"193#0#155#1.0","2012-12-26 19:05":"395#0#335#1.0","2012-12-26 15:55":"466#0#381#1.0","2012-12-26 18:45":"391#0#317#1.0","2012-12-26 23:20":"220#0#197#1.0","2012-12-26 04:40":"120#0#103#1.0","2012-12-26 04:15":"135#0#120#1.0""2012-12-26 20:05":"383#0#337#1.0""2012-12-26 10:35":"360#0#296#1.0""2012-12-26 11:20":"424#0#323#1.0""2012-12-26 09:35":"277#0#234#1.0""2012-12-26 19:30":"336#0#288#1.0""2012-12-26 20:30":"383#0#304#1.0""2012-12-26 14:10":"417#0#356#1.0""2012-12-26 22:00":"285#0#247#1.0""2012-12-26 03:20":"148#0#116#1.0""2012-12-26 00:30":"177#0#149#1.0""2012-12-26 05:50":"167#0#134#1.0""2012-12-26 12:30":"401#0#324#1.0""2012-12-26 14:35":"438#0#372#1.0""2012-12-26 10:00":"315#0#269#1.0""2012-12-26 17:50":"552#0#418#1.0""2012-12-26 23:45":"192#0#161#1.0""2012-12-26 12:05":"365#0#311#1.0""2012-12-26 21:40":"320#0#257#1.0""2012-12-26 18:10":"434#0#350#1.0""2012-12-26 02:35":"151#0#142#1.0""2012-12-26 05:25":"136#0#123#1.0""2012-12-26 07:55":"254#0#189#1.0""2012-12-26 19:55":"381#0#329#1.0""2012-12-26 21:15":"353#0#284#1.0""2012-12-26 00:05":"169#0#137#1.0""2012-12-26 08:15":"225#0#190#1.0""2012-12-26 08:40":"268#0#236#1.0""2012-12-26 17:25":"450#0#364#1.0""2012-12-26 09:00":"320#0#256#1.0""2012-12-26 20:55":"339#0#278#1.0""2012-12-26 15:20":"479#0#401#1.0""2012-12-26 06:10":"182#0#149#1.0""2012-12-26 11:45":"394#0#323#1.0"}

Redis 4.0 support

So attempting to process a redis 4.0rc2 dump.rd file yields:

Traceback (most recent call last):
  File "/usr/bin/rdb", line 11, in <module>
    load_entry_point('rdbtools==0.1.7', 'console_scripts', 'rdb')()
  File "/usr/lib64/python2.7/site-packages/rdbtools-0.1.7-py2.7.egg/rdbtools/cli/rdb.py", line 79, in main
    parser.parse(dump_file)
  File "/usr/lib64/python2.7/site-packages/rdbtools-0.1.7-py2.7.egg/rdbtools/parser.py", line 304, in parse
    self.parse_fd(open(filename, "rb"))
  File "/usr/lib64/python2.7/site-packages/rdbtools-0.1.7-py2.7.egg/rdbtools/parser.py", line 309, in parse_fd
    self.verify_version(f.read(4))
  File "/usr/lib64/python2.7/site-packages/rdbtools-0.1.7-py2.7.egg/rdbtools/parser.py", line 697, in verify_version
    raise Exception('verify_version', 'Invalid RDB version number %d' % version)
Exception: ('verify_version', 'Invalid RDB version number 8')

Not sure if it's a major deal to add support for 4.0 dumps or not.

key bytes is bigger than redis memory used

I use redis-memory-for-key to analyse a key and find the bytes of the key is 589M. But my redis memory used just only 502M. I wonder the meaning of bytes of a key.

ValueError: year is out of range (memory report)

Attempting a memory report on our redis dump files fails as per below

$ sudo rdb -c memory dump.rdb.pre-purge > rdb_memory.csv
Traceback (most recent call last):
  File "/usr/local/bin/rdb", line 9, in <module>
    load_entry_point('rdbtools==0.1.6', 'console_scripts', 'rdb')()
  File "/usr/local/lib/python2.7/dist-packages/rdbtools/cli/rdb.py", line 82, in main
    parser.parse(dump_file)
  File "/usr/local/lib/python2.7/dist-packages/rdbtools/parser.py", line 284, in parse
    self._expiry = to_datetime(read_unsigned_long(f) * 1000)
  File "/usr/local/lib/python2.7/dist-packages/rdbtools/parser.py", line 705, in to_datetime
    dt = datetime.datetime.utcfromtimestamp(seconds_since_epoch)
ValueError: year is out of range

Abandonware?

Is this abandonware?

The last commit was nearly two years ago
There are 16 open pull requests

@sripathikrishnan perhaps you'd consider asking for assistance in maintaining this most useful piece of open source software?

Failing that, time to go looking for forks, or alternatives, I guess.

Create Script to find Top N keys by Used Memory

A very frequent use case is finding top N keys by memory usage. For small databases, this is easily achieved by taking the csv file and sorting out.

But for databases with more than 64K keys, the csv approach fails because excel cannot handle that many rows.

The solution is to have a script that does a linear pass of all keys, maintain the top N keys by memory usage, and then finally print that out. Perhaps we could also maintain some more statistics as well.

See discussion on this over here - http://stackoverflow.com/questions/13673058/what-is-the-easiest-way-to-find-the-biggest-objects-in-redis/13681596#comment18794833_13681596

Import of RDB file to redis-cli pipe fails

Hi, while I can see this project hasn't been worked on for a while I thought it might be useful to document an issue we came across today while importing an RDB file from a read replica we run into a fresh AWS ElastiCache cluster.

Traceback (most recent call last):
  File "/usr/local/bin/rdb", line 9, in <module>
    load_entry_point('rdbtools==0.1.6', 'console_scripts', 'rdb')()
  File "/usr/local/lib/python2.7/dist-packages/rdbtools/cli/rdb.py", line 82, in main
    parser.parse(dump_file)
  File "/usr/local/lib/python2.7/dist-packages/rdbtools/parser.py", line 306, in parse
    self.read_object(f, data_type)
  File "/usr/local/lib/python2.7/dist-packages/rdbtools/parser.py", line 369, in read_object
    self._callback.rpush(self._key, val)
  File "/usr/local/lib/python2.7/dist-packages/rdbtools/callbacks.py", line 340, in rpush
    self.emit('RPUSH', key, value)
  File "/usr/local/lib/python2.7/dist-packages/rdbtools/callbacks.py", line 298, in emit
    self._out.write(u"$" + unicode(len(unicode(arg))) + u"\r\n")
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 4614: ordinal not in range(128)
All data transferred. Waiting for the last reply...
ERR Protocol error: expected '$', got ' '

If this is a simple fix or something we are doing wrong would love to know. We installed using pip and the command we ran was rdb --command protocol ~/redisprod-2015-08-27-100001.rdb | redis-cli -h xxxxxxx.xxxxxx.xx.xxxx.xxxx.cache.amazonaws.com --pipe.

Running the same command with out piping into redis-cli produces the following error:

UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 4614: ordinal not in range(128)

Install error

  Using cached rdbtools-0.1.8.tar.gz
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "<string>", line 20, in <module>
      File "/private/var/folders/fj/d9k5_d914m51bd5_8f0s92bh0000gn/T/pip-build-gmjvsvbw/rdbtools/setup.py", line 3, in <module>
        from rdbtools import __version__
      File "/private/var/folders/fj/d9k5_d914m51bd5_8f0s92bh0000gn/T/pip-build-gmjvsvbw/rdbtools/rdbtools/__init__.py", line 2, in <module>
        from rdbtools.callbacks import JSONCallback, DiffCallback, ProtocolCallback

    ----------------------------------------
    Command "python setup.py egg_info" failed with error code 1 in /private/var/folders/fj/d9k5_d914m51bd5_8f0s92bh0000gn/T/pip-build-gmjvsvbw/rdbtools

Expiry time incorrectly calculated

Expiry time read from the rdb isn't converted correctly to micro seconds. Note that the "to_datetime()" function receives input in micro seconds. Consider the following patch:

--- ../../redis-rdb-tools/rdbtools/parser.py    (revision 1545)
+++ ../../redis-rdb-tools/rdbtools/parser.py    (working copy)
@@ -283,10 +283,10 @@
                 data_type = read_unsigned_char(f)

                 if data_type == REDIS_RDB_OPCODE_EXPIRETIME_MS :
-                    self._expiry = to_datetime(read_unsigned_long(f))
+                    self._expiry = to_datetime(read_unsigned_long(f) * 1000)
                     data_type = read_unsigned_char(f)
                 elif data_type == REDIS_RDB_OPCODE_EXPIRETIME :
-                    self._expiry = to_datetime(read_unsigned_int(f) * 1000)
+                    self._expiry = to_datetime(read_unsigned_int(f) * 1000000)
                     data_type = read_unsigned_char(f)

                 if data_type == REDIS_RDB_OPCODE_SELECTDB :

Support RDB version 6 : New ziplist encoding

Ziplists will soon start using variable encoding for 8 and 24 bit numbers. See https://github.com/antirez/redis/compare/zipenc

rdb-tools should add support for this version as soon as it moves to unstable / 2.6

TypeError: sequence item 1: expected a bytes-like object, str found

Command: c:\Python34\Scripts>rdb --command json C:/Users/Sam/Downloads/dump.rdb
OS: Windows 10
Python Version: 3.4

c:\Python34\Scripts>rdb --command json C:/Users/Sam/Downloads/dump.rdb [{ Traceback (most recent call last): File "c:\Python34\Scripts\rdb-script.py", line 11, in <module> load_entry_point('rdbtools==0.1.7', 'console_scripts', 'rdb')() File "C:\Python34\lib\site-packages\rdbtools-0.1.7-py3.4.egg\rdbtools\cli\rdb.py", line 79, in main parser.parse(dump_file) File "C:\Python34\lib\site-packages\rdbtools-0.1.7-py3.4.egg\rdbtools\parser.py", line 304, in parse self.parse_fd(open(filename, "rb")) File "C:\Python34\lib\site-packages\rdbtools-0.1.7-py3.4.egg\rdbtools\parser.py", line 357, in parse_fd self.read_object(f, data_type) File "C:\Python34\lib\site-packages\rdbtools-0.1.7-py3.4.egg\rdbtools\parser.py", line 425, in read_object self._callback.set(self._key, val, self._expiry, info={'encoding':'string'}) File "C:\Python34\lib\site-packages\rdbtools-0.1.7-py3.4.egg\rdbtools\callbacks.py", line 151, in set self._out.write('%s:%s' % (encode_key(key), encode_value(value))) File "C:\Python34\lib\site-packages\rdbtools-0.1.7-py3.4.egg\rdbtools\callbacks.py", line 102, in encode_value return _encode(s, quote_numbers=False) File "C:\Python34\lib\site-packages\rdbtools-0.1.7-py3.4.egg\rdbtools\callbacks.py", line 96, in _encode return _encode_basestring_ascii(s) File "C:\Python34\lib\site-packages\rdbtools-0.1.7-py3.4.egg\rdbtools\callbacks.py", line 77, in _encode_basestring_ascii return '"' + str(ESCAPE_ASCII.sub(replace, s)) + '"' TypeError: sequence item 1: expected a bytes-like object, str found

Can someone please take a look at this and tell me what the issue is? Any help is appreciated. Thanks!

Converting rdb to protocol (UnicodeDecodeError: 'ascii' codec can't decode byte 0xd5)

uporabnik@lju2134:~$ rdb --command protocol dump-redis-6401.rdb
*2
$6
SELECT
$1
0
*3
$3
SET
$37
pb-urn:sspcreativeapprovalstatus:6741
Traceback (most recent call last):
File "/usr/local/bin/rdb", line 9, in
load_entry_point('rdbtools==0.1.8', 'console_scripts', 'rdb')()
File "/usr/local/lib/python2.7/dist-packages/rdbtools/cli/rdb.py", line 82, in main
parser.parse(dump_file)
File "/usr/local/lib/python2.7/dist-packages/rdbtools/parser.py", line 293, in parse
self.parse_fd(open(filename, "rb"))
File "/usr/local/lib/python2.7/dist-packages/rdbtools/parser.py", line 346, in parse_fd
self.read_object(f, data_type)
File "/usr/local/lib/python2.7/dist-packages/rdbtools/parser.py", line 401, in read_object
self._callback.set(self._key, val, self._expiry, info={'encoding':'string'})
File "/usr/local/lib/python2.7/dist-packages/rdbtools/callbacks.py", line 310, in set
self.emit('SET', key, value)
File "/usr/local/lib/python2.7/dist-packages/rdbtools/callbacks.py", line 299, in emit
self._out.write(u"$" + unicode(len(unicode(arg))) + u"\r\n")
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd5 in position 1: ordinal not in range(128)

version used:
sudo pip install git+https://github.com/sripathikrishnan/redis-rdb-tools@28ccd5b5dec0ac6e8724af3c61d167771b4d9c77 --upgrade

Binary data output

Hi, I store binary data as values for my keys. I see multiple bytes are combined into a single unicode character when I use rdb-tools to dump my data in json.

This is what I get when dumping from redis using redis-cli:

redis 127.0.0.1:6381> get "43542948993:Brown Home Improvement"
"\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01Y\xcc\x9c\x01"

This is what I get from the rdb-tools dump:
[{
"43542948993:Brown Home Improvement":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0001Y\u031c\u0001"}]

The \xcc\x9c is being combined into \u031c. I am guessing the tools is reading the data as utf-8 and converting it to unicode. Is there a way to preserve the bytes?

AUTH support for redis-memory-for-key

I just tried to use redis-memory-for-key on a redis instance that requires authentication, and apparently there is no way to do so.

PS thank you anyway for the tool!

Exception in read_zipmap_next_length

When I run with "--command diff" or "-c memory" I get the following exception:

Traceback (most recent call last):
File "./rdb", line 80, in
main()
File "./rdb", line 77, in main
parser.parse(dump_file)
File "/home/danielm/Downloads/sripathikrishnan-redis-rdb-tools-94d13ef/rdbtools/parser.py", line 306, in parse
self.read_object(f, data_type)
File "/home/danielm/Downloads/sripathikrishnan-redis-rdb-tools-94d13ef/rdbtools/parser.py", line 401, in read_object
self.read_zipmap(f)
File "/home/danielm/Downloads/sripathikrishnan-redis-rdb-tools-94d13ef/rdbtools/parser.py", line 577, in read_zipmap
next_length = self.read_zipmap_next_length(buff)
File "/home/danielm/Downloads/sripathikrishnan-redis-rdb-tools-94d13ef/rdbtools/parser.py", line 598, in read_zipmap_next_length
raise Exception('read_zipmap_next_length', 'Unexpected value in length field - %d' % num)
Exception: ('read_zipmap_next_length', 'Unexpected value in length field - 254')

ValueError: year is out of range

I'm trying to dump my RDB to a JSON file filtered by a particular key. I get an error (ValueError: year is out of range) regardless of the key filter being applied. I'm running Python 2.7.1 on OSX 10.7.5

rdb --command json --key "user.*" dump.rdb

Error:

[{Traceback (most recent call last):
  File "/usr/local/bin/rdb", line 8, in <module>
    load_entry_point('rdbtools==0.1.5', 'console_scripts', 'rdb')()
  File "/Library/Python/2.7/site-packages/rdbtools-0.1.5-py2.7.egg/rdbtools/cli/rdb.py", line 79, in main
    parser.parse(dump_file)
  File "/Library/Python/2.7/site-packages/rdbtools-0.1.5-py2.7.egg/rdbtools/parser.py", line 284, in parse
    self._expiry = to_datetime(read_unsigned_long(f) * 1000)
  File "/Library/Python/2.7/site-packages/rdbtools-0.1.5-py2.7.egg/rdbtools/parser.py", line 705, in to_datetime
    dt = datetime.datetime.utcfromtimestamp(seconds_since_epoch)
ValueError: year is out of range

Support Redis 3.x

Getting the following error when using Redis 3.2

Traceback (most recent call last):
  File "/usr/bin/rdb", line 9, in <module>
    load_entry_point('rdbtools==0.1.6', 'console_scripts', 'rdb')()
  File "/usr/lib/python2.7/site-packages/rdbtools/cli/rdb.py", line 82, in main
    parser.parse(dump_file)
  File "/usr/lib/python2.7/site-packages/rdbtools/parser.py", line 274, in parse
    self.verify_version(f.read(4))
  File "/usr/lib/python2.7/site-packages/rdbtools/parser.py", line 612, in verify_version
    raise Exception('verify_version', 'Invalid RDB version number %d' % version)
Exception: ('verify_version', 'Invalid RDB version number 7')

with the dump file

REDIS0007�    redis-ver�3.2.0�
redis-bits�@��ctime�d�2W��used-mem�ؔ��������hello�world��w9�=Z��

LZF encoding bit is wrong in redis-2.8.13

"""
Compressed Strings
First read the section “Length Encoding”, specifically the part when the first two bits are 11. In this case, the remaining 6 bits are read.
If the value of those 6 bits is 4, it indicates that a compressed string follows.
"""
Hint above is not correct.
In fact, in version 2.8.13, it is 'c3' means lzf compressed instead of 'c4'.
So "If the value of those 6 bits is 4" should change to "If the value of those 6 bits is 3"

Readme should use 'sudo' on pip command

The readme file directs the user to use the command 'pip install rdbtools'. It should be 'sudo pip install rdbtools', at least that was required in order to get it to work on my OSX system.

long type is not supported

Great tool, congrats.

I'm having a problem though. While doing a

rdb -c memory mydata.rdb

On a redis 2.4.14 RDB file I get the following stacktrace:

Traceback (most recent call last):
File "/usr/local/bin/rdb", line 9, in
load_entry_point('rdbtools==0.1.2', 'console_scripts', 'rdb')()
File "/usr/local/lib/python2.6/dist-packages/rdbtools-0.1.2-py2.6.egg/rdbtools/cli/rdb.py", line 79, in main
parser.parse(dump_file)
File "/usr/local/lib/python2.6/dist-packages/rdbtools-0.1.2-py2.6.egg/rdbtools/parser.py", line 306, in parse
self.read_object(f, data_type)
File "/usr/local/lib/python2.6/dist-packages/rdbtools-0.1.2-py2.6.egg/rdbtools/parser.py", line 401, in read_object
self.read_zipmap(f)
File "/usr/local/lib/python2.6/dist-packages/rdbtools-0.1.2-py2.6.egg/rdbtools/parser.py", line 593, in read_zipmap
self._callback.hset(self._key, key, value)
File "/usr/local/lib/python2.6/dist-packages/rdbtools-0.1.2-py2.6.egg/rdbtools/memprofiler.py", line 139, in hset
if(element_length(value) > self._len_largest_element) :
File "/usr/local/lib/python2.6/dist-packages/rdbtools-0.1.2-py2.6.egg/rdbtools/memprofiler.py", line 336, in element_length
return len(element)
TypeError: object of type 'long' has no len()

Reporting HTML template not included in PyPI download.

I suspect that's because it doesn't match anything in MANIFEST.in and so isn't included in the source distribution file.

Decode error while emitting redis protocol

Hello,
i have hash in redis

127.0.0.1:6379> hvals shop:47
1) "{\"data\":{\"cost\":\"4490\",\"country\":\"\xd0\x9a\xd1\x80\xd0\xbe\xd1\x81\xd1\x81\xd0\xbe\xd0\xb2\xd0\xba\xd0\xb8 Zoom Breathe 2K11\",\"pict300\":\"files/images/bonsport/p300_NI068SH95GYY.jpg\",\"discount\":\"0\",\"old_price\":\"4490\",\"pict_mini\":\"files/images/bonsport/m_NI068SH95GYY.jpg\",\"url\":\"asd\",\"pict\":\"files/images/bonsport/p240_NI068SH95GYY.jpg\"},\"id\":\"ABC123\"}"

When i try to convert rdb to redis protocol

 sudo rdb --command protocol /var/lib/redis/redis.rdb > /tmp/1

, i get error

sudo rdb --command protocol /var/lib/redis/redis.rdb > /tmp/1
Traceback (most recent call last):
  File "/usr/local/bin/rdb", line 9, in <module>
    load_entry_point('rdbtools==0.1.6', 'console_scripts', 'rdb')()
  File "/usr/local/lib/python2.7/dist-packages/rdbtools/cli/rdb.py", line 82, in main
    parser.parse(dump_file)
  File "/usr/local/lib/python2.7/dist-packages/rdbtools/parser.py", line 306, in parse
    self.read_object(f, data_type)
  File "/usr/local/lib/python2.7/dist-packages/rdbtools/parser.py", line 398, in read_object
    self._callback.hset(self._key, field, value)
  File "/usr/local/lib/python2.7/dist-packages/rdbtools/callbacks.py", line 318, in hset
    self.emit('HSET', key, field, value)
  File "/usr/local/lib/python2.7/dist-packages/rdbtools/callbacks.py", line 298, in emit
    self._out.write(u"$" + unicode(len(unicode(arg))) + u"\r\n")
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 34: ordinal not in range(128)

. I don't know python and i don't undestand why this error raise.
Is it bug ?

Can't report rdb file for redis 3.2

╭─sky@sky-linux ~  
╰─➤  rdb -c memory dump_unicat_6579.rdb > unicat.csv
Traceback (most recent call last):
  File "/usr/local/bin/rdb", line 9, in <module>
    load_entry_point('rdbtools==0.1.7', 'console_scripts', 'rdb')()
  File "/usr/local/lib/python2.7/dist-packages/rdbtools-0.1.7-py2.7.egg/rdbtools/cli/rdb.py", line 72, in main
    parser.parse(dump_file)
  File "/usr/local/lib/python2.7/dist-packages/rdbtools-0.1.7-py2.7.egg/rdbtools/parser.py", line 302, in parse
    self.parse_fd(open(filename, "rb"))
  File "/usr/local/lib/python2.7/dist-packages/rdbtools-0.1.7-py2.7.egg/rdbtools/parser.py", line 325, in parse_fd
    self._callback.end_database(db_number)
  File "/usr/local/lib/python2.7/dist-packages/rdbtools-0.1.7-py2.7.egg/rdbtools/memprofiler.py", line 153, in end_database
    self._stream.next_record(record)
  File "/usr/local/lib/python2.7/dist-packages/rdbtools-0.1.7-py2.7.egg/rdbtools/memprofiler.py", line 93, in next_record
    self._out.write("%d,%s,%s,%d,%s,%d,%d\n" % (record.database, record.type, encode_key(record.key), 
  File "/usr/local/lib/python2.7/dist-packages/rdbtools-0.1.7-py2.7.egg/rdbtools/callbacks.py", line 91, in encode_key
    return _encode(s, quote_numbers=True)
  File "/usr/local/lib/python2.7/dist-packages/rdbtools-0.1.7-py2.7.egg/rdbtools/callbacks.py", line 88, in _encode
    return _encode_basestring_ascii(s)
  File "/usr/local/lib/python2.7/dist-packages/rdbtools-0.1.7-py2.7.egg/rdbtools/callbacks.py", line 69, in _encode_basestring_ascii
    return '"' + str(ESCAPE_ASCII.sub(replace, s)) + '"'
TypeError: expected string or buffer

Converting rdb to protocol

Traceback (most recent call last): File "/usr/local/bin/rdb", line 9, in <module> load_entry_point('rdbtools==0.1.6', 'console_scripts', 'rdb')() File "/usr/local/lib/python2.7/dist-packages/rdbtools/cli/rdb.py", line 82, in main parser.parse(dump_file) File "/usr/local/lib/python2.7/dist-packages/rdbtools/parser.py", line 306, in parse self.read_object(f, data_type) File "/usr/local/lib/python2.7/dist-packages/rdbtools/parser.py", line 389, in read_object score = float(score) ValueError: invalid literal for float(): 8US-08-554690:G:3647.07874343085132US-07-495050:G

big rdb

Hi, Now I have a huge rdb file which is about 50G. I cost so much time to analyze memory by using this tool. Any advise?

UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 264: ordinal not in range(128)

rdb --command protocol dump.rdb
...
Traceback (most recent call last):
File "/usr/local/bin/rdb", line 9, in
load_entry_point('rdbtools==0.1.8', 'console_scripts', 'rdb')()
File "/usr/local/lib/python2.7/dist-packages/rdbtools/cli/rdb.py", line 82, in main
parser.parse(dump_file)
File "/usr/local/lib/python2.7/dist-packages/rdbtools/parser.py", line 293, in parse
self.parse_fd(open(filename, "rb"))
File "/usr/local/lib/python2.7/dist-packages/rdbtools/parser.py", line 346, in parse_fd
self.read_object(f, data_type)
File "/usr/local/lib/python2.7/dist-packages/rdbtools/parser.py", line 401, in read_object
self._callback.set(self._key, val, self._expiry, info={'encoding':'string'})
File "/usr/local/lib/python2.7/dist-packages/rdbtools/callbacks.py", line 310, in set
self.emit('SET', key, value)
File "/usr/local/lib/python2.7/dist-packages/rdbtools/callbacks.py", line 299, in emit
self._out.write(u"$" + unicode(len(unicode(arg))) + u"\r\n")
UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 264: ordinal not in range(128)

My OS is Ubuntu 14.04.4 64 bits.

I saw a few similar issues like this #75 and you said it's fixed but seems not: unicode() is still there and I don't see str() to replace it. Maybe you fixed in code but not in distributed package yet? Or I installed the stale version? I installed it by "pip install ..." as README said.

Thanks.

regular expression doesn't match

--command json -k "tw_id:111.*" dump.rdb

matches everything including:
"tw_id:524869205:username"

it should be matching:
"tw_id:11123456:username"
"tw_id:111:username"
"tw_id:111999:username"
etc

No?

Include expiration in memory report

In my analysis, I want to distinguish between keys that are set to expire and keys that are permanent. So for this I need expiration time to be included in the memory report.

Report memory used by a given redis key using the dump command on a running redis server

See discussion on this mailing list https://groups.google.com/d/topic/redis-db/JaI-paZ0xoA/discussion

Verify checksum in rdb file

The dump file has a 8 byte CRC-32 checksum at the end of the file. The parser currently ignores the checksum. Instead, it should raise an exception if the checksum does not match.

Support RDB version 5

Redis 2.6 will introduce rdb version 5 - which adds an 8 byte checksum at the end of file. The checksum is based on CRC64.

As a first step, we can simply ignore the checksum.

Support for RDB version 6

Hello,

today the RDB version was bumped to 6, there are big changes in ziplist.c that now contains three new encodings. I would love to see the changes in your specification and support in redis-rdb-tools.

You can find the details at the top of ziplist.c, I updated the comments inside the file to reflect the new implementation, but I'll be very happy to reply to any question if needed.

Thanks,
Salvatore

Support Sharding : Add a command to split dump.rdb

Sharding an existing redis database is a big pain. There isn't a way to do it without data loss. For example, exporting to json using any of the tools loses the TTL. There is also no good way to maintain set v/s list semantics. Writing a script to iterate over all keys is doable, but not appealing.

Using the rdbparser, we can split the dump file directly into several shards. This method will retain the data type, ttl, as well as the internal representation. It will likely be faster than existing methods, and should also be safer.

For flexibility, we should allow sharding by database, by key, by datatype, or any combination thereof.

Proposed API -

redis-shard dump.rdb config.json

dump.rdb is the input dump file
config.json is a configuration file that declares how the shard the data
Any keys not matching the configuration file will be moved to default-shard.rdb

Sample config file :

[
    {
        "shard-name" : "shard1",
        "db" : [0],
        "keys" : ["user:1.*", "user-friends:1.*"],
        "data-type" : ["dict", "list", "set"],
    },
    {
        "shard-name" : "shard2",
        "keys" : ["user:2.*", "user-friends:2.*"],
    },
]

Add docs for redis-profiler command - I looked around and couldn't fine any..

Is this just an internal part of rdb command? If so why is it exposed in the cli?

'ascii' codec can't decode

Use command
rdb --command protocol ~/Downloads/842a7322da7611e4-2016-04-06-11-28-00.rdb

Traceback (most recent call last):
  File "/usr/local/bin/rdb", line 9, in <module>
    load_entry_point('rdbtools==0.1.6', 'console_scripts', 'rdb')()
  File "/Library/Python/2.7/site-packages/rdbtools/cli/rdb.py", line 82, in main
    parser.parse(dump_file)
  File "/Library/Python/2.7/site-packages/rdbtools/parser.py", line 306, in parse
    self.read_object(f, data_type)
  File "/Library/Python/2.7/site-packages/rdbtools/parser.py", line 369, in read_object
    self._callback.rpush(self._key, val)
  File "/Library/Python/2.7/site-packages/rdbtools/callbacks.py", line 340, in rpush
    self.emit('RPUSH', key, value)
  File "/Library/Python/2.7/site-packages/rdbtools/callbacks.py", line 298, in emit
    self._out.write(u"$" + unicode(len(unicode(arg))) + u"\r\n")
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe5 in position 9: ordinal not in range(128)

Error while parsing my RDB file.

redis_version: 2.4.14

Traceback (most recent call last):
File "/usr/local/bin/rdb", line 9, in <module>
  load_entry_point('rdbtools==0.1.3', 'console_scripts', 'rdb')()
File "/usr/local/lib/python2.7/dist-packages/rdbtools-0.1.3-py2.7.egg/rdbtools/cli/rdb.py", line 79, in main
  parser.parse(dump_file)
File "/usr/local/lib/python2.7/dist-packages/rdbtools-0.1.3-py2.7.egg/rdbtools/parser.py", line 287, in parse
  self._expiry = to_datetime(read_unsigned_int(f) * 1000000)
File "/usr/local/lib/python2.7/dist-packages/rdbtools-0.1.3-py2.7.egg/rdbtools/parser.py", line 705, in to_datetime
  dt = datetime.datetime.utcfromtimestamp(seconds_since_epoch)
ValueError: timestamp out of range for platform time_t

Pypi

Can you add this to pypi?

Need Support for Python 3.3

Here's the error message:

$ python setup.py build
Traceback (most recent call last):
File "setup.py", line 3, in
from rdbtools import version
File "c:\Users\yupan\code\redis-rdb-tools\rdbtools__init__.py", line 2, in
from rdbtools.callbacks import JSONCallback, DiffCallback, ProtocolCallback
File "c:\Users\yupan\code\redis-rdb-tools\rdbtools\callbacks.py", line 8
ESCAPE = re.compile(ur'[\x00-\x1f"\b\f\n\r\t\u2028\u2029]')
^
SyntaxError: invalid syntax

Memory Dump for very large RDB files (> 30 GBs) is Slow

For very large RDB, the memory dump can take upwards of 30 minutes. Even slower, the "key" feature requires a sequential scan over the whole file.

Finally trying to further introspect a data structure like a hash, list, set to find out which field is taking up the most memory. In my case I use celery as worker queue, and some tasks can be gigantic.

So I've made some enhancements such as the following
i) Reduce time to about 5 minutes to dump in quick mode
ii) Allow re-seeking for key contents in seconds, and limit mode
iii) Allow for verbose dumping of hash/list/set to file structure

Parser performance not optimal ~1min for a 24MB file

Profiler output for a 24MB dump.rdb file

     44009161 function calls (44008966 primitive calls) in 205.628 CPU seconds

Ordered by: standard name

ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 205.628 205.628 :1()
1 0.000 0.000 0.000 0.000 :1(DecimalTuple)
10 0.000 0.000 0.000 0.000 StringIO.py:119(read)
1 0.000 0.000 0.000 0.000 StringIO.py:30()
10 0.000 0.000 0.000 0.000 StringIO.py:38(_complain_ifclosed)
1 0.000 0.000 0.000 0.000 StringIO.py:42(StringIO)
2 0.000 0.000 0.000 0.000 StringIO.py:54(init)
2 0.000 0.000 0.000 0.000 UserDict.py:17(getitem)
1 0.000 0.000 0.000 0.000 UserDict.py:57(get)
1 0.000 0.000 0.000 0.000 UserDict.py:69(contains)
1 0.000 0.000 0.000 0.000 future.py:48()
1 0.000 0.000 0.000 0.000 future.py:74(_Feature)
7 0.000 0.000 0.000 0.000 future.py:75(init)
1 0.046 0.046 0.297 0.297 init.py:1()
1 0.000 0.000 0.000 0.000 init.py:49(normalize_encoding)
1 0.000 0.000 0.013 0.013 init.py:71(search_function)
9/5 0.000 0.000 0.000 0.000 abc.py:137(subclasscheck)
37 0.000 0.000 0.000 0.000 abc.py:7(abstractmethod)
19 0.001 0.000 0.003 0.000 abc.py:78(new)
60 0.000 0.000 0.001 0.000 abc.py:81()
5 0.000 0.000 0.000 0.000 abc.py:97(register)
1 0.060 0.060 0.173 0.173 callbacks.py:1()
1 0.000 0.000 0.000 0.000 callbacks.py:194(DiffCallback)
1 0.020 0.020 0.033 0.033 callbacks.py:26(_floatconstants)
1 0.000 0.000 0.000 0.000 callbacks.py:264(MemoryCallback)
1 0.000 0.000 0.000 0.000 callbacks.py:269(init)
1 0.000 0.000 0.000 0.000 callbacks.py:279(start_rdb)
1 0.000 0.000 0.000 0.000 callbacks.py:282(start_database)
1 0.000 0.000 0.000 0.000 callbacks.py:285(end_database)
1 0.000 0.000 0.000 0.000 callbacks.py:288(end_rdb)
1000001 24.914 0.000 110.527 0.000 callbacks.py:291(set)
2 0.000 0.000 0.000 0.000 callbacks.py:299(start_hash)
2 0.000 0.000 0.000 0.000 callbacks.py:323(start_set)
6 0.000 0.000 0.000 0.000 callbacks.py:327(sadd)
2 0.000 0.000 0.000 0.000 callbacks.py:332(end_set)
1000003 3.987 0.000 9.397 0.000 callbacks.py:383(end_key)
1000003 3.457 0.000 5.409 0.000 callbacks.py:388(newline)
2000004 21.586 0.000 24.588 0.000 callbacks.py:391(sizeof_string)
1000003 5.104 0.000 11.793 0.000 callbacks.py:409(top_level_object_overhead)
1000003 1.560 0.000 1.560 0.000 callbacks.py:416(key_expiry_overhead)
1000003 3.552 0.000 5.197 0.000 callbacks.py:433(hashtable_entry_overhead)
1000003 9.298 0.000 19.548 0.000 callbacks.py:45(_encode_basestring_ascii)
2000006 3.137 0.000 3.137 0.000 callbacks.py:454(sizeof_pointer)
1000003 7.538 0.000 30.889 0.000 callbacks.py:72(_encode)
1000003 3.676 0.000 34.565 0.000 callbacks.py:91(encode_key)
1 0.000 0.000 0.000 0.000 callbacks.py:97(JSONCallback)
1 0.000 0.000 0.000 0.000 codecs.py:77(new)
1 0.004 0.004 0.004 0.004 collections.py:1()
1 0.001 0.001 0.001 0.001 collections.py:13(namedtuple)
34 0.000 0.000 0.000 0.000 collections.py:43()
4 0.000 0.000 0.000 0.000 collections.py:60()
4 0.000 0.000 0.000 0.000 collections.py:61()
6 0.000 0.000 0.000 0.000 copy.py:100(_copy_immutable)
6 0.000 0.000 0.000 0.000 copy.py:65(copy)
1 0.002 0.002 0.068 0.068 decimal.py:116()
1 0.000 0.000 0.000 0.000 decimal.py:158(DecimalException)
1 0.000 0.000 0.000 0.000 decimal.py:181(Clamped)
1 0.000 0.000 0.000 0.000 decimal.py:193(InvalidOperation)
1 0.000 0.000 0.000 0.000 decimal.py:222(ConversionSyntax)
1 0.000 0.000 0.000 0.000 decimal.py:232(DivisionByZero)
1 0.000 0.000 0.000 0.000 decimal.py:248(DivisionImpossible)
1 0.000 0.000 0.000 0.000 decimal.py:259(DivisionUndefined)
1 0.000 0.000 0.000 0.000 decimal.py:270(Inexact)
1 0.000 0.000 0.000 0.000 decimal.py:282(InvalidContext)
1 0.000 0.000 0.000 0.000 decimal.py:296(Rounded)
1 0.000 0.000 0.000 0.000 decimal.py:308(Subnormal)
1 0.000 0.000 0.000 0.000 decimal.py:319(Overflow)
1 0.000 0.000 0.000 0.000 decimal.py:357(Underflow)
1 0.000 0.000 0.000 0.000 decimal.py:3611(_ContextManager)
1 0.000 0.000 0.000 0.000 decimal.py:3626(Context)
3 0.000 0.000 0.000 0.000 decimal.py:3645(init)
1 0.000 0.000 0.000 0.000 decimal.py:4925(_WorkRep)
1 0.000 0.000 0.000 0.000 decimal.py:503(Decimal)
6 0.000 0.000 0.021 0.003 decimal.py:512(new)
1 0.000 0.000 0.000 0.000 decimal.py:5158(_Log10Memoize)
1 0.000 0.000 0.000 0.000 decimal.py:5162(init)
8 0.000 0.000 0.000 0.000 genericpath.py:15(exists)
3 0.000 0.000 0.000 0.000 gettext.py:130(_expand_lang)
1 0.000 0.000 0.001 0.001 gettext.py:421(find)
1 0.000 0.000 0.001 0.001 gettext.py:476(translation)
1 0.000 0.000 0.001 0.001 gettext.py:542(dgettext)
1 0.000 0.000 0.001 0.001 gettext.py:580(gettext)
1 0.000 0.000 0.000 0.000 hex_codec.py:27(hex_decode)
1 0.000 0.000 0.000 0.000 hex_codec.py:45(Codec)
1 0.000 0.000 0.000 0.000 hex_codec.py:52(IncrementalEncoder)
1 0.000 0.000 0.000 0.000 hex_codec.py:57(IncrementalDecoder)
1 0.000 0.000 0.000 0.000 hex_codec.py:62(StreamWriter)
1 0.000 0.000 0.000 0.000 hex_codec.py:65(StreamReader)
1 0.000 0.000 0.000 0.000 hex_codec.py:70(getregentry)
1 0.000 0.000 0.000 0.000 hex_codec.py:8()
1 0.000 0.000 0.000 0.000 io.py:1030(BufferedWriter)
1 0.000 0.000 0.000 0.000 io.py:1117(BufferedRWPair)
1 0.000 0.000 0.000 0.000 io.py:1183(BufferedRandom)
1 0.000 0.000 0.000 0.000 io.py:1247(TextIOBase)
1 0.000 0.000 0.000 0.000 io.py:1295(IncrementalNewlineDecoder)
1 0.000 0.000 0.000 0.000 io.py:1371(TextIOWrapper)
1 0.000 0.000 0.000 0.000 io.py:1850(StringIO)
1 0.000 0.000 0.000 0.000 io.py:267(_DocDescriptor)
1 0.000 0.000 0.000 0.000 io.py:276(OpenWrapper)
1 0.000 0.000 0.000 0.000 io.py:290(UnsupportedOperation)
1 0.000 0.000 0.000 0.000 io.py:294(IOBase)
1 0.028 0.028 0.036 0.036 io.py:35()
1 0.000 0.000 0.000 0.000 io.py:566(RawIOBase)
1 0.000 0.000 0.000 0.000 io.py:621(FileIO)
1 0.000 0.000 0.000 0.000 io.py:643(BufferedIOBase)
1 0.000 0.000 0.000 0.000 io.py:715(_BufferedIOMixin)
1 0.000 0.000 0.000 0.000 io.py:72(BlockingIOError)
1 0.000 0.000 0.000 0.000 io.py:792(_BytesIO)
1 0.000 0.000 0.000 0.000 io.py:898(BytesIO)
1 0.000 0.000 0.000 0.000 io.py:905(BufferedReader)
1 0.000 0.000 0.000 0.000 keyword.py:11()
3 0.000 0.000 0.000 0.000 locale.py:316(normalize)
1 0.000 0.000 0.000 0.000 numbers.py:13(Number)
1 0.000 0.000 0.000 0.000 numbers.py:169(Real)
1 0.000 0.000 0.000 0.000 numbers.py:270(Rational)
1 0.000 0.000 0.000 0.000 numbers.py:295(Integral)
1 0.000 0.000 0.000 0.000 numbers.py:34(Complex)
1 0.000 0.000 0.002 0.002 numbers.py:6()
6 0.000 0.000 0.001 0.000 optparse.py:1007(add_option)
1 0.000 0.000 0.001 0.001 optparse.py:1185(init)
1 0.000 0.000 0.000 0.000 optparse.py:1237(_create_option_list)
1 0.000 0.000 0.001 0.001 optparse.py:1242(_add_help_option)
1 0.000 0.000 0.001 0.001 optparse.py:1252(_populate_option_list)
1 0.000 0.000 0.000 0.000 optparse.py:1262(_init_parsing_state)
1 0.000 0.000 0.000 0.000 optparse.py:1271(set_usage)
1 0.000 0.000 0.000 0.000 optparse.py:1307(_get_all_options)
1 0.000 0.000 0.000 0.000 optparse.py:1313(get_default_values)
1 0.000 0.000 0.000 0.000 optparse.py:1356(_get_args)
1 0.000 0.000 0.000 0.000 optparse.py:1362(parse_args)
1 0.000 0.000 0.000 0.000 optparse.py:1401(check_values)
1 0.000 0.000 0.000 0.000 optparse.py:1414(_process_args)
2 0.000 0.000 0.000 0.000 optparse.py:1511(_process_short_opts)
1 0.000 0.000 0.000 0.000 optparse.py:200(init)
1 0.000 0.000 0.000 0.000 optparse.py:224(set_parser)
1 0.000 0.000 0.000 0.000 optparse.py:365(init)
6 0.000 0.000 0.001 0.000 optparse.py:560(init)
6 0.000 0.000 0.000 0.000 optparse.py:579(_check_opt_strings)
6 0.000 0.000 0.000 0.000 optparse.py:588(_set_opt_strings)
6 0.000 0.000 0.000 0.000 optparse.py:609(_set_attrs)
6 0.000 0.000 0.000 0.000 optparse.py:629(_check_action)
6 0.000 0.000 0.000 0.000 optparse.py:635(_check_type)
6 0.000 0.000 0.000 0.000 optparse.py:665(_check_choice)
6 0.000 0.000 0.000 0.000 optparse.py:678(_check_dest)
6 0.000 0.000 0.000 0.000 optparse.py:693(_check_const)
6 0.000 0.000 0.000 0.000 optparse.py:699(_check_nargs)
6 0.000 0.000 0.000 0.000 optparse.py:708(_check_callback)
2 0.000 0.000 0.000 0.000 optparse.py:752(takes_value)
2 0.000 0.000 0.000 0.000 optparse.py:764(check_value)
2 0.000 0.000 0.000 0.000 optparse.py:771(convert_value)
2 0.000 0.000 0.000 0.000 optparse.py:778(process)
2 0.000 0.000 0.000 0.000 optparse.py:790(take_action)
6 0.000 0.000 0.000 0.000 optparse.py:832(isbasestring)
1 0.000 0.000 0.000 0.000 optparse.py:837(init)
1 0.000 0.000 0.000 0.000 optparse.py:932(init)
1 0.000 0.000 0.000 0.000 optparse.py:943(_create_option_mappings)
1 0.000 0.000 0.000 0.000 optparse.py:959(set_conflict_handler)
1 0.000 0.000 0.000 0.000 optparse.py:964(set_description)
6 0.000 0.000 0.000 0.000 optparse.py:980(_check_conflict)
1 0.042 0.042 0.078 0.078 parser.py:1()
1 0.000 0.000 0.000 0.000 parser.py:239(RdbParser)
1 0.000 0.000 0.000 0.000 parser.py:258(init)
1 11.802 11.802 205.146 205.146 parser.py:267(parse)
2000007 12.386 0.000 33.266 0.000 parser.py:312(read_length_with_encoding)
1 0.000 0.000 0.000 0.000 parser.py:330(read_length)
2000006 11.267 0.000 51.153 0.000 parser.py:333(read_string)
1000003 7.220 0.000 141.652 0.000 parser.py:356(read_object)
1 0.000 0.000 0.000 0.000 parser.py:42(RdbCallback)
2 0.000 0.000 0.000 0.000 parser.py:466(read_intset)
1 0.000 0.000 0.000 0.000 parser.py:602(verify_magic_string)
1 0.000 0.000 0.000 0.000 parser.py:606(verify_version)
1 0.000 0.000 0.000 0.000 parser.py:611(init_filter)
2000006 9.695 0.000 14.738 0.000 parser.py:639(matches_filter)
1000003 1.779 0.000 1.779 0.000 parser.py:649(get_logical_type)
3000012 15.629 0.000 27.115 0.000 parser.py:710(read_unsigned_char)
6 0.000 0.000 0.000 0.000 parser.py:716(read_unsigned_short)
4 0.000 0.000 0.000 0.000 parser.py:722(read_unsigned_int)
1 0.000 0.000 0.000 0.000 parser.py:739(DebugCallback)
8 0.000 0.000 0.000 0.000 posixpath.py:59(join)
1 0.027 0.027 205.613 205.613 rdb:2()
1 0.001 0.001 205.288 205.288 rdb:8(main)
10 0.000 0.000 0.055 0.005 re.py:188(compile)
10 0.000 0.000 0.054 0.005 re.py:229(_compile)
19 0.000 0.000 0.032 0.002 sre_compile.py:184(_compile_charset)
19 0.001 0.000 0.031 0.002 sre_compile.py:213(_optimize_charset)
75 0.000 0.000 0.000 0.000 sre_compile.py:24(_identityfunction)
8 0.001 0.000 0.001 0.000 sre_compile.py:264(_mk_bitmap)
2 0.006 0.003 0.009 0.004 sre_compile.py:307(_optimize_unicode)
22 0.000 0.000 0.000 0.000 sre_compile.py:360(_simple)
10 0.000 0.000 0.007 0.001 sre_compile.py:367(_compile_info)
64/10 0.002 0.000 0.029 0.003 sre_compile.py:38(_compile)
20 0.000 0.000 0.000 0.000 sre_compile.py:480(isstring)
10 0.000 0.000 0.036 0.004 sre_compile.py:486(_code)
10 0.000 0.000 0.054 0.005 sre_compile.py:501(compile)
8 0.000 0.000 0.021 0.003 sre_compile.py:57(fixup)
104 0.000 0.000 0.003 0.000 sre_parse.py:132(len)
226 0.001 0.000 0.001 0.000 sre_parse.py:136(getitem)
22 0.000 0.000 0.000 0.000 sre_parse.py:140(setitem)
96 0.000 0.000 0.000 0.000 sre_parse.py:144(append)
81/32 0.001 0.000 0.001 0.000 sre_parse.py:146(getwidth)
10 0.000 0.000 0.000 0.000 sre_parse.py:184(init)
945 0.004 0.000 0.006 0.000 sre_parse.py:188(next)
209 0.001 0.000 0.001 0.000 sre_parse.py:201(match)
857 0.002 0.000 0.008 0.000 sre_parse.py:207(get)
69 0.000 0.000 0.000 0.000 sre_parse.py:216(isident)
13 0.000 0.000 0.000 0.000 sre_parse.py:222(isname)
12 0.000 0.000 0.000 0.000 sre_parse.py:231(_class_escape)
14 0.000 0.000 0.000 0.000 sre_parse.py:263(_escape)
33/10 0.000 0.000 0.018 0.002 sre_parse.py:307(_parse_sub)
38/10 0.003 0.000 0.017 0.002 sre_parse.py:385(_parse)
10 0.000 0.000 0.018 0.002 sre_parse.py:669(parse)
10 0.000 0.000 0.000 0.000 sre_parse.py:73(__init)
18 0.000 0.000 0.000 0.000 sre_parse.py:78(opengroup)
18 0.000 0.000 0.000 0.000 sre_parse.py:89(closegroup)
64 0.000 0.000 0.000 0.000 sre_parse.py:96(init)
1 0.001 0.001 0.006 0.006 threading.py:1()
2 0.000 0.000 0.000 0.000 threading.py:176(Condition)
1 0.000 0.000 0.000 0.000 threading.py:179(_Condition)
2 0.000 0.000 0.000 0.000 threading.py:181(init)
1 0.000 0.000 0.000 0.000 threading.py:221(_is_owned)
1 0.000 0.000 0.000 0.000 threading.py:272(notify)
1 0.000 0.000 0.000 0.000 threading.py:290(notifyAll)
1 0.000 0.000 0.000 0.000 threading.py:299(_Semaphore)
1 0.000 0.000 0.000 0.000 threading.py:347(_BoundedSemaphore)
1 0.000 0.000 0.000 0.000 threading.py:359(Event)
1 0.000 0.000 0.000 0.000 threading.py:362(_Event)
1 0.000 0.000 0.000 0.000 threading.py:366(init)
1 0.000 0.000 0.000 0.000 threading.py:376(set)
1 0.000 0.000 0.000 0.000 threading.py:414(Thread)
1 0.000 0.000 0.000 0.000 threading.py:426(init)
1 0.000 0.000 0.000 0.000 threading.py:510(_set_ident)
1 0.000 0.000 0.000 0.000 threading.py:57(_Verbose)
4 0.000 0.000 0.000 0.000 threading.py:59(init)
1 0.000 0.000 0.000 0.000 threading.py:64(_note)
1 0.000 0.000 0.000 0.000 threading.py:713(_Timer)
1 0.000 0.000 0.000 0.000 threading.py:742(_MainThread)
1 0.000 0.000 0.000 0.000 threading.py:744(init)
1 0.000 0.000 0.000 0.000 threading.py:752(_set_daemon)
1 0.000 0.000 0.000 0.000 threading.py:783(_DummyThread)
1 0.000 0.000 0.000 0.000 threading.py:99(_RLock)
1 0.000 0.000 0.000 0.000 traceback.py:1()
1 0.000 0.000 0.001 0.001 warnings.py:45(filterwarnings)
1 0.012 0.012 0.013 0.013 {import}
10 0.000 0.000 0.000 0.000 {_sre.compile}
35 0.021 0.001 0.021 0.001 {_sre.getlower}
3000023 4.948 0.000 4.948 0.000 {_struct.unpack}
3 0.000 0.000 0.000 0.000 {abs}
4 0.000 0.000 0.000 0.000 {all}
1 0.000 0.000 0.000 0.000 {binascii.a2b_hex}
26 0.001 0.000 0.001 0.000 {built-in method new of type object at 0x82e5e0}
3 0.000 0.000 0.000 0.000 {built-in method acquire}
10 0.000 0.000 0.000 0.000 {built-in method group}
1000006 3.285 0.000 3.285 0.000 {built-in method match}
2 0.000 0.000 0.000 0.000 {built-in method release}
1000003 2.831 0.000 2.831 0.000 {built-in method search}
1000003 5.982 0.000 5.982 0.000 {built-in method sub}
32 0.000 0.000 0.000 0.000 {chr}
1 0.015 0.015 205.628 205.628 {execfile}
6 0.000 0.000 0.000 0.000 {filter}
400 0.001 0.000 0.001 0.000 {getattr}
8 0.000 0.000 0.000 0.000 {globals}
4 0.000 0.000 0.000 0.000 {hasattr}
3000302 5.241 0.000 5.241 0.000 {isinstance}
20/11 0.000 0.000 0.000 0.000 {issubclass}
2002286/2002258 3.008 0.000 3.008 0.000 {len}
4 0.000 0.000 0.000 0.000 {locals}
2 0.000 0.000 0.000 0.000 {map}
7 0.000 0.000 0.000 0.000 {max}
4 0.000 0.000 0.000 0.000 {method 'contains' of 'frozenset' objects}
2 0.000 0.000 0.000 0.000 {method 'enter' of 'file' objects}
9 0.000 0.000 0.000 0.000 {method 'subclasses' of 'type' objects}
9 0.000 0.000 0.000 0.000 {method 'subclasshook' of 'object' objects}
73 0.000 0.000 0.000 0.000 {method 'add' of 'set' objects}
2000814 3.395 0.000 3.395 0.000 {method 'append' of 'list' objects}
1 0.000 0.000 0.000 0.000 {method 'copy' of 'dict' objects}
1 0.000 0.000 0.013 0.013 {method 'decode' of 'str' objects}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
24 0.000 0.000 0.000 0.000 {method 'endswith' of 'str' objects}
8 0.000 0.000 0.000 0.000 {method 'extend' of 'list' objects}
9 0.000 0.000 0.000 0.000 {method 'find' of 'str' objects}
85 0.000 0.000 0.000 0.000 {method 'get' of 'dict' objects}
1 0.000 0.000 0.000 0.000 {method 'insert' of 'list' objects}
30 0.000 0.000 0.000 0.000 {method 'isalnum' of 'str' objects}
4 0.000 0.000 0.000 0.000 {method 'isdigit' of 'str' objects}
33 0.000 0.000 0.000 0.000 {method 'items' of 'dict' objects}
3 0.000 0.000 0.000 0.000 {method 'join' of 'str' objects}
1 0.000 0.000 0.000 0.000 {method 'keys' of 'dictproxy' objects}
4 0.000 0.000 0.000 0.000 {method 'lower' of 'str' objects}
1 0.000 0.000 0.000 0.000 {method 'lstrip' of 'str' objects}
4 0.000 0.000 0.000 0.000 {method 'pop' of 'list' objects}
5000020 13.233 0.000 13.233 0.000 {method 'read' of 'file' objects}
18 0.000 0.000 0.000 0.000 {method 'remove' of 'list' objects}
8 0.000 0.000 0.000 0.000 {method 'replace' of 'str' objects}
3 0.000 0.000 0.000 0.000 {method 'reverse' of 'list' objects}
544 0.002 0.000 0.002 0.000 {method 'setdefault' of 'dict' objects}
2 0.000 0.000 0.000 0.000 {method 'setter' of 'property' objects}
6 0.000 0.000 0.000 0.000 {method 'split' of 'str' objects}
156 0.000 0.000 0.000 0.000 {method 'startswith' of 'str' objects}
3 0.000 0.000 0.000 0.000 {method 'strip' of 'str' objects}
2 0.000 0.000 0.000 0.000 {method 'tolist' of 'array.array' objects}
2 0.000 0.000 0.000 0.000 {method 'tostring' of 'array.array' objects}
1 0.000 0.000 0.000 0.000 {method 'translate' of 'str' objects}
8 0.000 0.000 0.000 0.000 {method 'upper' of 'str' objects}
2000006 5.662 0.000 5.662 0.000 {method 'write' of 'file' objects}
135 0.000 0.000 0.000 0.000 {min}
2 0.139 0.069 0.139 0.069 {open}
68 0.000 0.000 0.000 0.000 {ord}
8 0.000 0.000 0.000 0.000 {posix.stat}
9 0.000 0.000 0.000 0.000 {range}
1 0.000 0.000 0.000 0.000 {repr}
109 0.000 0.000 0.000 0.000 {setattr}
1 0.000 0.000 0.000 0.000 {sys._getframe}
3 0.000 0.000 0.000 0.000 {thread.allocate_lock}
2 0.000 0.000 0.000 0.000 {thread.get_ident}

Don't convert anything if regex doesn't match

I have a redis DB with some hash like user:abc.
Doing a dump like this:

rdb --command json --key "user-nothing:.*" -f user.json dump.rdb

same with:

rdb --command json --type hash --key "user-nothing:.*" -f user.json dump.rdb

ends up filling up the json file, with all data even though nothing should've matched.

What am I missing?

Add command line option to escape std out

When using command line tool, with RDBs containing Unicode or binary data, it can sometimes be useful to escape printed bytes. Printing binary data, or not supported Unicode codepoints, can be confusing or even interrupt other printed data and terminal action.

I suggest an addition of --escape=ENC parameter with the options raw (default), non-ascii, and non-utf8.

raw - emit raw bytes as it is now.
non-ascii - escape any non ascii byte.
non-utf8 - escape bytes that can't be read as a part of utf-8 encoded codepoint.

i met a problem in parsering rdb file.

Hi， i like your rdb parser tools, it is very good!

[jun.zeng@YZSJHL21-221 redisbinlog]$ ~/redis/redis-2.4.16/src/redis-cli -p 63790
redis 127.0.0.1:63790> keys *

"page"
redis 127.0.0.1:63790> ZRANGE page 0 -1
"renren"
"sina"
"qq"
"taobao"

this my redis content, i check its rdb file:
[jun.zeng@YZSJHL21-221 redisbinlog]$ od -c dump_63790_0.rdb
0000000 R E D I S 0 0 0 2 0 \f 004 p a g
0000020 e @ D @ U 004 U \0 \0 \0 P 003 006 \b
0000040 \0 \0 006 r e n 002 005 \b 022 1 . 1 0 000060 004 \0 \v 1 024 004 s i n a 006 022 2 . 2 000100 004 030 026 0 2 024 002 q q 004 5 \0 004 006 t
0000120 a o b a o \b 6 \0 000133

[jun.zeng@YZSJHL21-221 redisbinlog]$ ./test
0x52 0x45 0x44 0x49 0x53 0x30 0x30 0x30 0x32 0xfffffffe 0x0 0xc 0x4 0x70 0x61 0x67 0x65 0xffffffc3 0x40 0x44 0x40 0x55 0x4 0x55 0x0 0x0 0x0 0x50 0x20 0x3 0x6 0x8 0x0 0x0 0x6 0x72 0x65 0x6e 0x20 0x2 0x5 0x8 0x12 0x31 0x2e 0x31 0x30 0xffffffe0 0x4 0x0 0xb 0x31 0x14 0x4 0x73 0x69 0x6e 0x61 0x6 0x12 0x32 0x2e 0x32 0xffffffe0 0x4 0x18 0x16 0x30 0x32 0x14 0x2 0x71 0x71 0x4 0xffffffc0 0x5 0x0 0x4 0x6 0x74 0x61 0x6f 0x62 0x61 0x6f 0x8 0xffffffc0 0x6 0x0 0xffffffff 0xffffffff
0x52 0x45 0x44 0x49 0x53 is redis
0x30 0x30 0x30 0x32 is version
0xfffffffe 0x0 means database number is 0
0xc means this type is zset in ziplist encoding
0x4 0x70 0x61 0x67 0x65 is the key "page"
by your program, you said 0xffffffc3 0x40 0x44 0x40 this is the zlbytes ？
0x55 0x4 0x55 0x0 is the tail?
0x0 0x0 is the length? 0 ?

my mailbox is [email protected]
please help me in parsering this file , thank you !

sripathikrishnan / redis-rdb-tools Goto Github PK

redis-rdb-tools's Introduction

Parse Redis dump.rdb files, Analyze Memory, and Export Data to JSON

Installing rdbtools

Command line usage examples

Filter parsed output

Converting dump files to JSON

Generate Memory Report

Find Memory used by a Single Key

Comparing RDB files

Emitting Redis Protocol

Using the Parser

Other Pages

License

Maintained By

Credits

redis-rdb-tools's People

Contributors

Stargazers

Watchers

Forkers

redis-rdb-tools's Issues

Recommend Projects

Recommend Topics

Recommend Org