WeChat(微信), as the most popular mobile IM app in China, doesn't give users any method to export well-formatted history message. This tool can parse and export WeChat messages on a rooted android phone.
Right now it can dump messages in text-only mode, or generate a single-file html containing voice messages, images, emoji, etc.
NEWS: WeChat 6.0+ uses silk to encode audio. The code is updated.
NEWS: WeChat 6.3 uses a new avatar storage. The code is updated.
HELP NEEDED: Starting from May 2016, the first 1KB of all emojis in resource/emoji
are encrypted. Right now I'm using emoji URL which covers most of them.
If you are good at cryptography / reverse engineereing, or you work at Tencent, feel free to contact me and help take a look. It is also possible to recover the image without knowing the first 1KB (just have to detect chunks without knowing metadata), but I don't have time to do that either.
If this tools works for you, please take a moment to add your phone/OS to the wiki. If it doesn't work, please leave an issue together with your phone/OS/wechat version.
- python-PIL
- PyQuery
- pysox
- pysqlcipher
- numpy
- csscompressor (suggested, optional)
- adb and rooted android phone connected to a Linux/Mac OS.
- Silk audio decoder (included; just run
./third-party/compile_silk.sh
) - gnu-sed
Note that commands involving ./android-interact.sh
are meant to be run on the computer.
-
(Requires Linux or Mac) Get the decrypted WeChat database and the avatar index:
- Automatic:
./android-interact.sh db-decrypt
- Requires rooted adb. If the OS distribution does not come with adb support, you can download an app such as https://play.google.com/store/apps/details?id=eu.chainfire.adbd
- Manual:
-
Figure out your
${userid}
by inspecting the contents of/data/data/com.tencent.mm/MicroMsg
on the root filesystem of the device. It should be a 32-character-long name consisting of hexadecimal digits. -
Get
/data/data/com.tencent.mm/MicroMsg/${userid}/{EnMicroMsg.db,sfs/avatar.index}
from the device, possible ways are:./android-interact.sh db
- Use your rooted file system manager app
-
Get WeChat uin (an integer), possible ways are:
./android-interact.sh uin
, which pulls the value from/data/data/com.tencent.mm/shared_prefs/system_config_prefs.xml
- Login to web wechat, get wxuin=1234567 from
document.cookie
-
Get your phone IMEI number (a positive integer), possible ways are:
./android-interact.sh imei
- Call
*#06#
on your phone - Find IMEI in system settings
-
Decrypt database, will produce
decrypted.db
:./decrypt-db.py <path to EnMicroMsg.db> <imei> <uin>
-
NOTE: you may need to try different ways to getting imei & uin, because things behave differently on different phones.
Also, if the decryption doesn't work with pysqlcipher, maybe try the version of sqlcipher in
legacy
. - Automatic:
-
Copy the WeChat user resource directory
/mnt/sdcard/tencent/MicroMsg/${userid}/{emoji,image2,sfs,video,voice2}
from the phone's SD card to theresource
directory:./android-interact.sh res
- You might need to change
RES_DIR
in the script if the default is incorrect on your phone. - This can take a very long time. Some manual ways to do it faster:
-
If there's enough free space on your phone, you can archive all required files via
busybox tar
with or without compression inadb shell
, and useadb pull
to copy the archive. Note that busyBox is needed as the Android system'star
may choke on long paths. -
Alternatively, you can use pipes. This is slower, but doesn't require any free space on your phone.
# This will copy the whole 'MicroMsg' to the current directory: adb shell 'cd /mnt/sdcard/tencent && busybox tar czf - MicroMsg 2>/dev/null | busybox base64' | base64 -di | tar xzf -
-
What you'll need in the end is a
resource
directory with the following subdir:emoji,image2,sfs,video,voice2
.
-
-
(Optional) Download uncompress the emoji cache from here and put it under
wechat-dump
. This will avoid downloading lots of emojis in rendering.
-
Parse and dump text messages of every chat (requires
decrypted.db
):./dump-msg.py decrypted.db output_dir
-
List all chats (requires
decrypted.db
):./list-chats.py decrypted.db
-
Generate statistical report on text messages (requires
output_dir
from./dump-msg.py
):./count-message.sh output_dir
-
Dump messages of one contact to html, containing voice messages, emojis, and images (requires
decrypted.db
,avatar.index
, andresource
):./dump-html.py decrypted.db avatar.index resource "<contact_name>" output.html
See here for an example html.
Screenshots of generated html:
- Search by uid/username
- Attack the emoji encryption problem
- Use pipes by default to copy a directory from android
- Fix rare unhandled types: > 10000 and < 0
- Better user experiences... see
grep 'TODO' wechat -R
- more easy-to-use for non-programmers (GUI?)