It is the 'top' app in the mobile world, almost immediately followed the ' give me your mobile number' request comes the following question 'Do you have WhatsApp?'. Clearly this application is changing the concept of free SMS messaging.
Alberto warned about insecurity issues in how WhatsApp transmits data in plain text and what this means in shared environments.
Today we have to talk about the inside, the way in which WhatsApp stores and manages its data. Looking from within the file structure of the application we have two files called msgstore.db and wa.db (locations vary, of course, between Android and iPhone). These files are in SQLite format.
Once we import these files with a tool to browse inside their content (eg SQLite Manager), here comes the first surprise: none of the information contained is encrypted. Contacts are stored in wa.db and EVERY sent messages are in msgstore.db.
Wait a sec, did I say EVERY? Absolutely, every sent and received messages are there. And why "EVERY" is in uppercase?, simply because although theoretically WhatsApp give us the opportunity through its graphical interface to delete conversations, the reality is that they still remain in the database ad infinitum.
And the issue is even more fun if we sent or received messages at a time which GPS was enabled, because WhatsApp also stores coordinates in msgstore.db
In the case of Android there are even more important things stored that might be of interest to a forensic investigator - or maybe a jealous boyfriend/girlfriend. Apparently WhatsApp is configured by default with a very 'verbose' level of logging and store, within the directory / files / Logs, files with this appearance:
In these files are recorded every XMPP transactions made by the application with a very high verbose (debug) level, with the timestamp of when it receives or sends a message (among other things).
011-06-09 00:47:21.799 xmpp/reader/read/message 346XXXXXXX@s.whatsapp.net 1307XXXXXX-30 0 false false
These files are easily "parseable" to extract the ratio of mobile numbers which has maintained some kind of conversation with us. I created a small script that parses the file and pulls out this list of numbers:
import re import sys logfile = sys.argv logdata = open(logfile,"r") dump = logdata.readlines() numerosin =  numerosout =  for line in dump: m = re.search('(?<=xmpp/reader/read/message )\d+', line) if m: if not numerosin.count(m.group(0)): numerosin.append(m.group(0)) m = re.search('(?<=xmpp/writer/write/message/receipt )\d+', line) if m: if not numerosout.count(m.group(0)): numerosout.append(m.group(0)) print "Messages received from\n" print "\n".join(numerosin) print "\nMessages sent to\n" print "\n".join(numerosout)
Executing the script, it will ouput the information as follows:
$ python whatsnumbers.py whatsapp-2011-06-08.1.log
Messages received form
Messages sent to