Several people have already done a fantastic job of breaking down the file format and writing scripts to parse these cookies. If Perl is your flavor, check out these handy tools from Jake Cunningham. If you love Python, the script from Satishb3 does a great job of parsing the information.
While both of the above scripts do a fantastic job of parsing and presenting the information for the Cookies.binarycookies file, I wanted a way to parse a directory full of these binarycookies as well as the Google Analytic values from the cookies.
The awesome thing about open source is the ability to not only learn by looking at someone else's code, but to build on top of what they have done and create or tailor something for what you need (then hopefully turn around and share it again with others).
When I was reviewing the Satishb3 python script, I did not see a specific licensing agreement distributed with the code. I reached out to Satishb3 for permission to reuse his code and luckily for me, he graciously wrote back granting me permission.
This saved me a lot of time, and enabled me to focus my efforts on adding in the features that I needed. I sat down with some Dr. Pepper and the handy, dandy SIFT Workstation,and wrote a python script that parses the binarycookies file with the following additions:
1) Parses a directory full of cookies
2) Parses the Google Analytic values from the Cookies (umta, utmb, utmz)
3) Added an option to output into TLN format
Usage Examples
To process one file:
bc_parser.py -f Cookies.binarycookies -o myoutput.tsv
To process a directory of cookies:
bc_parser -d /full/path/to/cookies -o myoutput.tsvTo have the output in TLN format (this can be used with the file or directory option):
bc_parser.py -f Cookies.binarycookies -o myoutput.tsv - t -H MariPC -u Mari-f is the binary cookie filename, -o is the output file, -t means TLN output, -H is the Host (optional) and -u is the username (optional) .
Example Cookie Output:
Full Image |
Google Analytic Output, utmz:
Full Image |
TLN (Timeline Output):
Full Image |
Download the bc_parser python script.
Getting an error when running this, I'm not that good with python or the safari cookie format to fix this
ReplyDelete>python bc_parser.py -f Cookies.binarycookies -o output.tsv
Traceback (most recent call last):
File "bc_parser.py", line 489, in
cookies_and_ga = parse_file(f,options.infile)
File "bc_parser.py", line 351, in parse_file
utmz_values = parse_utmz(cookie_value["URL"],cookie_value["Value"])
File "bc_parser.py", line 146, in parse_utmz
utmz_value["LastUpdate_Epoch"] = int(utmz_values[1])/1000
ValueError: invalid literal for int() with base 10: '1376906434014:183831097'
Thanks for the feedback. It looks like there may be a non-standard time format in the utmz field - its hard to tell without the actual Cookies.binarycookies data your using. I will fix the code to skip over that line in your file. Or, if you can send me the Cookies.binarycookies file I can see if I can add in code to parse it correctly.
DeleteCheck back in a day or so and I'll have an updated version.
Ok, I know its been awhile, but I fixed the script for the above error :-) Someone was able to provide me with a file that was causing the same error. Visit my downloads page for the most recent version: http://az4n6.blogspot.com/p/downloads.html
DeleteIt turns out some domain hashes in the Google Analytic values included a extra period "." in them which threw the script off.
I appreciate the feedback. If your running into problems, please let me know and I'll be happy to take a look.
Is this script still working on IOS7? I get the following error:
ReplyDeleteC:\>python bc_parser_V2.0.py -f Cookies.binarycookies -o myoutput.tsv
File "bc_parser_V2.0.py", line 229
print "Error converting time for: " + URL + " " + cookie_value
What version of python are you using?
ReplyDeletePython version 3.3.3 on Windows 8
ReplyDeleteI tested the script on Python 2.6.4. so that may be part of the problem. This error pops up when it runs into a time it can't convert. Normally it would print a URL and the cookie value so you might be able to see why it's throwing the error, but I think this print statement doesn't works in Python 3.
ReplyDeleteAre you able to email me the test data, or run it on Python 2 get more information about the error? You can email me at arizona4n6 at gmail.com.