When opening documents in Office 2008 and 2010 ( not sure about other versions) on a Mac the user is presented with a dialog box for recent documents, called the Workbook Gallery. As you can see by the screen shot below, the Recent Documents tracks the File Name, Last Opened date and File Path of the recent documents:
This information is stored in the com.microsoft.office.plist file under the User's profile : /Users/%Username%/Library/Preferences/.
Some notes about the com.microsoft.office.plist files:
- It can contain A LOT of entries. I found close to 1500 entries spanning 4 years.
- It can also have User Information such as name and email address
- It has Volume names, so you can see if files were opened from an external drive, etc.
Looking at this plist file through the Xcode Plist Editor shows the following:
Now the Access Date look familiar, Hex values - and the File Alias looks like it's in Hex too.
Time to view the data in a Hex viewer to see what’s going on:
Ah ha! File Paths, File names, and the timestamp information.
Now the trick – figuring out the timestamp. By doing some testing – I.E. opening up files in MS Office on a Mac, checking the changes in the timestamp values, and brainstorming with Brian Moran, we were able to figure out the timestamp appeared to be in HFS+ 32 Bit Little Endian:
B95120CE = Thu, 01 August 2013 10:56:09 -0700
We couldn’t quite figure out what the last two bytes, 0xEB6A, were for – maybe milliseconds? Further testing will need to be done to confirm this.
Time to time it all together. Take the Data Field from File Alias in the Mac Plist Editor and convert the Hex value to ASCII (try this website) to get your File Paths and File Name:
00000000 01960002 00000a4d 44544855 4d424452 56000000 00000000 00000000 00000000 00000000 00004244 0001ffff ffff1645 6d706c6f 79656520 53616c61 72696573 2e786c73 78000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 0000ffff ffff0000 00000000 00000000 0000ffff ffff0000 0a024953 00000000 00000000 00000000 0018436f 6d70616e 7920546f 70205365 63726574 2046696c 65730002 00442f3a 566f6c75 6d65733a 4d445448 554d4244 52563a43 6f6d7061 6e792054 6f702053 65637265 74204669 6c65733a 456d706c 6f796565 2053616c 61726965 732e786c 7378000e 002e0016 0045006d 0070006c 006f0079 00650065 00200053 0061006c 00610072 00690065 0073002e 0078006c 00730078 000f0016 000a004d 00440054 00480055 004d0042 00440052 00560012 00302f43 6f6d7061 6e792054 6f702053 65637265 74204669 6c65732f 456d706c 6f796565 2053616c 61726965 732e786c 73780013 00132f56 6f6c756d 65732f4d 44544855 4d424452 5600ffff 0000
Company Top Secret Files
D/:Volumes:MDTHUMBDRV:Company Top Secret Files:Employee Salaries.xlsx
0/Company Top Secret Files/Employee Salaries.xlsx
(In this example, the volume name of my thumbdrive was MDTHUMVDRV)
Then use Dcode (or whatever you like) to convert the Hex timestamp (remember to remove the last two Bytes):
If you do not have a Mac at your disposal, no worries, you can still use the Windows plist editor.
From the plist Editor for Windows, convert the Access Date from Base64 to Hex (try this website):
AAC5USDO62o= = 0000B95120CEEB6A
Then use DCode to convert B95120CE as shown above.
For the File Alias, convert the Data field from Base64 to ASCII, try this website for the conversion.
Or go for door number two - you can use the python scrip I wrote, OfficePlistParser to parse the file.
The script will pull the MRU ID (so you can refer back to the plist for verification), Access Date in UTC, and Full Path. Long file names appear to be concatenated with with a random set of numbers, like so:
In Office 2010, the long file names are supplied in the file aliases which are parsed by the script.
It also pulls User information which is output to the screen. A note on the User information. I noticed on some of my test data that the username in the file may be the person who first registered the product, or entered their user information into MS Word first. This did not correspond to the user who opened the file.
For example, on another Mac I created a profile for testing, opened a document then parsed the file. The owner's name was listed in the plist file, not mine or my account user name. Some more research will need to be done here.... If your looking at a carved plist files it's something to be aware of as the username may not be representative of who opened the files.
Since Python does not have native support for reading binary plist files, the library biplist is required. This can be installed on the SIFT workstation using easy install:
sudo easy_install biplist
Here is an example some parsed content:
You can get the scrip here. Enjoy, and any feedback/issues with the script are appreciated. It worked on my test data but I don't know what type of shenanigans your clients may be up to...
Also, quick shout outs to Cheeky4N6Monkey and @brianjmoran for their help. Three minds are better then one, right?