It is well known that Microsoft Office files store internal metadata that can be very revealing during forensic examinations (Author, Last Saved By, Creation Time, Last Saved time, etc.). What may not be as well known are the timestamps maintained within the OLE data structures of the Office 97-2003 files and how these timestamps may be used in a forensic examination. If a file was opened and closed without saving (thus not updating the internal Last Saved time or potentially any file system timestamp) and you do not have access to operating system artifacts to demonstrate file access, examination options are limited. However, I’ve found that in some cases – specifically with Microsoft Excel – OLE timestamps may be used to determine the last time a file was opened, even if the file was closed before saving.
The Details
When a spreadsheet is saved in the Microsoft Excel 97-2003 format, the last modified time of the Root Entry within the OLE file should either be zeroed out or updated to reflect the time that the file was saved (depending on the version of Excel used to save the file). This may not be very helpful as the last save information is already available through other known metadata (i.e. the Summary Information stream). However, the last modified timestamp of the Root Entry appears to be updated when the Excel file is opened. If the file is then closed without being saved, this modification time remains and reflects the last time the file was opened. This means that it may be possible to detect the last time a Microsoft Excel file in 97-2003 format was opened if the file was not saved and an examiner is provided with nothing more than the file itself.
Updates to the last modified time of the Root Entry directory entry remained consistent in my testing of Excel 2000, Excel 2007, and Excel 2010 (I did not have Excel 2003 or 2013 available to me at the time of testing). Further, the timestamp was updated regardless of the version of Excel that created or opened the file. When the “Protected View” warning bar appears (requiring the user to click “Enable Editing” to edit the spreadsheet), it appears that the update to the OLE Root Entry modification timestamp will depend on the volume from which the file was opened. Opening a file that was downloaded from the Internet but stored on the local hard disk results in an update to the modification time (regardless of whether the “Enable Editing” button is clicked by the user). Opening a file from a network resource will not update the modification timestamp unless the user clicks the “Enable Editing” button. It should be noted though that my testing has been limited with regard to the Protected View functionality.
Finding the Timestamp
Forensic Implications
When an examiner is provided with a limited set of data (e.g. a flash drive or external hard drive), the options for analysis will likely be limited. Without the common operating system artifacts that we are used to examining, determining activity with regard to a particular file or set of files can be difficult. However, if an examiner is provided with a media device storing files in Excel 97-2003 format, he or she may be able to determine if and when each file was opened without being saved. Comparing the Last Saved time in the Summary Information stream to the last modified time of the OLE Root Entry may be revealing. If the last modified time of the OLE Root Entry is later than the Last Saved time, the file may have been opened and closed without saving after the last time that the file was saved. This information may be very helpful when the mere fact that a file was opened after a particular date is significant.
While this post (and my testing) has focused on Microsoft Office 97-2003 Excel files, it’s important to note that the OLE Root Entry last modified and creation timestamps are not limited to Microsoft Office files. There are a number of other files that use the OLE compound file format, such as jump lists (*.automaticDestinations-ms), thumbs.db files, and sticky notes. Further research into the behavior of the OLE timestamps with regard to other file types may reveal interesting and useful information for forensic examinations.
Resources
OLE Compound File Format
[MS-CFB]: Compound File Binary File Format
Forensics Wiki: OLE Compound File
Hi,
You should try the modules 'metacompound' in DFF 1.3 It extract all the OLE stream with their metadata (minifat / difat specific attributes, …) and also the metadata specific to the DOC and PPT Stream. (The "Last saved time", …). It also automatically extract pictures and text from DOC and PPT.
Solal,
Thanks for the tip, I haven't tried DFF yet. Is the metacompound module part of the free edition?
Jason,
Great post! It really validates my thoughts that analysts need to know more about the data structures that they encounter, in order to get the most out of them.
Even though for some, the OLE format MS Office documents may no longer be on their radar, this is valuable information in that the OLE format is in use in multiple file formats on more recent versions of Windows, as well as in more recent applications.
Again, thanks for writing and posting this…
Harlan,
Thanks, and good point about OLE being used in other formats as well. It will be interesting to see if/how this type of information is helpful in analyzing files from other (and possibly more current) applications.
Yes Jason. You can download it on http://www.digital-forensic.org.
(I'm sorry but most of the distribution haven't update to the 1.3 package yet and the module is not in the 1.2 version).
To test it you can directly click 'open evidence' then the 'green cross', to add your .doc file (or other compound document). Then double-click on the document who will appear in 'Logical Files' then a tree with a node for each stream will be created. Each node (stream) will have it's own metadata (it appear on the right panel) and the Document specific metadata will be added to the root node too. Then you can use the search engine to compare metadata of multiple documents (or create a python script).
Solal,
I just tried DFF 1.3.0 and was able to extract a good deal of OLE metadata (version, byte order, sector locations, etc.), but it doesn't appear to extract the last modified date from the Root Entry. I manually verified the date in the hex preview of DFF, but the timestamp doesn't appear to be parsed when applying the compound module to the file…
Jason,
If you check on the node with the 'blue cross' (who appear once the modules is applied to inform that some content was expended) this is normal.
On this node the only metadata that appear are:
* the general one from FAT/DIFAT and minifat table (under the attribute : metacompound.Compound document)
the "metacompound.DocumentSummaryInformation (Root Entry)" and "metacompound.SummaryInformation (Root Entry)" who correspond to the informations of the first embedded documents (there could be other Summary Information and Document Summary Information if there is word document embedded inside an other word document for examples).
To find the "metacompound.Creation time" and the "metacompound.Modified time" you should double-click on the node and select the "Root Entry" stream on this node the metadata should be set. Also this two metadatas is available for all other nodes (stream) but most of the time the value is set to 0. (1980-00-01 00:00:00).
Can you tell me if you find the right result on the "Root Entry" node ?
If not it will be good to share a 'word' document so I could patch the module.
Thanks.
Solal,
Thanks for the follow-up. I just tested this and the modification time listed for the Root Entry node was correct.
Jason.
Have you seen any occasions where the OLE "Last Opened Time" is updated but the NTFS Last Modified is NOT updated? i.e. OLE date is later than NTFS date.
Yes. In my testing, the NTFS Last Modified ($STD_INFO) timestamp was not updated when a file was opened and closed without being saved. In that case, the OLE Root Entry Last Modified (i.e. Last Opened Time) would be later than the NTFS $STD_INFO Last Modified time.