I was asked what this Fix-up thing was that I mentioned in my last post.
Fix-ups are used by NTFS to keep track of sectors that are part of specific data structures within the file system. This is done for a variety of reasons: detecting corruption from a failed disk sector, from a failed write, or from the moons of Jupiter not being properly aligned. By tagging all of the sectors that make up a specific structure, NTFS can check to ensure that the sector it just read was read correctly and does indeed belong to the structure it is trying to read. This is accomplished by a coordinated effort from two objects: the Update Sequence Number (USN) and the Update Sequence Array (USA).
The Update Sequence Number (USN) is a uint16 (two byte unsigned integer) that is stored in the header of the structure and in the last two bytes of every sector consumed by the structure. When this number is written into those sector ending positions, it overwrites existing bytes that used to be part of information that we probably cared about, such as file names, date/time stamps, etc. We need a way to store and recover that overwritten data, which brings us to the USA.
The Update Sequence Array (USA) is a series of uint16 (two-byte unsigned integer) entries that contain the last two bytes of each sector consumed by the structure. Any time we read a sector that is part of that structure, we must account for the USN. First we check to see if the last two bytes are the same as the USN from the structure’s header, to see if we read the sector correctly and to ensure the sector that we read is part of the structure we meant to read. Once we are happy with that, we throw it away. Next we need to step through the USA to the entry that corresponds to the sector we are reading, and substitute those two bytes for the last two of the sector we just read.
The types of data structures that we typically find USN and USA for are:
– $MFT FILE Records
– INDX Records for directories and other indexes
– $LogFiles RCRD Records
– $LogFile RSTR Records
This can be especially problematic when searching a drive in a forensics-like “look for this file name anywhere on the system” type keyword search. If there is a fix-up in the middle of the file name, the keyword search won’t find it. Most point-n-click forensics tools (EnCase, FTK, etc) will correct the data when displaying in the pretty tables that are part of the tool, but still can’t account for the USN when it is time to do a keyword search. Also, anyone doing manual parsing of file system, for those that have to deal with damaged media or a corrupt file system, needs to understand this concept.
Examples:
1) MFT Records are 1024-bytes long. So, each record consumes two sectors. In the FILE Record header, there will be a two-byte USN and a four-byte USA. The USN will be found repeated at the end of each sector, so offset 510-511 and 1022-1023. Depending on what attributes make up the FILE record, there could be important information at offsets 510-511 that could be overwritten.
2) INDX Records that make up directories are 4096-bytes long. If there are more entries than can fit in that amount of space, then a second INDX record, for another 4096-bytes, is used. Each INDX Record will have a two-byte USN and an eight-byte USA. Only the allocated part of the INDX will be written to. So, say we have a directory that contains a lot of files, then all but a few get deleted. The actual size of the INDX will shrink, but it will still be allocated in 4K chunks. There will be records in the slack space towards the bottom of that 4K chunk that will reference files that may or may not still exist. The sectors consumed by that slack space will have a different USN saved at their tails.