A Comprehensive Data Carving Architecture for Digital Forensics
September 15, 2006 - Dr. Golden G. Richard III was recently awarded an NSF Cyber grant for research into next-generation file carving. The 260K grant is three years in duration and funds Lodovico Marziale, a Ph.D. student working in digital forensics. One of the most important software techniques in digital forensics is file carving, which uses a database of characteristics about commonly encountered file types to search physical disks for files of these types. File carving is a particularly powerful technique because files can be retrieved from raw disk images, regardless of the type of filesystem, even if file metadata has been destroyed. File carving can find data associated with deleted files, in slack space, swap files, and data "hidden" outside the filesystem. The goal is to identify the starting and ending locations of files in the disk images and "carve" (copy) sequences of bytes into regular files so their value as evidence can be measured.
Currently, the scope of file carving is narrow and there are significant limitations. Patterns for new file types are typically created manually, which is a tedious and error-prone process. Current-generation tools generate many false positives, carving files whose formats are incorrect. This wastes the time of investigators, who already have overwhelming caseloads. In addition, carving has not been applied extensively to volatile media such as memory and network flows. RAM carving is useful for low-level detection of installed malware, where high level detection (e.g., via OS facilities) might be thwarted by the malware. Network data flow carving can help investigators determine whether specific documents were transmitted or whether versions of specific documents were transmitted. This research explores techniques for better automatic identification of documents, by expanding on the header/milestone/footer search generally employed in current generation tools. The research also expands the scope of data carving to include memory analysis and network flow analysis. The entire architecture is be incorporated into the distributed digital forensics framework created by the PI and his collaborators at the University of New Orleans.
