Television crime dramas have made “forensics” a household word. computer science professor Vassil Roussev is working to give forensics investigators a universal language—a data query language, that is.
Roussev has been awarded a nearly $300,000 National Science Foundation Grant to develop the language called “nugget,” that he says seeks to make digital forensic investigations quicker for analysts.
“What this work does is create a uniform language to describe what you’re doing,” to arrive at specific results or conclusions, Roussev said. “And the reason this is useful and important is that it documents it exactly, in effective computer code that is understandable to the analyst.”
Digital forensics is the science of tracing or tracking evidence from any digital system or source, such as a computer’s hard drive, a video, an audio file, cell phone or email. As the volume of data from these digital sources continues to grow, the need becomes greater for investigators to be able to sift quickly and efficiently through mounds of data to find potential evidence, Roussev said.
“When people talk about forensic analysis, they essentially figure out what happened,” he said. “Forensics is basically analysis after the fact or when you suspect something has happened.”
Currently, investigator notes are the main source for helping guide analysts in reproducing information gleaned from another colleagues' data search. However, many of the forensic tools are proprietary and are not designed to be used across different software tools or systems, Roussev said. Having to manually cull through another investigators notes to determine how they arrived at a particular conclusion slows the investigatory process, he said.
“Nugget,” an idea that originated with Roussev and is being executed by Christopher Stelly, a doctoral research student, is designed as an open source project that would operate regardless of the software the analysts uses.
As it relates to usability and performance requirements in digital forensics and incident response investigations, nugget seeks to: provide investigators with the means to easily and completely specify the data flow of a forensic inquiry from data source to final results; allow the fully automatic—and optimized—execution of the forensic computation; and provide a complete, formal and auditable log of the inquiry.
The investigation process not only has to be understandable, but also reproducible which is a key component in science and the legal arena of court cases, Roussev said.
“If (the results) can’t be reproduced, it’s not science,” said Roussev, referencing a basic tenet of scientific research methods.