Applying the Science of Similarity to Computer Forensics
DoD Cyber Crime Conference, 2011

Slides (pdf)    

Computers are fantastic at finding identical pieces of data, but terrible at finding similar data. Part of the problem is first defining the term similar in any given context. The relationships between similar pictures are different than the relationships between similar pieces of malware. This talk will explore the different kinds of similar, a scientific approach to finding similar things, and how these apply to computer forensics. Fuzzy hashing was just the beginning! Topics will include wavelet decomposition, control flow graphs, cosine similarity, and lots of other fun mathy stuffs which will make your life easier.

