Malware Forensic Field Guides: Tool Box 

Chapter 5     File Identification and Profiling

 Command Line Hashing Utilities


Name: Microsoft File Checksum Integrity Verifier (FCIV)
Page Reference:  244
Author/Distributor:  Microsoft
Available From:
Description: FCIV is a flexible command-line utility allowing the digital investigator to a single file or recursively scan a directory for either MD5 or SHA1 hash values of target files. FCIV also enables the user to limit hashing to specific types of files.




Compute hash and send to output (default screen)
-type Conducts hashing for specific file types; ex: -type *.exe
-exec file List of directories that should not be computed
-wp Without full path name. (Default store full path)
-md5 Specifies to use md5 hashing


Specifies to use sha1 hashing

Name: GNU Core Utilities
Page Reference:  244
Author/Distributor:  GNU Project
Available From:;
Description: The GNU core utilties for Windows is a collection of basic file, shell, and text manipulation utilities, which closely comport with the GNU utils for *nix systems; included in this suite of utilities are CLI md5sum and sha1 sum tools.

 GUI Hashing Utilities

Name: Hash Quick
Page Reference:  244
Author/Distributor:  Ted Lindsey
Available From:
Description: A lightweight utility with a clean interface, Hash Quick provides for drag-and-drop hashing of files and folders using either the MD5 or SHA1 cryptographic algorithym. Further, Hash Quick allows the digital investigator to quickly conduct batch and recursive hashing—functionality particularly helpful when examining or comparing multiples files, directories, or subdirectories.

Name:  WinMD5
Page Reference:  244
Author/Distributor:  Edwin Olson
Available From:
Description: WinMD5 is a robust and flexible GUI-based MD5 hashing utility, allowing for both dragand-drop hashing of target files and folders and hash value comparison (requires the installation of the Microsoft .NET framework on the analysis system).


Name:  MD5Summer
Page Reference:  244
Author/Distributor:  Luke Pascoe
Available From:
Description:  MD5summer enables the digital investigator to select a file or folder and generate MD5
hash values for the contents of each respective file.


Name:  HashonClick
Page Reference:  244
Author/Distributor:  2BrightSparks
Available From:
Description:  HashOnClick provides hash calculation through Windows Explorer shell extensions upon
right-clicking a target file and offers the additional choices of calculating a hash value with either the
SHA1 or CRC32 algorithms.

Name:    Graphical MD5sum
Page Reference:  244
Author/Distributor:  Toast442
Available From:
Description:  Graphical MD5sum is a relatively lightweight and intuitive MD5 GUI hashing tool that
provides for multiple file drag-and-drop functionality. Results can be quickly and easily copied and pasted into a report or other document using the built in “To Clipboard” feature.

Name:  Malcode Analyst Pack (MAP)
Page Reference:  244
Author/Distributor:  iDefense
Available From:
Description:  The MAP, a series of tools developed by iDefense Labs (owned by VeriSign, Inc.) to assist investigators with both static and dynamic malware analysis, provides a simple, clean MD5 hash calculation utility that offers hash calculation through Windows Explorer shell extensions upon right-clicking a target file.

Name:  Visual MD5
Page Reference:  244
Author/Distributor:  Protect Folder Plus Team
Available From:   Previously available from
Description:  An intuitive MD5 GUI hashing tool that provides for multiple file drag-and-drop
functionality, Visual MD5 also has features such as displaying the full system path of target files, date,
and time stamp reporting of hash generation, and a “copy to clipboard” option for quick collection of
results for pasting into a document.


 File Similarity Indexing

Name:  SSDeep
Page Reference:  245
Author/Distributor:  Jesse Kornblum
Available From:
Description: A fuzzy hashing tool that computes a series of randomly sized checksums for a file, allowing file association between files that are similar in file content but not identical.



Verbose mode, displays filename as its being processed
Pretty matching mode, similar to -d but includes all

Recursive mode
Directory mode, compare all files in a directory
Silent modem, all errors are suppressed
-b Uses only the bare name of files, all path information

Uses relative paths for filenames
Prints output in CSV format
 -t Only displays matches above the given threshold
 -m Match files against known hashes in file

Name:  SSDeepFE
Page Reference: 245
Author/Distributor:  Richard F. McQuown (
Available From:
Description:  SsdeepFE is a slick GUI front-end for ssdeep that allows for quick and efficient file
hashing. SSDeepFE is particularly useful for comparing unknown files against a preexisting piecewise
hash file list, shown in the figure, below.

Name:  DeepToad
Page Reference:  245
Author/Distributor:  Joxean Koret
Available From:
Description:  Inspired by ssdeep, Deeptoad is a (python) library and a tool to clusterize similar files using fuzzy hashing techniques. The menu and tool is usage is shown below:

DeepToad v1.0, Copyright (c) 2009, 2010 Joxean Koret <>Usage:
C:\Python26\ [parameters] <directory>
Common parameters:
-o=<directory> Not yet implemented
-e=<extensions> Exclude extensions (separated by comma)
-i=<extensions> Clusterize only specified extensions (separated by comma)
-m=<value> Clusterize a maximum of <value> file(s)
-d=<distance> Specify the maximum edit distance (by default, 16 or 33%)
-ida Ignore files created by IDA
-spam Enable spam mode (remove space characters)
-dspam Disable spam mode
-p Just print the generated hashes
-c Compare the files
-echo=<msg> Print a message (usefull to generate reports)
Advanced parameters:
-b=<block size> Specify the block size (by default, 512)
-r=<ignore range> Specify the range of bytes to be ignored (by default, 2)
-s=<output size> Specify the signature's size (by default, 32)
-f Use faster (but weaker) algorithm
-x Use eXperimental algorithm
-simple Use the simplified algorithm
-na Use non aggresive method (only applicable to default
-ag Use aggresive method (default)
-nb Ignore null blocks (default)
-cb Consider null blocks
Analyze a maximum of 25 files excluding zip and rar files:
C:\Python26\,.rar -m=25 /home/luser/samples

File Visualization

Name:  CryptoVisualizer (Part of the Crypto Implementations Analysis Toolkit)
Page Reference:  246
Author/Distributor:  Omar Herrera
Available From:
Description:  The Crypto Implementations Analysis Toolkit is a suite of tools for the detection and
analysis of encrypted byte sequences in files. CryptVisualizer displays the data contents of a target file in a graphical histogram, allowing the digital investigator to identify pattern or content anomalies.

Name:  BinVis
Page Reference:  246
Author/Distributor:  Gregory Conti/Marius Ciepluch
Available From:
Description:  BinVis is a binary file visualization framework that enables the digital investigator to view binary structures in unique ways. As shown in the figure below, BinVis provides for eight distinct
visualization modes that render alternative graphical perspectives on the target file structure, data patterns, and contents. Particularly useful for analysis is the interconnectedness of the views; for example, if the digital investigator opens the byteplot display and strings viewer, with each region that is clicked on in the byteplot viewer the same area of the target file is automatically displayed in the strings viewer.

Hexadecimal Editors

Name:  McAffee FileInsight
Page Reference:  248
Author/Distributor:  McAffee
Available From:
Description:  File Insight is a versatile hexadecimal editor geared toward suspicious file and malicious code analysis. In addition to traditional hexadecimal and strings parsing functionality,enhanced file parsing and navigation capabilities can be implemented with custom plug-ins and scripting. Lastly, a remote acquisition feature allows the digital investigator to acquire and input files hosted on remote URLs—even through a proxy server.

Name: 010 Editor
Page Reference:  248
Author/Distributor:  SweetScape Software
Available From:
Description:  A Swiss Army Knife of hex editors, 010 Editor uses unique Binary Template allowing the digital investigator to parse the particularized file structures within a myriad of binary files. Similar to
other plug-in or scritpting language based tools, a number of freely available templates have been developed by other 010 Editor users. In the figure below, a PDF file is parsed within the PDF Template developed by Didier Stevens. 010 Editor can also be used to compare two different files and generate hash values and histograms of data contents.


Name:  FlexHex
Page Reference:  248
Author/Distributor:  FlexHex
Available From:
Description:  A valuable hex editor for examining malicious binaries and document files, FlexHex can
parse OLE compound files and present the file structures for examination in a separate navigation pane.

 File Identification and Classification

Name:  GT2
Page Reference:  249
Author/Distributor:  Philip Helger (also known as “PHaX”)
Available From:
Description:  In addition to identifying an unknown binary’s file format, GT2 details the file’s target
operating system and architecture, file resources, dependencies, and metadata. Similarly, GT2 can also parse a variety of file formats, identifying file structures, and enumerating offsets.

Name:  File Identifier
Page Reference: 249
Author/Distributor: OptimaSC
Available From:
Description:  A command-line utility that is close to the functional equivalent of the Linux of file
command with additional metadata extraction and reporting features.

Helpful Switches:



Print out some information


Don't modify the last access date of the resource


 -v Verbose mode
 -cb File identification only (no metadata)
 -cs Standard identification search
 -ch Extended identification search (slower)
 -eh0 Print to HTML report
 -ec Print to CSV report

Name:  The Digital Record Object Identifier (DROID)
Page Reference: 251
Author/Distributor: British National Archives, Digital Preservation Department
Available From: and for tool
download, go to
Description: DROID is a GUI tool with similar functionality to TrIDNet. Developed by the British
National Archives Digital Preservation Department, as part of its PRONOM technical registry project,
DROID performs automated batch identification of file formats.


Name:  FileAlyzer
Page Reference: 251
Author/Distributor: Patrick Kolla/
Available From:
Description: A GUI-based utility for file identification and basic file analysis, including type
identification, hash value, properties, contents, and structure. A multipurpose tool, FileAlyzer also serves as a hex viewer, strings extractor, and PE file viewer.

 Embedded Artifact Extraction


Name:  TextScan
Page Reference: 258
Author/Distributor:  AnalogX
Available From:
Description: One good alternative or supplemental GUI-based strings extraction tool is TextScan. Like
BinText, TextScan has simple load functionality, will extract all of the ASCII and Unicode text contained inside the file (minimum character length can be adjusted), and will attempt to identify certain entities, such as function calls and DLLs.


Name:  Malcode Analyst Pack (MAP)
Page Reference: 258
Author/Distributor:   David Zimmer/iDefense
Available From:

Description: Another handy strings-parsing utility is the strings shell extension in the iDefense Malcode Analyst Pack (MAP). As previously mentioned in the Tool Box section in the context of hash values, MAP was developed by iDefense to assist investigators with both static and dynamic malware analysis. The strings shell extension is handy and simple: simply right-click on the file to be examined and choose the “Strings” shell extension. The strings in the file are parsed out into an easily navigable interface. The tool also provides a search function if a particular string is sought within the file. Like BinText and TextScan, the MAP Strings tool extracts both ASCII and Unicode strings and expressly bifurcates these results in the tool’s output.


Name: Binary Text Scan
Page Reference: 258
Author/Distributor:  Brian Enigma
Available From:  Previously hosted on and archived on
Description:  An older and little known tool, BinaryTextScan, is now difficult to find on the Internet
(previously hosted on Written by Brian Enigma, BinaryTextScan offers a simple output interface and identifies the corresponding file offset of discovered strings. Like other GUI strings analysis tools, BinaryTextScan also provides a string search function.


Name: TextExtract
Page Reference:  258
Author/Distributor:  Ultima Thule Ltd.
Available From:  Previously hosted on and now archived on

Description: Another GUI-based strings extraction tools is Ultima Thule Ltd.’s TextExtract. TextExtract differs a bit from the tools referenced above, particularly in that it pipes output into a text file as opposed to directly into the interface.

 Symbolic and Debug References

Page Reference: 260-261
Author/Distributor:Tim “Chez” Tabor
Available From:

Description: A sleek front-end for DUMPBIN, which includes dumpbinCHM. It is a shell context menu
that allows for a right-click on the target file and a selection of the DUMPBIN argument to be applied
against a target file.


 File Dependencies


Name:   LDD-win32 (altbinutils-pe)
Page Reference: 259
Author/Distributor:  Minimalist GNU for Windows (MinGW)
Available From:
Description:  A Windows port of ldd, a Linux tool for identifying a target file’s shared library

Name:   PEBrowse Professional
Page Reference: 259
Author/Distributor: SmidgeonSoft
Available From:

Description:  PEBrowse Professional is a GUI-based static analysis tool and diassembler for
Win32/Win64 Portable Executable files. Using the toggle button features of PEBrowse, the digital
investigator can drill down into a suspect binary’s file dependencies and associated API functions.
Further, upon double-clicking an API function, a memory offset for the reference is displayed in a
separate viewing pane.


 File Metadata

Name:   ExifTool GUI
Page Reference:  263
Author/Distributor: Bogdan Hrastnik
Available From:

Description:  Intutive graphical front-end to exiftool to recurrsively extract metadata from a myriad of
file types.


 File Obfuscation: Packers and Cryptors

 Packing and Cryptor Identification

Name: PEiD Plug-ins
Page Reference:  269
Author/Distributor: Various authors and contributors
Available From:
Description:  PEiD is a packer and cryptor freeware detection tool most predominantly used by digital
investigators, both because of its high detection rates (more than 600 different signatures) and an easy-to-use GUI interface that allows for multiple file and directory scanning with heuristic scanning options. PEiD contains a plug-in interface and a myriad of plug-ins that afford additional detection functionality, as described in the table, below.

Name:  PE Detective
Page Reference: 270
Author/Distributor:  Daniel Pistelli/NTCore
Available From:
Description:  PE Detective, created by Daniel Pistelli, can scan a single PE file or recursively scan entire directories to identify compilation and obfuscation signatures. PE Detective is deployed along with the Signature Explorer, shown in the figure below, which is an advanced signature manager to check collisions, and handle, update, and retrieve signatures. To examine a file in PE Detective, simply identify a suspect file through the browsing function, or drag and drop the file into the tool interface. The output from the tool will appear in the main “matches ” pane. If there are multiple signature results, they will be listed in descending priority. The data for each identified match reveals the signature name, the number of matches (meaning how many bytes in the signature match), and possible comments regarding the signature.

Name:  Mandiant Red Curtain
Page Reference: 270
Author/Distributor: Mandiant
Available From:

Description:  Another excellent utility for identifying both binary obfuscation mechanisms and other
malicious file characteristics and identifiers is Mandiant’s Red Curtain (MRC). MRC examines a
Windows executable file and determines its level of “suspiciousness” by evaluating it against a set of
certain criteria. In particular, MRC examines multiple aspects of a suspect executable, including entropy, indicia of obfuscation, compiler packing signatures, the presence of digital signatures, and other characteristics, and then generates a threat “score” as a preliminary “litmus test” in deciding whether a particular file requires further, more extensive investigation. Upon querying a target file, MRC producesan XML report detailing its analysis. The user interface displays the report in a grid, much like a typical spreadsheet application, allowing the digital investigator to arrange the various columns contained in the report, as shown in the figure below.

Another interesting and valuable feature of MRC is that it offers a “roaming” mode, allowing the
installation of an Agent on removable media to quickly gather information from other systems without
having to install the full MRC application (which requires.NET). Agent-gathered information
subsequently can be opened in the MRC user interface for analysis.

Moreover, unlike traditional packing detection utilities that simply scan a target binary to detect the
presence of a known packer or cryptor signature, MRC also focuses on file entropy or the measure of
“randomness” in the code. In addition to evaluating the entropy of a file, MRC examines a number of
other properties in a queried specimen file, including the digital signatures embedded in the file, PE
structure anomalies, unusual imported .dlls, and section permissions to calculate an aggregate “Threat
Score.” The Threat Scores and correlating values as defined by Mandiant are shown in the figure below.
In addition to the main graphical grid interface, MRC provides the user with an additional interface to
inspect the particular portions of the executable specimen that were evaluated by MRC in calculating the aggregate threat score assigned to the specimen.

Name: StudPE
Page Reference:  270
Author/Distributor: “Christi G”
Available From:
Description: Stud PE is a powerful multipurpose PE analysis tool written by “Christi G,” which offers a flexible packer signature identification feature and provides the ability to query a suspect file against a built-in or external signature database.

Name:  RDG
Page Reference:  270
Author/Distributor:  RDGMax
Available From:
Description:  RDG is the only GUI-based packer and compiler detection tool exclusively in the Spanish language. There are previous “hacked” versions in English, but often this version is hosted on shadier Internet forums. In addition to compiler and packer detection, RDG offers numerous other malicious binary analysis utilities, such as an entropy calculator, cryptographic algorithm detection, OEP detection, and custom signature creation, among others.

Name: Protection ID
Page Reference:  270
Author/Distributor:  cdkiller
Available From:  http:/
Description:  Protection ID is a GUI-based packing detection scanner for programs relating to Compact Disc copy protection mechanisms, as well as obfuscated executable files. The tool offers a series of options, such as “Context Menu,” Aggressive Scan,” and “Smart Scan,” but without supporting documentation describing their respective functionalities.

 Windows Executable File Format

Name:  PeView
Page Reference:  273
Author/Distributor:  Wayne J. Radburn
Available From:
Description:  PEView is a dual-paned graphical PE file parsing tool, providing the digital investigator
with an intuitive view of PE file structure and contents; toggle buttons allow for hierarchial drilling down
deeper into the target file.

Name:  Anywhere PE Viewer
Page Reference:  273
Author/Distributor:  Artem Kuroptev/UCWare
Available From:

Description:  Written in Java, Anywhere PE Viewer is a cross-platform PE file viewer that provides for
convenient drag-and-drop target file loading. The analyst interface is divided into four tabs for separate
viewing of the PE Header, Import Table, Export Table, and Resources.


Name:  PE Explorer
Page Reference:  273
Author/Distributor:  Heaven Tools
Available From:
Description: One of the few commerical PE analysis tools, PE Explorer is a robust graphical utilitiy that allows the digital investigator to conduct deep analysis into a suspect PE file’s structure and contents to develop a file profile. PE Explorer includes a PE file viewer, Resource Viewer, Dependency Scanner, and Symbol/Debug information viewer, among other features.

Name:  InspectEXE
Page Reference:  273
Author/Distributor:  Silurian Software
Available From:
Description:  InspectEXE is a PE viewing utility that can be invoked through right
and selecting “Properties.” Like FileAlyzer, InspectEXE identifies PE structure information, version
information, and other granular details about the target file, as seen in the figure below.

Name:  Exeinfo
Page Reference: 273
Author/Distributor:  Nir Sofer/Nirsoft
Available From:
Description: A great drag-and-drop GUI tool for obtaining these details, including. dlls and driver files, is Nirsoft’s Exeinfo. Simply drag a suspect file into the interface and the tool will query the file and print the results within the interface, as illustrated in figure below. In addition to identifying the file type, Exeinfo presents basic executable structure details, Created and Modified dates and times, and file metadata, if available.

Malicious Document Analysis

 Malicious Document Analysis: PDF Files

Name:  Origami
Page Reference:  286-287
Author/Distributor:  Gillaume Delugré, Frédéric Raynal (Contributor)
Available From:;
Description:  Origami is a framework of tools written in Ruby designed to parse and analyze malicious
PDF documents as well as to generate malicious PDF documents for research purposes. Origami contains a series of Ruby parsers—or core scripts (described in the table below), scripts, and Walker (a GTK GUI interface to examine suspect PDF files, depicted in the figure below).
Helpful Switches:



pdfscan.rb Parses the contents and structures of a target PDF file
extractjs.rb Extracts JavaScript from a target PDF file specimen
detectsig.rb Detects malicious signatures in a target PDF file specimen
pdfclean.rb Disables common malicious trigger functions
printmetadata.rb Extracts file metadata from a target PDF file specimen


Name:  PDF Toolkit (pdftk)
Page Reference:  291
Author/Distributor:  PDF Labs
Available From:
Description: Although not specifically geared toward malicious PDF analysis, pdftk, a multifunctional
CLI tool, has a number of functions that can assist the digital investigator in probing PDF data, including metadata extraction (shown below) and stream decompression.

C:\Malware Lab>pdftk.exe c:\Malware\PDFs\CMSIconf.pdf dump_data
InfoKey: ModDate
InfoValue: D:20100629103444+08'00'
InfoKey: CreationDate
InfoValue: D:20100629103353+08'00'
PdfID0: c86a7444fab1b41a530d5d29cc77d7a
PdfID1: 897f9215590643a9a3d611ffe01aa0
NumberOfPages: 1

Name:  Jsunpack-n
Page Reference:  290
Author/Distributor:  Blake Hartstein
Available From:
Description:  Jsunpack-n, “a generic JavaScript unpacker”, is a suite of tools written in python designed to emulate browser functionality when navigating to URLs. Although a powerful tool for researchers to idenfity client-side browser vulnerabilities and exploits, Jsunpack-n is also a favorite tool of digital investigators to examine suspect PDF files and extract embedded Javascript. In the figure below, the script is used to extract JavaScript from a suspect PDF file specimen and write it to a separate file for further analysis.

/home/malwarelab/Desktop/merry_christmas\ UNZIPPED.pdf
processing /home/malwarelab/Desktop/merry_christmas UNZIPPED.pdf!!!
parsing /home/malwarelab/Desktop/merry_christmas UNZIPPED.pdf
failed to decompress object 26 0
Found JavaScript in 31 0 (3106 bytes)
children []
tags [['Filter', ''], ['FlateDecode', ''], ['Length', '1213']]
indata = <</Filter[/FlateDecode]/Length
Wrote JavaScript (9085 bytes -- 5979 headers / 3106 code) to file
/home/malwarelab/Desktop/merry_christmas UNZIPPED.pdf.out

Name:  PDF Structazer
Page Reference:  393
Author/Distributor:  Eric Filiol, etc. al/Ecole supérieure d’Informatique, Electroniqueet Automatique
Available From:

Description: PDF Structazer is a GUI-based PDF analysis tool, allowing the digital investigator to
examine the structure and contents of PDF files.


Name:  PDFMiner
Page Reference:  291
Author/Distributor:  Yusuke Shinyama
Available From:
Description: Python PDF parser and analyzer. PDF Miner consists of numerous ptyhon scripts to
examine the textual data inside of a PDF file, including (extracts text contents from a PDF
file) and (dumps the internal contents of a PDF file in pseudo-XML format).

Name:    PDF Stream Dumper
Page Reference:   293
Author/Distributor:  SandSprite
Available From:
Description: PDF Stream Dumper is a feature-rich GUI-based malicious PDF analysis tool. Useful for
every phase of suspect PDF file profiling, PDF Stream Dumper has numerous specialized tools to
examine the PDF file structure, individual elements, and objects; scan for known exploits; and extract
obfuscated Javascript.

Name:   Malzilla
Page Reference:  293
Author/Distributor:  Boban Spasic aka bobby
Available From:
Description:  Described by the developer as a malware hunting tool, Malzilla is commonly used by
malicious code researchers to navigate to potentially malicious URLs in an effort to probe the contents formalicious code and related artifacts. However, Malzilla has a variety of valuable decoding and shellcode analysis features making it an essential tool in the digital investigator’s arsenal for exploring malicious PDF files.

Name:  PDF-Analyzer
Page Reference:  293
Author/Distributor:  Ingo Schmoekel
Available From:
Description:  Although not geared toward malicious PDF forensics, PDF-Analyzer is a graphical PDF
analyis tool that can be used by the digital investigator to extract metadata and view file structures and
properties in a target PDF specimen.

Name:  Open PDF Analysis Framework (OPAF)
Page Reference:  291
Author/Distributor:  Felipe Andres Manzano
Available From:
Description:  OPAF is a suite of eight python scripts to parse and extract PDF elements.

Malicious Document Analysis: Microsoft Office Files

Name:  STG
Page Reference:  297-298
Author/Distributor:  Microsoft
Available From:
Description:  STG is a basic GUI utility to browse OLE Structured Storage files

Name:  BiffView
Page Reference:  297-298
Author/Distributor: DIaLOGIKa
Available From:
Description:  Microsoft Office Excel workbooks are compound files saved in Binary Interchange File
Format (BIFF), which contain storages and numerous streams. As a part of the Office Binary (doc, xls,
ppt) Translator to Open XML project, BiffView was developed in an effort to analyze the BIFF file
structure. Upon processing a target file, BiffView prints an easily navigable HTML file containing the
structures of the target file.

Name:  SSView
Page Reference: 297-298
Author/Distributor:  MiTeC
Available From:
Description:  Useful for examining a suspect document for indicators of malice, SSView is a lightweight graphical tool for parsing the structures and contents of Microsoft OLE Structured Storage files.

Malicious Document Analysis: CHM Files

Name:  CHM-2-HTML
Page Reference:  309
Author/Distributor:  MacroObject
Available From:
Description:  Although not designed as a malicious CHM analysis tool, as a CHM to HTML converter, CHM-2-HTML quickly converts the elements of a CHM into an HTML page, while extracting and separating out executable files.