Today marks a significant milestone for Malcat users with the release of version 0.9.6, introducing Kesakode, a remote hash lookup service.
This innovative tool is tightly integrated into Malcat’s UI and is designed to match known functions, strings, and constant sets against a comprehensive database of clean malware and library files.
This article delves into the features, functionality, and potential use cases of Kesakode.
How Does Kesakode Work?
Indexing Process
According to Malcat’s blog, a vast library comprising over 300 recent malware families and a million unique, clean programs and libraries has been built over the past months.
ANYRUN malware sandbox’s 8th Birthday Special Offer: Grab 6 Months of Free Service
This includes the extensive 2000+ malware families corpus from Malpedia. For each sample, three sets of features are extracted:
- Function Hashes: Hashes of every interesting function found in the sample, with their absolute offsets masked.
- String Hashes: Hashes of every interesting string found in the sample, utilizing Malcat’s scoring system.
- Fuzzy Hash: A single fuzzy hash computed over the list of interesting code eeeeeeeee and data constants identified by the constant scanner.
These hashes are stored in a massive relational database linked to their corresponding samples.
Query Time
When a Kesakode query is made, the same three sets of hashes are computed and sent to the matching service.
The cloud service then queries the database to identify if and where these hashes have been seen before. The decision tree used is straightforward:
- Library Code: If a function hash is found in a library, it is labeled as LIBRARY code, and the library name is returned.
- Clean Program: The hash is labeled as CLEAN if found in a clean program.
- Malicious: If only seen in malware, it is labeled as MALICIOUS, and all malware family names where it was found are returned.
For code immediates and data constants, a different approach is taken.
Only malicious samples are focused on a single fuzzy hash summarizing the whole constant/immediate sets is stored.
At query time, this fuzzy hash is compared to its nearest neighbors, and all malware families with a similarity score greater than 80% are returned.
Kesakode is designed to handle the complexities of malware, often using deception techniques like code obfuscation or data/string encryption.
By using three different sets of features, Kesakode can identify the vast majority of malware samples.
Typical lookup queries take between 1 and 4 seconds, depending on the number of functions and strings in the program.
Users can help improve the database by submitting false positives and negatives with just a few clicks.
Use Cases
Malware Identification
Kesakode’s primary use is malware identification, helping users determine to which malware family a sample belongs.
Unlike public Yara rules, which can be incomplete or outdated, Kesakode uses more patterns for identification.
It works best on unpacked/dumped samples like those from the Triage sandbox.
Detection Engineering
For detection engineers, Kesakode aids in writing Yara rules by identifying unique portions of code and strings in malware.
The coloring scheme (LIBRARY, MALICIOUS, and CLEAN labels) is applied across Malcat’s views, making it easier to build Yara’s rules.
Faster Reverse Engineering
Even for standard application reverse engineering, Kesakode helps identify low-value functions, allowing users to focus on unique parts of the program.
The disassembly view in Malcat labels/colors every known function found in CLEAN programs or LIBRARY, speeding up the reversing process.
Kesakode is a powerful addition to Malcat, offering a robust solution for identifying malware samples, assisting in detection engineering, and speeding up reverse engineering processes.
With its comprehensive database and efficient query system, Kesakode is set to become an indispensable tool for cybersecurity professionals.
Free Webinar on Live API Attack Simulation: Book Your Seat | Start protecting your APIs from hackers