The ability to swiftly and accurately analyze malware is paramount. Traditional reverse engineering and code analysis methods are often too slow to keep pace with the sheer volume of new threats.
Enter Gemini 1.5 Flash, Google’s latest lightweight and cost-effective model, designed to revolutionize malware analysis with remarkable speed and efficiency.
In this article, we delve into Gemini 1.5 Flash’s capabilities, real-world performance, and the infrastructure supporting its deployment.
Gemini 1.5 Flash: A Game Changer in Malware Analysis
According to the Google Cloud report, Gemini 1.5 Flash builds on the robust capabilities of Gemini 1.5 Pro and is engineered for rapid inference and cost-effective deployment.
Both models share the same multimodal capabilities and can handle a context window of over 1 million tokens.
However, Gemini 1.5 Flash stands out with its optimized efficiency and speed, achieved through parallel computation of attention and feedforward components and online distillation techniques.
These architectural optimizations enable Gemini 1.5 Flash to process up to 1,000 requests per minute and 4 million tokens per minute.
This capability is crucial for handling the vast influx of new files analyzed daily by platforms like VirusTotal, which sees an average of 1.2 million unique new files daily.
Real-World Performance: Speed and Accuracy
To evaluate Gemini 1.5 Flash’s real-world performance, we analyzed 1,000 Windows executables and DLLs randomly selected from VirusTotal’s incoming stream.
This diverse selection included both legitimate software and various types of malware. The results were impressive, with Gemini 1.5 Flash processing each file in an average of 12.72 seconds, excluding the unpacking and decompilation stages.
Example 1: Dispelling a False Positive in 1.51 Seconds
In one instance, Gemini 1.5 Flash analyzed the file goopdate.dll (103.52 KB) in just 1.51 seconds. This file triggered a single anti-virus detection on VirusTotal, a common occurrence requiring time-consuming manual review.
Gemini 1.5 Flash quickly identified the file as a simple executable launcher for the BraveUpdate.exe application, allowing analysts to dismiss the alert as a false positive confidently.
Example 2: Resolving Another False Positive
Another example involved the file BootstrapPackagedGame-Win64-Shipping.exe (302.50 KB), flagged by two anti-virus engines on VirusTotal.
Gemini 1.5 Flash analyzed the decompiled code in just 4.01 seconds, revealing that the file was a game launcher. This insight allowed analysts to categorize the file as legitimate, avoiding unnecessary time and effort spent on a potential false positive.
Example 3: Longest Processing with Obfuscated Code
The file svrwsc.exe (5.91 MB) required the longest processing time: 59.60 seconds. Despite obfuscation techniques like XOR encryption, Gemini 1.5 Flash completed its analysis in less than a minute.
Are you from SOC/DFIR Teams? - Sign up for a free ANY.RUN account! to Analyse Advanced Malware Files
It identified the sample as malicious and pinpointed its backdoor functionality, which was designed to exfiltrate data and connect to command-and-control (C2) servers.
Example 4: Cryptominer Analysis
Gemini 1.5 Flash analyzes the decompiled code of a crypto-miner named colto.exe. The model receives only the decompiled code as input, without additional metadata or context from VirusTotal.
Within just 12.95 seconds, Gemini 1.5 Flash provides a comprehensive analysis, identifying the malware as a crypto-miner.
It highlights obfuscation techniques and extracts key Indicators of Compromise (IOCs), such as the download URL, file path, mining pool, and wallet address.
Example 5: Unmasking a Zero-Hour Keylogger
This example demonstrates the power of analyzing code for malicious behavior, particularly in detecting threats that traditional security solutions miss.
The executable AdvProdTool.exe (87KB) was submitted to VirusTotal, where it evaded all antivirus engines, sandboxes, and detection systems during its initial upload and analysis.
However, Gemini 1.5 Flash uncovers its true nature. In just 4.7 seconds, the model analyzes the decompiled code, identifies it as a keylogger, and reveals the IP address and port where it exfiltrates stolen data.
The analysis highlights the code’s use of OpenSSL to establish a secure TLS connection to the IP address on port 443.
Crucially, Gemini points out the suspicious use of keyboard input capture functions (GetAsyncKeyState, GetKeyState) and their connection to data transmission over the secure channel (SSL_write).
Infrastructure and Workflow
The impressive performance of Gemini 1.5 Flash is supported by a robust infrastructure built on Google Compute Engine.
The multi-stage workflow includes scaled unpacking and decompilation stages, leveraging Mandiant Backscatter and Hex-Rays Decompilers.
Mandiant Backscatter
Mandiant Backscatter, an internal cloud-based malware analysis service, dynamically unpacks incoming binaries. The unpacked binaries are then processed by a cluster of Hex-Rays Decompilers running on Google Compute Engine.
This setup ensures that the decompiled code is concise and ready for analysis by Gemini 1.5 Flash.
Hex-Rays Decompiler
The Hex-Rays IDA Pro Decompilers provide the scalable decompilation power necessary for this pipeline. The resulting decompiled pseudo-C code is stored in a Google Cloud Storage bucket, ready for analysis by Gemini 1.5 Flash.
While Gemini 1.5 Flash’s impressive performance is heavily dependent on the quality of the preceding unpacking and decompilation stages.
Continuous improvement in these areas is essential to ensure the highest quality output for analysis. Ongoing development efforts focus on enhancing Gemini’s analytical capabilities and refining the unpacking and decompilation stages.
Gemini 1.5 Flash represents a significant advancement in malware analysis, offering speed and efficiency that traditional methods cannot match.
By leveraging the power of AI and a robust infrastructure, Gemini 1.5 Flash is poised to transform how we approach malware dissection, making the digital world safer.
Samples Details
The following table contains details on the binary samples discussed in this post.
Filename | SHA-256 |
goopdate.dll | 0d2115d3de900bcd5aeca87b9af0afac 90f99c5a009db7c162101a200fbfeb2c |
BootstrapPackagedGame-Win64-Shipping.exe | 07db922be22e4feedbacea7f92983f51 404578bd0c495abaae3d4d6bf87ae6d0 |
svrwsc.exe | 0cdb71e81b07247ee9d4ea1e1005c945 4a5d3eb5f1078279a905f0095fd88566 |
colto.exe | 091e505df4290f1244b3d9a75817bb1e 7524ac346a2f28b0ef3c689c445beb45 |
3DViewer2009.exe | 08f20e0a2d30ba259cd3fe2a84ead658 0b84e33abfcec4f151c5b2e454602f81 |
AdvProdTool.exe | 04af0519d0dbe20bc8dc8ba4d97a791a e3e3474c6372de83087394d219babd47 |
"Is Your System Under Attack? Try Cynet XDR: Automated Detection & Response for Endpoints, Networks, & Users!"- Free Demo