PDFly Variant Uses Custom PyInstaller Modification, Forcing Analysts to Reverse-Engineer Decryption


A new variant of the PDFly malware has emerged with advanced techniques that challenge traditional analysis methods. The malware uses a modified PyInstaller executable that prevents standard extraction tools from working properly.

This makes it difficult for security teams to examine the code and understand how the threat operates.

The modified version changes key identifiers and encrypts Python bytecode using multiple layers of protection, requiring analysts to manually reverse-engineer the decryption process.

PDFly first appeared when security researcher Luke Acha mentioned the application on social media.

A similar sample called PDFClick was later discovered, showing that threat actors are actively developing this technique. Both samples share the same core modification strategy, making them part of a broader campaign to evade detection.

The modified PyInstaller stub contains corrupted strings and uses a custom magic cookie value that differs from standard implementations, preventing automated tools like PyInstxtractor from recognizing the file structure.

google

Samplepedia analysts identified the encryption scheme after detailed investigation of the malware’s internal components.

When standard extraction tools failed to process the executable, researchers had to examine the file using disassemblers to locate the modified elements.

The investigation revealed that the encryption was not embedded in the PyInstaller stub itself but rather in separate bootstrap files that handle archive extraction during runtime.

The malware developers implemented a complex encryption algorithm to protect the PYZ archive contents from analysis.

After modifying the PyInstxtractor script to recognize the custom magic cookie and removing validation checks, researchers found that extracted files remained encrypted.

Modified PyInstaller cookie structure showing custom magic value (Source - Samplepedia)
Modified PyInstaller cookie structure showing custom magic value (Source – Samplepedia)

Further analysis of the pyimod01_archive.pyc file revealed a multi-stage decryption process involving XOR operations with two different keys, followed by zlib decompression and data reversal before unmarshaling the Python code objects.

Decryption Process and Technical Implementation

The encryption algorithm follows a specific sequence that must be reversed to access the malicious code. First, the archived data undergoes XOR decryption using a 13-byte key labeled SCbZtkeMKAvyU.

The result then passes through zlib decompression to restore the original file structure. A second XOR operation applies a 7-byte key called KYFrLmy to further obfuscate the data.

Finally, the bytes are reversed before Python’s marshal module processes them into executable code objects.

Python bytecode disassembly showing XOR decryption implementation (Source - Samplepedia)
Python bytecode disassembly showing XOR decryption implementation (Source – Samplepedia)

Security researchers developed a generic extractor tool to handle multiple variants with different encryption keys.

The tool automatically searches for valid cookie structures in the PE overlay and validates them by checking package length, table-of-contents offset, and Python version fields.

Once located, the extractor parses the pyimod01_archive.pyc bytecode to extract XOR keys from generator expressions within the ZlibArchiveReader class, enabling automated decryption of future samples.

Follow us on Google News, LinkedIn, and X to Get More Instant UpdatesSet CSN as a Preferred Source in Google.

googlenews



Source link