Automated Analysis of kernel to Detect Vulnerabilities


Kernel vulnerabilities are prevalent in operating systems and can affect billions of devices. One of the most widely used tools for kernel fuzzing is the “Syzkaller,” which generates syscall sequences based on predefined specifications written in zlang.

There is existing research in automating Syzkaller specifications generation, which is still being done manually. However, a new research paper has been proposed that integrates LLMs (Large Language Models) and Syskaller specifications that can provide enhanced fuzzing. This has been named as “KernelGPT”.

KernelGPT Auto-Detect Vulnerabilities

LLMs have been into several use cases in pre-training and have seen many kernel codes during their development, which can be leveraged to make valid syscalls. Additionally, KernelGPT uses an iterative approach to include all specification components automatically.

The initial level of research demonstrated that KernelGPT enhanced Syzkaller to achieve higher coverage and find multiple previously unknown bugs. This is the first automated approach to using LLMs for kernel fuzzing.

Workflow of KernelGPT and Traditional method (Source: Arxiv)
Workflow of KernelGPT and Traditional method (Source: Arxiv)

Kernel and Device Drivers

The Syscall interface is where the interactions between userspace and kernel occur. Userspace applications that trigger crashes and kernel bugs are highly risky as they can affect all the kernel applications and bypass all kernel-enforced security policies.

On the other hand, Device drivers are deemed to register their syscall handlers with the kernel during initialization. Many drivers also require unique control logic, which has no similar counterpart in the syscall interface; hence, they use the generic syscall for dispatch.

However, several methods are used to detect kernel bugs to address the complexity and continuous evolution of OS kernels. One of the most effective techniques is Fuzz testing, which generates and executes syscalls on the target kernel.

KernelGPT uses the code extractor and analysis LLM to generate driver specifications that can enhance kernel fuzzing. It also determines command values, argument types, and type definitions for describing the generic handlers of the device.

Furthermore, a complete report about this KernelGPT has been published, which provides detailed information on the KernelGPT, approach, methods, techniques, and other information.



Source link