HackRead

Kernel Observability for Data Movement


Disclaimer: This article was submitted by Hilt as a contributed piece. Hackread.com has not independently verified all claims and does not endorse any specific product or service mentioned.

There is a recurring pattern in post-incident reviews that security teams rarely articulate explicitly: in most breaches, the underlying activity was not invisible. Data movement occurred. Processes accessed files outside of their expected scope.

Network connections were established to previously unseen destinations. In retrospect, the sequence of system events forms a clear and traceable chain. The failure is not the absence of signals, but the absence of visibility at the layer where those signals originate.

This reflects a structural limitation in modern security architectures, where most monitoring systems are not positioned at the level where data movement actually occurs.

The Visibility Problem in Modern Infrastructure

Most security organizations already consider their environments “monitored.” Infrastructure typically includes centralized logging pipelines, host-based agents, application telemetry, SIEM systems, and alerting layers aggregating events from multiple sources.

However, a more precise operational question is rarely answerable in a reliable way: Which process on which host is currently accessing which data, and where is that data going next? In most production environments, this question cannot be answered comprehensively, not due to a lack of instrumentation, but because telemetry is primarily collected at the wrong abstraction layer.

Modern observability stacks are predominantly based on user-space instrumentation: application logs, SDK-level hooks, runtime libraries, and API-level event capture. These mechanisms reflect what applications choose to expose rather than what the operating system actually executes. As a result, they provide partial visibility into system behavior, bounded by developer-defined instrumentation points.

By contrast, the actual movement of data occurs at the operating system level.

Operations such as file reads, network writes, process creation, memory mapping, and inter-process communication are all implemented as kernel-mediated events. These operations are not optional from the perspective of application logic; they are enforced through system calls and executed by the kernel as part of standard process execution.

As a result, the kernel observes a complete record of system-level data movement that does not depend on application instrumentation or developer logging decisions.

What Adversaries Already Exploit

When an attacker establishes a foothold in a production environment, their activity necessarily executes through kernel-mediated pathways. Process execution, privilege escalation, persistence mechanisms, file system access, and network communication all rely on system calls enforced by the operating system. Even when malicious behavior is initiated in user space, its execution is ultimately scheduled, validated, and mediated by the kernel. The kernel, therefore, captures events such as execveopenatconnect, and related system calls regardless of how or where they were triggered.

This gap is not theoretical. In post-incident investigations, it is common to find that malicious activity was fully observable at the system-call level, but was never surfaced by higher-level telemetry pipelines due to a lack of instrumentation at the kernel boundary.

The same limitation applies to non-adversarial failures. Misconfigured services may write sensitive data to unintended destinations, third-party libraries may initiate unexpected network communication, and scheduled jobs may access unauthorized files without generating any application-level logs. These behaviors are not absent; they are simply not captured by user-space observability systems.

Traditional Data Loss Prevention (DLP) systems attempt to mitigate this through static policy enforcement over predefined data flows. However, static rules are fundamentally limited in dynamic distributed systems. The key question is not whether a rule was violated, but whether actual runtime behavior deviated from expected system behavior.

Why Data Movement Is the Correct Abstraction

Security architectures are traditionally organized around static assets: databases, APIs, storage systems, and network boundaries. While intuitive, this model does not reflect how data behaves in distributed systems.

In modern infrastructure, sensitive data is continuously transformed and transferred across multiple execution contexts. A single query result may propagate through application memory, be processed by downstream services, pass through messaging systems, be cached, and eventually transmitted over network interfaces. At each stage, both the location and the set of processes interacting with the data change.

Securing individual assets in isolation does not guarantee the security of the system as a whole. A system may enforce strong protections at the database layer while still allowing downstream services to exfiltrate data after legitimate access. In these cases, the compromise does not occur at a protected boundary, but during runtime data movement between trusted components. Security and compliance should be defined in terms of data movement, not static data location.

The relevant questions become:

  • Where did it go next?
  • Where did the data originate?
  • Which processes accessed it?
  • Does this sequence match expected system behavior?

What Kernel-Level Visibility Provides

Answering the data movement question requires observing the layer where movement actually occurs. Every file read, network write, or process interaction with a resource is mediated by the operating system kernel, the component that sits between software and hardware and arbitrates all access to physical resources. User-space processes cannot directly access hardware or memory-backed resources; every operation must pass through a kernel-controlled execution path, which enforces permissions, translates requests into hardware instructions, and ultimately drives the underlying devices.

From a systems perspective, this makes the kernel the lowest and most complete observation point for data movement within a machine. No user-space process can bypass it when accessing resources, as all such operations must be executed through kernel-controlled system call interfaces. It is the complete and authoritative record of data movement.

At Hilt, we built our observability architecture on this foundation using eBPF, a technology that allows instrumentation programs to run directly within the kernel, attached to the system calls and execution points where data movement occurs. When a process opens a file, we see it. When data is written to a network socket, we see it.

When a process spawns a child, or establishes an IPC channel, or maps a memory region, we see it. None of this requires any change to the applications being monitored. No SDK, no code modification, no developer instrumentation work. The observation happens at the layer below the application entirely. This design provides a ground-truth representation of system behavior, reflecting actual runtime execution across processes, containers, and hosts.

From Kernel Events to Behavioral Models

Raw kernel telemetry, while complete, produces high-volume event streams that are not directly interpretable for security or compliance decision-making.

To address this, we model system activity as a continuously evolving behavioral graph:

  • The graph is updated in real time as system execution evolves
  • Edges represent observed interactions derived from kernel events
  • Nodes represent processes, files, network endpoints, and IPC channels

This structure allows for queries that are difficult to answer using log-based systems, such as whether a process has previously communicated with a given network destination or whether a service’s file access pattern is consistent with historical behavior. In production deployments, this architecture processes millions of kernel-level events per second across distributed infrastructure, with measured CPU overhead below 0.2% per host. In many environments, kernel-level scheduling efficiency reduces overall system latency relative to user-space instrumentation approaches.

Data Movement Governance with Hilt

Hilt integrates kernel-level instrumentation, temporal reconstruction, causal analysis, and adaptive behavioral detection into a single system that operates continuously across distributed environments. Rather than treating observability, security monitoring, and compliance auditing as separate layers, the platform unifies them into a single kernel-derived data model of system behavior.

In production, I lead Hilt’s technical direction as CTO. The system is deployed within infrastructure environments supporting organizations that manage billions of dollars in assets under management. The platform is designed to operate without requiring application-level modification, allowing for deployments across heterogeneous stacks and maintaining consistent visibility across hosts, containers, and services.

The core engineering work behind these systems is developed under my technical leadership at Hilt, in collaboration with engineers including David Gu and Zin Bitar, who contribute to kernel instrumentation, distributed data processing, and real-time behavioral pipeline implementation.

(Photo by Ferenc Almasi on Unsplash)





Source link