Understanding Kernel Monitoring in Windows and Linux
eBPF in Windows and Linux Kernel Introspection with Hooking for syscalls in cloud technologies such as Falco, Tetragon, and Sysdig.
Join the DZone community and get the full member experience.
Join For FreeThe cybersecurity landscape is undergoing a significant shift, moving from security tools monitoring applications running within userspace to advanced, real-time approaches that monitor system activity directly and safely within the kernel by using eBPF. This evolution in kernel introspection is particularly evident in the adoption of projects like Falco, Tetragon, and Tracee in Linux environments. These tools are especially prevalent in systems running containerized workloads under Kubernetes, where they play a crucial role in the real-time monitoring of dynamic and ephemeral workloads.
The open-source project Falco exemplifies this trend. It employs various instrumentation techniques to scrutinize system workload, relaying security events from the kernel to user space. These instrumentations are referred to as ‘drivers’ within Falco, reflecting their operation in kernel space. The driver is pivotal as it furnishes the syscall event source, which is integral for monitoring activities closely tied to the syscall context. When deploying Falco, the kernel module is typically installed via the Falco-driver-loader script included in the binary package. This process seamlessly integrates Falco’s monitoring capabilities into the system, enabling real-time detection and response to security threats at the kernel level.
How Do System Calls Work?
System calls (syscalls for short) are a fundamental aspect of how software interacts with the operating system. They are essential mechanisms in any operating system’s kernel, serving as the primary interface between user-space applications and the kernel.
Syscalls are functions used by applications to request services from the operating system’s kernel. These services include operations like reading and writing files, sending network data, and accessing hardware devices.
- When a user-space application needs to perform an operation that requires the kernel’s intervention, it makes a syscall.
- The application typically uses a high-level API provided by the operating system, which abstracts the details of the syscall.
- The syscall switches the processor from user mode to kernel mode, where the kernel has access to protected system resources.
- The kernel executes the requested service and then returns the result to the user-space application, switching back to user mode.
Types of System Calls
System calls can be categorized into several types, such as:
- File management: Operations like open, read, write, and close files
- Process control: Creation and termination of processes, and process scheduling
- Memory management: Allocating and freeing memory
- Device management: Requests to access hardware devices
- Information maintenance: System information requests and updates
- Communication: Creating and managing communication channels
Examples of Linux System Calls
open()
: Used to open a fileread()
: Used to read data from a file or a networkwrite()
: Used to write data to a file or a networkfork()
: Used to create a new process
Why System Calls Are Necessary for Kernel Introspection
System calls provide a controlled interface for user-space applications to access the hardware and resources managed by the kernel. They ensure security and stability by preventing applications from directly accessing critical system resources that could potentially harm the system if misused.
Kernel Introspection Performance Considerations
System calls involve context switching between user mode and kernel mode, which can be relatively expensive in terms of performance. Therefore, efficient use of system calls is important in application development.
A Shift to eBPF in Linux
In summary, system calls are crucial for the operation of any computer system, acting as gateways through which applications request and receive services from the operating system’s kernel. They play a critical role in resource management, security, and abstraction, allowing applications to perform complex operations without needing to directly interact with the low-level details of the hardware and operating system internals.
In recent years, we have seen a shift towards a technology called extended Berkeley Packet Filter (eBPF for short). eBPF is a revolutionary technology with origins in the Linux kernel that can run sandboxed programs in a privileged context, such as the operating system kernel. It is used to safely and efficiently extend the capabilities of the kernel without requiring to change kernel source code or load kernel modules, which can prove to be a safer alternative to the traditional kernel module.
Historically, the operating system has always been an ideal place to implement observability, security, and networking functionality due to the kernel’s privileged ability to oversee and control the entire system. At the same time, an operating system kernel is hard to evolve due to its central role and high requirement for stability and security. The rate of innovation at the operating system level has thus traditionally been lower compared to functionality implemented outside of the operating system.
The most noticeable impact on a host comes from the number of times an event has to be sent to user space and the amount of work that needs to be done in user space to handle this event. In other words, the earlier an event can be confidently dropped and ignored, the better. This is why programmable solutions like eBPF or kernel modules are beneficial. Having the ability to develop fine-grained in-kernel filters to control the amount of data sent from kernel space to user space is a huge benefit in Linux.
Falco, for example, has the ability to select specific syscalls to monitor through Adaptive Syscall Selection. This empowers users with granular control, optimizing system performance by reducing CPU load through selective syscall monitoring. After mapping the event strings from the rules to their corresponding syscall IDs, Falco uses a dedicated eBPF map to inject this information into the sys_enter and sys_exit tracepoints within the driver.
Falco’s modern eBPF probe is an alternative driver to the default kernel module. The main advantage it brings to the table is that it is embedded into Falco, which means that you don’t have to download or build anything. If your kernel is recent enough, Falco will automatically inject it, providing increased portability for end-users.
How To Handle Kernel Introspection in Windows and Linux
Syscalls in Windows and Linux fundamentally operate in the same way, providing an interface between user-space applications and the operating system’s kernel. However, there are notable differences in their implementation and usage, which also contribute to the variations in system call monitoring tools and the adoption of technologies like eBPF in these environments. Here are some of the clear differences in syscalls between Windows and Linux:
Implementation and API Differences
- Linux: Uses a consistent set of syscalls across different distributions.
Linux system calls are well-documented and relatively stable across versions. - Windows: Windows syscalls, known as Win32 API calls, can be more complex due to the broader range of functionalities and legacy support. The Windows API includes a set of functions, interfaces, and protocols for building Windows applications.
Syscall Invocation
- In Linux, system calls are typically invoked using a software interrupt, which switches the processor from user mode to kernel mode. For example, when a Linux program needs to read a file, it directly invokes the read syscall, which is a straightforward interface to the kernel’s file-reading capabilities.
- In contrast, Windows uses a similar mechanism but also includes additional layers of APIs that can abstract the underlying system calls more significantly. For instance, in Windows, a program might use the ReadFile function from the Win32 API to read a file.
This function, in turn, interacts with lower-level system calls to perform the operation. The Win32 API provides a more user-friendly interface and hides the complexity of direct system call usage, which is a common approach in Windows to provide additional functionality and manage compatibility across different versions of the operating system.
Syscall Monitoring Tools
- Linux: The open-source nature and the standardized system call interface in Linux make it easier to develop and use system call monitoring tools. Tools like auditd, Sysdig Inspect, and eBPF-based technologies are commonly used for monitoring system calls in Linux.
- Windows: System call monitoring tools are less common in Windows partly due to the complexity and variability of the Windows API and kernel. The closed-source nature of Windows also limits the development of external monitoring tools. There are a couple of tools from the Sysinternals suite, such as Procmon and Sysmon, which have existed for a long time. Needless to say, both are closed-source, Microsoft proprietary software. However, Windows does provide its own set of tools and APIs to extend Kernel visibility for monitoring, like Event Tracing for Windows (ETW) and Windows Management Instrumentation (WMI).
Implementing User-Space Hooking Techniques in Windows
- In addition to Procmon and Sysmon, many Windows products utilize kernel drivers, often augmented with user-space hooking techniques, to monitor system calls. User-space hooking refers to the method of intercepting function calls, messages, or events passed between software components in user space, outside the kernel. This technique allows for the monitoring and manipulation of interactions within an application without requiring changes to the underlying operating system kernel.
- User-space hooking is particularly useful in scenarios where kernel-level access is either not feasible or too risky, such as when dealing with security applications, system utilities, or performance monitoring tools. By leveraging user-space hooking, developers can gather valuable data on application behavior, enhance security measures, or modify functionality without the need for deep integration into the operating system’s core.
- Despite these approaches, Windows also offers its own set of tools and APIs to facilitate kernel visibility for monitoring purposes. ETW and WMI are the prime examples. ETW provides detailed event logging and tracing capabilities, allowing for the collection of diagnostic and performance information, while WMI offers a framework for accessing management information in an enterprise environment. Both are instrumental in extending visibility for kernel introspection, however, it’s still worth noting that maybe endpoint detection tools are still relying on user-space hooking techniques that provide limited system visibility.
eBPF for Windows
The eBPF for Windows initiative is an ongoing project designed to bring the functionality of eBPF, a feature predominantly used in the Linux environment, to Windows. Essentially, this project integrates existing eBPF tools and APIs into the Windows platform. It does so by incorporating existing eBPF projects as submodules and creating an intermediary layer that enables their operation on Windows.
The primary goal of this project is to ensure compatibility at the source code level for programs that utilize standard hooks and helpers, which are common across different operating systems. In essence, eBPF for Windows aims to allow applications originally written for Linux to be compatible with Windows.
While Linux offers a wide array of hooks and helpers, some are highly specific to its internal structures and may not be transferable to other platforms. However, there are many hooks and helpers with more general applications, and the eBPF for Windows project focuses on supporting these in cross-platform eBPF programs.
Additionally, the project makes the Libbpf APIs available on Windows. This is intended to maintain source code compatibility for applications interacting with eBPF programs, further bridging the gap between Linux and Windows environments in terms of eBPF program development and execution.
As of 2024, the eBPF for Windows project is still a work in progress. There are, of course, challenges to adoption in Windows eBPF. The beta status of eBPF for Windows means that it has yet to see the widespread adoption otherwise observed in Linux systems. The challenges include ensuring compatibility with Windows kernel architecture, integrating with existing Windows security and monitoring tools, and adapting Linux-centric eBPF toolchains to the Windows environment.
However, if successfully implemented, eBPF for Windows could bring powerful kernel introspection and programmability capabilities, similar to those in Linux, to Windows environments. This would significantly enhance the ability to monitor and secure Windows systems using advanced eBPF-based tools.
While there are inherent differences in how system calls are implemented and monitored in Windows and Linux, efforts like the eBPF for Windows project represent an ongoing endeavor to bridge these gaps. The potential for bringing Linux’s advanced monitoring capabilities to Windows could open up new possibilities in system security and management, although it faces significant developmental challenges. Currently, Windows cannot interpret Linux system calls.
Kernel Introspection for Windows
There are, of course, alternative approaches for Windows kernel introspection. The project Fibratus.io offers itself as a modern tool for Windows kernel exploration and observability with a focus on security. Fibratus uses an approach known as ETW for capturing system events. Many kernel developers will discover that the process of building a kernel driver in Windows is very tedious because of the various stringent Microsoft requirements regarding certification, quality lab testing, and more. Not only that, but the very process of writing kernel code is, in general, a much more time-consuming process, and a crash in a single kernel driver may crash the entire system.
Right now, ETW looks like the best approach for deep kernel insights, since the eBPF for Windows implementation is still somewhat limited to a network-stack observability use case, such as Xpress Data Path (XDP) for DDoS mitigation. ETW is implemented in the Windows operating system and provides developers with a fast, reliable, and versatile set of event-tracing features with very little impact on performance. You can dynamically enable or disable tracing without rebooting your computer or reloading your application or driver. Unlike debugging statements that you add to your code during development, you can use ETW in your production code. Similar to the syscall approaches we mentioned for Linux systems, ETW provides a mechanism to trace and log events that are raised by user-mode applications and kernel-mode drivers.
Kernel Introspection: A Conclusion
Windows security vendors typically maintain a level of confidentiality about the inner workings of their Endpoint Detection & Response (EDR) products. However, it’s widely recognized that many of these products leverage kernel drivers or the Event Tracing for Windows (ETW) framework, sometimes supplemented with user-space hooking techniques. The specific methodologies and implementations often remain under wraps, aligning with industry norms for proprietary technology.
The introduction of eBPF, a technology with roots in the Linux kernel, into Windows environments, marks a significant and promising development. eBPF’s transition to Windows is particularly notable for its potential in production environments. Its capability to dynamically load and unload programs without necessitating a kernel restart is a major advancement. This feature greatly facilitates system administration, allowing for more efficient debugging and problem-solving in live environments.
The gradual roll-out of eBPF in Windows signifies a step towards more flexible and powerful system diagnostics and management tools, mirroring some of the advanced capabilities long available in Linux systems. This evolution reflects the ongoing convergence of Linux and Windows operational paradigms and toolsets, enhancing the capabilities and utility of Windows systems in complex, production-grade applications.
Published at DZone with permission of Nigel Douglas. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments