The Noticeable Shift in SIEM Data Sources
Early SIEM solutions faced hurdles, especially from their data sources. Learn about their evolution over time from cloud to IoT Devices, among others.
Join the DZone community and get the full member experience.
Join For FreeSIEM solutions didn't work perfectly well when they were first introduced in the early 2000s, partly because of their architecture and functionality at the time but also due to the faults in the data and data sources that were fed into them.
During this period, data inputs were often rudimentary, lacked scalability, and necessitated extensive manual intervention across operational phases. Three of those data sources stood out.
1. Hand-Coded Application Layer Security
Coincidentally, application layer security became a thing when SIEM solutions were first introduced. Around that time, it became obvious that defending the perimeter, hosts, and endpoints was not sufficient security for applications.
Some developers experimented with manually coding application security layers to bolster protection against functionality-specific attacks. While this approach provided an additional security layer, it failed to provide SIEM solutions with accurate data due to developers' focus on handling use cases rather than abuse cases.
This was because the developers were accustomed to writing code to handle use cases, not abuse cases. So, they weren’t experienced and didn’t have the experience or knowledge to anticipate all likely attacks and write complex codes to collect or authorize access to data related to those attacks. Moreover, many sophisticated attacks necessitated correlating events across multiple applications and data sources, which was beyond the monitoring of individual applications and their coding capabilities.
2. SPAN and TAP Ports
SPAN ports, also known as mirror ports or monitor ports, were configured on network switches or routers to copy and forward traffic from one or more source ports to a designated monitoring port. They operated within the network infrastructure and allowed admins to monitor network traffic without disrupting the flow of data to the intended destination.
On the other hand, TAP ports were hardware devices that passively captured and transmitted network traffic from one network segment to another. TAP operated independently of network switches and routers but still provided complete visibility into network traffic regardless of network topology or configuration.
Despite offering complete visibility into network traffic, these ports fell out of favor in SIEM integration due to their deficiency in contextual information. The raw packet data that SPAN and TAP ports collected lacked the necessary context for effective threat detection and analysis, alongside challenges such as limited network visibility, complex configuration, and inadequate capture of encrypted traffic.
3. The 2000s REST API
As a successor to SOAP API, REST API revolutionized data exchange with its simplicity, speed, efficiency, and statelessness. Aligned with the rise of cloud solutions, REST API served as an ideal conduit between SIEM and cloud environments, offering standardized access to diverse data sources.
However, it had downsides: one of which was its network efficiency issues.
REST APIs sometimes over-fetched or under-fetched data, which resulted in inefficient data transfer between the API and the SIEM solution. There were also the issues of evolving schemas in REST APIs. Without a strongly typed schema, SIEM solutions found it difficult to accurately map incoming data fields to the predefined schema, leading to parsing errors or data mismatches.
Then there was the issue of its complexity and learning curve. REST API implementation is known to be complex, especially in managing authentication, pagination, rate limiting, and error handling. Because of this complexity, security analysts and admins responsible for configuring SIEM data sources found it difficult or even required additional training to handle its integrations effectively. This also led to configuration errors, which then affected data collection and analysis.
While some of the above data sources have not been completely scrapped out of use, their technologies have been greatly improved, and they now have seamless integrations.
Most Recently Used SIEM Data Sources
1. Cloud Logs
The cloud was introduced in 2006 when Amazon launched AWS EC2, followed shortly by Salesforce's service cloud solution in 2009. It offers unparalleled scalability, empowering organizations to manage vast volumes of log data effortlessly. Additionally, it provides centralized logging and monitoring capabilities, streamlining data collection and analysis for SIEM solutions. With built-in security features and compliance controls, cloud logs enable SIEM solutions to swiftly detect and respond to security threats.
However, challenges accompany these advantages.
According to Adam Praksch, a SIEM administrator at IBM, SIEM solutions often struggle to keep pace with the rapid evolution of cloud solutions, resulting in the accumulation of irrelevant events or inaccurate data.
Furthermore, integrating SIEM solutions with both on-premises and cloud-based systems increases complexity and cost, as noted by Mohamed El Bagory, a SIEM Technical Instructor at LogRhythm.
Notwithstanding, El Bagory acknowledged the vast potential of cloud data for SIEM solutions, emphasizing the need to explore beyond basic information from SSH logins and Chrome tabs to include data from command lines and process statistics.
2. IoT Device Logs
As Praksch rightly said, any IT or OT technology that creates logs or reports about its operation is already used for security purposes. This is because IoT devices are known to generate a wealth of rich data about their operations, interactions, and environments.
IoT devices, renowned for producing diverse data types such as logs, telemetry, and alerts, are considered a SIEM solutions’s favorite data source. This data diversity allows SIEM solutions to analyze different aspects of the network and identify anomalies or suspicious behavior.
Conclusion
In conclusion, as Praksch rightly said, "The more data a SIEM solution can work with, the higher its chances of successfully monitoring an organization's environment against cyber threats."
So, while most SIEM data sources date back to the inception of the technology, they have gone through several evolution stages to make sure they are extracting accurate and meaningful data for threat detection.
Opinions expressed by DZone contributors are their own.
Comments