When discussing effective approaches to the problem of security analytics, I think it is first important to start with a clear definition of the goal of security analytics. The ultimate goal of security analytics is to deliver technology solutions that assist human security analysts in detecting, responding to and mitigating cyber threats.
This simple statement hides an area of technological endeavor that is simultaneously fascinating, important and complex. While a full exploration of the many facets of security analytics is beyond the scope of this post, it is useful to discuss a high-level and general approach to security analytics to simplify the complex problem statement into more digestible pieces.
Security information and event management (SIEM) as a centerpiece
As is always the case, analytics systems will only perform as well as the input data will allow. Effective security analytics depends on a diverse and highly normalized set of security relevant data. This data set typically needs to contain host and network data originating from many types of systems, such as operating systems, applications, networking gear, border and internal security systems and the like. This is a function typically encapsulated in enterprise-grade security information and event management (SIEM) systems. The SIEM platform provides the necessary broad data collection and coherent data normalization for consumption by downstream analytics systems.
Security analytics, broken down
The general problem statement of security analytics can be broken down into two distinct objectives: anomaly detection and threat determination. The security analytics system must be able to detect anomalies in user and endpoint behavior with respect to previously known or calculated baseline behavior. Then the system must determine if the anomalies are of sufficient security risk to bring it to the attention of human analysts.
The reality is: There are a multitude of observed activities that are anomalous within dynamic computing environments that pose no security risk. Differentiating between anomalies that pose security risk and those that do not is an essential function for effective security analytics. Separating the problem into these two spaces—anomaly detection and threat determination—helps to simplify the focus of the analysis in each stage.
Objective no. 1: The anomaly detection stage
In the anomaly detection stage, we focus on discovering change in user and/or endpoint behavior without regard to its security relevance. There are many algorithmic approaches to performing anomaly detection on security relevant data, including correlation/pattern matching, basic statistical anomaly detection (e.g. univariate averages and variance) and methods generally termed computational statistics and machine learning.
Each analytic technique has its strengths, weaknesses and regimes in which it is best applied. Each analytic method may be sensitive to different types of change in the data as well as have its own accuracy characteristics (i.e. false positive probability and false negative probability). By combining the results of multiple anomaly detection methods, we hope to reduce false positives and build a more complete picture of anomalous user or endpoint behavior.
Objective no. 2: The threat determination stage
Once an anomaly has been detected, the security analytics system must then evaluate the security risk of the identified anomaly or collection of anomalies. This stage requires injection of additional sources of data that provide security domain knowledge, environmental context and threat intelligence.
Security domain knowledge can be either introduced in the form of known anomalies for specific compromise indicators or through supervised learning feedback provided by the human security analyst.
Environmental context takes into account the risk associated with assets or users involved in the detected anomaly and is typically specific to each end user’s environment. For example, if the system involved in anomalous activity is identified as a payment card processing system (a valuable asset with high risk level) or if the users involved are identified as highly privileged users (e.g. a domain administrator), then the associated anomaly can be elevated in terms of risk.
In addition, external threat intelligence can provide information on hosts, files or other assets that are known to be related to specific security threats. If the detected anomaly involves assets identified in threat intelligence feeds as security risks, the anomaly risk can be elevated.
The keys to an effective approach to security analytics
- Employ an enterprise-grade SIEM to perform comprehensive and consistent data collection and to serve a role in threat determination of observed anomalies.
- Break down the security analytics problem into two separate considerations: anomaly detection and threat determination.