Exploring the 3 Major Threat Detection Methods: Signature, Behavior, Machine Learning

I kid you not that more than once, it’s been asked, “Which event IDs indicate a security intrusion so I can set up an alert in my SIEM?” If it were only so easy. How do you effectively detect attacker activity within your organization? At first, this question may seem overwhelming when you think of all the different ways we use detection technologies and the logs they consume, but you can basically classify them into three types of detection:
  • Signature
  • Behavior
  • Machine Learning (ML)
It’s safe to say that these are listed in the order they’ve developed over time and their increasing sophistication. Signature and Behavior remain viable and vital tools but they have significant limitations and it’s excited that ML is fulfilling it’s promise in threat detection.
To detect threats today it’s not just about which methods you use, but also which data. Endpoint server and workstation logs are a start. But major blind spots exist unless your threat detection visibility extends to the network and cloud levels as well.
In this real training for free session, Geoff Mattson, founder of MistNet, and Randy Franklin Smith dove into these three detection methods. They looked at what data to use, what the science tells us we can do with it, what to expect, and demonstrated real world examples of each. They discussed where each method is most appropriate and looked at the limitations of each. Here’s a sample of what they covered:
This is a straightforward definition of artifacts or activity that within the applicable context indicates an intrusion. Example: a known file name associated with a dropper malware like c:\windows\system32\bigdrop.exe. Or a file with a hash matching known malware.
But there are more generalized signatures too. Such as new values showing up in registry keys frequently used by attackers for gaining persistence. Or looking for PowerShell scripts with base64 encoding. Or Microsoft Word kicking off a PowerShell script.
But where does that stop being signature detection and becoming behavior analysis?
Here we are watching the activity associated with a user, computer or process and asking either:
  1. Does that look malicious?
  2. Does that look abnormal?
If you see a script running from a file created by your email client or a web browser, and that script starts accessing low level system APIs associated for injecting code into another running process, that obviously looks malicious, because it employs several known attack techniques – and it doesn’t resemble normal user activity either. So, you could argue that this example is a positive for both questions above.
But, the question, “what looks abnormal?” begs the qualifier, “compared to ….”. You need a basis for normality. That basis can be statically described by the security analyst based on knowledge of differentials between normal end-user activity and known attack techniques. Perhaps you’d call this a logical model.
Or the system can analyze activity assumed to be normal over a period of time to determine a model of normalcy. And the context of the model can be general or specific. Here’s an example. You might analyze network activity of many end-user workstations to determine a baseline of destination subnets and TCP port numbers for getting a baseline of which parts of your network contain front end servers that you should expect end-user PCs to access. Or you might analyze that for Tom, Alice and Shadrik individually and compare future activity against each user’s specific baseline. Either way, you are asking “does that look normal” compared to an empirical model.
But at some point, behavioral modeling starts to look more like machine learning.
Machine learning
Machine learning is the newest of these 3 threat detection methods and it’s exciting to have gotten beyond the hype stage of ML and to now be reaping real progress from this area. In this session we looked at what data to use and what the science tells us we can do with it. We also discussed what you can expect from ML-based detection. What are the basic limits to this technology – especially the limitations of supervised and unsupervised learning? What approach is optimal now? From here Geoff facilitated the discussion on ML where he covered:
  • The Fundamentals of Machine Learning and Data Science
  • What Data is Needed for Detection
  • Machine Learning Limits and Considerations
  • Putting it All Together for Threat Detection
  • Demo of Multi-Stage Attack Detection
Watch this discussion to get our take on the boundaries between the detection methods and how to best utilize each.