Microsoft Azure is one of the fastest growing cloud platforms on the market. Often, when an emerging technology grows so fast, organizations end up with knowledge silos. In the case of Azure, your DevOps team may be up and running in production. As an afterthought, it might toss accountability for cloud security of the environment over the fence to you.
If you’re new to Azure Monitoring, you need to know the top six tips to monitor and secure your Azure Environment.
Getting Started with Azure Cloud Environment
Over the last year, Microsoft consolidated its event logging approach for most of its cloud services into EventHub. At LogRhythm, we have been hard at work building broad and flexible support for both current and future services that deliver or will deliver logs through this route.
If you’re a LogRhythm customer and are ready to start your Azure journey, you’ll need to install and configure the LogRhythm Open Collector and EventHub integration. More details, including documentation of both configuring Azure Monitor and the Open Collector, can be found in the Documentation & Downloads section of the LogRhythm Community here.
Tip #1: Audit the Activity Log
The Azure Activity Log is a core part of Azure Monitor. This makes up the “Control Plane” logs. If you think of any resource in Azure as a black box, the Activity Log will capture anything outside of that box. It’s similar in concept to AWS CloudTrail.
Common examples include:
- Creating a storage account
- Restarting a virtual machine
- Deleting a Key Vault
The Activity Log is easy to monitor. You (or your Azure Team) can perform a one-time (per subscription) export to an Event Hub, then use LogRhythm’s Open Collector to bring the logs into your deployment.
As with all logs, the format doesn’t lend itself to easy consumption and reading by humans. It’s a json formatted log, but at a glance, it doesn’t immediately hit you with the key information. This is where LogRhythm’s normalization and enrichment of log data comes in. LogRhythm processes the data into individual metadata fields and enriches with context such as geolocation, user identity, and risk.
Any time you see an Activity Log, you should be able to determine the who, what, when, and where. LogRhythm normalization helps you determine this much more quickly and easily.
- User (Origin): The User Principal Name (UPN) of the actor
- IP (Origin): Where the request initiated; LogRhythm will enrich this with location information
- What resource was impacted?
- Object is the full identifier of the resource. We break this down into:
- Serial Number: The subscription ID. Most customers have multiple Subscriptions (e.g., a few for development, staging, and production). You’ll probably want to filter your AI Engine alerts and dashboards based on the subscription (e.g., only alerting on production instances)
- Group: Resource group: another powerful filter for your analytics
- Object Type: The resource type (great for dashboards)
- Object Name: The resource name, as you might see it in the Azure Portal
- What was the action?
- Vendor Message ID: The operation that was performed on the resource
- Vendor Info: A description of the operation — it’s not actually in the original log, we’ve pulled it from Microsoft’s documentation
- Classification & Common Event: LogRhythm’s common taxonomy to describe the event in plain English
- What was the result?
- Result: The Activity Log often reports multiple logs for a single event; “Accept” and “Start” indicate that the operation has begun, but you’re ultimately looking for “Success” and “Failure”
- What was the risk?
- Severity: Is this just informational, or did something go wrong?
- What resource was impacted?
- Normal Message Date: LogRhythm TrueTime™ ensures we use the original time the log was generated, not when it was collected
- Session Type: The Azure Region; Azure typically reports the region as “global” for Activity Logs
Here’s a basic example:
From this, we can see that Luke successfully started the Virtual Machine named XWing-Navigation.
Cloud Logging Nuances
In contrast to traditional on-prem technologies, you’ll find most Activity Log actions have multiple logs associated with them, typically with different values in the Result field (Start à Accept à Success/Failure). This is because the Azure API is asynchronous, and gives you a new log at every step in the process.
Unfortunately, the activity logs don’t provide the same context at every stage. A good example is creating Network Security Group Rules, which is analogous to an AWS VPC ACL or standard firewall rule. The log shown in the top row of the table above, where the Result is “Success” looks like the following:
You can see here that the NSG Rule (named “EXHAUSTPORT”) was successfully created; however, the underlying Azure log doesn’t provide the content of the rule. Is it an allow or deny? What ports and IPs opened?
To answer those questions, there are a few options:
- Pivot to the same log where Result = “Accept” or “Start”
- Find the rule in the Azure Portal
- Use the Azure SmartResponse™ Automation plugin to get the content of the rule
Our preference is LogRhythm SmartResponse. In a perfect world, you’d be investigating within minutes. In reality, we know you may be investigating hours or days later. By using the SmartResponse Automation plugin, you get real-time context and information about the rule.
One final note: Microsoft doesn’t differentiate between a create or update. If you want to determine if this resource existed before, perform a pivot search on the Object field.
Tip #2: Get Familiar with Your Azure Environment
Once you understand what content and fields are available in an Azure Monitor log, you can begin to catalog your Azure implementation. This will empower you to determine what’s normal and what’s not, threat hunt, and build AI Engine analytics specific to your environment.
Top Resource Types
The Object Type field contains the Azure Resource Type. This is an interesting field to build a Top X or Trend widget.
Identify Which Regions You’re Active in
Many customers only have resources in specific Azure Regions. By understanding in which regions you have resources, you can determine if there is unexpected activity down the road. A region can be found in the Session Type field in Azure Resource logs; it is not applicable in Azure Active Directory diagnostic logs.
Note that the Azure Activity Log sometimes lists the region as “global.”
Catalog your Subscriptions
You probably have multiple Azure subscriptions, each used for different purposes like staging versus production.
You can run a report or build a Top X widget on the Serial Number field, which contains Azure subscription IDs from your logs. Your Azure team can help you identify which subscriptions those map to. From there, you can build LogRhythm lists and filter dashboards, searches, and AI Engine rules based on the type of environment. For example, you may create an AI Engine threshold rule to alert when multiple errors occur, but only for your production Azure subscriptions.
Tip #3: Hunt for Anomalies
Hunting for anomalies in your Azure environment gets significantly easier once you know what “normal” looks like. Until then, there are some events and behaviors you can monitor.
Find Specific Events
Based on Microsoft’s documentation, we compiled a list of around 5,000 Activity Log events and descriptions, which you can find attached to the Community post linked at the end of this blog. As you discover more interesting events, put them into a LogRhythm List for comparison against the Vendor Message ID field.
Here are a handful that we found interesting:
- MICROSOFT.INSIGHTS/LOGPROFILES/WRITE: someone may have changed your Azure Monitor Activity Log settings
- MICROSOFT.INSIGHTS/DIAGNOSTICSETTINGS/DELETE: someone deleted or turned off diagnostic settings
- MICROSOFT.SQL/SERVERS/DATABASES/TRANSPARENTDATAENCRYPTION/WRITE: someone modified encryption settings on a database
- MICROSOFT.AUTHORIZATION/POLICYASSIGNMENTS/DELETE: someone deleted an Azure policy
- MICROSOFT.SECURITY/POLICIES/WRITE: someone modified a security center policy
Deleting Azure resources should be a normal occurrence in any environment, but spikes in this behavior may indicate an operational or security incident. Monitor the Common Event “Object Deleted/Removed” through a dashboard or AI Engine trend rule.
To filter a Web Console widget, use the Lucene Syntax: commonEventName:”Object Deleted/Removed”
Similar to delete events, errors, and warnings, high-severity logs should be monitored for spikes. You can find this in the Severity field; look for values of “Critical”, “High”, and “Error.”
To filter a Web Console widget, use the Lucene Syntax: severity:(“Critical” “High” “Error”)
LogRhythm’s CloudAI for UEBA is a great way to monitor user behavior and detect suspicious activity, especially from Azure Active Directory sign-in logs. In addition, LogRhythm enriches the log with geolocation information including country, region, and city to quickly identify where the request originated.
When an Azure service generates a request, you may not recognize the IP or region; there may also be no associated user.
Tip #4: Monitor Your Azure Active Directory Audit and Sign-In Logs
Azure Active Directory underlies both Office 365 and Azure. You might already be bringing in Azure AD logs through the Office 365 Management API integration. These logs include Sign-In and Audit data, and follow a different schema than the Azure Monitor Activity Log.
There are benefits to using the Azure Monitor integration, primarily a richer set of data in the logs. They fall under the Azure Monitor category of “Diagnostic Logs.” To enable, navigate to “Azure Active Directory” in the Azure Portal. Find “Diagnostic Settings” on the left menu and add a new Diagnostic Setting to stream “AuditLogs” and “SigninLogs” to your Event Hub.
Here are some highlights from Azure AD sign-in log metadata:
- User (Origin): The UserPrincipalName of the user who signed in
- Policy: List of conditional access policies applied
- Session Type: The type of MFA used
- Object: The Application the user is signing into
- Result, Reason: A Result of 0 indicates success; the description of the error code will appear in Reason.
- Severity, Threat Name, State: Azure AD risk event data, from risk level during sign-in, risk event types, and risk state, respectively
- These will only be populated if your Azure Tenant has an “Azure AD Premium P2” license
Azure AD audit logs provide visibility into user, group, service principal, directory, and tenant configuration changes. Unfortunately, many of these raw logs are missing the actor’s IP address.
You can find a list of example events in the Audit Activity Reference documentation, but it is not a complete list, missing critical activities such as “Add User” and “Disable Account.”
Tip #5: Detect and Respond to Security Center Threats
Azure Security Center alerts are part of the Activity Log, so if you’re auditing that, you’re likely already getting them in LogRhythm. When an alert is activated, you’ll get:
- Vendor Message ID: “Microsoft.Security/locations/alerts/activate/action”
- Object: The Alert ID, which can be used in Azure to find the alert
- Subject: The name of the alert
- Vendor Info: The description of the alert
- Command, Process, User, Hostname, etc: Alert metadata
- For example, the command run on the host that Azure Security Center detected as malicious
The log below is an alert generated when we ran Greg Foss’s Quick-Mimikatz on an Azure Virtual Machine.
You can bring these alerts into LogRhythm for a single pane of glass. Using AI Engine’s Log Observed rule block, create a primary filter for the Azure Event Hub Log Source Type and an include filter for a Vendor Message ID of “Microsoft.Security/locations/alerts/activate/action.” Group the rule block by Object.
You can also leverage the SmartResponse Automation actions to:
- Automatically get the Azure Security Center alert details
- Once the investigation and response is complete, the analyst can dismiss the Azure Security Center alert directly from LogRhythm
Tip #6: Enforce Best Practices with Azure Policy
Azure Policy is an effective way to audit and enforce controls within your Azure environment, analogous to AWS Config. There are a variety of out-of-the-box policies, as well as a few from the community, such as the Azure Monitor Onboarding policies built by Microsoft’s John Kemnetz.
Some examples include:
- Auditing whether diagnostics are enabled for resources
- Enforcing diagnostics enabled for resources
- Auditing the number of owners of a subscription
- Restricting the types of resources that can be created
- Restricting the virtual machine SKUs that can be created
When an Azure Policy is violated, you’ll get a log with a Severity of “Warning” and one of the following Vendor Message IDs:
If you’re using Azure Policy to keep your environment secure, it is critical to closely monitor any modification or deletion of policy assignments with the following Vendor Message IDs:
Azure EventHub provides access to a wide range of logs across many Azure cloud services. LogRhythm’s Open Collector integrates with EventHub and provides collection and enrichment of Azure logs, enabling visibility, audit, threat hunting and enforcement across your Azure environment. Gain the same level of visibility into your cloud computing resources as you have on your on-premises resources, and easily correlate activities as your users move across both traditional and cloud delivered resources.
If you’re a LogRhythm customer and ready to get started on your Azure journey, you’ll need to install and configure the LogRhythm Open Collector and EventHub integration.
More details, including documentation of both configuring Azure Monitor and the Open Collector, can be found in the Documentation and Downloads section of the LogRhythm Community here.
Additional resources from this blog can be found on the LogRhythm Community.
- Azure Monitor Dashboard
- The Azure Security Center AI Engine Rule
- The list of Azure Monitor Operations
- A LogRhythm Echo Use Case with sample Azure logs