Azure Monitor is a comprehensive native monitoring solution that can be utilized to send alerts for given parameters. https://docs.microsoft.com/en-us/azure/azure-monitor/overview
Of the many capabilities Azure Monitor possesses, the most of interest to AVD Administrators and Engineers is the ability to monitor VMs, Storage accounts, and other resources used by Nerdio and AVD. While Nerdio does not incorporate Alert functionality, it is possible to construct custom Azure monitor Alerts to achieve the same desired effect. This guide will describe example alerts that may be of use to Nerdio for AVD users, as well as go over common practices used in Azure for VM Compute, Storage Account, Database, and App service resources.
Virtual Machines - ( Session Hosts )
An alert rule can be created to monitor VM resources. To start, go to a session host VM and navigate to Monitoring > Alerts > New Alert Rule.
When creating Alert rules, you can select multiple resources of the same type. In this case, we will want to select all VMs in this resource group. Doing so will mean the alerting does not need to be enabled on each VM individually. This is entirely up to your discretion
For this example, we will be creating an Alert for Data IOPS. As a common issue with some VM sizes is the VM Disk bandwidth is too low, and loading a large FSLogix profile will cause long logon times. By monitoring for this, we will know if we have under-provisioned VMs.
After selecting the specific value to be measured, we will then be prompted for more parameters. In this example, we want to know if disk usage is nearing an unhealthy amount. These specific settings will need to be tweaked differently depending on your unique situation. However, for this example a basic setting is provided. For information on how to set these parameters and further details, See https://docs.microsoft.com/en-us/azure/azure-monitor/platform/alerts-metric-overview
After setting the value, you will want to select actions to take. If no action groups have been made previously, we can do that now. See here: https://docs.microsoft.com/en-us/azure/azure-monitor/platform/action-groups?WT.mc_id=Portal-Microsoft_Azure_Monitoring for more info on action groups.
In this Example, we'll make setup a simple email notification, and then proceed to review and create. (Other triggers can be set, but notifications are most common for the purpose of alerting).
Finish creating the alert rule with a name and description. Severity is up to your preference, as it is used to sort alerts from the Alerts panel, for which details can be read here: https://docs.microsoft.com/en-us/azure/azure-monitor/platform/alerts-managing-alert-instances
If the alert is triggered, in this case, an email will be sent as defined in the alert group. It will look something like this:
These alerts can be customized for whatever resource metric is desired, such as CPU consumption or network bandwidth usage.
Storage Account Metrics (FSLogix Profiles and AppAttach)
The same Metric Alert rules can be created for other resources, such as storage accounts. Here we will follow steps similar to creating a VM metric alert above, but instead of VM Resources we will select Storage Account Used Capacity as follows:
We will set a GB size threshold that is near the quota. Note: In most cases the quota is created to an excessively large size as IOPS performance is tied to file share quota size. This alert is not useful for these situations. However for cost-savings or smaller environments, a small quota or standard-tier may be provisioned.
In this example, the threshold is set to 2GB (make sure to select the correct value under "Unit", default is B, not GiB). The granularity and frequency of evaluation can be set as desired.
We will use the action group created eariler:
finally give the Alert a name and enable:
SQL Databases (Nerdio App Backend)
SQL Databases can be monitored as well. For some operations, such as viewing auto-scaling history, a large amount of logs may be queried and parsed, causing a large demand for DTUs. The process is similar to the above examples for creating this Metric Alert, except the signal name will be DTU Percentage
SQL tends to be notoriously tricky to evaluate in terms of performance. However, the above settings should suffice to detect a significant throttling of DTUs which will affect Nerdio's functionality, which can manifest itself in errors such as "Execution Timeout Expired"
App Service (NMW Application)
Finally, using the same steps as the other resources, we can create a monitor for the App service itself.
First, the health check must be set in the application. Go to Monitoring -> Health Check.
Enable the Health check and input /public/health/status for the path.
A warning may appear about load balancing and single instances. These can be ignored, as we are only using a single instance for Nerdio, and the health check will be used to evaluate if the single instance is operational.
After the health check path has been configured, we can create an alert rule like before while on the app service blade in azure. For the condition on the app service, we will use Health Check status: