Azure VM Details Test

This test auto-discovers the virtual machines used by the target Microsoft Azure subscription, and for each VM, it reveals in-depth metrics such as status, memory utilization, CPU utilization, disk I/O measures, etc. In the process, the test points administrators to resource-hungry VMs.

Target of the Test: A Microsoft Azure Subscription

Agent deploying the test: A remote agent

Output of the test: One set of results for each VM in every resource group of the target Azure subscription

Configurable parameters for the test
Parameters Description

Test Period

How often should the test be executed.

Host

The host for which the test is to be configured.

Subscription ID

Specify the GUID which uniquely identifies the Microsoft Azure Subscription to be monitored. To know the ID that maps to the target subscription, do the following:

  1. Login to the Microsoft Azure Portal.

  2. When the portal opens, click on the Subscriptions option (as indicated by Figure 1).

    Figure 1 : Clicking on the Subscriptions option

  3. Figure 2 that appears next will list all the subscriptions that have been configured for the target Azure AD tenant. Locate the subscription that is being monitored in the list, and check the value displayed for that subscription in the Subscription ID column.

    Figure 2 : Determining the Subscription ID

  4. Copy the Subscription ID in Figure 2 to the text box corresponding to the SUBSCRIPTION ID parameter in the test configuration page.

Tenant ID

Specify the Directory ID of the Azure AD tenant to which the target subscription belongs. To know how to determine the Directory ID, refer to Configuring the eG Agent to Monitor the Microsoft Azure App Service

Client ID and Client Password

The eG agent communicates with the target Microsoft Azure Subscrption using Java API calls. To collect the required metrics, the eG agent requires an Access token in the form of an Application ID and the client secret value. To know how to determine the Application ID and the key, refer to Configuring the eG Agent to Monitor the Microsoft Azure App Service. Specify the Application ID of the created Application in the Client ID text box and the client secret value in the Client Password text box.

Proxy Host

In some environments, all communication with the Azure cloud be routed through a proxy server. In such environments, you should make sure that the eG agent connects to the cloud via the proxy server and collects metrics. To enable metrics collection via a proxy, specify the IP address of the proxy server and the port at which the server listens against the Proxy Host and Proxy Port parameters. By default, these parameters are set to none, indicating that the eG agent is not configured to communicate via a proxy, by default.

Proxy Username, Proxy Password and Confirm Password

If the proxy server requires authentication, then, specify a valid proxy user name and password in the Proxy Username and Proxy Password parameters, respectively. Then, confirm the password by retyping it in the Confirm Password text box.

Diagnostic Measures

By default, this flag is set to Off. This means that, by default, this test reports only host-level metrics - eg., CPU usage, disk usage, and network usage - for each VM.

For deeper insights into VM performance, you may want to collect guest-level metrics and other diagnostic data using the Azure Diagnostics extension. Azure Diagnostics extension is an agent in Azure Monitor that collects monitoring data from the guest operating system of Azure compute resources including virtual machines. To configure this test to use this extension and pull guest-level metrics from VMs, do the following:

  1. Install the Azure Diagnostics Agent on each VM you want guest-level metrics from. To achieve this, follow the procedure outlined in Enabling Guest level Monitoring for Azure Virtual Machines.

  2. Set the DIAGNOSTIC MEASURES flag of this test to On.

Detailed Diagnosis

To make diagnosis more efficient and accurate, the eG Enterprise embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option.

The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:

  • The eG manager license should allow the detailed diagnosis capability
  • Both the normal and abnormal frequencies configured for the detailed diagnosis measures should not be 0.
Measures reported by the test:
Measurement Description Measurement Unit Interpretation

Status

Indicates the current state of this virtual machine.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure Value Numeric Value
Provisioning succeeded 1
Running 2
Updating 3
Deallocating 4
Starting 5
Stopped 6
Deallocated 7

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the current state of this virtual machine. In the graph of this measure however, the same is represented using the numeric equivalents only.

Use the detailed diagnosis of this measure to know the IP, location, type, OS, and size of the VM.

Provisioning status

Indicates provisioning status of this virtual machine.

Number

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure Value Numeric Value
Succeeded 1
Updating 2

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the current state of this virtual machine. In the graph of this measure however, the same is represented using the numeric equivalents only.

Total cores

Indicates the total number of cores in this virtual machine.

Number

 

Configured memory

Indicates the amount of memory that is configured for this VM.

GB

 

Maximum disk size

Indicates the maximum size of the disk allocated to this VM.

GB

 

Temporary disk size

Indicates the size of the 'temporary disk' allocated to this VM.

GB

 

Maximum data disks

Indicates the maximum number of the 'Data disks' attached to this VM.

Number

 

Maximum IOPS

Indicates the maximum number of I/O operations that are allowed for this VM.

Number

 

CPU utilization

Indicates the percentage of CPU utilized by this VM.

Percent

A value close to 100% is a cause for concern, as it implies a severe contention for CPU resources on the VM. You may want to look deeper into the VM to figure out if any application is hogging its CPU resources.

Incoming network traffic

Indicates the amount of data received by this VM through all network interfaces.

MB

In the event that there is a network congestion, compare the values of these measures across VMs to know which VM is probably causing it.

Outgoing network traffic

Indicates the amount of data sent out through all the network interfaces by this VM.

MB

Data reads from disk

Indicates the amount of data read from the disk of this VM during the last measurement period.

MB

 

Data writes to disk

Indicates the amount of data written to the disk of this VM during the last measurement period.

MB

 

Disk read operations

Indicates the rate at which data was read from the disk of this VM during the last measurement period.

Operations/sec

 

Disk write operations

Indicates the rate at which data was written from the disk of this VM during the last measurement period.

Operations/sec

 

Total IOPS

Indicates the total number of I/O Operations per second on this VM.

Number

 

Interrupt time

Indicates the percentage of time that the processor of this VM spent receiving and servicing hardware interrupts during the last measurement period.

Percentage

The value of this measure is an indirect indicator of the activity of devices that generate interrupts, such as the system clock, the mouse, disk drivers, data communication lines, network interface cards, and other peripheral devices. These devices normally interrupt the processor when they have completed a task or require attention. Normal thread execution is suspended during interrupts. Most system clocks interrupt the processor every 10 milliseconds, creating a background of interrupt activity.

Processor time

Indicates the percentage of time that the processor of this VM is executing application or operating system processes other than Idle threads.

Percentage

The value of this measure is a primary indicator of processor activity. It is calculated by measuring the time that the processor spends executing the thread of the Idle process in each sample interval, and subtracting that value from 100%. Each processor has an Idle thread which consumes cycles when no other threads are ready to run.

This measure will be reported only if the following conditions are fulfilled:

  1. The Azure Diagnostics Agent should be installed on each VM you want monitored. To achieve this, follow the procedure outlined in Enabling Guest level Monitoring for Azure Virtual Machines.

  2. The DIAGNOSTIC MEASURES flag of this test should be set to On.

User time

Indicates the percentage of non-idle processor time that is spent in user mode by the processor of this VM.

Percentage

User mode is a restricted processing mode designed for applications, environment subsystems, and integral subsystems. The alternative, privileged mode, is designed for operating system components and allows direct access to hardware and all memory. The operating system switches application threads to privileged mode to obtain operating system services.

This measure will be reported only if the following conditions are fulfilled:

  1. The Azure Diagnostics Agent should be installed on each VM you want monitored. To achieve this, follow the procedure outlined in Enabling Guest level Monitoring for Azure Virtual Machines.

  2. The DIAGNOSTIC MEASURES flag of this test should be set

Privileged time

Indicates the percentage of non-idle processor time spent in privileged mode by the processor of this VM.

Percentage

Privileged mode is a processing mode designed for operating system components and hardware-manipulating drivers. It allows direct access to hardware and all memory. The alternative, user mode, is a restricted processing mode designed for applications, environment subsystems, and integral subsystems. The operating system switches application threads to privileged mode to obtain operating system services. % Privileged Time includes time spent servicing interrupts and DPCs. A high rate of privileged time might be attributable to a large number of interrupts generated by a failing device.

Processor frequency

Indicates the frequency at which the processor of this VM operates.

Number

This measure will be reported only if the following conditions are fulfilled:

  1. The Azure Diagnostics Agent should be installed on each VM you want monitored. To achieve this, follow the procedure outlined in Enabling Guest level Monitoring for Azure Virtual Machines.

  2. The DIAGNOSTIC MEASURES flag of this test should be set

Parking status

Indicates the number of CPU cores that were parked for the processor of this VM.

Number

 

Total processes page faults rate

Indicates the rate at which page faults by the threads executing in this process of this VM are occurring.

Faults/sec

A page fault occurs when a thread refers to a virtual memory page that is not in its working set in main memory. This does not cause the page to be fetched from disk if it is on the standby list and hence already in main memory, or if it is in use by another process with whom the page is shared.

Total processes handle usage

Indicates the number of handles that are currently utilized by the processes on this VM.

Number

A high value of this measure could indicate a memory leak on the VM.

This measure will be reported only if the following conditions are fulfilled:

  1. The Azure Diagnostics Agent should be installed on each VM you want monitored. To achieve this, follow the procedure outlined in Enabling Guest level Monitoring for Azure Virtual Machines.

  2. The DIAGNOSTIC MEASURES flag of this test should be set

Total processes non-shared data

Indicates the amount of data bytes that the processes of the processor associated with this VM has allocated that cannot be shared with other processes.

MB

 

Total processes memory

Indicates the current number of bytes in the working set of the processes of the processor of this VM.

Number

The working set is the set of memory pages touched recently by the threads in the process. If free memory in the VM is above a certain threshold, pages are left in the working set of a process even if they are not in use. When free memory falls below a certain threshold, pages are trimmed from working sets. If they are needed, they are then soft-faulted back into the working set before they leave main memory.

This measure will be reported only if the following conditions are fulfilled:

  1. The Azure Diagnostics Agent should be installed on each VM you want monitored. To achieve this, follow the procedure outlined in Enabling Guest level Monitoring for Azure Virtual Machines.

  2. The DIAGNOSTIC MEASURES flag of this test should be set

Total processes private memory

Indicates the number of bytes in the working set that are not shared and cannot be shared by other processes of the processor of this VM.

Number

This measure will be reported only if the following conditions are fulfilled:

  1. The Azure Diagnostics Agent should be installed on each VM you want monitored. To achieve this, follow the procedure outlined in Enabling Guest level Monitoring for Azure Virtual Machines.

  2. The DIAGNOSTIC MEASURES flag of this test should be set

Processes

Indicates the number of system processes in this VM at the time of data collection.

Number

This measure will be reported only if the following conditions are fulfilled:

  1. The Azure Diagnostics Agent should be installed on each VM you want monitored. To achieve this, follow the procedure outlined in Enabling Guest level Monitoring for Azure Virtual Machines.

  2. The DIAGNOSTIC MEASURES flag of this test should be set

Threads

Indicates the number of system threads in this VM at the time of data collection.

Number

 

Context switches

Indicates the rate at which context switches occurred on this VM.

Switches/sec

A context switch occurs when the kernel switches the processor from one thread to another. A context switch might also occur when a thread with a higher priority than the running thread becomes ready or when a running thread must wait for some reason (such as an I/O operation). The Thread\Context Switches/sec counter value increases when the thread gets or loses the time of the processor.

This measure will be reported only if the following conditions are fulfilled:

  1. The Azure Diagnostics Agent should be installed on each VM you want monitored. To achieve this, follow the procedure outlined in Enabling Guest level Monitoring for Azure Virtual Machines.

  2. The DIAGNOSTIC MEASURES flag of this test should be set

Free memory

Indicates the amount of physical memory, in bytes, that is immediately available for allocation to a process or for use by this VM.

MB

A low value for this measure implies excessive memory usage by a VM.

This measure will be reported only if the following conditions are fulfilled:

  1. The Azure Diagnostics Agent should be installed on each VM you want monitored. To achieve this, follow the procedure outlined in Enabling Guest level Monitoring for Azure Virtual Machines.

  2. The DIAGNOSTIC MEASURES flag of this test should be set

Committed memory in use

Indicates the amount of physical memory that is in use for which space has been reserved in the paging file so that it can be written to disk allocated to this VM.

MB

This measure will be reported only if the following conditions are fulfilled:

  1. The Azure Diagnostics Agent should be installed on each VM you want monitored. To achieve this, follow the procedure outlined in Enabling Guest level Monitoring for Azure Virtual Machines.

  2. The DIAGNOSTIC MEASURES flag of this test should be set

Cache faults

Indicates the rate at which faults occur when a page sought in the file system cache is not found and must be retrieved from elsewhere in memory (a soft fault) or from disk (a hard fault) of this VM.

Faults/sec

Ideally, the value of this measure should be 0 or very low.

Page reads from disk

Indicates the rate at which the disk of this VM was read to resolve hard page faults.

Pages/sec

Hard page faults occur when a process references a page in virtual memory that is not in its working set or elsewhere in physical memory, and must be retrieved from disk. This measure is a primary indicator of the kinds of faults that cause system-wide delays. It includes read operations to satisfy faults in the file system cache (usually requested by applications) and in noncached mapped memory files. Compare the value of Page Reads/sec to the value of Pages Input/sec to find an average of how many pages were read during each read operation.

Pages read and written to disk

Indicates the rate at which pages are read from or written to disk of this VM to resolve hard page faults.

Pages/sec

This measure will be reported only if the following conditions are fulfilled:

  1. The Azure Diagnostics Agent should be installed on each VM you want monitored. To achieve this, follow the procedure outlined in Enabling Guest level Monitoring for Azure Virtual Machines.

  2. The DIAGNOSTIC MEASURES flag of this test should be set

Memory paged pool size

Indicates the size of the paged pool which is an area of system memory (physical memory) for objects that can be written to disk of this VM when they are not being used.

MB

This measure will be reported only if the following conditions are fulfilled:

  1. The Azure Diagnostics Agent should be installed on each VM you want monitored. To achieve this, follow the procedure outlined in Enabling Guest level Monitoring for Azure Virtual Machines.

  2. The DIAGNOSTIC MEASURES flag of this test should be set

Non-paged pool kernel memory size

Indicates the size of the nonpaged pool which is an area of system memory (physical memory) for objects that cannot be written to disk of this VM, but must remain in physical memory as long as they are allocated.

MB

This measure will be reported only if the following conditions are fulfilled:

  1. The Azure Diagnostics Agent should be installed on each VM you want monitored. To achieve this, follow the procedure outlined in Enabling Guest level Monitoring for Azure Virtual Machines.

  2. The DIAGNOSTIC MEASURES flag of this test should be set

Committed memory

Indicates the amount of committed virtual memory allocated to this VM.

MB

This measure will be reported only if the following conditions are fulfilled:

  1. The Azure Diagnostics Agent should be installed on each VM you want monitored. To achieve this, follow the procedure outlined in Enabling Guest level Monitoring for Azure Virtual Machines.

  2. The DIAGNOSTIC MEASURES flag of this test should be set

Page Faults rate

Indicates the average number of pages faulted per second.

Faults/sec

This measure will be reported only if the following conditions are fulfilled:

  1. The Azure Diagnostics Agent should be installed on each VM you want monitored. To achieve this, follow the procedure outlined in Enabling Guest level Monitoring for Azure Virtual Machines.

  2. The DIAGNOSTIC MEASURES flag of this test should be set

Transition faults

Indicates the rate at which page faults are resolved by recovering pages that were being used by another process sharing the page, or were on the modified page list or the standby list, or were being written to disk of this VM at the time of the page fault.

Faults/sec

Disk read bytes

Indicates the amount of data read from the disk of this VM.

MB

These measures will be reported only if the following conditions are fulfilled:

  1. The Azure Diagnostics Agent should be installed on each VM you want monitored. To achieve this, follow the procedure outlined in Enabling Guest level Monitoring for Azure Virtual Machines.

  2. The DIAGNOSTIC MEASURES flag of this test should be set

Disk write bytes

Indicates the amount of data written to the disk of this VM.

MB

Connection failures

Indicates the number of times the TCP connection to this VM failed.

Number

This measure is calculated based on the number of times TCP connections have made a direct transition to the CLOSED state from the SYN-SENT state or the SYN-RCVD state along with the number of times TCP connections have made a direct transition to the LISTEN state from the SYN-RCVD state.

Segments sent

Indicates the rate at which TCP Segments were sent from this VM.

Segments/sec

The value of this measure includes those segemtns sent from current connections, but excludes those containing only retransmitted bytes.

Segments retransmitted

Indicates the rate at which segments containing one or more previously transmitted bytes were retransmitted by this VM.

Segments/sec

Connections reset

Indicates the number of times that TCP connections from this VM have made a direct transition to the CLOSED state from either the ESTABLISHED or CLOSE-WAIT state.

Number

Segments received

Indicates the rate at which segments were received by this VM, including those received in error.

Segments/sec

The value of this measure includes segments received on currently established connections.

Connections established

Indicates the number of TCP connections established on this VM.

Number

 

Processor idle time

Indicates the percentage of time for which the CPU of this VM has been idle.

Percent

If the CPU utilization measure of a VM reports a value close to 100%, then you may want to compare the value of the Interrupt time, Processor time, User time, and these two measures to understand where CPU was spent - in servicing interrupts? in executing application/OS processes? in user mode? idle? or waiting for input?

 

Processor wait time

Indicates the percentage of time the processor of this VM was waiting for I/O.

Percent

Page writes on disk

Indicates the rate at which this VM writes pages to disk.

Pages/Sec

 

Packet sent error

Indicates the number of error packets sent by this VM.

Number

 

Ideally, the value of these measures should be 0.

Packet received error

Indicates the number of error packets received by this VM.

Number

CPU credits consumed

Indicates the number of CPU credits consumed by this VM.

Number

Some VMs may not need to the full performance of the CPU continuously, like web servers, proof of concepts, small databases and development build environments. These workloads typically have burstable performance requirements. To support such requirements, Azure provides you with the ability to purchase a VM size with baseline performance that can build up credits when it is using less than its baseline. Every time a VM uses up a portion of the accumulated CPU credits, it means that that VM is using CPU above its baseline. Ideally therefore, the value of the CPU credits consumed measure should be low, and the CPU credits remaining measure should be high.

CPU credits remaining

Indicates the number of CPU credits still unused by this VM.

Number

VM cached bandwidth consumed

Indicates the percentage calculated by the total disk throughput completed over the max cached throughput of this VM.

Percent

Virtual machines that are enabled for both premium storage and premium storage caching have two different storage bandwidth limits.

  • The max uncached disk throughput is the default storage maximum limit that the virtual machine can handle.

  • The max cached storage throughput limit is a separate limit when you enable host caching. Host caching works by bringing storage closer to the VM that can be written or read to quickly.

If the value of the VM uncached bandwidth consumed is at 100%, it means that the VM has fully utilized its default storage limit. No data can be stored in the VM's disk from this point forward. This can cause the VM to suffer serious and prolonged performance degradations.

If the value of the VM cached bandwidth consumed is at 100%, it means that the VM has fully utilized the storage set aside for host caching. You may want to consider allocating more space for caching to improve throughput.

VM uncached bandwidth consumed

Indicates the percentage calculated by the total disk throughput completed over the max uncached throughput of this VM.

Percent

VM cached IOPS consumed

Indicates the percentage calculated by the total IOPS completed over the max cached IOPS limit of this VM.

Percent

Azure virtual machines have input/output operations per second (IOPS) and throughput performance limits based on the virtual machine type and size.

If the value of the VM uncached IOPS consumed measure is 100%, it means the VM performance has been capped. This can happen when the VM is requesting for more IOPS or throughput than what is allotted for the virtual machines or attached disks. When capped, the VM experiences suboptimal performance. This can lead to negative consequences like increased latency. To avoid this, you may want to increase the IOPS limit of the VM.

Reads served by the cache are not included in the disk IOPS and Throughput, hence not subject to disk limits. Cache has its separate IOPS and Throughput limit per VM.

If the VM cached IOPS consumed measure reports the value 100% for a VM, then it means that the VM has exhausted the IOPS limit configured for the cache. This can adversely impact throughput and I/O processing by the VM's cache. To avoid this, you may want to increase the IOPS limit of the cache.

VM uncached IOPS consumed

Indicates the percentage calculated by the total IOPS completed over the max uncached IOPS limit of this VM.

Percent

Uptime

Indicates the uptime of this VM.

Secs

 

Used memory

Indicates the amount of memory used by this VM.

MB

 

Memory utilization

Indicates the percentage of memory used by this VM.

Percent

A value close to 100% indicates excessive memory usage by the VM. A consistent rise in this value could hint at a potential memory shortage on the VM. You may want to allocate more memory to VM to avoid this.

Use the detailed diagnosis of the Status measure to know the IP, location, type, OS, and size of the VM.

Figure 3 : The detailed diagnosis of the Status measure of the Azure VM Details test