Memory – ESX Test

This test reports statistics pertaining to the machine memory of the VMware vSphere/ESXi server.

Target of the test : An ESX server host

Agent deploying the test : An internal/remote agent

Outputs of the test : One set of results for every ES

Configurable parameters for the test:
Parameter	Description
Test Period	How often should the test be executed
Host	The host for which the test is to be configured
Port	The port at which the specified HOST listens. By default, this is NULL.
ESX User and ESX Password	In order to enable the test to extract the desired metrics from a target ESX server, you need to configure the test with an ESX USER and ESX PASSWORD. The user credentials to be passed here depend upon the mechanism used by the eG agent for auto-discovering the VMs on the target vSphere server and monitoring the server and its VMs. These discovery/monitoring methodologies and their corresponding configuration requirements have been discussed hereunder: Discovering and monitoring by directly connecting to the target vSphere server: Starting with ESX server 3.0, a VMware ESX server offers a web service interface using which the eG agent discovers the guest operating systems on a physical ESX host. The VMware VI SDK is used by the agent to implement the web services interface. To use this interface for discovering the VMs and for monitoring, the eG agent should directly connect to the monitored vSphere/ESX server as an ESX USER with root privileges. However, if, owing to security constraints, you cannot use root user permissions, you can alternatively configure the tests with the credentials of a user who has been assigned the following permissions: Diagnostics TerminateSession To see how you can create such a user on the ESX server, refer to theCreating a Special Role on an ESX Server and Assigning the Role to a New User topic Discovering and monitoring using vCenter: By default, the eG agent connects to each ESX server and discovers the VMs executing on it. While this approach scales well, it requires additional configuration for each server being monitored. For example, separate user accounts may need to be created on each server for accessing VM details. While monitoring large virtualized installations however, the agents can be optionally configured to perform guest discovery using the VM information already available in vCenter. The same vCenter can also be used to monitor the vSphere server and its VMs. In this case therefore, the ESX USER and ESX PASSWORD that you specify should be that of an Administrator or Virtual Machine Administrator in vCenter. However, if, owing to security constraints, you prefer not to use the credentials of such users, then, you can create a special role on vCenter with the following privileges: Diagnostics Change settings View and stop sessions To know how to grant the above-mentioned permissions to a vCenter user, refer to Creating a Special Role on vCenter and Assigning the Role to a New User . If the ESX server for which this test is being configured had been discovered via vCenter, then the eG manager automatically populates the ESX USERand ESX PASSWORD text boxes with the vCenter user credentials using which the ESX discovery was performed.
Confirm Password	Confirm the specified ESX PASSWORD by retyping it here.
SSL	By default, the ESX server is SSL-enabled. Accordingly, the SSL flag is set to Yes by default. This indicates that the eG agent will communicate with the ESX server via HTTPS by default. On the other hand, if the eG agent has been configured to use the VMPerl API or CLI for monitoring (i.e., if the ESX USER parameter is set to none), then the status of the SSL flag is irrelevant. Like the ESX sever, the vCenter is also SSL-enabled by default. If you have chosen to use the vCenter for monitoring all the ESX servers in your environment, then you have to set the SSL flag to Yes.
Webport	By default, in most virtualized environments, the ESX server and vCenter listen on port 80 (if not SSL-enabled) or on port 443 (if SSL-enabled). This implies that while monitoring an SSL-enabled ESX server directly, the eG agent, by default, connects to port 443 of the ESX server to pull out metrics, and while monitoring a non-SSL-enabled ESX server, the eG agent connects to port 80. Similarly, while monitoring an ESX server via an SSL-enabled vCenter, the eG agent connects to port 443 of vCenter to pull out the metrics, and while monitoring via a non-SSL-enabled vCenter, the eG agent connects to port 80 of vCenter. Accordingly, the WEBPORT parameter is set to 80 or 443 depending upon the status of the SSL flag. In some environments however, the default ports 80 or 443 might not apply. In such a case, against the WEBPORT parameter, you can specify the exact port at which the ESX server or vCenter in your environment listens so that the eG agent communicates with that port.
Virtual Center	If the eG manager had discovered the target ESX server by connecting to vCenter, then the IP address of the vCenter server used for discovering this ESX server would be automatically displayed against the VIRTUAL CENTER parameter; similarly, the ESX USER and ESX PASSWORD text boxes will be automatically populated with the vCenter user credentials, using which ESX discovery was performed. If this ESX server has not been discovered using vCenter, but you still want to discover the guests on the ESX server via vCenter, then select the IP address of the vCenter host that you wish to use for guest discovery from the VIRTUAL CENTER list. By default, this list is populated with the IP address of all vCenter hosts that were added to the eG Enterprise system at the time of discovery. Upon selection, the ESX USERand ESX PASSWORD that were pre-configured for that vCenter server will be automatically displayed against the respective text boxes. On the other hand, if the IP address of the vCenter server of interest to you is not available in the list, then, you can add the details of the vCenter server on-the-fly, by selecting the Other option from the VIRTUAL CENTER list. This will invoke the ADD VCENTER SERVER DETAILS page. Refer to Adding the Details of a vCenter Server for VM Discoverysection to know how to add a vCenter server using this page. Once the vCenter server is added, its IP address, ESX USER, and ESX PASSWORD will be displayed against the corresponding text boxes. On the other hand, if you want the eG agent to behave in the default manner -i.e., communicate with each ESX server for monitoring and VM information - then set the VIRTUAL CENTER parameter to ‘none’.
Detailed Diagnosis	To make diagnosis more efficient and accurate, eG Enterprise embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option. The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled: The eG manager license should allow the detailed diagnosis capability Both the normal and abnormal frequencies configured for the detailed diagnosis measures should not be 0.

Measurements made by the test

Measurement

Description

Measurement Unit

Interpretation

Total physical memory:

Indicates the total amount of machine memory.

Used physical memory:

Indicates the amount of physical memory that is in use.

Ideally, the value of this measure should be low. If this measure reports an abnormally high value, then you can use the detailed diagnosis of this measure to know which VM is contributing to the memory contention.

Free physical memory:

Indicates the amount of machine memory that is free.

A high value is typically desired for this measure.

Percent physical memory free:

Indicates the percentage of machine memory that is available for use.

Percent

A very low value for this measure indicates a shortage of memory resources. If more machine memory is not made available soon, then this could significantly degrade the performance of the guest OS’.

Memory state:

Describes the contention for memory.

Number

This measure takes one of the following values:

0 - high (lot of memory available)
1 - soft
2 - hard
3 - low (memory is overcommitted)

The higher the number, the memory state is more constrained.

Memory granted:

Indicates the amount of memory that the VMkernel has allocated to all virtual machines running on the server.

Memory unreserved:

Indicates the current amount of unreserved swap space.

Shared memory:

Indicates the current amount of shared guest operating system memory.

VMware ESX can share common memory pages across VMs. This includes pages from VMs running the same virtual machine OS and applications.

Memory shared common:

Indicates the amount of memory required for a single copy of shared pages in running VMs.

Balloon memory:

Indicates the total amount of physical memory currently reclaimed by the ESX server using the vmmemctl modules.

The vmmectl driver that is installed on a virtual machine, emulates an increase or decrease inmemory pressure on the guest operating system; this way, it forces the guest OS to place memory pages into its local swap file. This driver differs from the VMware swap file method as it forces the operating system to determine what memory it wishes to page. Once the memory is paged locally on the guest operating system, the free physical pages of memory may be reallocated to other guests. As the ESX hosts sees that memory demand has been reduced, it will instruct vmmemctl to “deflate” the balloon and reduce pressure on the guest OS to page memory.

The maximum amount of memory that can be reclaimed from a guest may be configured by modifying the “sched.mem.maxmemctl” advanced option.

If the memory reclaimed from a guest (i.e., the value of this measure) is very low, it indicates excessive memory usage by the guest. Under such circumstances, you might want to consider allocating more memory to the guest.

You can use the detailed memory of the Balloon memory measure to

Percent balloon memory:

Indicates the percentage of balloon memory.

Percent

Current swap used:

Indicates the total amount of swap space used.

This counter reflects the total amount of VMkernel swap usage on the host. In almost any scenario, this counter should be at or close to zero as VMkernel memory swapping is used as a last resort. Significant or consistent memory swapping indicates that ESX host memory is severely overcommitted and that performance degradation is imminent or actively occurring.

Zero memory:

Indicates the amount of memory that is zeroed out.

The “Memory Zero” amount will fluctuate as memory is over allocated. ESX will zero out the VM’s memory to use with other VM’s.

Memory reserved capacity:

Indicates the amount of memory currently utilized to satisfy minimum memory values set for all VMs.

Active memory:

Indicates the amount of memory that is actively used.

Use the detailed diagnosis of this measure to understand the active memory usage of each VM.

Kernel memory:

Indicates the amount of machine memory being used by the ESX Server VMKernel.

Swap in rate:

Indicates the rate at which memory is swapped from disk into active memory.

Mbps

A high rate of swap ins and swap outs could be indicative of a memory contention on the host.

Swap out rate:

Indicates the rate at which memory is swapped from active memory to disk.

Mbps

Memory overhead:

Indicates the total of all overhead metrics for powered-on virtual machines, plus the overhead of running vSphere services on the host.

vSphere/ESXi virtual machines can incur two kinds of memory overhead:

The additional time to access memory within a virtual machine.
The extra space needed by the ESX/ESXi host for its own code and data structures, beyond the memory allocated to each virtual machine.

vSphere/ESXi memory virtualization adds little time overhead to memory accesses. Because the processor’s paging hardware uses page tables (shadow page tables for software-based approach or nested page tables for hardware-assisted approach) directly, most memory accesses in the virtual machine can execute without address translation overhead.

The memory space overhead has two components:

A fixed, system-wide overhead for the VMkernel
Additional overhead for each virtual machine

Overhead memory includes space reserved for the virtual machine frame buffer and various virtualization data structures, such as shadow page tables. Overhead memory depends on the number of virtual CPUs and the configured memory for the guest operating system. vSphere/ESXi also provides optimizations such as memory sharing to reduce the amount of physical memory used on the underlying server. These optimizations can save more memory than is taken up by the overhead.

Is memory overcommitted?

Indicates whether memory is over-committed or not.

Host memory is over-committed when the total memory space allocated (memory granted) to powered-on VMs, plus host memory overhead, is greater than the amount of total physical memory available to the host. However, note that it is unwise to run a virtual machine with a working set that is larger than the host memory. If this is the case, the hypervisor has to reclaim the virtual machine’s active memory through ballooning or hypervisor swapping, which will lead to potentially serious virtual machine performance degradation.

If the host memory is overcommited, then this measure will report the value Yes. If not, then this measure will report No.

The numeric values that correspond to the measure values discussed above are listed in the table below:

Numeric Value	Measure Value
1	Yes
0	No

Note:

By default, this measure reports the values Yes or No only to indicate whether the host memory is overcommitted or not. The graph of this measure however, represents the host memory state using the numeric equivalents - 0 or 1.

Memory overcommitted:

Indicates how much percentage of memory is over-committed.

Percent

A very high value for this measure indicates a shortage of memory resources in the host.

Usage of physical memory:

Indicates the memory usage as a percentage of the total configured or available memory.

Percent

A consistent increase in this value could be indicative of a slow, but steady erosion of the host physical memory. If the trend continues, it could significantly degrade the performance of the host and the guest OS’.

If this measure reports a value close to 100%, then you can use the detailed diagnosis of this measure to know which VM is consuming physical memory excessively and contributing to the contention.

Service console memory:

Indicates the amount of memory that is currently reserved for the service console.

Machine memory saving:

Indicates the amount of memory saved due to sharing of memory.

The value of this measure is the difference between the value of the Memory shared common measure and the Shared memory measure.

The amount of memory saved by memory sharing depends on workload characteristics. A workload of many nearly identical virtual machines might free up more than thirty percent of memory, while a more diverse workload might result in savings of less than five percent of memory.

Host cache used for swapping

Indicates the space used for caching swapped pages in the host cache.

Datastores that are created on solid state drives (SSD) can be used to allocate space for host cache. The host reserves a certain amount of space for swapping to host cache.

The host cache is made up of files on a low-latency disk that ESXi uses as a write back cache for virtual machine swap files. The cache is shared by all virtual machines running on the host.

When there is severe memory pressure and the hypervisor needs to swap memory pages to disk it will swap to the host cache on the SSD drive instead.

If the value of this measure rises consistently, it indicates that memory pages are constantly been swapped to the host cache. This in turn is indicative of a serious memory crunch on the hypervisor. You may want to throw more memory on your hypervisor to avoid this.

Also, If the value of this measure becomes equal to the space allocated to the host cache, it means that the swapped memory pages have completely filled the host cache. Under such circumstances, these memory pages will need to be copied to the regular .vswp file. This is not a recommended practice as it will decrease performance for your VMs as these pages more than likely at some point will need to be swapped in. To avoid this therefore, you want to resize the host cache.

Memory swap out rate to host cache from active memory

Indicates the rate at which memory is being swapped from active memory to host cache.

Mbps

When there is severe memory pressure and the hypervisor needs to swap memory pages to disk it will swap out to the host cache on the SSD drive instead.

If the memory pages are swapped out to the host cache at a high rate - i.e., if the value of this measure is consistently high - check the amount of free physical memory on the host. A free memory value of 6% or less indicates that the host requires more memory resources.

Memory swap in rate from host cache into active memory

Indicates the rate at which memory is being swapped from host cache into active memory.

Mbps

Ideally, the value of this measure should be high.

Latency

Indicates the percentage of time the virtual machines on the host were blocked waiting to access swapped, compressed memory, or ballooned memory.

Percent

The higher the value of this measure, the more adverse will be the impact on VM performance.

Active write

Indicates the amount of memory actively being written to by the virtual machines on the host.

Low free threshold

Indicates the threshold of free host physical memory below which vSphere will begin reclaiming memory from virtual machines through ballooning and swapping.

If the value of the Free physical memory measure has fallen below the value of this measure, it is a clear indicator of a memory contention on the host.

Compression rate

Indicates the rate of memory compression for the virtual machines on the host.

Mbps

If there is a danger of host level swapping, then ESXi will use memory compression to reduce the number of pages that it needs to swap out.

If the value of this measure is high, it could be indicative of a memory contention on the host.

Compressed

Indicates the amount of memory compressed.

Higher the value of this measure, more will be the capacity of the host.

Decompression rate

Indicates the rate of memory decompression for the virtual machines on the host.

Mbps

Active memory over commitment

Indicates whether amount of memory that is actively used by VMs is higher than machine memory available.

If the machine memory is overcommited, then this measure will report the value Yes. If not, then this measure will report No.

The numeric values that correspond to the measure values discussed above are listed in the table below:

Numeric Value	Measure Value
1	Yes
0	No

Note:

By default, this measure reports the values Yes or No only to indicate whether the machine memory is overcommitted or not. The graph of this measure however, represents the same using the numeric equivalents - 0 or 1.

Memory swapped in from disk

Indicates the sum of swapin values for all powered-on virtual machines on the host.

Memory swapped out from disk

Indicates the sum of swapout values for all powered-on virtual machines on the host.

A high value for this measure is indicative of a severe memory contention on the host.

Heap

Indicates the VMkernel virtual address space dedicated to VMkernel main heap and related data.

The main consumer of VMFS heap are the pointer blocks which are used to address file blocks in very large files/VMDKs on a VMFS filesystem. Therefore, the larger your VMDKs, the more VMFS heap you can consume.

As a rule of thumb, a single ESXi host should have enough default heap space to address around 10TB of open files/VMDKs on a VMFS-5 volume.

Heap free

Indicates the free address space in the VMkernel main heap.

The value of this measure varies based on number of physical devices and configuration options.

A high value is desired for this measure. If there is no free heap left, then you will be unable to perform any VM operations (power-on, power-on, VMotion) and may also be denied access to your VMs. To avoid this, it would be good practice to allocate adequate heap.

VMFS PB cache capacity miss ratio

Indicates the trailing average of the ratio of capacity misses to compulsory misses for the VMFS PB Cache.

Percent

Typically, if a block in memory is accessed for the first time, the block is brought into the cache. This is called a compulsory miss.

If a cache miss occurs because the block requested was discarded from cache owing to lack of memory space, then such a miss is called a capacity miss.

Ideally, the value of this measure should be 0. A high value is indicative of frequent capacity cache misses. To avoid this, make sure that your cache is sized adequately.

VMFS PB cache overhead

Indicates the amount of VMFS heap used by the VMFS PB Cache.

VMFS PB cache size

Indicates the space used for holding VMFS Pointer Blocks in memory.

If the value of the VMFS PB cache capacity miss ratio measure is very high, you may want to compare the value of this measure with that of the Maximum VMFS PB cache size measure to determine whether/not the cache is getting filled up rapidly. If so, then you may want to increase the maximum cache size, so that the cache can grow freely without having to evict any pointer block entries.

Maximum VMFS PB cache size

Indicates the maximum size the VMFS pointer block cache can grow to.

You can configure the minimum and maximum sizes of the pointer block cache on each ESXi host. When the size of the pointer block cache approaches the configured maximum size, an eviction mechanism removes some pointer block entries from the cache.

Base the maximum size of the pointer block cache on the working size of all open virtual disk files that reside on VMFS datastores. All VMFS datastores on the host use a single pointer block cache.