Xen Pool Details Test

A Resource Pool comprises of multiple XenServer Host installations, bound together into a single managed entity which can host Virtual Machines. When combined with shared storage, a Resource Pool enables VMs to be started on any XenServer Host which has sufficient memory and then dynamically moved between XenServer Hosts while running with minimal downtime (XenMotion).

A pool always has at least one physical host, known as the "pool master", that provides a single point of contact for all of the servers in the pool, known as "slaves", managing communication to other members of the pool as necessary. If the pool master is shut down or unavailable, you will not be able to connect to the pool until the master is online again or until you nominate one of the other members as the new pool master for the pool. However, if a pool is High Availability-enabled, then, upon the failure of the master, another host in the pool is automatically selected as the master. VMs in the pool then automatically restart on the new master.

Likewise, you can also enable the Workload balancing component on a pool. Workload Balancing is a XenServer component, packaged as a virtual appliance, that:

  1. Creates reports about VM performance in your XenServer environment
  2. Evaluates resource utilization and locates virtual machines on the best possible hosts in the pool for their workload's needs

Using this test, you can determine whether/not the XenServer being monitored is the pool master, and if so, understand the composition of the pool and know the status of the hosts in the pool. In addition, for the pool master, this test reports whether/not the HA and Workload balancing features are enabled for the pool.

Target of the test : A XenServer host

Agent deploying the test : An internal/remote agent

Outputs of the test : One set of results for the pool to which the monitored XenServer belongs; if the target XenServer is not part of any pool, then this test will report metrics for a Default descriptor

Configurable parameters for the test
  1. Test period - How often should the test be executed
  2. Host - The host for which the test is to be configured.
  3. PORT - The port at which the specified HOST listens. By default, this is NULL.
  4. XEN user - To enable the eG agent to connect to the XenServer API for collecting statistics of interest, this test should login to the XenServer as a root user. Provide the name of the root user in the XEN USER text box. Root user privileges are mandatory when monitoring a XenServer 5.5 (or below). However, if you are monitoring XenServer 5.6 (or above) and you prefer not to expose the credentials of the root user, then, you have the option of configuring a user with pool-admin privileges as the xen user. If you do not want to expose the credentials of a root/pool-admin user, then you can configure the tests with the credentials of a xen user with Read-only privileges to the XenServer. However, if this is done, then the Xen Uptime test will not run, and the Xen CPU and Xen Memory tests will not be able to report metrics for the control domain descriptor. To avoid such an outcome, do the following before attempting to configure the eG tests with a xen user who has Read-only privileges to the XenServer:

    • Modify the target XenServer’s configuration in the eG Enterprise system. For this, follow the Infrastructure -> Components -> Add/Modify menu sequence, pick Citrix XenServer as the Component type, and click the Modify button corresponding to the target XenServer.
    • In the modify component details page that then appears, make sure that the os is set to Xen and the Mode is set to ssh.
    • Then, in the same page, proceed to provide the User and Password of a user who has the right to connect to the XenServer console via SSH.
    • Then, click the Update button to save the changes.
  5. Once this is done, you can configure the eG tests with the credentials of a xen user with Read-only privileges.   

  6. xen password - The password of the specified xen user needs to be mentioned here.
  7. confirm password - Confirm the xen password by retyping it here.
  8. ssl - By default, the Xen Server is not SSL-enabled. This indicates that by default, the eG agent communicates with the XenServer using HTTP. Accordingly, the ssl flag is set to No by default. If you configure the XenServer to use SSL, then make sure that the SSL flag is set to Yes, so that the eG agent communicates with the XenServer using HTTPS. Note that a default SSL certificate comes bundled with every XenServer installation. If you want the eG agent to use this default certificate for communicating with an SSL-enabled XenServer, then no additional configuration is required. However, if you do not want to use the default certificate, then you can generate a self-signed certificate for use by the XenServer. In such a case, you need to explicitly follow the broad steps given below to enable the eG agent to communicate with the XenServer via HTTPS:

    • Obtain the server-certificate for the XenServer
    • Import the server-certificate into the local certificate store of the eG agent

    For a detailed discussion on each of these steps, refer to the Troubleshooting section of this document.

  9. webport - By default, in most virtualized environments, the XenServer listens on port 80 (if not SSL-enabled) or on port 443 (if SSL-enabled). This implies that while monitoring an SSL-enabled XenServer, the eG agent, by default, connects to port 443 of the server to pull out metrics, and while monitoring a non-SSL-enabled XenServer, the eG agent connects to port 80. Accordingly, the webport parameter is set to 80 or 443 depending upon the status of the ssl flag.  In some environments however, the default ports 80 or 443 might not apply. In such a case, against the webport parameter, you can specify the exact port at which the XenServer in your environment listens so that the eG agent communicates with that port.
  10. DETAILED DIAGNOSIS - To make diagnosis more efficient and accurate, the eG suite embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option.

    The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:

    • The eG manager license should allow the detailed diagnosis capability
    • Both the normal and abnormal frequencies configured for the detailed diagnosis measures should not be 0.
Measurements made by the test
Measurement Description Measurement Unit Interpretation

Is this server pool master?:

Indicates whether/not the monitored XenServer is the master in this pool.

 

If the monitored XenServer is the pool master, then this measure will report the value Yes. If not, then, this measure will report the value No.

The numeric values that correspond to the above-mentioned measure values are as follows:

 

Measure Value Numeric Value

Yes

1

No

0

Note:

By default, this test reports the Measures Values listed in the table above to indicate whether/not a server is the pool master. In the graph of this measure however, the same will be represented using the numeric equivalents.

Is this pool high-availability enabled?:

Indicates whether this pool is high-availability (HA) enabled or not?

 

This measure is reported only if the XenServer being monitored is the pool master – i.e., only if the ‘Is this server pool master?’ measure reports the value ‘Yes’.

If the pool for which the target XenServer is the master is HA-enabled, then this measure will report the value Yes. If not, then, this measure will report the value No.

The numeric values that correspond to the above-mentioned measure values are as follows:  

Measure Value Numeric Value

Yes

1

No

0

Note:

By default, this test reports the Measures Values listed in the table above to indicate whether/not the pool is HA-enabled. In the graph of this measure however, the same will be represented using the numeric equivalents.

High availability host failures to be tolerated:

Indicates the number of failures that this host can tolerate before the pool is declared to be overcommitted.

Number

This measure is reported only if the XenServer being monitored is the pool master – i.e., only if the ‘Is this server pool master?’ measure reports the value ‘Yes’.

High Availability works by creating a failure plan (that is, by calculating how many hosts can be restarted based on the priorities you set). The number of hosts that can be restarted is based on the available resources (CPU, memory) in the pool. As you specify the restart priority for VMs, XenServer evaluates the resources required to start each VM. When there are not enough resources to restart all the VMs set to be restarted, the pool reaches its Maximum failure capacity and is considered overcommitted. The pool can also be overcommitted for reasons such as not enough free memory or changes to virtual disks and networks that affect which VMs can be restarted on which servers.

To increase the maximum failure capacity for a pool, you need to do one or more of the following:

  • Reduce the number of VMs set to Restart as their restart priority.
  • Increase the amount of RAM on your servers or add more servers to the pool to increase its capacity.
  • Reduce the amount of memory configured on some VMs.
  • Shut down non-essential VMs.

Is this pool work load balancing enabled?

Indicates whether/not this pool is enabled for workload balancing.

 

This measure is reported only if the XenServer being monitored is the pool master – i.e., only if the ‘Is this server pool master?’ measure reports the value ‘Yes’.

If the pool for which the target XenServer is the master is workload balancing-enabled, then this measure will report the value Yes. If not, then, this measure will report the value No.

The numeric values that correspond to the above-mentioned measure values are as follows:  

Measure Value Numeric Value

Yes

1

No

0

Note:

By default, this test reports the Measures Values listed in the table above to indicate whether/not the pool is workload balancing-enabled. In the graph of this measure however, the same will be represented using the numeric equivalents.

Total hosts in pool:

Indicates the number of XenServer hosts in this pool.

Number

This measure is reported only if the XenServer being monitored is the pool master – i.e., only if the ‘Is this server pool master?’ measure reports the value ‘Yes’.

 

Online hosts in pool:

Indicates the number of XenServer hosts in this pool that are currently online.

Number

This measure is reported only if the XenServer being monitored is the pool master – i.e., only if the ‘Is this server pool master?’ measure reports the value ‘Yes’.

 

Offline hosts in pool:

Indicates the number of XenServer hosts in this pool that are currently offline.

Number

This measure is reported only if the XenServer being monitored is the pool master – i.e., only if the ‘Is this server pool master?’ measure reports the value ‘Yes’.

If the value of this measure is equal to the value of the Total hosts in pool measure, it indicates that none of the hosts in the pool are currently available. In this situation, users will neither be able to access the pool or its VMs.

If the pool master of a pool that is not HA-enabled goes offline, the slaves realize that communication has been lost and each retry for sixty seconds. Each slave then puts itself into emergency mode, whereby the slave hosts will now only accept the pool emergency commands. If the master comes back up at this point, it will reestablish communication with its slaves, they will leave emergency mode, and operation will return to normal. If the master remains offline, you should choose a slave and promote it to master. Once a slave becomes the master, you need to inform the other slaves who the new master is. Until this process is complete, you will not be able to access the pool.

Now, if the slaves in a pool that is not HA-enabled go offline, they will stop sending heartbeat messages to the master. If no heartbeat has been received for 30 seconds then the master assumes the slave is dead. To recover from this problem, you can repair the slave or instruct the master to forget about the slave node. In the case of the latter, all VMs running on the slave will be marked as ‘offline’ and can be restarted on other hosts.

In case of HA-enabled pools, if any host (be it the master or a slave) in the pool goes offline, the HA mechanism automatically moves protected VMs to a healthy host. Additionally, if the host that fails is the master, HA selects another host to take over the master role automatically, meaning that you can continue to manage the XenServer pool.

Disabled hosts in pool:

Indicates the number of XenServer hosts in this pool that are currently disabled.

Number

This measure is reported only if the XenServer being monitored is the pool master – i.e., only if the ‘Is this server pool master?’ measure reports the value ‘Yes’.

If the server in a resource pool is placed in the Maintenance mode, then all running VMs will be automatically migrated from it to another server in the same pool. If the server is the pool master, a new master will also be selected for the pool. When all running VMs have been successfully migrated off the server, the server's status is changed to and set to Disabled.