vSAN Health Test
In a Virtual SAN enabled cluster, you can use the vSAN health checks to monitor the status of cluster components, diagnose issues, and troubleshoot problems. The health checks cover hardware compatibility, network configuration and operation, advanced vSAN configuration options, storage device health, and virtual machine objects. The vSAN health checks are divided into categories. Each category contains individual health checks.
Health Check Category | Description |
---|---|
Hardware Compatibility |
Monitor the cluster components to ensure that they are using supported hardware, software, and drivers. |
Performance Service |
Monitor the health of vSAN performance service. |
Network |
Monitor vSAN network health. |
Physical disk |
Monitor the health of physical devices in the vSAN cluster. |
Data |
Monitor vSAN data health. |
Cluster |
Monitor vSAN cluster health. |
Capacity utilization |
Monitor vSAN cluster capacity. |
Online health |
Monitor vSAN cluster health and send to VMware’s analytics backend system for advanced analysis. You must participate in the Customer Experience Improvement Program to use online health checks. |
vSAN Build Recommendation |
Monitor vSAN build recommendations for vSphere Update Manager. |
vSAN iSCSI target service |
Monitor the iSCSI target service, including the network configuration and runtime status. |
Encryption |
Monitor vSAN encryption health. |
Stretched cluster |
Monitor the health of a stretched cluster, if applicable. |
Hyperconverged cluster configuration compliance |
Monitor the status of hosts and settings configured through the Quickstart workflow. |
The health checks in the above-table are periodically executed on the vSAN cluster for health testing and performance guarantee. By continuously tracking the health checks on the cluster, administrators can find out current health of the cluster and quickly identify the alerts on unhealthy conditions in time. This way, administrators are enabled to act on the health check alerts that indicate failure conditions or hardware incompatibility with the highest priority. To help administrators in this regard, eG Enterprise offers the vSAN Health test.
This test monitors the tests under all the health check categories on the vSAN cluster and reports the count of tests in each health check category at different states. The revelation helps administrators to proactively identify the failures and warnings during health checks and reduces the pain involved in troubleshooting the failure conditions.
Note:
This test is applicable only for the vSAN enabled clusters in the VMware vCenter server.
Target of the test : A VMware vCenter server
Agent deploying the test : An internal agent
Outputs of the test : One set of results for each vSAN cluster:health check category combination.
Parameter | Description |
---|---|
Test Period |
How often should the test be executed. |
Host |
The host for which this test is to be configured. |
Port |
Refers to the port at which the specified host listens to. |
VC User and VC Password |
To connect to vCenter and extract metrics from it, this test should be configured with the name and password of a user with Administrator or Virtual Machine Administrator privileges to vCenter. However, if, owing to security constraints, you are not able to use the credentials of such users for test configuration, then you can configure this test with the credentials of a user with Read-only rights to vCenter. For this purpose, you can assign the ‘Read-only’ role to a local/domain user to vCenter, and then specify name and password of this user against the VC User and VC Password text boxes. The steps for assigning this role to a user on vCenter have been detailed in vCenter servers terminate user sessions based on timeout periods. The default timeout period is 30 mins. When you stop an agent, sessions currently in use by the agent will remain open for this timeout period until vCenter times out the session. If the agent is restarted within the timeout period, it will open a new set of sessions. If you want the eG agent to close already existing sessions on vCenter before it opens new sessions, then, instead of the ‘Read-only’ user, you can optionally configure the VC User and VC Password parameters with the credentials of a user with permissions to View and Stop Sessions on vCenter. For this purpose, you can create a special role on vCenter, grant the View and Stop Sessions privilege (prior to vCenter 4.1, this was called the View and Terminate Sessions privilege) to this role, and then assign the new role to a local/domain user to vCenter. The steps for assigning this role to a user on vCenter have been detailed in |
Confirm Password |
Confirm the password by retyping it in this text box. |
SSL |
By default, the vCenter server is SSL-enabled. Accordingly, the SSL flag is set to Yes by default. This indicates that the eG agent will communicate with the vCenter server via HTTPS by default. |
Webport |
By default, in most virtualized environments, vCenter listens on port 80 (if not SSL-enabeld) or on port 443 (if SSL-enabled) only. This implies that while monitoring vCenter, the eG agent, by default, connects to port 80 or 443, depending upon the SSL-enabled status of vCenter – i.e., if vCenter is not SSL-enabled (i.e., if the SSL flag above is set to No), then the eG agent connects to vCenter using port 80 by default, and if vCenter is SSL-enabled (i.e., if the ssl flag is set to Yes), then the agent-vCenter communication occurs via port 443 by default. Accordingly, the Webport parameter is set to default by default. In some environments however, the default ports 80 or 443 might not apply. In such a case, against the Webport parameter, you can specify the exact port at which vCenter in your environment listens, so that the eG agent communicates with that port for collecting metrics from vCenter. |
DDForPassedandInfo |
By default, both this flag is set to No, indicating that by default, the test does not generate detailed diagnostic measures for Passed and Info measures. If you want the test to generate and store detailed measures for the Passed and Info measures, set the DDForPassedandInfo flag to Yes. |
DD Frequency |
Refers to the frequency with which detailed diagnosis measures are to be generated for this test. The default is 1:1. This indicates that, by default, detailed measures will be generated every time this test runs, and also every time the test detects a problem. You can modify this frequency, if you so desire. Also, if you intend to disable the detailed diagnosis capability for this test, you can do so by specifying none against DD frequency. |
Detailed Diagnosis |
To make diagnosis more efficient and accurate, the eG Enterprise embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option. The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:
|
Measurement | Description | Measurement Unit | Interpretation |
---|---|---|---|
Passed |
Indicates the number of tests under this health check category that returned the Passed state. |
Number |
The detailed diagnosis of this measure, if enabled using DDForPassedandInfo flag, reveals the name of the tests under each health check category that returned the Passed state, detailed message and health status of the tests. |
Skipped |
Indicates the number of tests under this health check category that returned the Skipped state. |
Number |
The detailed diagnosis of this measure lists the name of the tests under each health check category that returned the Skipped state, detailed message and health status of the tests. |
Info |
Indicates the number of tests under this health check category that returned the Info state . |
Number |
The detailed diagnosis of this measure, if enabled using DDForPassedandInfo flag, reveals the name of the tests under each health check category that returned the Info state, detailed message and health status of the tests. |
Warning |
Indicates the number of tests under this health check category that returned the Warning state. |
Number |
The detailed diagnosis of this measure lists the name of the tests under each health check category that returned the Warning state, detailed message and health status of the tests. |
Failed |
Indicates the number of tests under this health check category that returned the Failed state. |
Number |
The detailed diagnosis of this measure lists the name of the tests under each health check category that returned the Failed state, detailed message and health status of the tests. |
Unknown |
Indicates the number of tests under this health check category that returned the Unknown state. |
Number |
The detailed diagnosis of this measure lists the name of the tests under each health check category that returned the Unknown state, detailed message and health status of the tests. |