GPU Sensors - OS Test

This test monitors each GPU available in the hardware unit of the target system and reports the voltage, temperature and the load handled by each GPU. In addition, this test reports the speed of each GPU and the average speed of the fans in each GPU. This way, administrators may be alerted to potential overload condition of the GPU and help administrators identify potential issues that may affect the functioning of the GPU.

Target of the test : A Physical Desktop Group

Agent deploying the test : A remote agent

Output of the test: One set of results for each GPU available in the hardware unit of the target system being monitored.

Configurable parameters for the test
Parameter Description

Test Period

How often should the test be executed.

Host

The nick name of the Physical Desktop Group component for which this test is to be configured.

Port

Refers to the port at which the specified host listens to. By default, this is NULL.

Report Powered OS

This flag is relevant only for those tests that are mapped to the Physical Desktops Details layer. If this flag is set to Yes (which is the default setting), then the 'inside view' tests will report measures for even those Windows physical machines that do not have any users logged in currently. Such desktops will be identified by their name and not by the username_on_physicalmachinename. On the other hand, if this flag is set to No, then this test will not report measures for those physical machines to which no users are logged in currently.  

Report By User

This flag is set to Yes by default. The value of this flag cannot be changed. This implies that the physical machines in environments will always be identified using the login name of the user. In other words, in VDI environments, this test will, by default, report measures for every username_on_physicalmachinename.

Exclude

By default, this parameter is set to none. This means that the test will monitor all the applications that are launched on the target server, by default. If you want the test to disregard certain applications when monitoring, then provide a comma-separated list of process names that correspond to the applications you want to ignore, in the Exclude text box. For instance, your specification can be: winword.exe,js.exe,taskmgr.exe. Your specification can include wild card patterns as well. For example: *win*,js*,*task

DD Frequency

Refers to the frequency with which detailed diagnosis measures are to be generated for this test. For instance, if you set to 6:1, it means that detailed measures will be generated every time this test runs, and also every time the test detects a problem.

Measurements made by the test
Measurement Description Measurement Unit Interpretation

Voltage utilized

Indicates the current voltage of this GPU.

Volts

 

Clock Speed

Indicates the current speed of this GPU.

MHz

A very low value for this measure indicates that the GPU is slow.

Comparing the value of this measure across GPUs will point you to that GPU that is currently very slow.

Temperature

Indicates the current temperature of this GPU.

Celsius

The value of this measure should be within permissible limits. A sudden/gradual increase in the value of this measure may affect the functioning of the server and needs to be immediately attended to.

Load Utilized

Indicates the percentage of load handled by this GPU.

Percent

Comparing the value of this measure across GPUs will help you identify the GPU that is handling the maximum load.

Total revolutions

Indicates the average speed of the fans in this GPU.

RPM

The speed of the fans must be within the permissible range. A sudden increase/decrease in the value of this measure is a cause for concern.