Status Hardware Traps Test
This test monitors the status of various hardware elements present in the Stratus server using SNMP traps.
Target of the test : An SNMP trap
Agent deploying the test : An internal agent
Outputs of the test : One set of results for every OID value monitored.
Parameter | Description | ||||||
---|---|---|---|---|---|---|---|
Test Period |
How often should the test be executed. |
||||||
Host |
The host for which the test is to be configured. |
||||||
Port |
The port at which the application listens. |
||||||
SourceAddress |
Specify a comma-separated list of IP addresses or address patterns of the hosts sending the traps. For example, 10.0.0.1,192.168.10.*. A leading '*' signifies any number of leading characters, while a trailing '*' signifies any number of trailing characters. |
||||||
OIDValue |
Provide a comma-separated list of OID and value pairs returned by the traps. The values are to be expressed in the form, DisplayName:OID-OIDValue. For example, assume that the following OIDs are to be considered by this test: .1.3.6.1.4.1.9156.1.1.2 and .1.3.6.1.4.1.9156.1.1.3. The values of these OIDs are as given hereunder:
In this case the OIDvalue parameter can be configured as Trap1:.1.3.6.1.4.1.9156.1.1.2-Host_system,Trap2:.1.3.6.1.4.1.9156.1.1.3-Network, where Trap1 and Trap2 are the display names that appear as descriptors of this test in the monitor interface. The test considers a configured OID for monitoring only when the actual value of the OID matches with its configured value. For instance, in the example above, if the value of OID .1.3.6.1.4.1.9156.1.1.2 is found to be Host and not Host_system, then the test ignores OID .1.3.6.1.4.1.9156.1.1.2 while monitoring. An * can be used in the OID/value patterns to denote any number of leading or trailing characters (as the case may be). For example, to monitor all the OIDs that return values which begin with the letter 'F', set this parameter to Failed:*-F*. |
||||||
ShowOID |
Selecting the True option against ShowOID will ensure that the detailed diagnosis of this test shows the OID strings along with their corresponding values. If you select False, then the values alone will appear in the detailed diagnosis page, and not the OIDs. |
||||||
Detailed Diagnosis |
To make diagnosis more efficient and accurate, the eG Enterprise embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option. The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:
|
Measurement | Description | Measurement Unit | Interpretation |
---|---|---|---|
Empty |
Indicates that a slot in the system is in an "empty" state. |
Boolean |
For a slot, this state indicates that the slot is empty, physically not present, or electrically inaccessible. If the empty device causes the system to be go into simplex mode, the device is no longer fault tolerant. In some cases this state represents both a slot and a device. For instance, an instance of an SRA_DIMM in the Empty state means that a slot exists for the DIMM, but that the slot is empty. DIMMs, CPU Boards, IO Boards and Processors are represented by such WMI objects. Sensors go to this state instead of the"Not Present" state when they are not present. Empty devices are generally enumerable. |
Not present |
Indicates that a device in the system is in a "not present" state. |
Boolean |
This state indicates that a device is either physically not present or electrically inaccessible. For instance, pulling the power cord on a CPU board makes the DIMMs and Processors on the board go to this state. When a WMI object goes to this state, it is generally not enumerable. Thus, this state only appears in state change events. |
Removed |
Indicates that a device in the system is in a "removed" state. Usually, this is a final state but it can be a transient state. |
Boolean |
Usually, this state indicates that a device was intentionally removed from service. When intentionally removed from service, the device remains in this state. Only some devices go to this state when removed from services; other devices go to other offline states. Some devices pass through this state as they are brought online. |
Dumping |
Indicates that a device is in a "Dumping" state. This is a transient state. |
Boolean |
This state indicates a device is in the process of writing a dump to a file. |
Diagnostics passed |
Indicates that a device is in a "Diagnostic Passed" state. This is a transient state and the device should change to "online" state when it is brought online. |
Boolean |
This state indicates that a device has just completed its diagnostics tests. |
Initialising |
Indicates that a device is in a "Initialising" state. This is a transient state and the device should change to "online" state when it is brought online. |
Boolean |
This state indicates that a device is in the process of initializing. |
Syncing |
Indicates that a device is in a "synching" state. This is a transient state and the device should change to "online" state when it is brought online. |
Boolean |
This state indicates that a device is synchronizing itself with its partners. For instance, when a CPU is brought up, it synchronizes its memory and its processor state with that of its partners. |
Offline |
Indicates that a device is in a "offline" state.
|
Boolean |
This state indicates that a device is offline. Only some devices can go to this state while other devices go into the "Removed From Service" state. |
Firmware update complete |
Indicates that a device's firmware update procedure has completed. |
Boolean |
|
Diagnostics |
Indicates that a device is running diagnostics. |
Boolean |
|
Online |
Indicates that a device is in a "online" state. |
Boolean |
This state indicates that the device is online, but not configured for redundancy. For instance, a working NIC that is not part of a team will be in this state. Although the online state does not indicate whether a device is safe-to-pull or not, on a properly configured system such devices can be assumed safe-to-pull. |
Simplex |
Indicates that a device is in a "Simplex" state. |
Boolean |
This state indicates that a device is online, configured for redundancy, and is not safe-to-pull. When applied to a port, indicates that the port is configured for redundancy, and that whatever is connected to the port is not safe-to-pull. |
Duplex |
Indicates that a device is in a "Duplex" state. |
Boolean |
This state indicates that a device is online, configured for redundancy, and is safe-to-pull. When applied to a port, indicates that the port is configured for redundancy, and that whatever is connected to the port is safe-to-pull. |
Shot |
Indicates that a device is in a "Shot" state. This is a transient state and the device should either transit to "broken" or "online" state after diagnostic is done. |
Boolean |
This state indicates that a device experienced a problem and will soon move to either an online state or the broken state. |
Broken |
Indicates that a device is in a "Broken" state. |
Boolean |
This state Indicates that a device is broken. In the case of a port, this state may mean that the port is inoperative or that that which attaches to the port is inoperative. There are several reasons that a device could be broken but usually points to hardware errors. Contact your service providers for service checks. In the case where the device is a port, it usually indicates that there is nothing attached to the port, or when whatever should be attached to the port is not responding. For example, a NIC port will be in this state when it cannot detect link. |