v7000 Port Traffic Test

IBM Storwize V7000 storage system can have two or four hardware components called nodes or node canisters that provide the virtualization of internal and external volumes, and cache and copy services (Remote Copy) functions. Storwize V7000 type 100 node canisters contain four ports for Fibre Channel connection and two ports for 1 Gbps Ethernet connection. Type 300 node canisters contain four ports for Fibre Channel connection, two ports for 1 Gbps Ethernet connection, and an HBA that provides two additional ports for 10 Gbps Ethernet connection. Each node presents a volume to the SAN through four Fibre Channel ports or two FCoE ports. These ports therefore are the primary handlers of I/O requests from the SAN. I/O load on the ports directly translate into load on the volumes. This is why, administrators need to continuously monitor the data and commands processed by each port, so that overloaded ports can be quickly identified and the load-balancing algorithim fine-tuned accordingly. Moreover, since port-related errors can deny hosts access to the data stored in the SAN, port monitoring is imperative to enable administrators to quickly detect such errors and fix them to ensure the normal functioning of the SAN. All this and more can be achieved using the v7000 Port Traffic test. For each port on a node, this test reports the rate at which data and commands are handled by each node and the number and nature of errors/failures encountered by eacn FC port. This way, administrators can be proactively alerted to potential port overloads and error conditions (with FC ports), and thus enabled to rapidly initiate remedial measures to avoid an impending storage system slowdown.   

Target of the test : An IBM Storwize v7000 storage system

Agent deploying the test : A remote agent

Outputs of the test : One set of results for each port available in the node of the IBM Storwize v7000 storage system being monitored.

Configurable parameters for the test
Parameter Description

Test Period

How often should the test be executed.

Host

The host for which the test is to be configured.

Port

The port number at which the specified host listens to. By default, this is NULL.

Timeout

Specify the duration (in seconds) beyond which the test will timeout in the Timeout text box. The default value is 60 seconds.

Measurements made by the test
Measurement Description Measurement Unit Interpretation

Data transmitted to hosts

Indicates the rate at which data is transmitted to the hosts through this port.

MB/Sec

Compare the value of these measures across the ports to identify the port that is busy transmitting /receiving data to the hosts - thus load-balancing irregularities on the ports can be detected easily.

Data received from hosts

Indicates the rate at which data is received from the hosts through this port.

MB/Sec

Data transmitted to controllers

Indicates the rate at which data is transmitted to the controllers through this port.

MB/Sec

Compare the value of these measures across the ports to identify the port that is busy transmitting /receiving data to the controllers - thus load-balancing irregularities on the ports can be detected easily.

Data received from controllers

Indicates the rate at which data is received from the controllers through this port.

MB/Sec

Commands initiated to controllers

Indicates the rate at which commands are initiated to the controllers through this port.

Commands/Sec

Compare the value of these measures across the ports to identify the port that is busy receiving/initiating commands – this way, load-balancing irregularities on the ports can be detected easily.

Commands received from hosts

Indicates the rate at which commands are received from the hosts through this port.

Commands/Sec

Link failures

Indicates the number of link failures experienced by this FC port.

Number

Ideally, the value of this measure should be zero. A non-zero value indicates that Fibre Channel connectivity with the port was “broken” that many times. This is likely an indicator for a faulty connector or cable. These are also caused when the device connected to the port is restarted, replaced or being serviced when the Fibre Channel cable connected to the port is temporarily disconnected.

This measure is applicable to FC Ports.

Loss-of-synchronizations

Indicates the number of times this FC port failed to synchronize.

Number

Ideally, the value of this measure should be zero. A non-zero value for this measure indicates that port went into the “loss of synchronization” state, where it encountered continuous Disparity errors.

This is likely an indicator for a faulty connector or cable. These are also caused when the device connected to the port is restarted, replaced or being serviced when the Fibre Channel cable connected to the port is temporarily disconnected. 

If the port is in the “loss of synchronization” state for longer than a specific period, the port will get into the link failure state which could degrade the performance of the Fibre Channel link.

This measure is applicable only to FC Ports.

Loss-of-signal

Indicates the number of times the signal was lost on this FC port.

Number

Ideally, the value of this measure should be zero. A non-zero value for this measure indicates that the port detected a loss of the electrical or optical signal used to transfer data on the port.

This is likely an indicator for a faulty connector or cable. These are also caused when the device connected to the port is restarted, replaced or being serviced when the Fibre Channel cable connected to the port is temporarily disconnected. 

If the port is in the “loss of signal” state for longer than a specific period, the port will get into the link failure state which could degrade the performance of the Fibre Channel link.

This measure is applicable only to FC Ports.

Primitive sequence protocol errors

Indicates the number of Primitive Sequence protocol errors that occurred on this FC port.

Number

Ideally, the value of this measure should be zero.

This measure is applicable only to FC ports.

Invalid transmission words

Indicates the number of invalid words that were transmitted through this FC port.

Number

Transmission Words are either data Transmission Words or control Transmission Words.  The first two bits of a Transmission Word are the synchronization header, and are set to either 01h or 10h. The remaining 64 bits of the Transmission Word are the output of a scrambler applied to the Transmission Word body. The Transmission Word body is eight bytes that represent a pair of words and/or Special Functions.

An invalid Transmission Word shall be recognized by the receiver when one of the following conditions is detected:

  • A code violation, as specified by the 8B/10B transmission code (see 5.2), is detected within a Transmission Word. This is referred to as a code violation condition;
  • A K30.7 special character is detected in any character position of a Transmission Word. This indicates an error condition has been detected at a lower implementation level within the receiver;
  • Any valid special character is detected in the second, third, or fourth character position of a Transmission Word. This is referred to as an invalid special code alignment condition; or
  • A defined Ordered Set is received with improper beginning running disparity.

Ideally, the value of this measure should be zero.

This measure is applicable only to FC ports.

Invalid CRCs

Indicates the number of invalid CRCs that occurred on this FC port.

Number

This refers to the number of Fibre Channel frames handled by the port that contains checksum errors.

Ideally, the value of this measure should be zero.

These are usually recoverable errors and will not degrade system performance unless their occurrence is sustained when the data cannot be relayed after retransmissions.

This measure is applicable only to FC ports.

No Buffer credit timer

Indicates the time duration for which this FC port was unable to send frames due to the lack of buffer credit.

Microseconds

Buffer credits, also called buffer-to-buffer credits (BBC) are used as a flow control method by Fibre Channel technology and represent the number of frames a port can store.

Each time a port transmits a frame that port's BB Credit is decremented by one; for each R_RDY received, that port's BB Credit is incremented by one. Transmission of an R_RDY indicates that the port has processed a frame, freed a receive buffer, and is ready for one more. If the BB Credit is zero, the corresponding node cannot transmit until an R_RDY is received back. A high value for this measure therefore indicates that an R_RDY was not received by the FC port for a long time. This is a cause for concern, as until the R_RDY is received, the FC port will not resume communication. 

The solution for this problem is to allocate optimal buffer credits to the FC port. The optimal number of buffer credits is determined by the distance (frame delivery time), the processing time at the receiving port, the link signaling rate, and the size of the frames being transmitted. As the link speed increases, the frame delivery time is reduced and the number of buffer credits must be increased to obtain full link utilization, even in a short-distance environment. Smaller frame sizes need more buffer credits

This measure is applicable only to FC ports.