Citrix HDX Users Test

To ensure that users are able to access applications/desktops on-demand, administrators must closely track that user’s accesses, promptly detect probable access latencies, diagnose its root-cause, and take steps to avert it, well before that user notices and complains. To achieve this, administrators can use the Citrix HDX Users test. This test automatically discovers the users who are currently accessing applications and virtual desktops in a XenApp/XenDesktop infrastructure, and for each user, reports the latencies that user experienced when interacting with the applications/desktops. This way, the test quickly and accurately points administrators to those users who are experiencing slowness, and also leads them to what is causing the slowness – the network? or the server hosting the applications/desktops? If a latent network is causing the slowness, then the test provides administrators with detailed insights into network performance and enables them to rapidly figure out where the bottleneck lies - on the client-side network? or on the server-side network?

Target of the test : An AppFlow-enabled ADC Appliance

Agent deploying the test : A remote agent

Outputs of the test : One set of results for every user who is currently connected to an application/virtual desktop

Configurable parameters for the test
Parameter Description

Test period

How often should the test be executed. It is recommended that you set the test period to 5 minutes. This is because, the eG AppFlow Collector is capable of capturing and aggregating AppFlow data related to the last 5 minutes only.

Host

The host for which the test is to be configured.

Cluster IPs

This parameter applies only if the ADC appliance being monitored is part of a ADC cluster. In this case, configure this parameter with a comma-separated list of IP addresses of all other nodes in that cluster.

If the monitored ADC appliance is down/unreachable, then the eG AppFlow Collector uses the Cluster IPs configuration to figure out which other node in the cluster it should connect to for pulling AppFlow statistics. Typically, the collector attempts to connect to every IP address that is configured against Cluster IPs, in the same sequence in which they are specified. Metrics are pulled from the first cluster node that the collector successfully establishes a connection with.

Enable Logs

This flag is set to No by default. This means that, by default, the eG agent does not create AppFlow logs. You can set this flag to Yes to enable AppFlow logging. If this is done, then the eG agent automatically writes the raw AppFlow records it reads from the collector into individual CSV files. These CSV files are stored in the <EG_AGENT_INSTALL_DIR>\NetFlow\data\<IP_of_Monitored_NetScaler>\hdxappflow\actual_csv folder on the eG agent host. These CSV files provide administrators with granular insights into the HDX appflows, thereby enabling effective troubleshooting.

Note:

By default, the eG agent creates a maximum of 10 CSV files in the actual_csv folder. Beyond this point, the older CSV files will be automatically deleted by the eG agent to accommodate new files with current data. Likewise, a single CSV file can by default contain a maximum of 99999 records only. If the records to be written exceed this default value, then the eG agent automatically creates another CSV file to write the data.

If required, you can overwrite these default settings . For this, do the following:

  1. Login to the eG agent host.
  2. Edit the Netflow.Properties file in the <EG_AGENT_INSTALL_DIR>\NetFlow\config directory.
  3. In the file, look for the parameter, csv_file_retention_count.
  4. This is the parameter that governs the maximum number of CSV files that can be created in the auto_csv folder. By default, this parameter is set to 10. If you want to retain more number of CSV files at any given point in time, you can increase the value of this parameter. If you want to retain only a few CSV files, then decrease the value of this parameter.
  5. Next, look for the parameter, csv_max_flow_record_per_file.
  6. This is the parameter that governs the number of flow records that can be written to a single CSV. By default, this parameter is set to 99999. If you want a single file to accommodate more records, so that the creation of new CSVs is delayed, then increase the value of this parameter. On the other hand, if you want to reduce the capacity of a CSV file, so that new CSVs are quickly created, then decrease the value of this parameter.
  7. Finally, save the file.

DD Frequency

Refers to the frequency with which detailed diagnosis measures are to be generated for this test. The default is 1:1. This indicates that, by default, detailed measures will be generated every time this test runs, and also every time the test detects a problem. You can modify this frequency, if you so desire. Also, if you intend to disable the detailed diagnosis capability for this test, you can do so by specifying none against DD Frequency.

Detailed Diagnosis

To make diagnosis more efficient and accurate, the eG Enterprise embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option.

The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:

  • The eG manager license should allow the detailed diagnosis capability
  • Both the normal and abnormal frequencies configured for the detailed diagnosis measures should not be 0.
Measurements made by the test

Measurement

Description

Measurement Unit

Interpretation

Active applications

Indicates the number of applications currently accessed by this user.

Number

To know which applications are actively used by this user, use the detailed diagnosis of this measure.

This measure is reported only for application users, and not desktop users.

Application launches

Indicates the number of applications that were launched by this user.

Number

To know which applications were launched by this user, use the detailed diagnosis of this measure.

This measure is reported only for application users, and not desktop users.

Application terminates

Indicates the number of applications terminated by this user.

Number

To know which applications were terminated in this user, use the detailed diagnosis of this measure.

This measure is reported only for application users, and not desktop users.

Active desktops

Indicates the number of desktops currently accessed by this user.

Number

To know which adesktops are actively used by this user, use the detailed diagnosis of this measure.

This measure is reported only for desktop users, and not application users.

Desktop launches

Indicates the number of desktops launched by this user.

Number

To know which desktops were launched by this user, use the detailed diagnosis of this measure.

This measure is reported only for desktop users, and not application users.

Session status

Indicates the current status of this user.

 

The values that this measure can take and the numeric values that correspond to each measure value are listed in the table below:

Measure Value Numeric Value
Active 0
SR successful 1000
Existing ICA session got terminated 1001
Existing ICA connection got terminated and SR failed 1002
Existing ICA connection terminated and SR failed and client is trying to do ACR and is successful 1003

Note:

Typically, this test reports the Measure Values in the table above to indicate session status. In the graph of this measure however, the same is indicated using the numeric equivalents only.

Average application startup duration

Indicates the average time that elapsed between when an application accessed by this user was launched and when it started running.

Msecs

A high value for this measure indicates that one/more applications are starting up slowly on the server. In this case, use the detailed diagnosis of the Active applications measure to know which application is the slowest in starting up.

This measure is reported only for application users, and not desktop users.

RTT

Indicates the screen lag experienced by this user while interacting with applications/desktops.

Msecs

A high value for this measure is indicative of the poor quality of a user’s experience with applications/desktops.

To know the reason for this below-par UX, compare the value of the WAN latency, DC latency, and Host delay measures of that user.

WAN latency

Indicates the average latency experienced by this user due to problems with the client side network.

Msecs

A high value for this measure indicates that the client side network is slow.

If the value of the RTT measure is abnormally high for a user, you can compare the value of this measure with that of the DC latency and Host delay, and measures of that user to know what is causing the slowness – is it the client side network? the server side network? or the server hosting the applications/desktops? 

DC latency

Indicates the average latency experienced by this user due to problems with the server side network.

Msecs

A high value for this measure indicates that the server side network is slow.

If the value of the RTT measure is abnormally high for a user, you can compare the value of this measure with that of the WAN latency and Host delay measures of that user to know what is causing the slowness – is it the client side network? the server side network? or the server hosting the applications/desktops? 

Host delay

Indicates the delay that this user experienced when waiting for the host to process the packets. 

Msecs

A high value for this measure indicates a processing bottleneck with the server hosting the applications.

If the value of the RTT measure is abnormally high for a user, you can compare the value of this measure with that of the WAN latency and DC latency, measures to know what is causing the slowness – is it the client side network? the server side network? or the server hosting the applications/desktops? 

Bandwidth

Indicates the bandwidth used by this user.

Kbps

Ideally, the value of this measure should be low.

A high value indicates excessive bandwidth usage by the user.

Compare the value of this measure across users to know which user is consuming bandwidth excessively.

Bytes

Indicates the total bytes consumed by this user's sessions.

Bytes

Compare the value of this measure across users to know which user has the maximum throughput and which has the least.

Client side retransmits

Indicates the number of packets retransmitted on the client side connection during the last measurement period.

Number

Ideally, the value of these measures should be 0.

 

Server side retransmits

Indicates the number of packets retransmitted on the server side connection during the last measurement period.

Number

Client side 0 win count

Indicates how many times this user's client advertised a zero TCP window during the last measurement period.

Number

TCP Zero Window is when the Window size in a machine remains at zero for a specified amount of time.

TCP Window size is the amount of information that a machine can receive during a TCP session and still be able to process the data. Think of it like a TCP receive buffer. When a machine initiates a TCP connection to a server, it will let the server know how much data it can receive by the Window Size.

In many Windows machines, this value is around 64512 bytes. As the TCP session is initiated and the server begins sending data, the client will decrement it's Window Size as this buffer fills. At the same time, the client is processing the data in the buffer, and is emptying it, making room for more data. Through TCP ACK frames, the client informs the server of how much room is in this buffer. If the TCP Window Size goes down to 0, the client will not be able to receive any more data until it processes and opens the buffer up again.

The machine (client/server) alerting the Zero Window will not receive any more data from the host. This is why, ideally, the value of these measures should be 0.

A non-zero value warrants an immediate investigation to determine the reason for the Zero Window. It could be that the client/server was running too many processes at that moment, and its processor is maxed. Or it could be that there is an error in the TCP receiver, like a Windows registry misconfiguration. Try to determine what the client was doing when the TCP Zero Window happened.

These measures are reported only for application users, and not desktop users.

Server side 0 win count

Indicates how many times during this user's sessions the server advertised a zero TCP window during the last measurement period.

Number

Client RTO

Indicates how many times during the last measurement period the retransmit timeout got invoked on this user's client side connection.

Number

An RTO occurs when the sender is missing too many acknowledgments and decides to take a time out and stop sending altogether. After some amount of time, usually at least one second, the sender cautiously starts sending again, testing the waters with just one packet at first, then two packets, and so on. As a result, an RTO causes, at minimum, a one-second delay on your network. A low value is hence desired for these measures.

These measures are reported only for application sessions, and not desktop sessions.

Server RTO

Indicates how many times during the last measurement period the retransmit timeout got invoked on this user's server side connection.

Number

ACR counts

Indicates the total number of times the client automatically reconnected this user to sessions.

Number

The Automatic Client Reconnect (ACR) policy setting, when enabled, allows automatic reconnection by the same client after a connection has been interrupted. Allowing automatic client reconnect allows users to resume working where they were interrupted when a connection was broken. Automatic reconnection detects broken connections and then reconnects the users to their sessions.

Session reconnects

Indicates the number of times this user's sessions reconnected.

Number

This measure includes only those times a user reconnected to a disconnected session by mechanisms other than the ACR setting.

Client SRTT

Indicates the RTT (round-trip time or screen lag time) for this user smoothed over the client side connection. 

 

MSecs

TCP implementations attempt to predict future round-trip times by sampling the behavior of packets sent over a connection and averaging those samples into a ‘‘smoothed’’ round-trip time estimate, SRTT. When a packet is sent over a TCP connection, the sender times how long it takes for it to be acknowledged, producing a sequence, S, of round-trip time samples: s1, s2, s3.... With each new sample, si, the new SRTT is computed from the formula:

SRTTi+1 = (α x SRTTi) + (1 − α )xsi

Here, SRTTi is the current estimate of the round-trip time, SRTTi+1 is the new computed value, and α is a constant between 0 and 1 that controls how rapidly the SRTT adapts to change. The retransmission time-out (RTOi), the amount of time the sender will wait for a given packet to be acknowledged, is computed from SRTTi. The formula is:

RTOi = β x SRTTi

Here, β is a constant, greater than 1, chosen such that there is an acceptably small probability that the round-trip time for the packet will exceed RTOi.

These measures are reported only for application sessions, and not desktop sessions.

Server SRTT

Indicates the RTT (round-trip time or screen lag time) of this session, smoothed over the server side connection. 

MSecs

Client jitter

Indicates the client side jitter.

Msecs

Jitter is defined as a variation in the delay of received packets. At the sending side, packets are sent in a continuous stream with the packets spaced evenly apart. Due to network congestion, improper queuing, or configuration errors, this steady stream can become lumpy, or the delay between each packet can vary instead of remaining constant.

A high value for these measures therefore is indicative of a long time gap between ICA packets. To know where the delay is longer – whether on the client side or on the server side - compare the value of the Client jitter measure with that of the Server jitter measure.

Also, if the value of the Round trip time – RTT measure is abnormally high for a user, then you can compare the values of these measures with that of the WAN latency and DC latency measures to know what is causing the problem – the client side network? or the server side network?

These measures are reported only for application sessions, and not desktop sessions.

Server jitter

Indicates the server side jitter.

Msecs

Use the detailed diagnosis of the Active applications measure to know which applications are being actively used by a user. The application startup time, startup duration, application uptime, and module path are displayed for each active application. From this, you can quickly identify applications that took too long to startup and applications that restarted recently, and initiate investigations to find the reasons for the same.

Figure 17 : The detailed diagnosis of the Active applications measure reported by the Citrix HDX Users test

Use the detailed diagnosis of the Application launches measure to know which applications were launched by a user.

Figure 18 : The detailed diagnosis of the Application launches measure reported by the Citrix HDX Users test

The detailed diagnosis of the Session status measure provides additional details of a user. If the status of a session is abnormal, you can use these details to know from which client the user is connecting, the client type and version, which server the user is connecting to, the start time, and the uptime of the session. This will help in troubleshooting the abnormal session status.

Figure 19 : The detailed diagnosis of the Session status measure reported by the Citrix HDX Users test

For a desktop user, you can know which desktop that user is currently logged into using the detailed diagnosis of the Active desktops measure. The time at which the desktop started up and the uptime of the desktop are revealed, so that you can instantly figure out whether the desktop experienced any unusual/unscheduled reboot.

Figure 20 : The detailed diagnosis of the Active desktops measure

Use the detailed diagnosis of the Desktop launches measure to know which desktop(s) was recently launched by the user.

Figure 21 : The detailed diagnosis of the Desktop launches measure of the Citrix HDX Users test