Daniel Feller, a lead architect at Citrix, recently posted a couple of very interesting articles relating to troubleshooting Citrix logon issues. These blogs highlight challenging Citrix issues and how to go about locating the root cause of these issues.
Troubleshooting Citrix Problems is Very Challenging
The first example concerns logon times that reached 100+ seconds when users logged onto Citrix Virtual Apps. In this case, Citrix Director reported that VM start, GPO processing, logon script execution and interactive session time all took over 20 seconds. So where is the root cause of the slowness?
Troubleshooting is often very time consuming and involves a process of elimination. In this case, one can deduce that since the user logon eventually succeeded, the Citrix delivery infrastructure components – StoreFront, Delivery Controller, SQL Server – should have been available and operational. Since the user connected to the Citrix site from an internal end-point in the network, the Citrix ADC/NetScaler could not have caused the problem (only external accesses would have used the Citrix ADC). And since the user did not notice any slowness after the logon succeeded, the virtual desktop that the user connected to during the session was not likely to because of the issue. Ultimately, the problem was narrowed down to excessive memory utilization on the Citrix Delivery Controller. Since the Citrix Delivery Controller is a central coordinator during logon and session establishment, a resource bottleneck resulted in an unusually high Citrix logon time and many of the stages of the logon process showed slowness.
Why is Citrix Slow? Where is the Issue?
- Is it on the client side?
- Is it the user’s network?
- Is it due to the Delivery Controller?
- Is it the Citrix Virtual Apps servers or Virtual Desktops VMs?
- Is it due to the user’s workload?
- Is it due to the virtual/cloud platform?
- Is there slowness with the infrastructure services (AD, file storage, profiles etc.)?
- Is it due to storage?
This scenario highlights how routine monitoring of every aspect of Citrix delivery can save organizations a lot of troubleshooting effort. The eG Innovations and DABCC Citrix Performance Survey found that 76% of Citrix professionals spend more than two days in a week troubleshooting problems. That’s 40% of their work week!
Routine monitoring of the Citrix infrastructure would have alerted admins to a memory utilization abnormality on the Delivery Controller. If this had been noticed and corrected, the user would not have even seen slowness during logon, and hours of troubleshooting time could have been saved.
In the same example, Daniel also noticed another thing – unusually high VM start times. Normally, in this environment, the VMs were supposed to be ready for user access (power management settings were configured accordingly), so VM start time should have been near zero (0) seconds. Upon further analysis, the problem was narrowed down to a hypervisor issue. The hypervisor on which the already powered on VMs were hosted was over-committed in terms of resources and the VDAs on these VMs were not registering properly with the Citrix Delivery Controller. As a result, the Citrix Delivery Controller was spinning up new VMs for users instead of using the ones that were already powered on.
In this example, with proactive Citrix performance monitoring in place, the administrator would have been alerted to both anomalies – the high resource usage on the hypervisor and the unregistered status of the VDAs. Both of these would have triggered alerts from the monitoring tool and the administrator could have taken remedial action before users noticed the issue.
Why is Citrix Troubleshooting Hard?
Troubleshooting a Citrix environment is often difficult and time-consuming:
- Firstly, there are many tiers of software and hardware involved in supporting the service. There are many interdependencies between these tiers (e.g., a database outage can cause Citrix Delivery Controller issues). As a result, a problem in one of the supporting tiers – in the above example, it was the hypervisor – can result in Citrix slowness.
- Secondly, user complaints don’t help identify the cause of the problem. Users are not aware of the different Citrix tiers or the supporting services involved. Their complaints often relate to the front-end service that they are accessing. When a user says that “Citrix logon is slow,” administrators often start troubleshooting from the front-end. If the real cause of the problem is a bottleneck in a supporting tier, it may take several hours before the problem is identified.
- Thirdly, insights into the performance of the entire Citrix stack and the supporting infrastructure are not available in one unified console today. The eG Innovations and DABCC survey found that 67% of Citrix admins often use between two and five tools to manage their Citrix infrastructure.
Proactive Citrix Monitoring Can Eliminate Troubleshooting
Organizations that implement proactive performance monitoring for their Citrix infrastructure can greatly reduce the time, effort and cost involved in troubleshooting Citrix issues after they arise.
In order to be effective, performance monitoring for a Citrix infrastructure must:
Watch our award-winning video highlighting how eBay successfully moved from reactive troubleshooting to proactive monitoring of their Citrix workspace services using eG Enterprise.