1. Monitoring the User Experience of VMware Horizon Users
The success of any digital workspace technology depends on the user experience and VMware Horizon is no exception. There are many aspects of user experience. When a user first connects, the logon must happen quickly. When the user launches a client application (e.g., Microsoft Outlook, Google Chrome, etc.), the application must be available for user interactions within seconds.
When using the application, the lag between moving the mouse and for the movement to appear on the screen should be within a few 100 milliseconds. Audio/video applications should have sufficient bandwidth to operate, and the client applications on user desktops must operate without any crashes. To ensure that a VMware Horizon deployment delivers the expected value, it is essential that user experience be within the expected levels.
As the deployment’s effectiveness is based on the performance that users experience, VMware Horizon administrators must proactively monitor user experience. There are two common ways to monitor user experience:
- Synthetic monitoring: This type of monitoring simulates user accesses to the VMware Horizon service and reports service availability and response time.
One of the approaches for synthetic monitoring is logon simulation, which periodically checks if users can login from different external/internal locations to the VMware Horizon service. Logon simulation is a purpose-built solution for simulating VMware Horizon logons and hence, works in a plug-and-play manner. You can set up simulations to run from multiple external locations to measure logon time from different locations.
- Simulate user logons and catch issues before users are affected.
- Measure time for every step of the logon process: browser access, authentication, session establishment, and enumeration to application launch
- Detect which step of the logon process is causing slowness
- Test application/desktop availability and logon performance 24×7
- Establish baselines and compare user experience across locations
The limitation of the logon simulation is that it doesn’t go beyond the logon process. If you need to provide SLAs that cover application launch times and screen refresh latencies, you cannot rely on the logon simulation alone. This is where a full-session simulation comes in. With this capability, you can record a full session including application launch and access within a VDI session, and then have the session replayed periodically to report different user experience metrics.
Both these synthetic monitoring approaches do not require any agents on your VMware Horizon servers or desktops.
- Real user experience monitoring: While simulation helps in tracking whether your Horizon service is working 24×7, it does not track the performance seen by real users. To monitor real user experience, metrics such as actual logon times of users, PCoIP or Blast latency and available bandwidth, user input delay, application launch time on the RDSH servers or virtual desktops become important. In almost all cases, these metrics have to be collected from user desktops/RDSH servers. Hence, some form of agent software is required on these systems. While some of these metrics are available from VMware Horizon APIs, others have to be obtained by tight integration with Microsoft Windows APIs.
What is VMware RDSH?
An RDS host can be a virtual machine or a physical server. The Remote Desktop Session Host service allows a server to host applications and remote desktop sessions. With Horizon Agent installed on an RDS host, users can connect to applications and desktop sessions by using the display protocol PCoIP or Blast Extreme.
2. Providing Deep Insights into Virtual Desktops/RDSH Sessions
At the core of any VMware Horizon deployment are the session hosts – the virtual desktops or the RDSH hosts. As applications are executed on these hosts and accessed via remote protocols from user terminals, a wealth of performance metrics can be gleaned from these systems. Hence, in-depth monitoring of the virtual desktops and RDSH hosts is a must in any VMware Horizon deployment.
As each virtual desktop or an RDSH host runs as a VM, tools such as VMware vROPs and Nutanix Prism can provide indicators of resource usage levels of these VMs. However, as protocol handling and application access is handled by the guest OS within the VMs, it is not sufficient to get an “outside view of a VM” alone. For in-depth performance visibility into a VMware Horizon deployment, it is essential for administrators to have an “inside view of each VM” as well.
Performance questions that can be answered using an inside view of a VM include:
- Which user is logged into the virtual desktop/RDSH host?
- How long has the user been logged in, and how much time has he/she been idle in the session?
- Have there been any session disconnects and when?
- How long did it take to process the user’s logon and what is the breakdown of logon time: how much time was spent in profile loading, in GPO processing, in logon script execution, etc.?
- What is the user input delay for a session? Is the guest OS taking too long to respond to user requests?
- What applications are running in a user’s session and what resources are they taking up?
- If any browser instance is in use, what resources is it taking up and what are the active URLs?
- At what times were different applications launched during the user’s session and how long did each launch take?
- What is the latency over the PCoIP or Blast channel and how much bandwidth is available during a session?
- What bandwidth is used by each session and which virtual channel (audio, video, printer, etc.) is responsible for the bandwidth usage?
- What system resources (CPU, memory, disk IOPS, etc.) are used by a session? Which processes are responsible for the resource usage?
- Application crashes are also captured on the virtual desktop/RDSH host only. So, which of the client applications has crashed the most and when?
This level of insight is necessary to effectively respond to user complaints that their virtual desktop or application access is slow.
3. Delivering End-to-End Insights Covering Every Tier of the Horizon Deployment
Unusual activity or resource usage inside a virtual desktop or RDSH host can slow user access. At the same time, slowness can also occur due to abnormalities in any of the other VMware Horizon tiers. A typical VMware Horizon deployment involves many tiers that must work together to support the VDI service. Unified Access Gateways (UAGs), Load balancers, Horizon Connection Servers, vCenters, ESX servers, profile servers, storage, network, Active Directory, etc., must all be working well. A problem in any of these tiers can manifest as slow user experience. Hence, monitoring of each of these tiers is a must.
As the function of each tier is different, the VMware Horizon monitoring tool must be aware of the respective key performance indicators (KPIs) for each tier and track these KPIs.
As the VMware Connection Servers play a central role in brokering and controlling sessions, monitoring all aspects of their functioning is especially important. Any connectivity issues between the Connection Servers and the Active Directory, the vCenters, the event database, or the licensing servers must be immediately attended to.
4. Pinpointing the Root Cause of Problems with Topology Views
When you collect thousands of KPIs from different tiers, manually analyzing all these metrics is a tedious and time-consuming task. If your monitoring tool has machine learning capabilities, it can analyze all the metrics against historical baselines and automatically determine which of the several thousand metrics collected is indicative of a potential problem.
When problems manifest in many different VMware Horizon tiers, administrators are left to wonder where the real issue lies and where the effects are. If your monitoring tool has automatic correlation capabilities, it can auto-discover the VMware Horizon landscape. Color-coded topology views enable administrators to understand exactly where the root cause of the problems is.
Accurate root-cause diagnosis will help reduce mean time to repair (MTTR) and save your organization thousands of dollars in user productivity.
Another key advantage is the fact that topology views can be easily used by helpdesk staff to triage problems. By determining which tier has a more severe issue, they can direct the problem to the right administration team. This can lead to increased operational efficiency. Finger-pointing will be reduced and your key administrators will spend less time in firefighting routine issues.
5. Reporting and Analytics
While real-time performance views are important, VMware Horizon admins are often called upon to perform post-mortem analysis on problems that may have happened in the past. For instance, your CIO may have had a slow logon a few hours ago and you have been asked to investigate why. Having access to detailed historical insights is therefore a must. As most VMware Horizon environments have non-persistent desktops, you cannot rely on access to the user’s desktop to investigate further. The monitoring tool must have sufficient metrics to assist with post-mortem diagnosis.
As many organizations in healthcare, finance, and other sectors have strict compliance requirements, you will need your monitoring tool to assist with usage reporting on sessions, applications, users, and desktops.
At the same time, you need assistance for capacity planning as well. How can you accommodate additional users? Where are the resource bottlenecks? Do you need to add additional CPU or memory to your VMware ESX cluster? The empirical data collected by the monitoring tool should be used to provide you with such guidance.
A VMware Horizon administrator has a thankless job. In this blog, we have reviewed the top 5 capabilities that VMware Horizon administrators need in their monitoring tool to enable them to perform their job well.