User Analytics Test

Enterprises typically use SharePoint to create web sites and web applications. The success of the SharePoint platform therefore hinges on the level of user satisfaction with the web sites and applications created on that platform. The key to ensuring high user satistaction lies in closely tracking user requests to the web sites/web applications on SharePoint, measuring the responsiveness of the web sites/web applications to the user requests, instantly detecting poor responsiveness, and accurately isolating which user’s experience is being impacted by this slowness, well before that user notices! This can be achieved using the User Analytics test!

This test queries the SharePoint usage database at configured intervals and collects usage metrics that are stored therein – this includes the web sites/web applications accessed, count and names of users of each web site/web application, the browsers that were used for web site/web application access, web pages requested, the time taken for the requested pages to load, where page views spent time and how much, error responses returned, resources consumed, and many more.  Using the query results, the test then auto-discovers the users accessing each of the web sites/web applications that are configured for monitoring. Then, for each such user, this test reports the average time taken by the corresponding site/web application to load pages. In the process, the test points administrators to slow web sites/web applications, reveals the exact user who has suffered the most owing to this slowness,  and also leads them to the probable source of the slowness – is it owing to a latent web front end? is it because of slow service calls? Or is it due to inefficient queries to the backend database?

Sometimes, poor user experience can be attributed to HTTP errors. This is why, this test instantly alerts administrators to HTTP error responses, thus ensuring their timely intervention and rapid resolution of the error conditions. 

This way, the User Analytics test enables administrators to proactively detect users who are experiencing or who will potentially experience performance issues with a web site/web application, helps them promptly and accurately diagnose the source of the poor user experience, and thus ensures that they initiate measures to enhance user experience and pre-empt the damage that may be caused to revenue and reputation.

Note:

This test will run only if a SharePoint Usage and Health Service application is created and is configured to collect usage and health data. To know how to create and configure this application, follow the steps detailed in Configuring the eG Agent to Collect Usage Analytics.

Target of the test : A Microsoft SharePoint Server

Agent deploying the test : An internal/Remote agent

Outputs of the test : One set of results for each user accessing every SharePoint site configured for monitoring

First-level descriptor: Site URL

Second-level descriptor: User name

Configurable parameters for the test
Parameters Description

Test period

This indicates how often should the test be executed.

Host

The host for which the test is to be configured.

Port

The port at which the host server listens.

SQL Port Number

Specify the port number of the Microsoft SQL server that is hosting the usage database.

Instance

If the SQL server hosting the usage database is instance-based, then provide the instance name here. If not, then set this to none.

SSL

If the SQL server hosting the usage database is SSL-enabled, then set this flag to Yes. If not, set it to No.

Isntlmv2

In some Windows networks, NTLM (NT LAN Manager) may be enabled. NTLM is a suite of Microsoft security protocols that provides authentication, integrity, and confidentiality to users. NTLM version 2 (“NTLMv2”) was concocted to address the security issues present in NTLM. By default, the Isntlmv2 flag is set to No, indicating that NTLMv2 is not enabled by default on the SQL server hosting the usage database. Set this flag to Yes if NTLMv2 is enabled on that SQL server.

Database Domain

Specify the fully qualified name of the domain in which the Microsoft SQL server hosting the usage database operates. For instance, your specification can be: SharePoint.eginnovations.com

Database Server Name

Specify the name of Microsoft SQL server that hosts the usage database to be accessed by this test. Database Name

Database Name

Specify the name of the usage database that this test should access.

Database User Name, Database Password, Confirm Password

Specify the credentials of a user who has read-only access to the configured usage, in the Database User Name and Database Password text boxes. Then, confirm the password by retyping it in the Confirm Password text box.

Slow Transaction Cutoff (ms)

This test reports the count of slow page views and also pinpoints the pages that are slow. To determine whether/not a page is slow, this test uses the Slow Transaction Cutoff parameter. By default, this parameter is set to 4000 millisecs (i.e., 4 seconds). This means that, if a page takes more than 4 seconds to load, this test will consider that page as a slow page by default. You can increase or decrease this slow transaction cutoff according to what is ‘slow’ and what is ‘normal’ in your environment.

Note:

The default value of this parameter is the same as the default Maximum threshold setting of the Avg page load time measure – i.e., both are set to 4000 millisecs by default. While the former helps eG to distinguish between slow and healthy page views for the purpose of providing detailed diagnosis, the latter tells eG when to generate an alarm on Avg page load time. For best results, it is recommended that both these settings are configured with the same value at all times. Therefore, if you change the value of one of these configurations, then make sure you update the value of the other as well. For instance, if the Slow Transaction Cutoff is changed to 6000 millisecs, change the Maximum Threshold of the Avg page load time measure to 6000 millsecs as well.

Site

Configure a comma-separated list of web site URLs that you want this test to monitor. For eg., http://www.msproject28rk2:11982,http://www.mydocs.com

URL patterns to be ignored from monitoring

By default, this test does not track requests to the following URL patterns: *.js,*.css,*.jpeg,*.jpg,*.png,*.asmx,*.ashx,*.svc,*.dlll. If required, you can remove one/more patterns from this default list, so that such patterns are monitored, or can append more patterns to this list in order to exclude them from monitoring. For instance, to additionally ignore URLs that end with .gif and .bmp when monitoring, you need to alter the default specification as follows: *.js,*.css,*.jpeg,*.jpg,*.png,*.asmx,*.ashx,*.svc,*.dlll,*.gif,*.bmp 

Ignore Ajaxdelta Pages

By default, this test ignores all requests to AjaxDelta pages. This is why, the Ignore Ajaxdelta Pages is set to Yes by default. If you want the test to track requests to the AjaxDelta pages as well, set this flag to No.

Max Acceptable Duration

By default, this parameter is set to 3 (seconds). This implies that this test, by default, will report metrics for only those web sites containing one/more web parts that process requests for a duration longer than the 3 seconds. You can increase or decrease the value of this parameter, depending upon what you think is ‘slow’ in your environment. This way, you can configure the test to focus on only those web sites that contain slow or critical web parts alone. 

Fetch Farm Measures

Typically, farm-level metrics – eg., metrics on farm status, site collections, usage analytics – will not vary from one SharePoint server in the farm to another. If these metrics are collected and stored in the eG database for each monitored server in the SharePoint farm, it is bound to unnecessarily consume space in the database and increase processing overheads. To avoid this, farm-level metrics collection is by default switched off for the member servers in the SharePoint farm, and enabled only if the server being monitored is provisioned as the Central Administration site. Accordingly, this parameter is set to If Central Administration by default. This default setting ensures that farm-level metrics are collected from and stored in the database for only a single SharePoint server in the farm.  

If you want to completely switch-off farm-level metrics collection for a SharePoint farm, then set this parameter to No.

Some high-security environments may not allow an eG agent to be deployed on the Central Administration site. Administrators of such environments may however require farm-level insights into status and performance. To provide these insights for such environments, you can optionally enable farm-level metrics collection from any monitored member server in the farm, even if that server is not provisioned as the Central Administration site. For this, set this parameter to Yes when configuring this test for that member server.   

Domain, Domain User, Password, and Confirm Password

When monitoring a SharePoint 2010 server, this test has to be configured with the credentials of a domain user with the following privileges:

To know how to add a user to one of these groups, refer to Adding a User to Local Groups on the eG Agent Host.

It is recommended that you create a special user for this purpose and assign the aforesaid privileges to him/her. Once such a user is created, specify the domain to which that user belongs in the domain text box, and then, enter the credentials of the user in the domain user and password text boxes. To confirm the password, retype it in the confirm password text box.

DD Frequency

Refers to the frequency with which detailed diagnosis measures are to be generated for this test. The default is 1:1. This indicates that, by default, detailed measures will be generated every time this test runs, and also every time the test detects a problem. You can modify this frequency, if you so desire. Also, if you intend to disable the detailed diagnosis capability for this test, you can do so by specifying none against DD frequency.

Detailed Diagnosis

To make diagnosis more efficient and accurate, the eG Enterprise suite embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option.

The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:

  • The eG manager license should allow the detailed diagnosis capability
  • Both the normal and abnormal frequencies configured for the detailed diagnosis measures should not be 0.
Measurements made by the test
Measurement Description Measurement Unit Interpretation

Unique sessions

Indicates the number of unique sessions for this user on this web site. 

Number

Compare the value of this measure across users to identify the user who has the maximum number of open sessions on the site, and is hence, probably overloading the site.

The detailed diagnosis of this measure reveals the unique client IP addresses from which the user launched his/her sessions and the number of requests received from each IP address.

Unique destinations

Indicates the number of unique destinations for this site for this user.

Number

To know the most popular destination URLs for a user, use the detailed diagnosis of this measure. Here, you will find the top-10 destinations in terms of the number of hits.

Unique referrers

Indicates the number of unique URLs external to this site (parent web application is treated as external as well), from where this user navigated to the browser.

Number

To know which referrer URL was responsible for the maximum hits, use the detailed diagnosis of this measure. The top-10 unique referrer URLs in terms of the number of hits they generated will be displayed as part of the detailed diagnostics. 

Apdex score

Indicates the Apdex score of this user for this site.

Number

Apdex (Application Performance Index) is an open standard developed by an alliance of companies. It defines a standard method for reporting and comparing the performance of software applications in computing. Its purpose is to convert measurements into insights about user satisfaction, by specifying a uniform way to analyze and report on the degree to which measured performance meets user expectations.

The Apdex method converts many measurements into one number on a uniform scale of 0-to-1 (0 = no users satisfied, 1 = all users satisfied). The resulting Apdex score is a numerical measure of user satisfaction with the performance of enterprise applications. This metric can be used to report on any source of end-user performance measurements for which a performance objective has been defined.

The Apdex formula is:

Apdex = (Satisfied Count + Tolerating Count / 2) / Total Samples

This is nothing but the number of satisfied samples plus half of the tolerating samples plus none of the frustrated samples, divided by all the samples.

A score of 1.0 means all responses were satisfactory. A score of 0.0 means none of the responses were satisfactory. Tolerating responses half satisfy a user. For example, if all responses are tolerating, then the Apdex score would be 0.50.

Ideally therefore, the value of this measure should be 1.0. A value less than 1.0 indicates that this user’s experience with the corresponding web site has been less than satisfactory.  

Total page views

Indicates the number of times the pages this web site  were viewed by this user.

Number

This is a good measure of the traffic to a web site from a particular user.

A high number of page views from a single user typically indicates how frequently that user is accessing the web site. Sudden, but significant spikes in the page view count could be a cause for concern, as it could be owing to a malicious virus attack or an unscrupulous attempt to hack your web site/web application.

Satisfied page views

Indicates the number of times pages were viewed in this web site by this user without any slowness.

Number

A page view is considered to be slow when the average time taken to load that page exceeds the slow transaction cutoff configured for this test. If this slow transaction cutoff is not exceeded, then the page view is deemed to be ‘satisfactory’.

Ideally, the value of this measure should be high.

If the value of this measure is much lesser than the value of the Tolerating page views and the Frustrated page views, it is a clear indicator that the experience of the user is below-par. In such a case, use the detailed diagnosis of the Tolerating page views and Frustrated page views measures to know which pages are slow.

Tolerating page views

Indicates the number of tolerating page views for this user in this web site.

 

Number

If the Average page load time of a page exceeds the Slow Transaction Cutoff configuration of this test, but is less than 4 times the slow transaction cutoff (i.e., < 4 * slow transaction cutoff), then such a page view is considered to be a Tolerating page view.

Ideally, the value of this measure should be 0. A value higher than that of the Satisfied page views measure is a cause for concern, as it implies that the user is less than satisfactory. To know which pages are contributing to this sub-par experience, use the detailed diagnosis of this measure.

Frustrated page views

Indicates the number of frustrated page views for this user in this web site.

Number

If the Average page load time of a page is over 4 times the Slow Transaction Cutoff configuration of this test (i.e., > 4 * slow transaction cutoff), then such a page view is considered to be a Frustrated page view.

Ideally, the value of this measure should be 0. A value higher than that of the Satisfied page views measure is a cause for concern, as it implies that the experience of the user has been less than satisfactory. To know which pages are contributing to this sub-par experience, use the detailed diagnosis of this measure.

Average page load time

Indicates the average time taken by the pages in this site that are requested by this user to load completely.

Msecs

This is the average interval between the time that a user initiates a request and the completion of the page load of the response in the user's browser.

If the value of this measure is consistently high for a user, it implies a degraded user experience. You may want to check the Apdex score in such circumstances to determine whether/not user experience has already been affected. Regardless, you should investigate the anomaly and quickly determine where the bottleneck lies – is it with the web front-end? is it owing to slow service calls? Or is it because of inefficient queries to the backend?   -  so that the problem can be fixed before users even notice any slowness! For that, you may want to compare the values of the Average service calls duration, Average CPU duration, Average IIS latency, and Average query duration measures of this test.

Average service calls duration

Indicates the time taken by the requests of this user to this web site to generate service calls.

Secs

If the Avg page load time of a user is abnormally high, then you can compare the value of this measure with that of the Average CPU duration, Average IIS latency, and Average query duration measures of this test to know what exactly is delaying page loading – a slow front-end web server? inefficient queries to the backend database? or slow service calls?

Average IIS latency

Indicates the average time requests from this user took in the frontend web server after the requests were received by the frontend web server but before the browser began processing the requests.

Secs

If the Avg page load time of a user is abnormally high, then you can compare the value of this measure with that of the Average service calls duration, Average CPU duration, and Average query duration measures of this test to know what exactly is delaying page loading – a slow browser? A slow front-end web server? inefficient queries to the backend database? or slow service calls?

Average CPU duration

Indicates the average time for which requests from this user to this site used the CPU.

Secs

If the Avg page load time of a web application is abnormally high, then you can compare the value of this measure with that of the Average service calls duration, Average IIS latency, and Average query duration measures of this test to know what exactly is delaying page loading – a slow browser? a slow front-end web server? inefficient queries to the backend database? or slow service calls?

SQL logical reads

Indicates the total number of 8 kilobyte blocks that this browser read from storage on the back-end database server.

Number

 

Average CPU megacycles

Indicates the average number of CPU mega cycles spent processing the requests to this browser in the client application on the front end web server.

Number

 

Total queries

Indicates the total number of database queries generated by requests to this browser.

Number

 

Average query duration

Indicates the average time taken for all backend database queries generated by requests from this user to this web site.

Secs

If the Avg page load time of a browser is abnormally high, then you can compare the value of this measure with that of the Average service calls duration, Average IIS latency, and Average CPU duration measures of this test to know what exactly is delaying page loading – a slow browser? a slow front-end web server? inefficient queries to the backend database? or slow service calls?

Average data consumed

Indicates the average bytes of data downloaded by the requests of this user.

KB

 

GET requests

Indicates the number of GET requests from this user to this site.

Number

 

POST requests

Indicates the number of POST requests from this user to this site.

Number

 

OPTION requests

Indicates the number of OPTION requests from this user to this site.

Number

 

300 responses

Indicates the number of responses for requests from this user that had a status code in the 300-399 range.

Number

300 responses could indicate page caching on the client browsers. Alternatively 300 responses could also indicate redirection of requests. A sudden change in this value could indicate a problem condition.

400 errors

Indicates the number responses for requests from this user that had a  status code in the range 400-499.

Number

A high value indicates a number of missing/error pages.

Use the detailed diagnosis of this measure to know when each of the 400 errors occurred, which user experienced the error, when using what browser, from which machine. This information will greatly aid troubleshooting.

500 errors

Indicates the number of responses for requests from this user that had a status code in the range 500-599.

Number

Since responses with a status code of 500-600 indicate server side processing errors, a high value reflects an error condition.

Use the detailed diagnosis of this measure to know when each of the 500 errors occurred, which user experienced the error, when using what browser, from which machine. This information will greatly aid troubleshooting.