Distributed Cache Usage Analytics Test

The Distributed Cache service, which is built on Windows Server AppFabric Cache, is set to run in a collocated mode on all SharePoint 2013 Servers by default. It's essential for maintaining the large amounts of information on your SharePoint Server, ensuring that the information is fresh and readily available for the end user.

Caching functionalities, provided by the Distributed Cache service, enable web applications deployed on SharePoint to quickly retrieve data without any dependency on databases stored in SQL Server, as everything is stored in memory.

Any SharePoint server in the farm running the Distributed Cache service is known as a cache host. Cache size is the memory allocated to the Distributed Cache service on the cache host.

At any given point in time, sufficient memory resources should be available to the Distributed cache service to ensure optimum cache usage and to assure SharePoint users of a satisfactory experience with their web applications. In the absence of adequate memory, cache lookups will be delayed or even missed, thus affecting overall SharePoint performance and adversely impacting the health of user interactions with the web applications.

It is hence imperative that administrators keep an eye on the usage of the cache service by each dependent web application, rapidly detect unexpected slowness in cache reads and writes, capture cache misses, and figure out if such anomalies are owing to the bad size of the Distributed cache service. This is what the Distributed Cache Usage Analytics test help administrators do!

This test queries the SharePoint Logging database and retrieves metrics revealing how each web application uses the distributed cache, from it. The metrics so collected reveal the following:

Is any web application reading from and/or writing to the cache slowly? If so, which host is slow?
Is any web application overloading the cache with read/write requests?
Which web application is experiencing many cache misses?
Are there any cache failures? If so, which web application failed to read from or write to the cache?

This way, the test brings cache usage and sizing irregularities to light, pinpoints the exact web application that is being impacted by these abnormalities, and thus prompts administrators to right-size the cache to ensure peak application performance.

Note:

This test will run only if a SharePoint Usage and Health Service application is created and is configured to collect usage and health data. To know how to create and configure this application, follow the steps detailed in Configuring the eG Agent to Collect Usage Analytics.
This test is not applicable when the target server is a Microsoft SharePoint 2010 server

Target of the test : A Microsoft SharePoint Server

Agent deploying the test : An internal/remote agent

Outputs of the test : One set of results for each web application in the monitored SharePoint server

Configurable parameters for the test
Parameters	Description
Test period	This indicates how often should the test be executed.
Host	The host for which the test is to be configured.
Port	The port at which the host server listens.
SQL Port Number	Specify the port number of the SQL server that hosts the SharePoint Logging database.
Instance	If the SQL server that hosts the SharePoint Logging database is instance-based, then provide the instance name here. If not, then set this to none.
SSL	If the SQL server hosting the SharePoint Logging database is SSL-enabled, then set this flag to Yes. If not, set it to No.
Isntlmv2	In some Windows networks, NTLM (NT LAN Manager) may be enabled. NTLM is a suite of Microsoft security protocols that provides authentication, integrity, and confidentiality to users. NTLM version 2 (“NTLMv2”) was concocted to address the security issues present in NTLM. By default, the Isntlmv2 flag is set to No, indicating that NTLMv2 is not enabled by default on the SQL server that hosts the SharePoint Logging database. Set this flag to Yes if NTLMv2 is enabled on that SQL server.
Database Domain	Specify the fully qualified name of the domain in which the Microsoft SQL server hosting the SharePoint Logging database operates. For instance, your specification can be: SharePoint.eginnovations.com
Database server Name	Specify the name of Microsoft SQL server that hosts the SharePoint Logging database to be accessed by this test.
Database Name	Specify the name of the SharePoint Logging database that this test should access.
Database User Name, Database Password, Confirm Password	Specify the credentials of a user who has db_datareaderaccess to the SharePoint Logging database configured, in the Database User Name and Database Password text boxes. Then, confirm the password by retyping it in the Confirm Password text box.
URL patterns to be ignored from monitoring	By default, this test does not track requests to the following URL patterns: .js,.css,.jpeg,.jpg,.png,.asmx,.ashx,.svc,.dlll. If required, you can remove one/more patterns from this default list, so that such patterns are monitored, or can append more patterns to this list in order to exclude them from monitoring. For instance, to additionally ignore URLs that end with .gif and .bmp when monitoring, you need to alter the default specification as follows: .js,.css,.jpeg,.jpg,.png,.asmx,.ashx,.svc,.dlll,.gif,.bmp
Ignore AjaxDelta Pages	By default, this test ignores all requests to AjaxDelta pages. This is why, the Ignore AjaxDelta Pages is set to Yes by default. If you want the test to track requests to the AjaxDelta pages as well, set this flag to No.
Fetch Farm Measures	Typicaly, farm-level metrics - eg., metrics on farm status, site collections, usage analytics - will not vary from one SharePoint server in the farm to another. If these metrics are collected and stored in the eG database for each monitored server in the SharePoint farm, it is bound to unnecessarily consume space in the database and increase processing overheads. To avoid this, farm-level metrics collection is by default switched off for the member servers in the SharePoint farm, and enabled only if the server being monitored is provisioned as the Central Administration site. Accordingly, this parameter is set to If Central Administration by default. This default setting ensures that farm-level metrics are collected from and stored in the database for only a single SharePoint server in the farm. If you want to completely switch-off farm-level metrics collection for a SharePoint farm, then set this parameter to No. Some high-security environments may not allow an eG agent to be deployed on the Central Administration site. Administrators of such environments may however require farm-level insights into status and performance. To provide these insights for such environments, you can optionally enable farm-level metrics collection from any monitored member server in the farm, even if that server is not provisioned as the Central Administration site. For this, set this parameter to Yes when configuring this test for that member server.
Domain, Domain User, Password, and Confirm Password	If the Fetch Farm Measures flag of these tests is set to No or to If Central Administration Site, then this test should be configured with the credentials of a user with the following privileges: The user should be part of the SharePoint Farm Administrators group. To know how to add a user to this group, refer to Adding a User to a Farm Administrators Group. The user should have shell admin access to all databases in SharePoint. To know how to grant such an access to a user, refer to Granting a User Shell Admin Access to All SharePoint Databases. The user should have full control access to each web application that needs to be monitored on the SharePoint server. To know how to grant such an access to a user, refer to Granting a User Full Control Access to Web Applications on Microsoft SharePoint. The user should have read and execute access to the eG agent install directory. To know how to grant this access to a user, refer to Granting a User Read and Execute Permissions to the eG Agent Install Directory. On the other hand, if the Fetch Farm Measures flag of these tests is set to Yes, then the user configured for the tests not only requires the four privileges discussed above, but should also be part of the following groups on the eG agent host: Administrators WSS_ADMIN_WPG IIS_USRS Performance Monitor Users WSS_WPG Users It is recommended that you create a special user for this purpose and assign the aforesaid privileges to him/her. Once such a user is created, specify the domain to which that user belongs in the Domain text box, and then, enter the credentials of the user in the Domain User and Password text boxes. To confirm the password, retype it in the Confirm Password text box.
DD Frequency	Refers to the frequency with which detailed diagnosis measures are to be generated for this test. The default is 1:1. This indicates that, by default, detailed measures will be generated every time this test runs, and also every time the test detects a problem. You can modify this frequency, if you so desire. Also, if you intend to disable the detailed diagnosis capability for this test, you can do so by specifying none against DD frequency.
Detailed Diagnosis	To make diagnosis more efficient and accurate, the eG Enterprise embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option. The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled: The eG manager license should allow the detailed diagnosis capability Both the normal and abnormal frequencies configured for the detailed diagnosis measures should not be 0.

Measurements made by the test
Measurement	Description	Measurement Unit	Interpretation
Reads	Indicates the number of cache reads performed by this web application.	Number
Average read duration	Indicates the average time taken by this web application to read from the cache.	Msecs	A low value is desired for this measure. A consistent increase in the value of this measure could indicate a reading bottleneck. One of the reasons for reading delays is insufficient memory for the cache on the host. You may want to right-size the cache to make sure that your requests are serviced quickly and efficiently. Reading delays can also occur if the cache is overloaded with read requests or too much data is to be read. Whenever this measure registers an abnormally high value for a web application, look up the value reported by the Reads and Average read size measures for the same web application to determine whether/not the slowness can be attributed to the count and size of the reads.
Average read size	Indicates how many kilobytes of data, on an average, are read from the cache by this web application.	KB
Writes	Indicates the number of writes to the cache by this web application.	Number
Average write duration	Indicates the average time taken by this web application to write to the cache.	Msecs	A low value is desired for this measure. A consistent increase in the value of this measure could indicate a writing bottleneck. One of the reasons for writing delays is insufficient memory for the cache. You may want to right-size the cache to make sure that your requests are serviced quickly and efficiently. Writing delays can also occur if the web application is overloading the cache with write requests or too much data is to be written. Whenever this measure registers an abnormally high value for a web application, look up the value reported by the Writes and Average writes size measure for the same application to determine whether/not the slowness can be attributed to the unusually high number and size of the writes.
Average writes size	Indicates how many kilobytes of data, on an average, are written to the cache by this web application.	Number
Misses	Indicates the number of requests from this web application that were not serviced by the cache.	Number	Ideally, the value of this measure should be 0. If on the other hand, this measure value is close to the value of the Objects requested measure, it is a cause for serious concern, as it implies that almost all objects requested were not found in the cache. Under such circumstances, use the detailed diagnosis of this measure to know which web site addresses could not be found in the cache. One of the reasons for a high number of misses could be insufficient memory allocation to the cache service. In such a situation, you may want to increase the cache size by adding more memory.
Hits	Indicates the number of requests from this web application that were successfully served by this cache.	Number	Ideally, the value of this measure should be the same as the value of the Objects requested measure. If not, check whether the cache has enough memory, and if required, add more memory to it.
Failures	Indicates the number of cache failures experienced by this web application.	Number	Ideally, the value of this measure should be 0. If this measure reports a non-zero value, then use the detailed diagnosis of this measure to know which web site addresses were being looked up in the cache when the failures occurred.
Objects requested	Indicates the number of objects requested by this web application.	Number

Use the detailed diagnosis of the Misses measure to know which web site addresses could not be found in the cache.

SharePoint >SharePoint >- Google Chrome

Figure 1 : The detailed diagnosis of the Misses measure

Use the detailed diagnosis of the Failures measure to know which web site addresses were being looked up in the cache when the failures occurred.

SharePoint >SharePoint >- Google Chrome

Figure 2 : The detailed diagnosis of the Failures measure