Office 365 Performance Monitoring and Troubleshooting: What Microsoft Can’t, But eG Enterprise Can

Microsoft Office 365 is the Most Popular Cloud Service Today

Office 365 troubleshooting requires a specialized set of tools.A recent analysis by Skyhigh Networks on 27 million employees rated Microsoft Office 365 as the most widely popular enterprise cloud service by user count. While Office 365 offers a wide range of Microsoft products on a subscription basis, from the cloud, one of its popular offerings is SharePoint Online. A recent Hyperfish, Sharegate, Nintex and LiveTiles survey estimates that adoption of SharePoint Online grew to 50% in the last year.

At eG Innovations, we have been a Microsoft Office 365 user for several years. Our employees rely on Office 365 tools, email (Exchange Online), collaboration (SharePoint Online and Teams), cloud storage (OneDrive for Business) and many other services. Many of our day-to-day business operations require Office 365 services to be available and performing well.

Microsoft Office 365 Problem Detection: A Real-World Example

A couple of days ago, our users were complaining of slowness when accessing SharePoint Online. To enable administrators to check the availability of Office 365 services, Microsoft offers a portal – https://status.office365.com/ where administrators can check for any known issues detected by Microsoft’s systems. When our users were complaining, the Microsoft portal’s dashboard showed that all the services were up and running.

The Admin Center Console provides system status that may not always be current.Microsoft 365 Admin Center Console

At the same time, our own eG Enterprise performance monitoring solution was alerting about slowness with Microsoft SharePoint Online. As you can see from the figure below, eG Enterprise had detected that many file operations on SharePoint Online were slow or not working. File deletion was several folds slower than normal and file checkout was not even successful.

eG Enterprise identified the SharePoint Online file upload as the problemeG Enterprise detecting SharePoint Online file upload failure

Since there were no known Office 365 issues reported on the Office 365 admin center portal, our IT team started to look for potential issues at our end. Network delays, bandwidth congestion, packet drops, client issues, etc. were all investigated, and no other problem was found. All other applications in our network were working fine and the problem seemed to be isolated with the SharePoint Online service, which was extremely slow and affecting our users.

Searching on the Internet, we found that there was indeed a problem with SharePoint Online services and it wasn’t just eG Innovations that was affected. Many others across the globe were facing the same issue and complaining about it.

Some users complained about the SharePoint issue not being displayed on Microsoft service health dashboardPeople complaining about SharePoint issue not being displayed on Microsoft service health dashboard

The very next day, there was an additional issue with SharePoint Online that lasted for a few hours: users were unable to login to SharePoint Online. eG Enterprise proactively detected and alerted us to this problem as well.

eG Enterprise alerts administrators to Office 365 performance issues.eG Enterprise alerting to Office 365 performance issues

Lessons Learned: What is Needed for SaaS/Cloud Monitoring

Our real-world experience above highlights the challenges that enterprises face when accessing SaaS and cloud services.

  • SaaS/cloud service providers cannot be relied upon to be proactive. Their service portals may not reflect every outage and their responses are often reactive, not proactive. We know this to be true because Microsoft’s own admin console did not detect any issues.
  • Having independent monitoring capabilities in place allows you to monitor the availability and performance of your business-critical services and to initiate remedial action before users complain.
  • Performance monitoring that provides end-to-end visibility allows you to differentiate between problems that are in your network/domain vs. ones that are in the SaaS/cloud provider’s data center.
  • Performance monitoring tools cannot rely only on APIs, metrics exposed by SaaS/cloud service providers. In the above example, Microsoft’s Office 365 APIs and metrics did not reveal the problem. Synthetic monitoring implemented in eG Enterprise to periodically assess and measure the performance of SharePoint Online was what was able to detect the problem.

What eG Enterprise Offers for Microsoft Office 365 Monitoring

eG Enterprise provides a complete solution for monitoring Microsoft Office 365:

  • Using APIs exposed by Microsoft, eG Enterprise collects health, availability, performance and usage metrics about Office 365 services. Exchange Online, SharePoint Online, Skype for Business, Teams and OneDrive can be monitored this way. Overall tenant health and license usage are also being tracked.
  • User experience monitoring is critical for detecting issues in advance. eG Enterprise includes:
    • A wide range of synthetic monitoring capabilities. A custom-built logon simulator for Exchange Online checks periodically for user logon and access issues with Exchange Online. File upload/download operations are performed to ensure that SharePoint Online services are working well. A full session simulation capability is also available to simulate entire user sessions and periodically check for availability and response time degradations.
    • Real user monitoring (RUM) is supported for SharePoint Online. Using Javascript injection, RUM monitors web page response time and any errors, in order to identify slow or error-prone transactions.
  • Network monitoring capabilities track the performance of an enterprise’s local network to detect issues with packet drops, excessive bandwidth utilization etc.
eG Enterprise provides a comprehensive Office 365 performance monitoring dashboard.eG Enterprise Office 365 performance monitoring dashboard