Monitoring the BlackBerry Enterprise Server v5 (or its variants)

BlackBerry Enterprise Server v5.0 differs from its predecessors in its built-in high availability architecture which enables fast recovery from unplanned downtime of core BlackBerry Enterprise Server components. The new high availability option is designed to automatically use the standby core components if the need arises and has the flexibility of being deployed on physical and virtual servers (specifically VMware).

With the component level architecture, health metrics are continually monitored by the BlackBerry Enterprise Server. BlackBerry administrators can set failover thresholds, which when exceeded, trigger the BlackBerry Enterprise Server to automatically switch over to the standby server. For example, if the primary server loses its connection to the mail server, automatic failover would occur to the standby server, minimizing the delay of switching over manually. The administrator acknowledges when an automatic failover has occurred, fixes the problem on the originating server, and then manually sets the systems back, ensuring that failover loops are avoided.

Figure 1 : The high availability architecture of the BES v5

To monitor BES v5, eG Enterprise provides a dedicated BlackBerry 5x model.

layer model

Figure 2 : The layer model of the BlackBerry Enterprise Server v5

Each layer of Figure 2 is mapped to a wide variety of tests which report a wealth of performance metrics related to the BlackBerry Enterprise Server v5. Using the metrics reported by these tests, administrators can find quick and accurate answers to the following performance queries:

  • How many users have MDS enabled on their devices?
  • Are data transmissions on MDS connections heavy?
  • Has the MDS Connection Service refused any data packets?
  • Were invalid packets received by the MDS Connection Service?
  • Have too many SRP connections to the BlackBerry Infrastructure failed?
  • Is any user connected to the MDS for an unreasonably long time? Which user is this?
  • Is too much data transmitted from the MDS to any user's handheld device? Which user is this?
  • Is the router configured with adequate device connections?
  • How is the load on the router?
  • Are there too many undelivered messages on the BlackBerry Enterprise Server?
  • Have too many messages expired?
  • Are there too many pending requests to the BlackBerry Policy Service?
  • Did any request to the Policy Service fail?
  • Were any hung threads detected on the BlackBerry Enterprise Server?
  • Is the BlackBerry Dispatcher connected to the handheld device? If so, how quickly was the connection established?
  • Does the BES have adequate licenses?
  • What is the user load on the BlackBerry Messaging Agent? Which users are currently connected to the agent?
  • Did any connection attempt to the messaging agent fail in the last 10 minutes?
  • Are devices taking too long to connect to the messaging agent?
  • Is the messaging agent able to process messages quickly or are too many messages pending on the agent?
  • Has any user failed to initialize with BES?

The sections that follow will discuss the top 6 layers of Figure 3.2. For remaining layers, refer to Monitoring Unix and Windows Servers .