Monitoring Cisco SD-WAN

eG Enterprise offers a special-purpose monitoring model for the Cisco SD-WAN appliance to monitor the components of the Cisco SD-WAN, resource utilization of the components, status and traffic flowing through the tunnels. Abnormal traffic flows can be instantly detected using this monitored model. This way, administrators are proactively alerted to issues so that they can initiate remedial actions well before users complain.

By periodically running the RESP API commands, eG agents collect various metrics of interest from the target Cisco SD-WAN appliance. Figure 1 depicts the layer model of a Cisco SD-WAN.

Figure 1 : Layer model for Cisco SD-WAN

Every layer in the Figure 1 is mapped to various tests to determine metrics related to the performance of the target Cisco SD-WAN appliance. Using the metrics reported by the tests, administrators can find accurate answers for the following queries:

  • What percentage of CPU is consumed by the user processes of each component?

  • What percentage of CPU is consumed by system processes of each component?

  • What percentage of time the CPU was idle for each component?

  • What is the status of the disk allocated to each component?

  • What percentage of disk space is utilized by each component?

  • What is the amount of memory resources used by each component?

  • What percentage of memory is already utilized by each component?

  • What is the admin and operational status of each network interface?

  • What is the uptime of each network interface?

  • How well traffic is handled by each network interface?

  • How many errors were encountered during data/packet transmission/reception?

  • What is the current state of each control connection?

  • What is the uptime of each control connection?

  • How well traffic is handled by each tunnel? Which tunnel is handling maximum amount of data/packet transmission/reception?

  • Are the components of the target Cisco SD-WAN appliance reachable?

  • What is the current status and uptime of each component?

  • How many times each component crashed?

  • How many times reboots were noticed for each component?