Azure Backup Jobs Status Test

Azure Backup service triggers jobs that run in background in various scenarios such as triggering backup, restore operations, disabling backup.

If backup jobs fail, then no copies of your critical data will be available. With no backups, data recovery becomes impossible when disaster strikes; consequently, loss of data becomes inevitable. This is why, administrators should track the progress of these backup jobs, quickly detect job failures, and take appropriate action. Likewise, administrators should also be able to rapidly capture jobs that have been running for an abnormally long time, so that the reasons for the same can be quickly ascertained. The Azure Backup Jobs Status test helps with all of the above!

This test tracks the status of the backup jobs that have been triggered for an Azure Subscription and reports the count of jobs in different states. In the process, the test alerts administrators to failed jobs and jobs that have been running for a duration beyond a configured time limit. Detailed diagnostics throw light on the exact jobs that failed or are long-running, thus enabling administrators to easily troubleshoot the failure/abnormal run time (as the case may be).

Target of the Test: A Microsoft Azure Subscription

Agent deploying the test: A remote agent

Output of the test: One set of results for the Azure Subscription being monitored

Configurable parameters for the test
Parameters Description

Test Period

How often should the test be executed.

Host

The host for which the test is to be configured.

Subscription ID

Specify the GUID which uniquely identifies the Microsoft Azure Subscription to be monitored. To know the ID that maps to the target subscription, do the following:

  1. Login to the Microsoft Azure Portal.

  2. When the portal opens, click on the Subscriptions option (as indicated by Figure 1).

    Figure 1 : Clicking on the Subscriptions option

  3. Figure 2 that appears next will list all the subscriptions that have been configured for the target Azure AD tenant. Locate the subscription that is being monitored in the list, and check the value displayed for that subscription in the Subscription ID column.

    Figure 2 : Determining the Subscription ID

  4. Copy the Subscription ID in Figure 2 to the text box corresponding to the SUBSCRIPTION ID parameter in the test configuration page.

Tenant ID

Specify the Directory ID of the Azure AD tenant to which the target subscription belongs. To know how to determine the Directory ID, refer to Configuring the eG Agent to Monitor the Microsoft Azure App Service

Client ID and Client Password

The eG agent communicates with the target Microsoft Azure Subscription using Java API calls. To collect the required metrics, the eG agent requires an Access token in the form of an Application ID and the client secret value. To know how to determine the Application ID and the key, refer to Configuring the eG Agent to Monitor the Microsoft Azure App Service. Specify the Application ID of the created Application in the Client ID text box and the client secret value in the Client Password text box.

Proxy Host

In some environments, all communication with the Azure cloud be routed through a proxy server. In such environments, you should make sure that the eG agent connects to the cloud via the proxy server and collects metrics. To enable metrics collection via a proxy, specify the IP address of the proxy server and the port at which the server listens against the Proxy Host and Proxy Port parameters. By default, these parameters are set to none, indicating that the eG agent is not configured to communicate via a proxy, by default.

Proxy Username, Proxy Password and Confirm Password

If the proxy server requires authentication, then, specify a valid proxy user name and password in the Proxy Username and Proxy Password parameters, respectively. Then, confirm the password by retyping it in the Confirm Password text box.

Show DD for In Progress Jobs

By default, this test does not provide detailed diagnostics for the In progress jobs measure. Accordingly, this flag is set to No by default.

If you want to know which backup jobs are in progress currently, then you need to enable detailed diagnosis for the In progress jobs measure by setting this flag to Yes. Before doing so, make sure that your eG database is well-tuned and adequately sized to handle the additional load imposed by these detailed metrics.

Show DD for Completed Jobs

By default, this test does not provide detailed diagnostics for the Completed jobs measure. Accordingly, this flag is set to No by default.

If you want to know which backup jobs have completed, then you need to enable detailed diagnosis for the Completed jobs measure by setting this flag to Yes. Before doing so, make sure that your eG database is well-tuned and adequately sized to handle the additional load imposed by these detailed metrics.

Datasource Location Names

In the Azure Backups context, a datasource represents the data to be backed by. By default, this test monitors all backup jobs that have been triggered for the target Azure Subscription, regardless of the location of the datasources that are being backed up. Accordingly, this parameter is set to ALL by default. However, if you want the test to monitor only those backup jobs that are taking backups of datasources in a specific location (eg., Central US, Brazil Sourth etc.), then specify that location name here. You can even configure multiple datasource location names, as a comma-separated list. For instance, your specification can be: Central US, Brazil South, East Asia

To know what location names you can provide, do the following:

  1. Login to the Azure portal.

  2. When Figure 3 appears, select the Backup center option as indicated.

    Figure 3 : Clicking on the Backup center option

  3. Figure 4 then appears, with the Overview option in the left panel selected by default. By default, the right panel of Figure 4 will provide an overview of all backup jobs that have been triggered for all subscriptions. To view only those backup jobs that are related to the subscription being monitored, select that subscription from the Datasource subscription drop-down in the left panel.

    Figure 4 : Selecting the Datasource subscription being monitored

  4. Next, focus on the Datasource location drop-down in the right panel. This is set to All by default. To know what datasource locations are available for selection, click on the Datasource location drop-down. A list of locations will appear (see Figure 5). From this list, make note of the location names that you want this test to monitor, and specify those names as a comma-separated list against the Datasource Location Names parameter.

    Figure 5 : Viewing the Datasource location names that you can configure for this test

Datasource Types

By default, this test monitors all back up jobs that have been triggered for the target Azure Subscription, regardless of the type of datasources that are being backed up. Accordingly, this parameter is set to ALL by default. However, if you want the test to monitor only those backup jobs that are taking backups of a specific type of datasource (eg., Azure Files, Azure Virtual machines, Azure Disks etc.), then specify that datasource type here. You can even configure multiple datasource types, as a comma-separated list. For instance, your specification can be: Azure Virtual machines,Azure Blobs,Azure Disks

To know what datasource types you can provide, do the following:

  1. Login to the Azure portal.

  2. When Figure 3 appears, select the Backup center option as indicated.

  3. Next, focus on the Datasource type drop-down in the right panel. This is set to All by default. To know what datasource tyoes are available for selection, click on the Datasource type drop-down. A list of datasource types will appear (see Figure 5). From this list, make note of the types that you want this test to monitor, and specify those types as a comma-separated list against the Datasource Types parameter.

    Figure 6 : Viewing the Datasource types that you can configure for this test

Vault Names

A vault is an online-storage entity in Azure that's used to hold data, such as backup copies, recovery points, and backup policies. Azure Backup stores backed-up data in Recovery Services Vaults or in Backup Vaults.

  • Backup Vault: This is a storage entity in Azure that houses backup data for various Azure services, such Azure Database for PostgreSQL servers and newer workloads that Azure Backup will support. Backup vaults are based on the Azure Resource Manager model of Azure.

  • Recovery Services Vault: This is a storage entity in Azure that houses copies of data and configuration information for Azure Virtual machines, workloads, servers, or workstations, and for various Azure services such as IaaS VMs (Linux or Windows) and Azure SQL databases. Recovery Services vaults are based on the Azure Service Manager model of Azure.

By default, this test monitors all backup jobs that have been triggered for the target Azure Subscription, regardless of the vaults in which the backed up data is stored. Accordingly, this parameter is set to ALL by default. However, if you want the test to monitor only those backup jobs that are storing backups in a specific vault, then specify the name of that vault here. You can even configure multiple vault names, as a comma-separated list. For instance, your specification can be: SMTest2Vault,TempVault,AzrVMVault. Note that your specification can include the names of both Backup vaults and Recovery Services vaults.

To know what vault names you can provide, do the following:

  1. Login to the Azure portal.

  2. When Figure 3 appears, select the Backup center option as indicated.

  3. Figure 7 then appears, with the Overview option in the left panel selected by default. Now, click on the Vaults option in the left panel, as indicated by Figure 7.

    Figure 7 : Clicking on the Vaults option in the left panel

  4. This will invoke Figure 8. By default, the right panel of Figure 8 will display all vaults that pre-exist across subscriptions. To view only those vaults that have been created for the subscription you are monitoring, select that subscription from the Vault subscription drop-down in the right panel. This will list those vaults that have been configured for the chosen subscriptiion. From this list, make a note of those vault names that you want to monitor, and then specify those names as a comma-separated list against the Vault Names parameter of this test.

    Figure 8 : Viewing the vault names that you can configure for this test

Long Running Jobs Limit in Min

Specify the duration (in minutes) beyond which a backup job should be running for it to be considered a long-running job. By default, the value of this parameter is set to 180 minutes. This means that, by default, this test will count the number of backup jobs that have been running for over 180 minutes, and report that count as the value of the Number of long running jobs measure. You can increase or decrease this time limit if you so need.

DD Frequency

Refers to the frequency with which detailed diagnosis measures are to be generated for this test. The default is 1:1. This indicates that, by default, detailed measures will be generated every time this test runs, and also every time the test detects a problem. You can modify this frequency, if you so desire. Also, if you intend to disable the detailed diagnosis capability for this test, you can do so by specifying none against DD frequency.

Detailed Diagnosis

To make diagnosis more efficient and accurate, the eG Enterprise embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option.

The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:

  • The eG manager license should allow the detailed diagnosis capability
  • Both the normal and abnormal frequencies configured for the detailed diagnosis measures should not be 0.
Measures made by the test:
Measurement Description Measurement Unit Interpretation

Number of failed jobs

Indicates the number of backup jobs that have failed.

Number

Ideally, the value of this measure should be 0. If a non-zero value is reported, it implies that one/more backup jobs have failed. In this case, you can use the detailed diagnosis of this measure to know which jobs failed.

Number of in progress jobs

Indicates the number of backup jobs that are in progress.

Number

The detailed diagnosis of this measure, if enabled, lists the jobs in progress.

Number of completed jobs

Indicates the number of backup jobs that have completed.

Number

The detailed diagnosis of this measure, if enabled, lists the jobs that have completed.

Number of long running jobs

Indicates the number of backup jobs that have been running for a duration beyond the configured LONG RUNNING JOBS LIMIT IN MIN.

Number

A high value is a cause for concern. In this case, use the detailed diagnosis of this measure to know which jobs have been running for a long time.

Use the detailed diagnosis of the Number of failed jobs measure to know the details of the failed backup jobs. These details include the name of the resource group where the job ran, the location and type of the datasource that was backed up, the name and type of vault in which the backup was stored, and the friendly name of the backup instance.

Figure 9 : Detailed diagnosis of the Number of failed jobs measure