AWS Storage Gateway Test

AWS Storage Gateway connects an on-premises software appliance with cloud-based storage to provide seamless integration with data security features between your on-premises IT environment and the Amazon Web Services (AWS) storage infrastructure.

AWS Storage Gateway offers file-based, volume-based and tape-based storage solutions:

  • File Gateway – File gateway is a type of AWS Storage Gateway that supports a file interface into Amazon S3 and that adds to the current block-based volume and VTL storage. File gateway combines a service and virtual software appliance, enabling you to store and retrieve objects in Amazon S3 using industry-standard file protocols such as Network File System (NFS). The software appliance, or gateway, is deployed into your on-premises environment as a virtual machine (VM) running on VMware ESXi. The gateway provides access to objects in S3 as files on a NFS mount point.
  • File gateway also provides low-latency access to data through transparent local caching. File gateway manages data transfer to and from AWS, buffers applications from network congestion, optimizes and streams data in parallel, and manages bandwidth consumption.

  • Volume Gateway – Volume gateway provides cloud-backed storage volumes that you can mount as Internet Small Computer System Interface (iSCSI) devices from your on-premises application servers. The gateway supports the following volume configurations:

    • Cached Volumes – You store your data in Amazon Simple Storage Service (Amazon S3) and retain a copy of frequently accessed data subsets locally.
    • Stored volumes - If you need low-latency access to your entire data set (and not just the frequently accessed data set), you can configure your on-premises gateway to store all your data locally and then asynchronously back up point-in-time snapshots of this data to Amazon S3.
  • Tape Gateway – Tape Gateway provides a virtual tape infrastructure that scales seamlessly with your business needs and eliminates the operational burden of provisioning, scaling, and maintaining a physical tape infrastructure.

In order to ensure the peak performance of their mission-critical applications, administrators must make sure that the storage gateway used by and volumes provisioned for the on-premises applications are able to process I/O requests quickly and are sized commensurate to the current and anticipated load.

The AWS Storage Gateway Test helps administrators with this analysis. This test auto-discovers the storage gateways configured on AWS and reports the I/O throughput, cache usage, and I/O latency of each storage gateway. In the process, the test pinpoints overloaded gateways and those that are experiencing slowness when processing I/O requests. With the help of the test, you can also judge how effectively/otherwise the cache is being used, and determine how the cache can be tweaked to improve performance.

You can also optionally configure the test to report metrics for each volume, instead of each storage gateway. Using the per-volume metrics that the test reports, you can quickly identify volumes under duress, and can rapidly initiate performance improvement measures.

Target of the test: Amazon EC2 Cloud

Agent deploying the test : A remote agent

Outputs of the test : One set of results for each storage gateway / volume (as the case may be)

Configurable parameters for the test
Parameter Description

Test Period

How often should the test be executed.

Host

The host for which the test is to be configured.

AWS Access Key, AWS Secret Key, Confirm AWS Access Key, Confirm AWS Secret Key

To monitor an Amazon EC2 instance, the eG agent has to be configured with the access key and secret key of a user with a valid AWS account. For this purpose, we recommend that you create a special user on the AWS cloud, obtain the access and secret keys of this user, and configure this test with these keys. The procedure for this has been detailed in the Obtaining an Access key and Secret key topic. Make sure you reconfirm the access and secret keys you provide here by retyping it in the corresponding Confirm text boxes.

Proxy Host and Proxy Port

In some environments, all communication with the AWS EC2 cloud and its regions could be routed through a proxy server. In such environments, you should make sure that the eG agent connects to the cloud via the proxy server and collects metrics. To enable metrics collection via a proxy, specify the IP address of the proxy server and the port at which the server listens against the Proxy Host and Proxy Port parameters. By default, these parameters are set to none , indicating that the eG agent is not configured to communicate via a proxy, by default.

Proxy User Name, Proxy Password, and Confirm Password

If the proxy server requires authentication, then, specify a valid proxy user name and password in the Proxy User name and Proxy Password parameters, respectively. Then, confirm the password by retyping it in the Confirm Password text box. By default, these parameters are set to none, indicating that the proxy sever does not require authentication by default.

Proxy Domain and Proxy Workstation

If a Windows NTLM proxy is to be configured for use, then additionally, you will have to configure the Windows domain name and the Windows workstation name required for the same against the Proxy Domain and Proxy Workstation parameters. If the environment does not support a Windows NTLM proxy, set these parameters to none.

Exclude Region

Here, you can provide a comma-separated list of region names or patterns of region names that you do not want to monitor. For instance, to exclude regions with names that contain 'east' and 'west' from monitoring, your specification should be: *east*,*west*

Gateway Filter Name

By default, this test reports metrics for each storage gateway configured. Accordingly, this flag is set to GatewayId by default. In this case, the measures of this test will be aggregated across all storage volumes of a storage gateway. If you want, you can configure this test to report usage metrics per storage volume. For this, set this flag to VolumeID.

Measurements made by the test
Measurement Description Measurement Unit Interpretation

Read data

If the Gateway Filter Name parameter is set to GatewayID, then this measure reports the amount of data that on-premises applications read from this storage gateway for all volumes in the gateway.

If the Gateway Filter Name is set to VolumeID, then this measure will report the amount of data that was read from this volume by on-premises applications.

KB

If the value of these measures is consistently low for any gateway/volume, it indicates low throughput.

Here are some recommended best practices for optimizing gateway performance:

  • Add high performance disks such as solid-state drives (SSDs) and a NVMe controller.
  • Attach virtual disks to your VM directly from a storage area network (SAN) instead of the Microsoft Hyper-V NTFS.
  • Confirm that the virtual processors that are assigned to the gateway VM are backed by an equal number of cores and that you are not oversubscribing the CPUs of the host server.
  • You can add additional CPUs to the gateway host server.
  • When you provision disks in a gateway setup, we strongly recommend that you do not provision local disks for the upload buffer and cache storage that use the same underlying physical storage disk.
  • For volumes gateways, if you find that adding more volumes to a gateway reduces the throughput to the gateway, consider adding the volumes to a separate gateway. In particular, if a volume is used for a high-throughput application, consider creating a separate gateway for the high-throughput application. However, as a general rule, you should not use one gateway for all of your high-throughput applications and another gateway for all of your low-throughput applications.

Write data

If the Gateway Filter Name parameter is set to GatewayID, then this measure reports the amount of data that on-premises applications wrote into this storage gateway for all volumes in the gateway.

If the Gateway Filter Name is set to VolumeID, then this measure will report the amount of data read that was written into this volume by on-premises applications.

KB

Read time

If the Gateway Filter Name parameter is set to GatewayID, then this measure reports the time taken by on-premises applications read from storage volumes in this gateway.

If the Gateway Filter Name is set to VolumeID, then this measure will report the time taken by on-premises applications to read from this volume.

Secs

 

An abnormally high value for these measures indicates an I/O processing bottleneck. You may want to investigate the slowdown further and isolate its root-cause. The best practices discussed in the Interpretation of the Read data and Write data measure can be employed to optimize gateway performance and avert such anomalies.

 

Write time

If the Gateway Filter Name parameter is set to GatewayID, then this measure reports the time taken by on-premises applications to write into all storage volumes in this gateway.

If the Gateway Filter Name is set to VolumeID, then this measure will report the time taken by on-premises applications to write into this volume.

Secs

Queued writes

If the Gateway Filter Name parameter is set to GatewayID, then this measure reports the amount of data waiting to be written to all volumes of this gateway.

If the Gateway Filter Name is set to VolumeID, then this measure will report the amount of data waiting to be written to this volume.

KB

A high value of this measure or a steady increase in the value of this measure for a storage gateway/volume could indicate an I/O processing bottleneck.

Compressed data downloaded

Indicates the amount of compressed data that all volumes of this gateway downloaded from AWS.

KB

This measure is reported only for each storage gateway, and not for each volume.

Compressed data uploaded

Indicates the amount of compressed data that all volumes of this gateway uploaded to AWS.

KB

This measure is reported only for each storage gateway, and not for each volume.

Compressed data read time

Indicates the amount of time taken to read compressed data from gateway.

Secs

These measures are reported only for each storage gateway, and not for each volume.


A steady increase in the value of this measure could indicate an I/O processing bottleneck.

 

Compressed data write time

Indicates the amount of time taken to write compressed data into gateway.

Secs

Data usage of upload buffer

Indicates the percent usage of this gateway's upload buffer.

Percent

This measure is reported only for cached volume gateways and tape gateways.

To prepare for upload to Amazon S3, a cached volume gateway and/or a tape gateway stores incoming data in a staging area, referred to as an upload buffer. Your gateway uploads this buffer data over an encrypted Secure Sockets Layer (SSL) connection to AWS, where it is stored encrypted in Amazon S3.

A value close to 100% for this measure indicates that the disk used by the storage gateway as the upload buffer is running out of space. This can happen if the gateway is unable to write data to Amazon S3 at the same pace at which it writes to the buffer. This in turn implies a bottleneck when uploading.

This can also happen if the disk is not sized right. The minimum disk space recommendation for the working storage upload buffer is 150 GiB and the maximum is 2 TiB.

Data used in upload buffer

Indicates the total number of bytes being used in this gateway's upload buffer.

KB

This measure is reported only for cached volume gateways and tape gateways.

Free data in upload buffer

Indicates the total amount of unused space in this gateway's working storage.

KB

This measure is reported only for cached volume gateways and tape gateways.

A high value is desired for this measure.

Free space in gateway's working storage

Indicates the total amount of unused space in this stored volume gateway's upload buffer.

KB

This measure is reported only for stored volume gateways.

To prepare for upload to Amazon S3, a stored volume gateway stores incoming data in a staging area, referred to as an upload buffer/working storage. Your gateway uploads this buffer data over an encrypted Secure Sockets Layer (SSL) connection to AWS, where it is stored encrypted in Amazon S3.

Adequate free space should be available in the working storage to enable the gateway to store all the incoming data before upload. A high value is hence desired for this measure. The minimum disk space recommendation for the working storage is 150 GiB and the maximum is 2 TiB.

Data usage of gateway's working storage

Indicates the percent usage of this storage volume gateway's working storage.

Percent

This measure is reported only for stored volume gateways.

A value close to 100% for this measure indicates that the disk used by the storage volume gateway as the working storage is running out of space. This can happen if the gateway is unable to write data to Amazon S3 at the same pace at which it writes to the working storage. This in turn implies a bottleneck when uploading.

This can also happen if the disk is not sized right. The minimum disk space recommendation for the working storage upload buffer is 150 GiB and the maximum is 2 TiB.

Data used in gateway's working storage

Indicates the total amount of data being used in the storage volume gateway's upload buffer.

KB

This measure is reported only for stored volume gateways.

Application reads served from cache

If the Gateway Filter Name parameter is set to GatewayID, then this measure reports the percentage of application reads served from this gateway's cache.

If the Gateway Filter Name is set to VolumeID, then this measure will report the percentage of read operations from this volume that are served from the cache.

Percent

Ideally, the value of this measure should be above 80%. If not, then it means that many read requests are being serviced by directly accessing the data in AWS. This can increase I/O overheads and adversely impact application performance.

Usage of gateway's cache storage

If the Gateway Filter Name parameter is set to GatewayID, then this measure reports the percent usage of this gateway's cache.

If the Gateway Filter Name parameter is set to VolumeID, then this measure reports what percentage of the gateway's cache storage is used by this volume.

Percent

If the value of this measure grows steadily close to 100%, it denotes the excessive usage of that gateway's cache storage.

If the value of this measure is close to 100% for a volume, it implies that a particular volume is taking up too much cache space.

If the gateway's cache storage runs out of space, then the cache will no longer be able to hold frequently-accessed objects; this in turn will increase cache misses and related overheads. This is why, the cache storage has to be sized rightly. The recommended minimum cache size is 150 GiB and the maximum is 16 TiB.

Gateway's cache has not been persisted to AWS

If the Gateway Filter Name parameter is set to GatewayID, then this measure reports the percentage of this gateway's cache that has not been persisted to AWS.

If the Gateway Filter Name parameter is set to VolumeID, then this measure reports what percentage of the gateway's cache storage has not been persisted to this volume of AWS.

Percent

As your applications write data to the storage volumes in AWS, the gateway initially stores the data on the cache storage before uploading the data to Amazon S3.

The value of this measure represents the amount of cached data that is yet to be uploaded to Amazon S3. If this value is very high, it could indicate that the gateway is having trouble uploading data to AWS. You may want to investigate the reasons for the same. In the process, you may also want to configure this test to report metrics and volume, and identify the exact volume on AWS to which maximum data has not been uploaded.

Total cache size

Indicates the amount of data stored in this gateway's cache.

KB

This measure is reported only for each storage gateway, and not for each volume.

Time since last available recovery point

Indicates the time since the last available recovery point of this gateway's cache storage.

Secs

This measure is reported only for each storage gateway, and not for each volume.

A volume recovery point is a point in time at which all data of the volume is consistent. You can clone a volume or create a snapshot of it from its recovery point.

Data used in gateway's cache storage

Indicates the amount of data being used in this gateway's cache storage.

KB

This measure is reported only for each storage gateway, and not for each volume.

Free space in gateway's cache storage

Indicates the total amount of unused space in this gateway's cache storage.

KB

This measure is reported only for each storage gateway, and not for each volume.

Ideally, the value of this measure should be high.