ZFS Pools Test

ZFS is a combined file system and logical volume manager designed by Sun Microsystems. The features of ZFS include data integrity verification against data corruption modes, support for high storage capacities, integration of the concepts of filesystem and volume management, snapshots and copy-on-write clones, continuous integrity checking and automatic repair, RAID-Z and native NFSv4 ACLs.

ZFS uses the concept of storage pools to manage physical storage. Historically, file systems were constructed on top of a single physical device. To address multiple devices and provide for data redundancy, the concept of a volume manager was introduced to provide the image of a single device so that file systems would not have to be modified to take advantage of multiple devices. This design added another layer of complexity and ultimately prevented certain file system advances, because the file system had no control over the physical placement of data on the virtualized volumes.

ZFS eliminates the volume management altogether. Instead of forcing you to create virtualized volumes, ZFS aggregates devices into a storage pool. The storage pool describes the physical characteristics of the storage (device layout, data redundancy, and so on), and acts as an arbitrary data store from which file systems can be created. File systems are no longer constrained to individual devices, allowing them to share space with all file systems in the pool. You no longer need to predetermine the size of a file system, as file systems grow automatically within the space allocated to the storage pool. When new storage is added, all file systems within the pool can immediately use the additional space without additional work.

High usage of disk space in a pool can cause a severe contention for disk resources amidst the file systems sharing the space in the pool; this in turn results in slowdowns when users attempt to access data from these file systems. A high level of I/O activity on or bandwidth usage by a storage pool can also slowdown disk accesses. To ensure that such adversities do not occur, administrators need to constantly monitor the space usage and I/O operations of the storage pools. The ZFS Pools test facilitates this. Using this test, administrators can closely track the space usage and read-write operations to each storage pool, be proactively alerted to a potential space crisis in a pool, and accurately isolate those pools that are experiencing abnormal levels of bandwidth usage and I/O.

This test is disabled by default. To enable the test, go to the enable / disable tests page using the menu sequence : Agents -> Tests -> Enable/Disable, pick the desired Component type, set Performance as the Test type, choose the test from the disabled tests list, and click on the << button to move the test to the ENABLED TESTS list. Finally, click the Update button.

Target of the test : A Solaris host

Agent deploying the test : An internal agent

Outputs of the test : One set of results for each storage pool configured

Configurable parameters for the test
Parameter	Description
Test Period	How often should the test be executed.
Host	The host for which the test is to be configured.

Measurements made by the test

Measurement

Description

Measurement Unit

Interpretation

Pool size:

Indicates the total size of this pool.

The value of this measure is equal to the sum of the sizes of all top-level virtual devices.

Allocated space:

Indicates the amount of physical space allocated to all datasets and internal metadata in this pool.

Note that this amount differs from the amount of disk space as reported at the file system level.

Free space:

Indicates the amount of unallocated space in this pool.

Capacity in use:

Indicates the amount of disk space used, expressed as a percentage of the total disk space in this pool.

Percent

Ideally, the value of this measure should not exceed 80%. If space usage exceeds this threshold, consider using ZFS quotas and reservations to keep it under check.

You can use the quota property to set a limit on the amount of space a file system can use. In addition, you can use the reservation property to guarantee that some amount of space is available to a file system.

You can also dynamically add space to a pool by adding a new top-level virtual device.

Health:

Indicates the current health status of this pool.

The values that this measure can report, their numeric equivalents, and their descriptions have been discussed in the table below:

Measure Value	Numeric Value	Description
Offline	0	The device has been explicitly taken offline by the administrator.
Online	1	The device or virtual device is in normal working order.
Degraded	2	The virtual device has experienced a failure but can still function.
Unavail	3	The device or virtual device cannot be opened.
Faulted	4	The device or virtual device is completely inaccessible.
Removed	5	The device was physically removed while the system was running.

Note:

By default, this measure reports one of the Measure Values listed in the table above. The graph of this measure however will represent the health status using the numeric equivalents only.

Operations read:

Indicates the rate at which read I/O operations were sent to the pool or device, including metadata requests.

Kilobytes/Sec

High values of these measures are indicative of high levels of I/O activity on a pool. Compare the values of these measures across pools to identify the I/O-intensive pools.

Operations write:

Indicates the rate at which write I/O operations were sent to the pool or device.

Kilobytes/Sec

Read bandwidth:

Indicates the bandwidth of all read operations (including metadata).

Kilobytes/Sec

High values for these measures indicate high bandwidth usage by a pool. By comparing the values of these measures across pools, you can isolate those pools that consume bandwidth excessively, and also understand when they spend too much bandwidth - when reading? or writing?

Write bandwidth:

Indicates the bandwidth of all write operations.

Kilobytes/Sec

Scrub status:

Indicates the status of ZFS scrubs that may have been performed on this pool during the last 8 days.

ZFS Scrubs allows you to schedule and manage scrubs on a ZFS volume. Performing a ZFS scrub on a regular basis helps to identify data integrity problems, detects silent data corruptions caused by transient hardware issues, and provides early alerts to disk failures. If you have consumer-quality drives, consider a weekly scrubbing schedule. If you have datacenter-quality drives, consider a monthly scrubbing schedule.

Depending upon the amount of data, a scrub can take a long time. Scrubs are I/O intensive and can negatively impact performance. They should be scheduled for evenings or weekends to minimize the impact to users.

The values that this measure can take and their corresponding numeric values have been detailed below:

Measure Value	Numeric Value
Scrub completed	1
Scrub in progress \| resilver	2
Scrub in progress	3
Scrub repaired	4
None requested	5
Expired	6

Note:

By default, this measure reports one of the Measure Values listed in the table above. The graph of this measure however will represent the scrub status using the numeric equivalents only.