Monitoring Distributed File Systems

 With Distributed File System (DFS), administrators can make it easy for users to access and manage files that are physically distributed across a network. They can make files distributed across multiple servers appear to users as if they reside in one place on the network. Users no longer need to know and specify the actual physical location of files in order to access them.

For example, if marketing material is scattered across multiple servers in a domain, administrators can use DFS to make it appear as though all of the material resides on a single server. This eliminates the need for users to go to multiple locations on the network to find the information they need. Using DFS, administrators can group shared folders located on different servers by transparently connecting them to one or more DFS namespaces.

Using the DFS tools, an administrator selects which shared folders to present in the namespace, designs the hierarchy in which those folders appear, and determines the names that the shared folders show in the namespace. When a user views the namespace, the folders appear to reside on a single, high-capacity hard disk. Users can navigate the namespace without needing to know the server names or shared folders hosting the data.

Moreover, DFS also offers the DFS Replication Service, with the help of which multiple copies of the same data can be created and stored in different namespace servers. This facilitates fault-tolerance and load-sharing.

It is hence evident that continuous availability, rapid and reliable access, and easy management of shared files and folders are the cornerstones of the DFS architecture. If DFS fails to deliver on these promises, users will be unable to access the files they want when they want it. This in turn will adversely impact user productivity and user confidence in the technology. This is why, administrators need to periodically run health checks on DFS and ensure that the DFS-managed files/folders are available and accessible at all times. To help administrators achieve this, eG Enterprise provides a specialized Microsoft DFS monitoring model.

Figure 7 : Layer model of the Microsoft DFS

Each layer of this model is mapped to tests that check the following at configured intervals:

  • Are referral requests being processed quickly? Which DFS namespace is the slowest in responding to referral requests?

  • How effective is the compression algorithm used by the DFS replication service? Is it saving bandwidth when performing replication? When replicating which folder was bandwidth saving the lowest? Are any replication connections consuming bandwidth excessively? 

  • Are the staging and Conflict and Deleted folders sized adequately for all replication folders?

  • Is the quota configuration of the staging and conflict and deleted folders right?

  • Is any replication folder experiencing replication bottlenecks? Which one?

  • How is the API request load on the namespace server? Is the server able to handle the load?

These metrics shed light on the following:

  • A potential slowdown when accessing a namespace on the namespace server;
  • Sizing inadequacies of the namespace server
  • Bottlenecks in replication and the replication folders they affect;
  • Impact of replication on bandwidth usage;
  • How quota configurations affect the speed and efficiency of replication;

The sections that follow elaborate on the DFS Replication Service layer alone, as the other layers have already been discussed in the Monitoring Unix and Windows Servers document.