At Apriorit, we developed several custom Windows and Linux virtual file system implementations, and so we decided to share our knowledge on the topic in this series of articles. This article will be useful for any developers who wish to create Windows virtual file system that can process file operations in its own fashion.
File system virtualization is a great technique for protecting users from the complexities of storage management, especially when files are stored in various points across networks. Each virtual file system implementation allows presenting the data storage to the user in a way you want it, using computing to completely separate representation from the way files are actually stored. However, implementing virtual file system can be fairly complex, and requires knowledgeable and experienced development team.
The solution that virtualizes file system, described in this article, has become popular due to the rapid development of services such as Dropbox and Google Drive for accessing files remotely. All the popular cloud storage providers offer file APIs to work with files in the cloud from your applications. Using such an API, a developer can implement a logical drive that works directly with files in cloud storage.
Here’s the basic structure of such a solution:
- Kernel mode driver
- User mode service
- Mounting utility (can be a simple console app)
The kernel mode driver redirects file operation requests to the user mode service, which provides the interface for processing these file operations in user mode. This approach allows a developer to abstract kernel mode development and file entities and work only with file operations.
Let’s consider each part of this virtualization file system solution in detail.
The driver implements the file system. It includes the logic for redirecting file operations and managing disks.
For our purposes, during installation this driver will create a few devices.
The first device the driver will create is a control device that’s used for disk management. It provides a developer with capabilities to mount and unmount drives.
The second device that will be created is a communication device. It implements synchronization between the driver and the user mode service. This service sends certain codes to the device to indicate its state: whether it’s going to start or stop or is ready to receive requests.
The last device that will be created is a redirector device. It catches file operation requests sent to the mounted drive and redirects them to the user mode service that must implement handlers for these operations.
When these three devices are created, the driver is configured with a handler function for request processing. For that purpose, a single function is used: FSDispatchRequest. This is the most crucial part of the driver and should be implemented carefully.
Since the driver is a file system and is not a file system filter, it must handle every file operation code itself – codes cannot be forwarded to other file systems, but can be forwarded only to the driver for the disk storage that is formatted for this file system.
Once the virtual disk driver is fully initialized, Windows I/O Manager can ask it to recognize a new volume. If the volume is not recognized by any file system, the RAW file system will be assigned and the user will be asked to format the disk when he or she accesses that drive for the first time. However, this does not happen for our virtual disk! When the disk mounting request arrives, our file system is assigned to the new volume and, from this moment on, the I/O manager starts sending file operation requests for the mounted disk to our driver.
When these file operation requests arrive, they’re all sent to the service for processing in user mode.
All control codes (except IRP_MJ_DEVICE_CONTROL) correspond to Win32 file operations. The IRP_MJ_DEVICE_CONTROL code is designed to solve tasks that are not directly related to file operations. In the solution described in this article, this code is used by all devices for disk management, service synchronization, and handling of file system control codes. In order to detect for which device a request has been sent, its header is checked.
Requests that are sent to the communication device perform service synchronization tasks. The service sends these requests to indicate its readiness to recover file operations for further processing (IOCTL_FS_SEND_REQUEST) and to send responses regarding certain file operations (IOCTL_FS_RECEIVE_RESPONSE). It also sends signals when it is going to start (IOCTL_FS_START) or stop (IOCTL_FS_STOP and IOCTL_FS_FILE_CACHE_CONTROL).
If the request is not for the communication device, it’s checked whether it’s for the control device. The control device can process disk management requests such as to mount a disk (IOCTL_FS_ADD_DISK), unmount a disk (IOCTL_FS_DELETE_DISK) or get a list of existing drives (IOCTL_FS_GET_DISKS).
Finally, if the request is sent neither for the communication device nor for the control device, it’s forwarded to the redirector device that implements handling of requests for removable storage, such as refreshing directories, querying volume names, and so on.
The mounting tool is an application that works with a control device and allows a user to mount or unmount a drive.
The implementation of the simplest mounting tool looks like this:
DiskInfo is a simple structure with parameters for mounting or unmounting a drive.
In the simplest case, it’s enough to specify a command (map or unmap) and a disk letter. Optionally, a disk label or different cache path (other than C:\cache) can be specified. If a disk label is not specified, one will be generated by the driver. The ParseArguments function parses arguments from CLI and fills in the DiskInfo structure. Then the ExecuteMapCommand/ExecuteUnmapCommand function sends the corresponding control code to the control device.
If the driver receives an IOCTL_FS_ADD_DISK request, the FSAddDisk function is called by the driver.
This function checks if a driver with the mentioned letter already exists.
It also generates a name for the volume if one is not provided by the mounting tool.
When a disk letter is checked and the volume name is prepared, a volume device and a disk device are created.
Now the disk and volume devices have been created, and the request is sent to the user mode service. The virtual disk is ready for use.
The service implements the logic for handling disk requests in user mode. It starts the DeviceRequestThreadProc thread for request processing using the QueueUserWorkItem Win32 function.
In this DeviceRequestThreadProc thread the service indicates that it's ready to receive a request
Once a request has been received from the driver, it’s processed in a separate thread depending on the type of operation.
In order to process file system requests, the service provides an interface for custom implementation of all file operations. (Note that in the example below, most parameters are replaced with “…” for the sake of simplicity.)
When this interface is implemented, users are able to create a drive to work with files as desired.
File System Virtualization – Part 2
Let’s consider the following system with only one drive.
The mounting tool in this example is adjusted with additional parameters that are required for working with the box files API.
The result of running the FSDiskControl mounting tool is a new drive that works with files in your box.com file storage:
In this article, we’ve described a solution that allows users to implement virtual file system in operating system for the price of implementing a single interface. Furthermore, no kernel mode implementation or advanced file system knowledge is required, and users can rely on any high-level libraries and solutions they like.
In the second part of this article, we’ll provide an example of a cloud service plugin (like that shown in the example in this article) and describe its implementation.