Backing up information in a secure and timely manner is the number one rule of data protection. In Windows, you can back up data with a minifilter driver. It allows you to restore data after any changes and protects backups from malicious user-mode processes, undesired encryption, and ransomware.
In this article, we discuss Windows restrictions on file management in the user space, roles of the ntdll library, I/O request packets, and minifilters. In the second part, we provide a tutorial on how to develop a minifilter driver to back up files. You’ll also find Github link to the example of our driver.
This article will be useful for Windows driver developers who want insights into Windows file management and want to back up data without interference from user space processes.
In our tutorial, we show you how to securely save critical data. By “securely,” we mean that you can recover this data even if:
- the files are deleted or damaged
- the file system is damaged (e.g. a directory is formatted or encrypted)
To achieve this, we need to isolate data from user space processes such as those launched by encryption ransomware or backup blocking software.
We also need to decide when we want to back up data and choose the format and storage location before we start backing it up.
The right time to back up data
The first option for planning backups is using a timer. This solution has several drawbacks:
- The ABA problem. A file can be created and deleted, or it can be substituted for a file with the same name in between backups. In this case, we’ll have no records of what happened to the file between control points set by the timer.
- Interprocess synchronization. It’s hard to guarantee that a file wasn’t changed while being saved.
Therefore, it’s best to detect modifications and save altered data in response to changes. It’s also important to minimize the interference of third-party code before data is backed up in its present state.
Backup data format
In our example, we’ll save the contents of a file after each modification. You can develop a more complex solution by saving changes in diff files after saving the contents of the original file. This will spare disk space, but the diff files will be useless if the original control value is lost.
Storage for backups
Since encryption ransomware is a threat, we need to isolate copied data from its original version by storing copies outside the original file system. This will prevent access by random processes. Therefore, we won’t store backups on the same logical disk as the source data.
We can also create an SDK for user programs that require access to saved data. In this tutorial, we’ll create a basic SDK to illustrate how it works with the driver, but we’ll only use it to create a partition to store backups.
These challenges and solutions require us to operate in the Windows kernel space. But before developing a backup driver, we need to investigate some unobvious limitations and special features of Windows file management.
Before starting to develop our minifilter driver, let’s discuss the basic principles of Windows file management. This theoretical background will be useful for driver development rookies, as it teaches you how:
- Windows manages user processes
- the core library of the Windows API operates
- a function call is executed in the kernel space
- input/output (I/O) request packets (IRPs) help to manage drivers and driver stacks
If you’ve already developed Windows drivers, you can skip this part of the article and move on to our driver development tutorial.
User process limitations
The most obvious way to protect data from user space processes that may include ransomware is to isolate that data from the Windows operating system. To do this, you need to develop an application that doesn’t communicate with Windows while running. How can you do that? It’s impossible for several reasons.
Any process you run has to inform the operating system about its termination. Moreover, some virtual memory management processes are executed by the kernel and hidden from the user.
If we examine the work of our code, we’ll find out that we can work with only the contents of a process’s address space. Moreover, we’ll be limited to the stack of the main thread, as any other processes require dynamic memory allocation, which is impossible without calling the operating system. What we can do is call pure functions that don’t require the Windows API, operate with conditions, and carry out any calculations.
An application developed with such severe limitations turns into a black box. A user can’t get any results from it. The only possible application for such code is increasing the load on the processor and collecting metrics.
Therefore, we have no choice but to call the Windows API to back up data. Let’s see how this API works and manages calls.
All roads lead to the ntdll library
The Windows API is distributed across lots of dynamic libraries, but in fact, most calls lead to code executing in the ntdll library. It’s loaded in every running process and is the only way to access the operating system kernel from the user space.
At the same time, Microsoft documentation recommends avoiding direct calls to subprograms of this library and doesn’t provide a description of most ntdll features. Instead, they recommend using wrapper libraries such as kernel32.dll for accessing the kernel.
However, if we take a closer look at the core code of ntdll subprograms, we’ll see that most of them contain only a few instructions similar to these:
Here’s a concise description of how ntdll subprograms operate:
- We upload the code that contains a required system call to the AX register and execute a system call that generates a hardware interruption.
- The processor needs to handle the interruption, so it finds a corresponding handler (in this case, the system call dispatcher) and passes breakpoint control to it.
The handler’s job is to:
- get the system call code from the AX register
- save current parameters of the process that called for interruption (this is required to return control of the process after executing the syscall instruction)
- delegate control to the handler of the specific call (in our example, 0x55 code reflects the CreateFile call)
The rest of the call execution happens in the kernel space.
Moving to the kernel space
For our example, we’ll call CreateFile. This function is described in detail in Windows documentation and tutorials. In general, the CreateFile execution thread looks like this:
The interrupt handler (ntkrnlmp.exe) is located in the operating system kernel. When a system call is recognized, the object manager starts handling it. The object manager is responsible for parsing the path name transmitted to a file and comparing it to the device object in the kernel to which drivers need to handle access.
After that, the handler is governed by the I/O manager. It interacts with device drivers by creating and transmitting a data structure called the IRP.
IRP, or there and back again
In Windows, anything can be a device object, from real devices to operating system–level abstractions such as file systems. Device object logic is implemented by drivers associated with the device. Each driver registers processing functions for any actions with the device (create, read, write, etc).
To start working with a device, the I/O manager creates an IRP and calls the first driver with the IoCallDriver function. IRP is simply a data structure that contains information accessible to drivers. IoCallDriver finds a required function in the driver stack table and calls it. After the driver executes the function, it performs one of three actions:
- Calls IoCallDriver and passes a request to the next driver in the stack. File system filters built on the scheme above behave this way.
- Calls IoCompleteIrp and finishes the call execution. The call returns to the I/O manager through the driver stack and then to the NTFS driver.
- Returns the call with a STATUS_MORE_PROCESSING_REQUIRED note. In this case, the I/O manager does nothing with this IRP. We’ll take a closer look at this scenario later.
Minifilters instead of legacy file system filters
You can filter traffic associated with a device using the legacy file system filter execution shown in Figure 1. This relies on a simple driver. We can also perform the same filtering with an operating system abstraction called a minifilter. Windows documentation recommends using minifilters instead of outdated filter drivers and provides guidelines on how to port old filters to minifilters.
A minifilter is a model of a file system filter driver. Using minifilters, we can both process common commands such as create, read, and write and register pre- and post-operations to the operations we need.
Minifilters are implemented with a filter manager driver. It manages minifilters and passes calls to them. Here’s an example of filter manager operations:
With this knowledge in mind, we can start developing our own minifilter to back up data.
Our solution consists of two parts: the FileMon.sys driver and Manager.exe to manipulate the driver. The application is limited to creating a logical disk to store backups — in particular, writing a file name and its content.
Our solution will work this way:
- The driver stores data on a separate logical disk.
- When a file in C:\\storage is changed, the name of the file and its new content are recorded on that separate disk.
- When a file is renamed, the driver notices it and saves new backups of this file using the new name.
- An application in the user space can assign the driver to a logical disk.
Important! Any mistake in driver development may result in the blue screen or unstable system operation.
Using the minifilter driver, we can filter all file system traffic and handle each IRP. We need to look out for packets such as these:
- IRP_MJ_CREATE and IRP_MJ_SET_INFORMATION for monitoring file access and renaming files
- IRP_MJ_WRITE for modifying the contents of each file that was accessed or renamed
When our driver receives the IRP_MJ_CREATE packet, it needs to remember that this file was accessed and be able to identify operations that modify it. In order to do that, the driver has to establish a user context of objects associated with IRP. We can define the context structure this way:
Then, we can create the context with the following code:
After that, we get the information we need:
Since we also analyze changes to the file, we can simply save the reference to a file buffer while pre-processing the packet and write a log while post-processing it. Here’s how it looks:
Next, we save the reference to the buffer:
Finally, we use the recorded reference:
By now, we have developed a method for identifying files and recording their modification. Let’s take a look at the issues you can face.
Backing up and caching
Reading and writing data to a hard drive are time-consuming processes. Writing a structure to a hard drive is much slower than writing the same structure to random-access memory (RAM). Also, the writing process is slowed down because it’s only possible to write fixed-size blocks to a disk.
If we need to change several bytes of a file, we’ll have to read and rewrite a whole block. Moreover, several processes can work with the same file. To speed up these operations, modern operating systems cache files in RAM.
It’s worth noting that the Windows caching manager operates at the file level (operating system-level abstraction). That’s why we don’t need to think about the hard drive — we need to think only about the files we want to back up. Modified files are transformed into blocks only when flushing the cache to the hard drive. Flushing happens either on command from the manager or when the cache limit is reached.
In our solution, it’s easy to ignore caching for two reasons:
- Cache files mostly contain temporary changes we don’t need to back up.
- We need to keep backup files in line only with data recorded to the disk.
That’s why we’ll intercept only the packets responsible for writing data to the disk. We can implement this with the MiniFilter_PreWrite condition:
If this condition is met, we intercept file system cache that we don’t need. If the condition is in the “else” body, we intercept data/file cache flush, and that’s exactly the data we need to back up.
However, we can’t simply skip working with the cache because if we do, we’ll have no backups of small folders we need to save.
MFT and small files
MFT is the core file of NTFS operations. This file fully describes the file system and may be located anywhere on the disk. Each record in the MFT describes a file or a directory and has a fixed size of 1 KB. A record consists of file attributes (name and contents) and a list of disk addresses where the file is stored.
If the file attributes can be stored in the record, NTFS will store the file directly in the MFT instead of on the disk. We can fix this issue by changing the file size in such a way that the file system will have to allocate disk space for those files. This can be done with MiniFilter_PreWrite:
Now we’re sure that even the smallest files with valuable data will be backed up.
Let’s suppose we noticed a file was successfully saved to the disk and now operating in the body of MiniFilter_PostWrite. We need to back up new data by re-recording that new data to another place on the disk before we allow a user process to manage the file. We need to send the IRP down the driver stack once again, but this time we’ll create the IRP ourselves.
Here’s what we need to do:
- Call the IoAllocateIrp function to create the IRP and set up the parameters we need. If needed, we can allocate additional memory in the memory descriptor list (MDL).
- Create a callback using an IoSetCompletionRoutine function. This function will be called after the IoCompleteIrp call is handled down the driver stack.
- Send the IRP down by calling the IoCallDriver function. After that, handling of the IRP continues in the callback body.
- Free MDLs associated with the packet and delete the packet by calling the IoFreeIrp function.
- Return STATUS_MORE_PROCESSING_REQUIRED to I/O manager. Despite the misleading name, this method stops any activity with the packet. Also, by this point, the packet doesn’t exist.
Once this process is finished, new data has been successfully backed up.
Despite having rather specific code for a minifilter, our solution is still a driver. We can register a handler to manage the driver with messages sent over the Device Input and Output Control (IOCTL) protocol. To do that, we need to define the IOCTL_SET_STORAGE_VOLUME command to install a logical disk. We’ll call this command from the user space.
Windows File System Driver Development
In the user space, we simply need to use the API to open the object with which the driver is associated and send the command to it by calling the DeviceIoControl function.
The handler is described in the driver with the following code:
In the user space, creating the handler looks like this:
When the handler is ready, it means our minifilter driver is ready as well! Let’s launch it and see how it operates.
You’ll find the demo of our minifilter on the Github.
Note: It’s better to test the driver on a virtual machine.
The driver is stored in the FileMon.sln file. To launch it, you have to:
- Enable test signing mode in Windows.
- Right-click on the .inf file and choose Install.
sc start filemonas an administrator.
Then, you need to use Manager.sln to assemble Manager.exe:
- Prepare the environment and add the free volume that doesn't have a file system.
- Enumerate the volumes by running Manager.exe enum_volumes and find the number of this image.
- Initialize the Manager.exe set_volume <put-your-number-here> driver.
- Create the folder C:\\storage.
- Add files and save some data to this folder. Currently, the driver monitors only records and ignores empty files.
- Use the disk viewer to view the log (we used HxD).
If the driver works correctly, you’ll see records such as these:
In this article, we’ve discussed aspects of Windows file management components and minifilter drivers. We’ve also provided an example of how to use this knowledge to create a native backup implementation in a protected directory. The demo of our driver is here.
With this driver, you can back up valuable data after each modification and protect it from tampering, unwanted encryption, malicious user processes, and ransomware. Additionally, you can easily add functionality such as interactions between the driver and the manager by registering new operations. For example, it’s possible to change feature implementation to back up data without rewriting the driver.
At Apriorit, we have a dedicated Windows driver development team that has successfully implemented dozens of complex projects. If you have a challenging project in mind, feel free to contact us and start a discussion!