LinkD
Understanding Win32 Reparse Points
Reparse points are an often used, but seldom understood object that's available with Windows. At the time of writing, reparse points are only available with NTFS, but they could be made available with future file systems as well.
Reparse points allow us to things such as mounting a disk volume to a folder, or allowing a folder to point to some other folder (useful for managing disk space in some situations, or for creating a contiguous file tree out of non-contiguous directories).
However, using these features without understanding the subtleties of how it works can often lead to mistakes being made that can directly impact your ability maintain the files system, perform backups, manage security, etc..
In this article we'll explore basics of what a reparse point is, and then focus on understanding the reparse points used for Junctions and Volume Mount Points.
A Junction is a reparse point used to point one directory to another.
A Volume Mount Point is a reparse point used to point a directory to a disk volume, so the disk volume is mounted to that directory.
What is a Reparse Point
According to the MSDN reparse point page, a reparse point is as follows:
A file or directory can contain a reparse point, which is a collection of user-defined data. The format of this data is understood by the application which stores the data, and a file system filter, which you install to interpret the data and process the file. When an application sets a reparse point, it stores this data, plus a reparse tag, which uniquely identifies the data it is storing. When the file system opens a file with a reparse point, it attempts to find the file system filter associated with the data format identified by the reparse tag. If a file system filter is found, the filter processes the file as directed by the reparse data. If a file system filter is not found, the file open operation fails
To rephrase this, a reparse point is a special file or folder, which instead of containing the usual data contains two special bits of data:
- A reparse point tag
- The user defined data
This data is interpreted by a special program called a “file system filter driver”.
When the Windows OS is walking down the file system (parsing the file system) and encounters a reparse point, it uses the reparse point tag to determine which file system filter to use, and then passes the user defined data to that filter to process it. This is where the term “reparse” comes from. The OS parses the data once, and determines it's a reparse point, then the file system filter re-parses the data to get the “real” data or the desired result.
In the case of a Junction or a Volume Mount Point, the file system filter is provided by Microsoft, and is part of the Windows operating system.
Reparse Point Tags
The reparse point tag is an identifier that is registered with Microsoft that allows the OS to determine which file system filter driver to use to interpret a given reparse point.
The MSDN page for reparse point tags lays out the structure of the tag in a straightforward manner, and we will not repeat it here.
The only two fields in the tag of real interest to us that I'll point out are that the first bit indicates if the tag is an MS tag, and the last 16 bits uniquely identify the tag type.
However, for our purposes, we really only need to consider the tag as a whole. When writing a program that examines reparse points, you will typically compare the entire tag value, not individual fields in the tag.
Microsoft defines these tag types in winnt.h:
- IO_REPARSE_TAG_DFS
- IO_REPARSE_TAG_DFSR
- IO_REPARSE_TAG_HSM
- IO_REPARSE_TAG_HSM2
- IO_REPARSE_TAG_MOUNT_POINT
- IO_REPARSE_TAG_SIS
- IO_REPARSE_TAG_SYMLINK
The tag IO_REPARSE_TAG_MOUNT_POINT (0xA0000003L) is the tag that identifies both Junctions and Volume Mount points, and is what we will focus on for the rest of the article.
The “User Defined Data”
The data in the reparse point is whatever the filter driver needs to be able to find the desired data.
The data is returned in one of two different structs depending on if the reparse point is a Microsoft reparse point or not (remember that the first bit in the tag will be a 1 if it is a Microsoft reparse point).
If it is a MS tag, the struct will be a REPARSE_DATA_BUFFER, otherwise it will be a REPARSE_GUID_DATA_BUFFER
In the case of IO_REPARSE_TAG_MOUNT_POINT reparse points, the PathBuffer field on the REPARSE_DATA_BUFFER struct identifies the target pointed to by the junction or volume mount point.
If it is a junction, it will have a path such as:
\??\D:\MyTargetFolder
If it is a volume mount point, it will contain the volume ID such as:
\??\Volume{e124abc3-1234-5678-0ab1-e32f863291ab3}\
One curiosity to note is the “\??\” prefix on the substitute name paths. We'll discuss this later on.
Viewing a Reparse Point
When working with a reparse point, you will obviously want to be able to examine the tag and the reparse data. However, you can't just double click the reparse point in Windows Explorer, as that will reparse it and open the target.
There are various ways to view the contents of a reparse point, but most are dependent on the type of reparse point you are looking at.
The simplest way to view the contents of any reparse point to see the type and user defined data is to use fsutil.
The exact command is:
fsutil reparsepoint query path
where path is the path of the reparse point.
For example, if I create a folder c:\testjunctions\source that points to c:\testjunctions\target, and query the reparse point with fsutil, I see the following output:
C:\testjunctions>fsutil reparsepoint query c:\testjunctions\source
Reparse Tag Value : 0xa0000003
Tag value: Microsoft
Tag value: Name Surrogate
Tag value: Mount Point
Substitue Name offset: 0
Substitue Name length: 54
Print Name offset: 56
Print Name Length: 0
Substitute Name: \??\c:\testjunctions\target
Reparse Data Length: 0x00000042
Reparse Data:
0000: 00 00 36 00 38 00 00 00 5c 00 3f 00 3f 00 5c 00 ..6.8...\.?.?.\.
0010: 63 00 3a 00 5c 00 74 00 65 00 73 00 74 00 6a 00 c.:.\.t.e.s.t.j.
0020: 75 00 6e 00 63 00 74 00 69 00 6f 00 6e 00 73 00 u.n.c.t.i.o.n.s.
0030: 5c 00 74 00 61 00 72 00 67 00 65 00 74 00 00 00 \.t.a.r.g.e.t...
0040: 63 00 c.
As you can see, fsutil nicely breaks down the tag fields, and the data in the REPARSE_DATA_BUFFER for us. It's an essential tool to use when testing any of your code for working with reparse points.
(to be continued)
