To create new products, it’s important to have standardized and reliable components. For example, home builders build houses using bricks of a standard size. In programming, we have development standards. One of these standards is the Component Object Model (COM), created by Microsoft.
COM effectively solves the problem of code reuse, but the implementation of some of its functions isn’t clear. For example, when developing data leak protection systems, you may need to get a COM server process ID (PID) to check how processes handle sensitive data. The documentation from Microsoft doesn’t provide an explicit way to do this, so we decided to share our experience. In this article, we explain how COM servers work and show three different ways to get a COM server’s PID.
This guide will be useful both for those wishing to learn more about the infrastructure of COM servers as well as for specialists facing issues getting PIDs in practice.
Code reusability is a priority in software development. Usually, software is developed using a specific programming language and can only be used effectively if other components are developed using the same language. COM provides developers with constituent modules, which must work in a variety of environments.
As a platform-independent, object-oriented technology standard, the Component Object Model enables the development of binary components. Subsequently, these components can be used both locally and in a distributed network environment. The main purpose of COM is to provide means by which objects and components written in various programming languages can interoperate with no changes to the executable code. According to Microsoft’s documentation, objects that provide services to clients are called COM servers. Services are represented as implementations of COM interfaces, which can be called by any client that can get a pointer to one of the interfaces on the server object.
The ability to interact transparently with objects is built in to COM by design. An object can run in the same process, on the same machine, or on another machine, but there will always be a single programming model for it, regardless of its type. This feature is called location transparency.
Location transparency means that for COM users, it doesn’t matter where the COM server is located. For a user, it all works the same. But in case the COM server is in another process and the user needs its PID, location transparency by design doesn’t allow the user to get it. Let’s look at how we can deal with this problem.
Getting a process ID is a rather specific task. Nevertheless, there are some cases where it’s necessary. For example, a PID is needed for system monitoring tools, which should provide information about processes running within the system, including data on processes with parent–child relationships. COM servers don’t provide this information, although it would be useful to know which process initiated the server’s creation. Moreover, there’s no easy or obvious way to hook such processes, as there are no parent–child relationships between them and the original process. The way out is to track the creation of COM servers using the methods we’ll introduce below.
In addition, obtaining a PID is particularly important for all kinds of Data Loss Prevention (DLP) systems. In a DLP system, it’s necessary to control how processes handle sensitive documentation. Usually, this is done by hooking the process. Sometimes, processes generate child processes or COM processes, and they also need to be controlled to prevent data leakage. To hook a COM process, you need to know its PID. This is exactly the problem we encountered while working on one of our projects.
In one of our projects, we used an application that uses WinAPI hooks for functions like ReadFile, WriteFile, CreateFile and so on to handle encrypted files. Files are stored encrypted on the disk, and when the app interacts with a file, it decrypts or encrypts the contents on the fly:
The application doesn’t know that the file is encrypted and uses a simple API to interact with it. Sometimes, the app launches child processes that also work with the encrypted files. This situation can be handled by hooking the CreateProcess or CreateProcessAsUser functions and retrieving the PID of child processes from the return value. The PID is enough to hook the process. Here’s how it works:
Sometimes, an application launches COM processes that also work with encrypted files. There’s no easy or obvious way to hook such processes, as there are no parent–child relationships between them and the original process. Let’s see how to get a COM server process ID from the very basics.
Before you get PIDs, let’s quickly review some basics of COM technology to better understand what the process ID of the COM server is and how COM servers work from the inside.
How to get a COM interface
In order to get the PID of a COM server, we must first get the COM interface. There are three main functions to get COM interfaces:
In general, the process of getting a COM interface looks like this:
- Client calls the CoCreateInstanceEx function causing COM to delegate the activation request to its local Service Control Manager (SCM).
- The local SCM looks into the local registry under [HKCR\CLSID\CLSID\inprocServer32] for an in-process server that implements this COM class. If it finds an in-process server, it returns the in-process server’s path to COM. COM then loads the in-process server.
- If the SCM can’t find an in-process server that implements the requested COM class, it looks into the SCM cache to see whether the requested COM class’s class factory has been registered by an already running local server.
- If the SCM can’t find a server in the cache, it looks for a local server path under [HKCR\CLSID\CLSID\LocalServer32] and spawns the local server, which registers the server’s supported class factories.
- If the SCM can’t find a local server, it looks for the RemoteServerName entry under [HKCR\AppID\AppID\RemoteServerName]. The local SCM then contacts the remote SCM and asks the remote SCM to handle the activation request. The remote SCM will, following steps 3 through 5, try to spawn a remote process that supports the requested factory.
COM operates with objects through interface pointers.
However, there’s no way to know if the created object is in the same process or in another, and there’s also no API to get the PID of the process that hosts the COM object. To better understand how to get a COM server process ID, let’s explore how COM communicates with remote objects.
How COM servers communicate with remote objects
Clients access remote COM objects through special proxies that the COM runtime provides to achieve location transparency. Microsoft extends and utilizes its existing Remote Procedure Call (RPC) technology to allow remote objects to communicate. Proxies marshal all parameters and interfaces to the remote stubs that unmarshal them and call a remote object’s methods. You can read more about inter-object communication in the Microsoft documentation.
To get a PID, we need to understand how proxies marshal COM interfaces and what fields the marshaled representation has. COM uses a special OBJREF structure to represent a marshaled interface. Here’s an example of such an interface:
There are four different formats for an OBJREF, which are specified by definitions of the u_objref field. Also, the STDOBJREF structure is an important part of the marshaled interface representation. Here is an example of this structure:
STDOBJREF contains fields important for marshaling and finding the remote object and interface: object exporter identifier (OXID), object identifier (OID), and interface pointer identifier (IPID). We will use these fields in two of the three methods for getting a PID described below, so let’s take a closer look at them:
- OXID. This is a 64-bit value assigned to an apartment that exports an interface or marshals an interface for a remote client. It is unique within a given machine.
- OID. This is a 64-bit value assigned to a stub manager, which can be described as a fake client on the object side. It is unique within a particular apartment.
- IPID. This is a 128-bit value which represents a unique interface pointer ID to identify an interface stub.
Now, let’s go directly to the methods of obtaining PIDs. To be concise, we have omitted the error handling step in our tutorials.
There are three basic methods of getting the PID of a COM server:
Let’s take a closer look at each of them.
Get a COM server PID from the IPID
Note: Error handling is omitted for brevity.
The first method is getting the PID from the IPID. To illustrate this method, we’ll write a small app that instructs COM to launch a separate process that hosts a COM object:
But having only the IUnknown interface, we can’t say if an object is local or remote. Let’s marshal the interface to see the fields we need:
Then we use a debugger to see the contents of the IPID structure:
First two bytes of Data2 are the PID of the remote process.
The code to get the PID from OBJREF looks like this:
To extract the PID, we use the GetCOMServerPID function:
If the PID is less than 65535, we will get it after executing this function. If it’s greater, we need to use another method.
Get the COM server PID from the OXID Resolver
Note: Error handling is omitted for brevity.
Every COM machine runs a special manager service called Object Resolver, also called OXID Resolver. It runs on every machine that supports COM and performs two important functions:
- Stores the remote procedure call (RPC) string bindings that are necessary to connect with remote objects and provides RPC string bindings to local clients.
- Sends ping messages to remote objects for which the local machine has clients and receives ping messages for objects running on the local machine.
Basically, OXID Resolver stores information about the COM server (addresses, ports, etc.). We can retrieve this information from the OXID Resolver by calling the ResolveOxid method:
First, we should retrieve the RPC binding to connect to the OXID Resolver. We can establish a connection with the OXID Resolver via transmission control protocol (TCP) port 135:
Here’s an example of this:
Now, we can use the OXID Resolver to get the server’s string bindings. String bindings are similar to logical addresses.
Let’s call the ResolveOxid method to get string bindings of the server:
We requested the TCP addresses, but the server may not support any networking. It seems like Microsoft Excel doesn’t use any TCP connections. In this case, the request should cause an error, but it doesn’t. To figure out why there’s no error, let’s look in Process Explorer from Sysinternals tools to see what TCP/IP connections Microsoft Excel uses after the execution of the code above:
As you can see, Microsoft Excel has two open ports.
Object Resolver makes Microsoft Excel open a TCP port and returns us that binding. Now it’s easy to get the process that uses that port with the GetTcpTable2 function or the PowerShell command:
This method is the most correct way to get the PID of a COM server. By getting the PID from the OXID resolver, we use a documented API and follow the same steps as the COM runtime during the connection of remote objects.
Get a COM server PID from the ALPC port
Note: Error handling is omitted for brevity.
The previous two methods used the OBJREF interface to find the PID. This method, on the other hand, takes a rather different approach, as it uses Advanced Local Procedure Call (ALPC). COM employs this procedure to communicate between objects on the same machine.
Let’s look at opened handles of the COM server using Process Explorer from Sysinternals tools:
We can see that the COM server uses the same ALPC Port that the client passes to the RpcBindingFromStringBinding function. In order to find the COM server’s PID, we should:
Note: Make sure that the RpcBindingFromStringBinding function is called by the COM runtime and not by a plain RPC call. The RpcBindingFromStringBinding function is used as an element of the COM communication implementation, and if we check this function in the API monitor and we are interested in COM, we have to make sure that it’s called by the COM runtime.
The most challenging part of this method is finding a process that uses the handle value, as there’s no documented way to do that. We have to explore the ntdll.dll functions and structures to get information about handles and all running processes. First, we need to get some macros and definitions, as most ntdll.dll constants and structures are not exposed in public headers:
To get a PID, we need to call a function that prints all opened ALPC ports and corresponding PIDs to the standard output:
As we can see, here’s the needed process. Now we just need to filter it by the known ALPC port name:
This approach is the most time-consuming because it involves examining each process in the system and the name of each handle.
This article describes three different ways to get a process ID of a COM server. This value can be extremely important if you’re faced with a task that involves creating parent–child processes in system monitoring tools and checking DLP systems.
- Get COM server PID from IPID. The fastest way of getting COM servers process ID is from the IPID struct, but it can provide only values less than 65535. Also, it relies on the undocumented structure of the IPID field in the marshaled form.
- Get COM server PID from OXID resolver. Getting a PID from the OXID Resolver can be considered the most correct way of getting a COM server’s PID because it uses a documented API and performs the same steps as the COM runtime does when connecting to remote objects.
- Get COM server pid from alpc port. Getting COM server’s PID through the ALPC port requires using an undocumented API from ntdll.dll. Note that it may take quite a lot of time, as this approach involves looking into each process on the system and into each handle’s name.
Apriorit specialists have in-depth knowledge of engineering cybersecurity solutions and are always ready to help you with that. If you have a challenging cybersecurity project in mind, feel free to contact us.