Sizing Up the Void

Previously we discovered how to hold an unboxed instance of System.Void, and although circumventing the .NET framework always feels good, it left us with a burning question; What does the memory layout look like? Can a class contain infinite voids?

Answering this question is not a simple task. C# is a managed language, after all, and whilst some tools to work with pointers are well known and documented, they all seem to work with Array-like objects.

Problem: Garbage Collector

The Garbage Collector (GC) is one of the fundamental aspects of a managed runtime. In short, the GC is responsible for the allocation and de-allocation of managed objects like classes on the heap. As part of its routines, the garbage collection will happily defragment the memory, moving objects around in the process.

This shuffling makes any pointers to objects unreliable. There is a concept of pinning in .NET, which is an instruction to the GC to temporarily not move an object while there exists a pointer to it. Unfortunaly if you try to pin an object that is not in a small list of special cases (e.g. array), you will receive error CS0208: Cannot take the address of, get the size of, or declare a pointer to a managed type.

The shuffling issue is easy to get around; create a program that does not allocate much. In this way, the GC will never be triggered and thus any pointers will be somewhat reliable. We can even encourage the GC to mind its own business by starting our program as follows.

const int enoughMemoryForMe = 0xFFFFFF;
var stoppedGC = false;
try {
    stoppedGC = GC.TryStartNoGCRegion(enoughMemoryForMe);

    // Main Code

} finally {
    GC.EndNoGCRegion();
}

This API would normally be used in extremely specialised cases where a programmer has run out of all options and the GC keeps misbehaving, pausing the process at the most inopportune moments. The runtime is also free to ignore this instruction. In reality, this API is never needed. Even when its needed, it's not.

It's also horribly broken - what happens if a ThreadAbortException occurs right as the GC.TryStartNoGCRegion method returns true, but before stoppedGC is set? This is, after all, why the C# lock statement changed from using System.Threading.Monitor.Enter to System.Threading.Monitor.TryEnter I digress... and this API is good enough for our use case.

Next we need a way to get a pointer to a managed object, without using the fixed statement. I remember when I first discovered reflection, I thought to myself that it's an all-powerful API. I was drunk with power, however I soon discovered that with great power come great maintainability problems, and I tend to use reflection less nowadays. I bring this up because the code I am about to unveil makes even the most creative uses of reflection look clean.

Enter: the __makeref keyword. It's not really supported or documented by .NET, but it has been a valid C# keyword since initial release and works on newer runtimes such as .NET Core too. You won't find too much about it online, and the code you do find is never production-worthy. The __makeref keyword returns a TypedReference which is a struct containing a pointer to a managed object. You may use it like this at your colleagues' expense:

    private unsafe static IntPtr GetPointer(object o)
    {
        TypedReference typedReference = __makeref(o);
        IntPtr ptr = **(IntPtr**)&typedReference;
        return ptr;
    }

As I have hopefully convinced you, the unsafe keyword in the method is an understatement.

Plan A

The plan is as follows; create a class with sentinel values and voids, as shown below.

public class VoidHolder
{
    public int sentinel0;
    public System.Void v1;
    public int sentinel1;
    public System.Void v2;
    public System.Void v3;
    public int sentinel2;
}

We can then do the following:

  1. Set the sentinel values to different arbitrary constants.
  2. We can unapologetically use __makeref to get a pointer to our object.
  3. Search for all 3 sentinels and check how many bytes lie in between them.
  4. The space between the fields should give us an indication of the size of System.Void.
  5. We should expect to see that the difference between the first pair of sentinels is twice the size of the first pair.

We run this and confirm that the 3 sentinels are contiguous in memory!

You know something is wrong because I have not shown you the code. Void is a struct of 0 bytes? It seemed too good and (spoiler warning) too unlikely to be true.

Problem: Data Structure Alignment

The .NET runtime has throw another spanner in the works; the ordering of fields within a class is completely up to the runtime. This allows runtimes to optimise for the device they are running on. For example servers might prefer layouts where fields are naturally aligned, whilst mobile devices might prefer more packed layouts due to memory concerns.

If we can not reliably check the contents of a class, we can instead check the size of the class itself. To do so, we rely on the inner workings of the GC. Although no production system should rely on this behaviour, each thread has its own heap and the GC allocates objects contiguously in the order they were constructed (with the exception of "large" objects which get their own heap).

If we create objects with varying number of Voids, and observe the size of these objects, we can hope to spot the solution within the pattern. Specifically, we can allocate objects with 0, 1, 2, and 3 voids. If Void is indeed 0 bytes in length, all these objects would be the same distance apart. If not, we should see some predictable pattern emerge. If we see neither of those options, hope begins to wane.

public class VoidHolder0
{
    // To each class we add a "notempty" field
    // to avoid any attempts by the runtime to pad
    // our classes without us knowing about it.
    public long notempty;
}

public class VoidHolder1
{
    public long notempty;
    public Void v1;
}

public class VoidHolder2
{
    public long notempty;
    public Void v1;
    public Void v2;
}

public class VoidHolder3
{
    public long notempty;
    public Void v1;
    public Void v2;
    public Void v3;
}

const int enoughMemoryForMe = 0xFFFFFF;
var stoppedGC = false;
try
{
    stoppedGC = GC.TryStartNoGCRegion(enoughMemoryForMe);

    // Allocate all the objects
    var vh0 = new VoidHolder0();
    var vh1 = new VoidHolder1();
    var vh2 = new VoidHolder2();
    var vh3 = new VoidHolder3();
    var f = new object();

    // Retrieve the pointer for each one, as a long
    var ptr0 = GetPointer(vh0).ToInt64();
    var ptr1 = GetPointer(vh1).ToInt64();
    var ptr2 = GetPointer(vh2).ToInt64();
    var ptr3 = GetPointer(vh3).ToInt64();
    var ptrf = GetPointer(f).ToInt64();

    // Print their pairwise differences to the console
    Console.WriteLine($"0 -> 1: {ptr1 - ptr0}");
    Console.WriteLine($"1 -> 2: {ptr2 - ptr1}");
    Console.WriteLine($"2 -> 3: {ptr3 - ptr2}");
    Console.WriteLine($"3 -> f: {ptrf - ptr3}");

    // Stops the GC from collecting them until now
    GC.KeepAlive(vh0);
    GC.KeepAlive(vh1);
    GC.KeepAlive(vh2);
    GC.KeepAlive(vh3);
    GC.KeepAlive(f);
}
finally
{
    GC.EndNoGCRegion();
}

When we run the program, we get the following results.

0 -> 1: 24  // (16 on 32-bit mode)
1 -> 2: 32  //  20
2 -> 3: 40  //  24
3 -> f: 48  //  28

There we have it, proof that an unboxed instance of System.Void takes up at least 1 byte. The size of the voidless object is 24 bytes (16 bytes on 32 bit). This is explained by the two pointer-size headers in .NET objects. The first, at offset 0, points to the method table, while the other, at offset (-1 * pointer_size) is GC data including flags such as whether the object is finalized or not. Add 8 bytes for the "notempty" and there is pretty hard evidence that System.Void is at least 1 byte long, with the CLR padding each field for natural alignment.

Conclusion

In the end, it turns out that System.Void is a massive red herring. Despite the .NET team making significant efforts to stop us from creating instances of System.Void, the type seems to really be nothing more than a placeholder for reflection, and behaves just like any other struct without any fields.

This outcome is the most predictable outcome. There is no reason for System.Void to be special-cased in the lowest levels of the runtime. However despite its unsurprising results, the effort to investigate System.Void provided ample learning opportunities. I hoped you, the reader, enjoyed it as much as I did.


Other posts you might like


Join the Discussion

You must be signed in to comment.

No user information will be stored on our site until you comment.