Overhead of a .Net Array

Overhead of a .NET array?

Here's a slightly neater (IMO) short but complete program to demonstrate the same thing:

using System;

class Test
{
    const int Size = 100000;

    static void Main()
    {
        object[] array = new object[Size];
        long initialMemory = GC.GetTotalMemory(true);
        for (int i = 0; i < Size; i++)
        {
            array[i] = new string[0];
        }
        long finalMemory = GC.GetTotalMemory(true);
        GC.KeepAlive(array);

        long total = finalMemory - initialMemory;

        Console.WriteLine("Size of each element: {0:0.000} bytes",
                          ((double)total) / Size);
    }
}

But I get the same results - the overhead for any reference type array is 16 bytes, whereas the overhead for any value type array is 12 bytes. I'm still trying to work out why that is, with the help of the CLI spec. Don't forget that reference type arrays are covariant, which may be relevant...

EDIT: With the help of cordbg, I can confirm Brian's answer - the type pointer of a reference-type array is the same regardless of the actual element type. Presumably there's some funkiness in object.GetType() (which is non-virtual, remember) to account for this.

So, with code of:

object[] x = new object[1];
string[] y = new string[1];
int[] z = new int[1];
z[0] = 0x12345678;
lock(z) {}

We end up with something like the following:

Variables:
x=(0x1f228c8) <System.Object[]>
y=(0x1f228dc) <System.String[]>
z=(0x1f228f0) <System.Int32[]>

Memory:
0x1f228c4: 00000000 003284dc 00000001 00326d54 00000000 // Data for x
0x1f228d8: 00000000 003284dc 00000001 00329134 00000000 // Data for y
0x1f228ec: 00000000 00d443fc 00000001 12345678 // Data for z

Note that I've dumped the memory 1 word before the value of the variable itself.

For x and y, the values are:

The sync block, used for locking the hash code (or a thin lock - see Brian's comment)
Type pointer
Size of array
Element type pointer
Null reference (first element)

For z, the values are:

Sync block
Type pointer
Size of array
0x12345678 (first element)

Different value type arrays (byte[], int[] etc) end up with different type pointers, whereas all reference type arrays use the same type pointer, but have a different element type pointer. The element type pointer is the same value as you'd find as the type pointer for an object of that type. So if we looked at a string object's memory in the above run, it would have a type pointer of 0x00329134.

The word before the type pointer certainly has something to do with either the monitor or the hash code: calling GetHashCode() populates that bit of memory, and I believe the default object.GetHashCode() obtains a sync block to ensure hash code uniqueness for the lifetime of the object. However, just doing lock(x){} didn't do anything, which surprised me...

All of this is only valid for "vector" types, by the way - in the CLR, a "vector" type is a single-dimensional array with a lower-bound of 0. Other arrays will have a different layout - for one thing, they'd need the lower bound stored...

So far this has been experimentation, but here's the guesswork - the reason for the system being implemented the way it has. From here on, I really am just guessing.

All object[] arrays can share the same JIT code. They're going to behave the same way in terms of memory allocation, array access, Length property and (importantly) the layout of references for the GC. Compare that with value type arrays, where different value types may have different GC "footprints" (e.g. one might have a byte and then a reference, others will have no references at all, etc).
Every time you assign a value within an object[] the runtime needs to check that it's valid. It needs to check that the type of the object whose reference you're using for the new element value is compatible with the element type of the array. For instance:
```
object[] x = new object[1];
object[] y = new string[1];
x[0] = new object(); // Valid
y[0] = new object(); // Invalid - will throw an exception
```

This is the covariance I mentioned earlier. Now given that this is going to happen for every single assignment, it makes sense to reduce the number of indirections. In particular, I suspect you don't really want to blow the cache by having to go to the type object for each assigment to get the element type. I suspect (and my x86 assembly isn't good enough to verify this) that the test is something like:

Is the value to be copied a null reference? If so, that's fine. (Done.)
Fetch the type pointer of the object the reference points at.
Is that type pointer the same as the element type pointer (simple binary equality check)? If so, that's fine. (Done.)
Is that type pointer assignment-compatible with the element type pointer? (Much more complicated check, with inheritance and interfaces involved.) If so, that's fine - otherwise, throw an exception.

If we can terminate the search in the first three steps, there's not a lot of indirection - which is good for something that's going to happen as often as array assignments. None of this needs to happen for value type assignments, because that's statically verifiable.

So, that's why I believe reference type arrays are slightly bigger than value type arrays.

Great question - really interesting to delve into it :)

C# length 1 array versus single value, performance and memory overhead

Just a short note. In the .NET world the big performance issue is the GC that may easily lock the app for 50-100ms. You will likely not see a big difference in reading data from an object or single value array. But you will likely get a penalty if you need to create such object a lot. You certainty should avoid creating objects from the getters code.

I think that this article may be helpful: https://michaelscodingspot.com/avoid-gc-pressure/. Also consider usage of some Profile tool to check how much time it actually takes. I prefer using the PerfView provided by MS. It may take some time to start using it, but you certainly get benefits in results.

Which size does an array have in memory?

The overhead of arrays in bytes are:

Architecture | Value Type Array | Reference Type Array
    x86              12                   16
    x64              24                   32

You can calc these values with

using System;

class Test
{
    const int Size = 100000;

    static void Main()
    {
        Console.WriteLine("Running at {0} bits", IntPtr.Size * 8);

        Tester<string>();
        Tester<double>();

        Console.ReadKey();
    }

    static void Tester<T>()
    {
        var array = new object[Size];
        long initialMemory = GC.GetTotalMemory(true);

        for (int i = 0; i < Size; i++)
        {
            array[i] = new T[0];
        }

        long finalMemory = GC.GetTotalMemory(true);

        GC.KeepAlive(array);

        long total = finalMemory - initialMemory;

        Console.WriteLine("Size of each {0}[]: {1:0.000} bytes", typeof(T).Name,
                          ((double)total) / Size);
    }
}

This code is a modified version of the one from here Overhead of a .NET array?

Clearly you have to execute it at 32 and at 64 bits.

To this overhead you have to add: the elements of the array (so size * sizeof(element)) plus at least a reference to the array that you'll need to have (so IntPtr.Size).

Note that there are some inconsistencies I've noticed. If I create double[1], so arrays of a single double, each one of them is perfectly aligned on the 8 byte boundary, but the space used seems to be only 20 bytes/array (at 32 bits, so 12 + sizeof(double)). This is clearly impossible, because 20 isn't divisible by 8. I think the GC.GetTotalMemory is "ignoring" the hole between objects. This could be an additional overhead of some bytes/array (depending on the type of elements of the array). For byte[1] the medium size is 16 bytes/array (at 32 bits, so 12 + sizeof(byte) + 3). This seems to be more correct.

Memory layout of a .NET array

Great question. I found this article which contains block diagrams for both value types and reference types. Also see this article in which Ritcher states:

[snip] each array has some additional
overhead information associated with
it. This information contains the rank
of the array (number of dimensions),
the lower bounds for each dimension of
the array (almost always 0), and the
length of each dimension. The overhead
also contains the type of each element
in the array.

The limitation on the size of .Net array

That is correct. No single object can be larger than 2 GB.

As with 32-bit Windows operating
systems, there is a 2GB limit on the
size of an object you can create while
running a 64-bit managed application
on a 64-bit Windows operating system.

This question has additional details and some useful links: Single objects still limited to 2 GB in size in CLR 4.0?

Performance of Arrays vs. Lists

Very easy to measure...

In a small number of tight-loop processing code where I know the length is fixed I use arrays for that extra tiny bit of micro-optimisation; arrays can be marginally faster if you use the indexer / for form - but IIRC believe it depends on the type of data in the array. But unless you need to micro-optimise, keep it simple and use List<T> etc.

Of course, this only applies if you are reading all of the data; a dictionary would be quicker for key-based lookups.

Here's my results using "int" (the second number is a checksum to verify they all did the same work):

(edited to fix bug)

List/for: 1971ms (589725196)
Array/for: 1864ms (589725196)
List/foreach: 3054ms (589725196)
Array/foreach: 1860ms (589725196)

based on the test rig:

using System;
using System.Collections.Generic;
using System.Diagnostics;
static class Program
{
    static void Main()
    {
        List<int> list = new List<int>(6000000);
        Random rand = new Random(12345);
        for (int i = 0; i < 6000000; i++)
        {
            list.Add(rand.Next(5000));
        }
        int[] arr = list.ToArray();

        int chk = 0;
        Stopwatch watch = Stopwatch.StartNew();
        for (int rpt = 0; rpt < 100; rpt++)
        {
            int len = list.Count;
            for (int i = 0; i < len; i++)
            {
                chk += list[i];
            }
        }
        watch.Stop();
        Console.WriteLine("List/for: {0}ms ({1})", watch.ElapsedMilliseconds, chk);

        chk = 0;
        watch = Stopwatch.StartNew();
        for (int rpt = 0; rpt < 100; rpt++)
        {
            for (int i = 0; i < arr.Length; i++)
            {
                chk += arr[i];
            }
        }
        watch.Stop();
        Console.WriteLine("Array/for: {0}ms ({1})", watch.ElapsedMilliseconds, chk);

        chk = 0;
        watch = Stopwatch.StartNew();
        for (int rpt = 0; rpt < 100; rpt++)
        {
            foreach (int i in list)
            {
                chk += i;
            }
        }
        watch.Stop();
        Console.WriteLine("List/foreach: {0}ms ({1})", watch.ElapsedMilliseconds, chk);

        chk = 0;
        watch = Stopwatch.StartNew();
        for (int rpt = 0; rpt < 100; rpt++)
        {
            foreach (int i in arr)
            {
                chk += i;
            }
        }
        watch.Stop();
        Console.WriteLine("Array/foreach: {0}ms ({1})", watch.ElapsedMilliseconds, chk);

        Console.ReadLine();
    }
}

C# Where does the memory overhead come from

Memory overhead

Your cVector class adds alot of memory overhead. On a 32-bit system, any reference object has a memory overhead of 12 bytes (although 4 of those are free to be used by fields if possible), if I recall correctly. Let's go with an overhead of 8 bytes. So in your case with 10,000,000 triangles, each containing 4 vectors, that adds upp to:

10,000,000 * 4 * 8 = 305 MB of overhead

If you're running on a 64-bit system it's twice that:

10,000,000 * 4 * 16 = 610 MB of overhead

On top of this, you also have the overhead of the four references each cSTLTriangle will have to the vectors, giving you:

10,000,000 * 4 * 4 = 152 MB (32-bit)
10,000,000 * 4 * 8 = 305 MB (64-bit)

As you can see this all builds up to quite a hefty bit of overhead.

So, in this case, I would suggest you make cVector a struct. As discussed in the comments, a struct can implement interfaces (as well as properties and methods). Just be aware of the caveats that @Jcl mentioned.

You have the same issue with your cSTLTriangle class (about 76/152 MB overhead for 32-bit and 64-bit, respectively), although at its size I'm not sure I want to recommend going with struct on that. Others here might have useful insights on that matter.

Additionally, due to padding and object layout, the overhead might actually be even larger, but I haven't taken that into account here.

List capacity

Using the List<T> class with that amount of objects can cause some wasted memory. As @Matthew Watson mentions, when the list's internal array has no more room, it will be expanded. In fact, it will double it's capacity every time that happens. In a test with your number of 10533050 entries, the capacity of the list ended up at 16777216 entries, giving an overhead of:

( 16777216 - 10533050 ) * 4 byte reference = 23 MB (32-bit)
( 16777216 - 10533050 ) * 8 byte reference = 47 MB (64-bit)

So since you know the number of triangles in advance, I would recommend just going with a simple array. Manually setting the Capacity of a list works too.

Other issues

The other issues that have been discussed in the comments should not give you any memory overhead, but they sure will put alot of unnecessary pressure on the GC. Especially the SubArray method which, while very practical, will create many millions of garbage arrays for the GC to handle. I suggest skipping that and indexing into the array manually, even if it's more work.

Another issue is reading the entire file at once. This will be both slower and use more memory than reading it piece by piece. Directly using a BinaryReader as others have suggested might not be possible due to the endianness issues you need to deal with. One complicated option could be to use memory mapped files, that would let you access the data without having to care about if it's been read or not, leaving the details to the OS.

(man I hope I got all these numbers right)

Is there any memory or CPU overhead in using a struct in C#

A struct may or may not be allocated on the stack. Reference types can never be allocated on the stack; they are always allocated on the heap.

From the Standard (ISO 23270), § 8.8:

8.8 Structs
The list of similarities between classes and structs is long—structs can implement
interfaces, and can have the same kinds of members as classes. Structs differ from
classes in several important ways, however: structs are value types rather than
reference types, and inheritance is not supported for structs. Struct values
are stored “on the stack” or “in-line”. Careful programmers can sometimes enhance
performance through judicious use of structs.

For example, the use of a struct rather than a class for a Point can make a large
difference in the number of memory allocations performed at run time. The program
below creates and initializes an array of 100 points.

With Point implemented as a class, 101 separate objects are instantiated—one for the
array and one each for the 100 elements.
class Point
{
  public int x, y;
  public Point(int x, int y)
  {
    this.x = x;
    this.y = y;
  }

}
class Test
{
  static void Main()
  {
    Point[] points = new Point[100];
    for (int i = 0; i < 100; i++)
    {
      points[i] = new Point(i, i*i);
    }
}
If Point is instead implemented as a struct, as in
struct Point
{
  public int x, y;
  public Point(int x, int y)
  {
    this.x = x;
    this.y = y;
  }
}
only one object is instantiated—the one for the array. The Point instances are
allocated in-line within the array. This optimization can be misused. Using structs
instead of classes can also make an application run slower or take up more memory,
as passing a struct instance by value causes a copy of that struct to be created.

So the answer is "Maybe".

For your example, wrapping an array (a reference type) within a struct (a value type), doesn't mean anything: that array is still allocated on the heap.

If however, you change your class EdgeData to a struct, it can be (but may not be) allocated in-line within the array. So if your EdgeData class is, for instance, 16 bytes in size, and you create and populate an EdgeData[] of 100 entries, you are actually allocating 1 array instance (with a backing store sized to hold 100 object references, and 100 individual instances of your EdgeData class.

If EdgeData is a struct, you allocate 1 array with a backing store sized to hold 100 EdgeData instances (in this case, 1600 bytes, since our hypothetical EdgeData struct is 16 bytes in size.)

Iterating over the class version of the array, especially if the array is very large, may cause paging, since you are probably losing locality of reference as you jump all over the heap to hit the individual EdgeData instances.

Iteration over the struct version of the array preserves locality of reference, since the EdgeData instances are in-line.

C#: Which uses more memory overhead? A string or a char holding a sequence of a word?

It would depend how many times you used the thing in question. The way .Net does strings is that if you had multiple strings with the same value then that value is only stored in memory once and each string variable points to that value. However each char array used would have its own memory allocation even if the contents of the char array were identical.

//.Net Framework allocates memory space for "word" only once, all the three variables refer to this chunk of memory
String s1, s2, s3;
s1 = "word";
s2 = "word";
s3 = "word";

//.Net Framework allocates memory for each array
char[] c1 = new char[] { 'w','o','r','d'};
char[] c2 = new char[] { 'w','o','r','d'};
char[] c3 = new char[] { 'w','o','r','d'};

What is the overhead of the C# 'fixed' statement on a managed unsafe struct containing fixed arrays?

Empirically, the overhead appears to be, in the best case, ~270% on 32 bit JIT and ~200% on 64 bit (and the overhead gets worse the more times you "call" fixed). So I'd try to minimise your fixed blocks if performance is really critical.

Sorry, I'm not familiar enough with fixed / unsafe code to know why that's the case

Details

I also added some TestMore methods which call your two test methods 10 times instead of 2
to give a more real world scenario of multiple methods being called on your fixed struct.

The code I used:

class Program
{
    static void Main(string[] args)
    {
        var someData = new ByteArray();
        int iterations = 1000000000;
        var multiple = new MultipleFixed();
        var single = new SingleFixed();

        // Warmup.
        for (int i = 0; i < 100; i++)
        {
            multiple.Test(ref someData);
            single.Test(ref someData);
            multiple.TestMore(ref someData);
            single.TestMore(ref someData);
        }

        // Environment.
        if (Debugger.IsAttached)
            Console.WriteLine("Debugger is attached!!!!!!!!!! This run is invalid!");
        Console.WriteLine("CLR Version: " + Environment.Version);
        Console.WriteLine("Pointer size: {0} bytes", IntPtr.Size);
        Console.WriteLine("Iterations: " + iterations);

        Console.Write("Starting run for Single... ");
        var sw = Stopwatch.StartNew();
        for (int i = 0; i < iterations; i++)
        {
            single.Test(ref someData);
        }
        sw.Stop();
        Console.WriteLine("Completed in {0:N3}ms - {1:N2}/sec", sw.Elapsed.TotalMilliseconds, iterations / sw.Elapsed.TotalSeconds);

        Console.Write("Starting run for More Single... ");
        sw = Stopwatch.StartNew();
        for (int i = 0; i < iterations; i++)
        {
            single.Test(ref someData);
        }
        sw.Stop();
        Console.WriteLine("Completed in {0:N3}ms - {1:N2}/sec", sw.Elapsed.TotalMilliseconds, iterations / sw.Elapsed.TotalSeconds);

        Console.Write("Starting run for Multiple... ");
        sw = Stopwatch.StartNew();
        for (int i = 0; i < iterations; i++)
        {
            multiple.Test(ref someData);
        }
        sw.Stop();
        Console.WriteLine("Completed in {0:N3}ms - {1:N2}/sec", sw.Elapsed.TotalMilliseconds, iterations / sw.Elapsed.TotalSeconds);

        Console.Write("Starting run for More Multiple... ");
        sw = Stopwatch.StartNew();
        for (int i = 0; i < iterations; i++)
        {
            multiple.TestMore(ref someData);
        }
        sw.Stop();
        Console.WriteLine("Completed in {0:N3}ms - {1:N2}/sec", sw.Elapsed.TotalMilliseconds, iterations / sw.Elapsed.TotalSeconds);

        Console.ReadLine();
    }
}

unsafe struct ByteArray
{
    public fixed byte Data[1024];
}

class MultipleFixed
{
    unsafe void SetValue(ref ByteArray bytes, int index, byte value)
    {
        fixed (byte* data = bytes.Data)
        {
            data[index] = value;
        }
    }

    unsafe bool Validate(ref ByteArray bytes, int index, byte expectedValue)
    {
        fixed (byte* data = bytes.Data)
        {
            return data[index] == expectedValue;
        }
    }

    public void Test(ref ByteArray bytes)
    {
        SetValue(ref bytes, 0, 1);
        Validate(ref bytes, 0, 1);
    }
    public void TestMore(ref ByteArray bytes)
    {
        SetValue(ref bytes, 0, 1);
        Validate(ref bytes, 0, 1);
        SetValue(ref bytes, 0, 2);
        Validate(ref bytes, 0, 2);
        SetValue(ref bytes, 0, 3);
        Validate(ref bytes, 0, 3);
        SetValue(ref bytes, 0, 4);
        Validate(ref bytes, 0, 4);
        SetValue(ref bytes, 0, 5);
        Validate(ref bytes, 0, 5);
    }
}

class SingleFixed
{
    unsafe void SetValue(byte* data, int index, byte value)
    {
        data[index] = value;
    }

    unsafe bool Validate(byte* data, int index, byte expectedValue)
    {
        return data[index] == expectedValue;
    }

    public unsafe void Test(ref ByteArray bytes)
    {
        fixed (byte* data = bytes.Data)
        {
            SetValue(data, 0, 1);
            Validate(data, 0, 1);
        }
    }
    public unsafe void TestMore(ref ByteArray bytes)
    {
        fixed (byte* data = bytes.Data)
        {
            SetValue(data, 0, 1);
            Validate(data, 0, 1);
            SetValue(data, 0, 2);
            Validate(data, 0, 2);
            SetValue(data, 0, 3);
            Validate(data, 0, 3);
            SetValue(data, 0, 4);
            Validate(data, 0, 4);
            SetValue(data, 0, 5);
            Validate(data, 0, 5);
        }
    }
}

And the results in .NET 4.0, 32 bit JIT:

CLR Version: 4.0.30319.239
Pointer size: 4 bytes
Iterations: 1000000000
Starting run for Single... Completed in 2,092.350ms - 477,931,580.94/sec
Starting run for More Single... Completed in 2,236.767ms - 447,073,934.63/sec
Starting run for Multiple... Completed in 5,775.922ms - 173,132,528.92/sec
Starting run for More Multiple... Completed in 26,637.862ms - 37,540,550.36/sec

And in .NET 4.0, 64 bit JIT:

CLR Version: 4.0.30319.239
Pointer size: 8 bytes
Iterations: 1000000000
Starting run for Single... Completed in 2,907.946ms - 343,885,316.72/sec
Starting run for More Single... Completed in 2,904.903ms - 344,245,585.63/sec
Starting run for Multiple... Completed in 5,754.893ms - 173,765,185.93/sec
Starting run for More Multiple... Completed in 18,679.593ms - 53,534,358.13/sec

Overhead of a .Net Array