What Does the Keyword "New" Do to a Struct in C#

What does the keyword new do to a struct in C#?

From struct (C# Reference) on MSDN:

When you create a struct object using the new operator, it gets created and the appropriate constructor is called. Unlike classes, structs can be instantiated without using the new operator. If you do not use new, the fields will remain unassigned and the object cannot be used until all of the fields are initialized.

To my understanding, you won't actually be able to use a struct properly without using new unless you make sure you initialise all the fields manually. If you use the new operator, then a properly-written constructor has the opportunity to do this for you.

Hope that clears it up. If you need clarification on this let me know.


Edit

There's quite a long comment thread, so I thought I'd add a bit more here. I think the best way to understand it is to give it a go. Make a console project in Visual Studio called "StructTest" and copy the following code into it.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace struct_test
{
class Program
{
public struct Point
{
public int x, y;

public Point(int x)
{
this.x = x;
this.y = 5;
}

public Point(int x, int y)
{
this.x = x;
this.y = y;
}

// It will break with this constructor. If uncommenting this one
// comment out the other one with only one integer, otherwise it
// will fail because you are overloading with duplicate parameter
// types, rather than what I'm trying to demonstrate.
/*public Point(int y)
{
this.y = y;
}*/
}

static void Main(string[] args)
{
// Declare an object:
Point myPoint;
//Point myPoint = new Point(10, 20);
//Point myPoint = new Point(15);
//Point myPoint = new Point();

// Initialize:
// Try not using any constructor but comment out one of these
// and see what happens. (It should fail when you compile it)
myPoint.x = 10;
myPoint.y = 20;

// Display results:
Console.WriteLine("My Point:");
Console.WriteLine("x = {0}, y = {1}", myPoint.x, myPoint.y);

Console.ReadKey(true);
}
}
}

Play around with it. Remove the constructors and see what happens. Try using a constructor that only initialises one variable(I've commented one out... it won't compile). Try with and without the new keyword(I've commented out some examples, uncomment them and give them a try).

What does happen when you make a struct without using the new keyword?

If you go through the small struct documentation, you can quote:

A struct type is a value type that is typically used to encapsulate small groups of related variables, such as the coordinates of a rectangle or the characteristics of an item in an inventory.

Normally, when you declare in your code these value type like:

int i; // By default it's equal to 0
bool b; // by default it's equal to false.

Or a reference type as:

string s; //By default it's null

The struct you have created is a value type, which by default isn't initialized and you can't access its properties. Therefore, you can't declare it as:

Person p;

Then use it directly.

Hence the error you got:

"Use of possibly unassigned field 'Age'"

Because p is still not initialized.
This also explains your second part of the question:

I won't be able to use it. The compiler gives an error "Use of an unassigned local variable p" but If I make instance of the struct with the default (parameterless) constructor there isn't such an error.

The same reason you couldn't directly assign p.Name = "something" is because p is still not initialized.

You must create a new instance of the struct as

Person p = New Person(); //or Person p = default(Person);

Now, what happens when you create a new instance of your struct without giving values to the struct properties? Each one of them will hold it's default value. Such as the Age = 0 because it's an int type.

Why is it possible to instantiate a struct without the new keyword?

The why is simply - because the spec says so. The how is a matter of ensuring that the entire block of memory is "definitely assigned", which means: assigning a value to each field of the struct. However, this requires 2 nasty things:

  • public fields (almost always bad)
  • mutable fields (generally bad in a struct)

so in most best-practice cases, you do need to use the new(...) syntax, to invoke the constructor (or to zero-the memory, for the parameterless constructor) for the type correctly.

Should I use new operator in C# Structures while declaring member functions in it?

Although you don't need the new to instantiate a structure, the compiler will not let you use any method or property unless all of its fields are initialized.

You can either create a constructor instead of the setvalue method, or assign it one after the other:

StructureMethods book1;
book1.author = "ds";
book1.name = "";
book1.page = 1;
book1.id = 3;
book1.printinfo();

StructureMethods book2 = new StructureMethods("test","bbb",0,1);
book2.printinfo();

struct StructureMethods
{
public string name;
public string author;
public int page;
public int id;

public StructureMethods(string a, string b, int c, int d)
{
name = a;
author = b;
page = c;
id = d;
}

public void printinfo()
{
Console.WriteLine("Name : " + name);
Console.WriteLine("Author : " + author);
Console.WriteLine("Total Page No : " + page);
Console.WriteLine("ID : " + id);
}
}

Side-Note1: there is no need to use return; at the end of the function unless you are returning some value in a non-void method.

Side-Note2: Since you mentioned you are new to C#, be sure you understand when to use structs and when classes. You can start here

Does using new on a struct allocate it on the heap or stack?

Okay, let's see if I can make this any clearer.

Firstly, Ash is right: the question is not about where value type variables are allocated. That's a different question - and one to which the answer isn't just "on the stack". It's more complicated than that (and made even more complicated by C# 2). I have an article on the topic and will expand on it if requested, but let's deal with just the new operator.

Secondly, all of this really depends on what level you're talking about. I'm looking at what the compiler does with the source code, in terms of the IL it creates. It's more than possible that the JIT compiler will do clever things in terms of optimising away quite a lot of "logical" allocation.

Thirdly, I'm ignoring generics, mostly because I don't actually know the answer, and partly because it would complicate things too much.

Finally, all of this is just with the current implementation. The C# spec doesn't specify much of this - it's effectively an implementation detail. There are those who believe that managed code developers really shouldn't care. I'm not sure I'd go that far, but it's worth imagining a world where in fact all local variables live on the heap - which would still conform with the spec.


There are two different situations with the new operator on value types: you can either call a parameterless constructor (e.g. new Guid()) or a parameterful constructor (e.g. new Guid(someString)). These generate significantly different IL. To understand why, you need to compare the C# and CLI specs: according to C#, all value types have a parameterless constructor. According to the CLI spec, no value types have parameterless constructors. (Fetch the constructors of a value type with reflection some time - you won't find a parameterless one.)

It makes sense for C# to treat the "initialize a value with zeroes" as a constructor, because it keeps the language consistent - you can think of new(...) as always calling a constructor. It makes sense for the CLI to think of it differently, as there's no real code to call - and certainly no type-specific code.

It also makes a difference what you're going to do with the value after you've initialized it. The IL used for

Guid localVariable = new Guid(someString);

is different to the IL used for:

myInstanceOrStaticVariable = new Guid(someString);

In addition, if the value is used as an intermediate value, e.g. an argument to a method call, things are slightly different again. To show all these differences, here's a short test program. It doesn't show the difference between static variables and instance variables: the IL would differ between stfld and stsfld, but that's all.

using System;

public class Test
{
static Guid field;

static void Main() {}
static void MethodTakingGuid(Guid guid) {}

static void ParameterisedCtorAssignToField()
{
field = new Guid("");
}

static void ParameterisedCtorAssignToLocal()
{
Guid local = new Guid("");
// Force the value to be used
local.ToString();
}

static void ParameterisedCtorCallMethod()
{
MethodTakingGuid(new Guid(""));
}

static void ParameterlessCtorAssignToField()
{
field = new Guid();
}

static void ParameterlessCtorAssignToLocal()
{
Guid local = new Guid();
// Force the value to be used
local.ToString();
}

static void ParameterlessCtorCallMethod()
{
MethodTakingGuid(new Guid());
}
}

Here's the IL for the class, excluding irrelevant bits (such as nops):

.class public auto ansi beforefieldinit Test extends [mscorlib]System.Object    
{
// Removed Test's constructor, Main, and MethodTakingGuid.

.method private hidebysig static void ParameterisedCtorAssignToField() cil managed
{
.maxstack 8
L_0001: ldstr ""
L_0006: newobj instance void [mscorlib]System.Guid::.ctor(string)
L_000b: stsfld valuetype [mscorlib]System.Guid Test::field
L_0010: ret
}

.method private hidebysig static void ParameterisedCtorAssignToLocal() cil managed
{
.maxstack 2
.locals init ([0] valuetype [mscorlib]System.Guid guid)
L_0001: ldloca.s guid
L_0003: ldstr ""
L_0008: call instance void [mscorlib]System.Guid::.ctor(string)
// Removed ToString() call
L_001c: ret
}

.method private hidebysig static void ParameterisedCtorCallMethod() cil managed
{
.maxstack 8
L_0001: ldstr ""
L_0006: newobj instance void [mscorlib]System.Guid::.ctor(string)
L_000b: call void Test::MethodTakingGuid(valuetype [mscorlib]System.Guid)
L_0011: ret
}

.method private hidebysig static void ParameterlessCtorAssignToField() cil managed
{
.maxstack 8
L_0001: ldsflda valuetype [mscorlib]System.Guid Test::field
L_0006: initobj [mscorlib]System.Guid
L_000c: ret
}

.method private hidebysig static void ParameterlessCtorAssignToLocal() cil managed
{
.maxstack 1
.locals init ([0] valuetype [mscorlib]System.Guid guid)
L_0001: ldloca.s guid
L_0003: initobj [mscorlib]System.Guid
// Removed ToString() call
L_0017: ret
}

.method private hidebysig static void ParameterlessCtorCallMethod() cil managed
{
.maxstack 1
.locals init ([0] valuetype [mscorlib]System.Guid guid)
L_0001: ldloca.s guid
L_0003: initobj [mscorlib]System.Guid
L_0009: ldloc.0
L_000a: call void Test::MethodTakingGuid(valuetype [mscorlib]System.Guid)
L_0010: ret
}

.field private static valuetype [mscorlib]System.Guid field
}

As you can see, there are lots of different instructions used for calling the constructor:

  • newobj: Allocates the value on the stack, calls a parameterised constructor. Used for intermediate values, e.g. for assignment to a field or use as a method argument.
  • call instance: Uses an already-allocated storage location (whether on the stack or not). This is used in the code above for assigning to a local variable. If the same local variable is assigned a value several times using several new calls, it just initializes the data over the top of the old value - it doesn't allocate more stack space each time.
  • initobj: Uses an already-allocated storage location and just wipes the data. This is used for all our parameterless constructor calls, including those which assign to a local variable. For the method call, an intermediate local variable is effectively introduced, and its value wiped by initobj.

I hope this shows how complicated the topic is, while shining a bit of light on it at the same time. In some conceptual senses, every call to new allocates space on the stack - but as we've seen, that isn't what really happens even at the IL level. I'd like to highlight one particular case. Take this method:

void HowManyStackAllocations()
{
Guid guid = new Guid();
// [...] Use guid
guid = new Guid(someBytes);
// [...] Use guid
guid = new Guid(someString);
// [...] Use guid
}

That "logically" has 4 stack allocations - one for the variable, and one for each of the three new calls - but in fact (for that specific code) the stack is only allocated once, and then the same storage location is reused.

EDIT: Just to be clear, this is only true in some cases... in particular, the value of guid won't be visible if the Guid constructor throws an exception, which is why the C# compiler is able to reuse the same stack slot. See Eric Lippert's blog post on value type construction for more details and a case where it doesn't apply.

I've learned a lot in writing this answer - please ask for clarification if any of it is unclear!

Struct initialization and new operator

The reason one is valid while the other is not is that you cannot call methods on uninitialised objects. Property setters are methods too.

public struct X
{
public int a;
public void setA(int value)
{ this.a = value; }
}

public struct Y
{
public int a { get; set; }
}

class Program
{
static void Main(string[] args)
{
X x;
x.setA(1); // A: error
x.a = 2; // B: okay

Y y;
y.a = 3; // C: equivalent to A
}
}

The reason that is not allowed is that the property setter could observe the uninitialised state of the object. The caller does not know whether the property setter merely sets a field, or does more than that.

If a struct is a value type why can I new it?

Because they have constructors.

The new operator doesn't mean "this is a reference type"; it means "this type has a constructor". When you new something you create an instance, and in doing so you invoke a constructor.

For that matter, all value and reference types have constructors (at the very least a default constructor taking no args if the type itself doesn't define any).

Why does adding primitive struct to a List not require the new keyword. Whereas adding non-primitive struct to List require the new keyword? - C#

what does the new keyword exactly do in C#?

It's all listed here. The most relevant one to this question is "constructor invocation". Structs and classes have constructors, and constructors create instances of structs and classes.

When you do:

new KeyValuePair<int,int>(10,20)

you are calling this constructor.

int, which is an alias for the Int32 struct, does not have a constructor that accepts a parameter of type int. This is the reason why you can't do:

new int(10)

Note that calling a constructor isn't the only way to create an instance of a struct. You can also do something like:

var defaultKVP = default(KeyValuePair<int, int>); // gets the default value of the type KeyValuePair<int, int>
// defaultKVP is an instance of KeyValuePair<int, int>! It's not null! Structs can't be null :)

The default value of a struct is defined by setting all its value-typed fields to their default values, and reference-typed fields to null.

The reason why an integer literal like 10 is an instance of the struct Int32, is, well, compiler magic. The spec says so, so it is implemented this way.



Related Topics



Leave a reply



Submit