When to Use "New" and When Not To, in C++

When to use new and when not to, in C++?

You should use new when you wish an object to remain in existence until you delete it. If you do not use new then the object will be destroyed when it goes out of scope. Some examples of this are:

void foo()
{
Point p = Point(0,0);
} // p is now destroyed.

for (...)
{
Point p = Point(0,0);
} // p is destroyed after each loop

Some people will say that the use of new decides whether your object is on the heap or the stack, but that is only true of variables declared within functions.

In the example below the location of 'p' will be where its containing object, Foo, is allocated. I prefer to call this 'in-place' allocation.

class Foo
{

Point p;
}; // p will be automatically destroyed when foo is.

Allocating (and freeing) objects with the use of new is far more expensive than if they are allocated in-place so its use should be restricted to where necessary.

A second example of when to allocate via new is for arrays. You cannot* change the size of an in-place or stack array at run-time so where you need an array of undetermined size it must be allocated via new.

E.g.

void foo(int size)
{
Point* pointArray = new Point[size];
...
delete [] pointArray;
}

(*pre-emptive nitpicking - yes, there are extensions that allow variable sized stack allocations).

When should I use the new keyword in C++?

Method 1 (using new)

  • Allocates memory for the object on the free store (This is frequently the same thing as the heap)
  • Requires you to explicitly delete your object later. (If you don't delete it, you could create a memory leak)
  • Memory stays allocated until you delete it. (i.e. you could return an object that you created using new)
  • The example in the question will leak memory unless the pointer is deleted; and it should always be deleted, regardless of which control path is taken, or if exceptions are thrown.

Method 2 (not using new)

  • Allocates memory for the object on the stack (where all local variables go) There is generally less memory available for the stack; if you allocate too many objects, you risk stack overflow.
  • You won't need to delete it later.
  • Memory is no longer allocated when it goes out of scope. (i.e. you shouldn't return a pointer to an object on the stack)

As far as which one to use; you choose the method that works best for you, given the above constraints.

Some easy cases:

  • If you don't want to worry about calling delete, (and the potential to cause memory leaks) you shouldn't use new.
  • If you'd like to return a pointer to your object from a function, you must use new

Why should C++ programmers minimize use of 'new'?

There are two widely-used memory allocation techniques: automatic allocation and dynamic allocation. Commonly, there is a corresponding region of memory for each: the stack and the heap.

Stack

The stack always allocates memory in a sequential fashion. It can do so because it requires you to release the memory in the reverse order (First-In, Last-Out: FILO). This is the memory allocation technique for local variables in many programming languages. It is very, very fast because it requires minimal bookkeeping and the next address to allocate is implicit.

In C++, this is called automatic storage because the storage is claimed automatically at the end of scope. As soon as execution of current code block (delimited using {}) is completed, memory for all variables in that block is automatically collected. This is also the moment where destructors are invoked to clean up resources.

Heap

The heap allows for a more flexible memory allocation mode. Bookkeeping is more complex and allocation is slower. Because there is no implicit release point, you must release the memory manually, using delete or delete[] (free in C). However, the absence of an implicit release point is the key to the heap's flexibility.

Reasons to use dynamic allocation

Even if using the heap is slower and potentially leads to memory leaks or memory fragmentation, there are perfectly good use cases for dynamic allocation, as it's less limited.

Two key reasons to use dynamic allocation:

  • You don't know how much memory you need at compile time. For instance, when reading a text file into a string, you usually don't know what size the file has, so you can't decide how much memory to allocate until you run the program.

  • You want to allocate memory which will persist after leaving the current block. For instance, you may want to write a function string readfile(string path) that returns the contents of a file. In this case, even if the stack could hold the entire file contents, you could not return from a function and keep the allocated memory block.

Why dynamic allocation is often unnecessary

In C++ there's a neat construct called a destructor. This mechanism allows you to manage resources by aligning the lifetime of the resource with the lifetime of a variable. This technique is called RAII and is the distinguishing point of C++. It "wraps" resources into objects. std::string is a perfect example. This snippet:

int main ( int argc, char* argv[] )
{
std::string program(argv[0]);
}

actually allocates a variable amount of memory. The std::string object allocates memory using the heap and releases it in its destructor. In this case, you did not need to manually manage any resources and still got the benefits of dynamic memory allocation.

In particular, it implies that in this snippet:

int main ( int argc, char* argv[] )
{
std::string * program = new std::string(argv[0]); // Bad!
delete program;
}

there is unneeded dynamic memory allocation. The program requires more typing (!) and introduces the risk of forgetting to deallocate the memory. It does this with no apparent benefit.

Why you should use automatic storage as often as possible

Basically, the last paragraph sums it up. Using automatic storage as often as possible makes your programs:

  • faster to type;
  • faster when run;
  • less prone to memory/resource leaks.

Bonus points

In the referenced question, there are additional concerns. In particular, the following class:

class Line {
public:
Line();
~Line();
std::string* mString;
};

Line::Line() {
mString = new std::string("foo_bar");
}

Line::~Line() {
delete mString;
}

Is actually a lot more risky to use than the following one:

class Line {
public:
Line();
std::string mString;
};

Line::Line() {
mString = "foo_bar";
// note: there is a cleaner way to write this.
}

The reason is that std::string properly defines a copy constructor. Consider the following program:

int main ()
{
Line l1;
Line l2 = l1;
}

Using the original version, this program will likely crash, as it uses delete on the same string twice. Using the modified version, each Line instance will own its own string instance, each with its own memory and both will be released at the end of the program.

Other notes

Extensive use of RAII is considered a best practice in C++ because of all the reasons above. However, there is an additional benefit which is not immediately obvious. Basically, it's better than the sum of its parts. The whole mechanism composes. It scales.

If you use the Line class as a building block:

 class Table
{
Line borders[4];
};

Then

 int main ()
{
Table table;
}

allocates four std::string instances, four Line instances, one Table instance and all the string's contents and everything is freed automagically.

Why is it a bad idea to use 'new'?

The point is that new, much like a pregnancy, creates a resource that is managed manually (i.e. by you), and as such it comes with responsibility.

C++ is a language for library writing, and any time you see a responsibility, the "C++" approach is to write a library element that handles this, and only this, responsibility. For dynamic memory allocation, those library components already exist and are summarily referred to as "smart pointers"; you'll want to look at std::unique_ptr and std::shared_ptr (or their TR1 or Boost equivalents).

While writing those single-responsibility building blocks, you will indeed need to say new and delete. But you do that once, and you think about it carefully and make sure you provide the correct copy, assignment and destruction semantics. (From the point of exception safety, single responsibility is crucial, since dealing with more than one single resource at a time is horribly unscalable.)

Once you have everything factored into suitable building blocks, you compose those blocks into bigger and bigger systems of code, but at that point you don't need to exercise any manual responsibility any more, since the building blocks already do this for you.

Since the standard library offers resource managing classes for the vast majority of use cases (dynamic arrays, smart pointers, file handles, strings), the point is that a well-factored and crafted C++ project should have very little need for any sort of manual resource management, which includes the use of new. All your handler objects are either automatic (scoped), or members of other classes whose instances are in turn scoped or managed by someone.

With this in mind, the only time you should be saying new is when you create a new resource-managing object; although even then that's not always necessary:

std::unique_ptr<Foo> p1(new Foo(1, 'a', -2.5));   // unique pointer
std::shared_ptr<Foo> p2(new Foo(1, 'a', -2.5)); // shared pointer
auto p3 = std::make_shared<Foo>(1, 'a', -2.5); // equivalent to p2, but better

Update: I think I may have addressed only half the OP's concerns. Many people coming from other languages seem to be under the impression that any object must be instantiated with a new-type expression. This is in itself a very unhelpful mindset when approaching C++:

The crucial distinction in C++ is that of object lifetime, or "storage class". This can be one of: automatic (scoped), static (permanent), or dynamic (manual). Global variables have static lifetime. The vast majority of all variables (which are declared as Foo x; inside a local scope) have automatic lifetime. It is only for dynamic storage that we use a new expression. The most important thing to realize when coming to C++ from another OO language is that most objects only ever need to have automatic lifetime, and thus there is never anything to worry about.

So the first realization should be that "C++ rarely needs dynamic storage". I feel that this may have been part of the OP's question. The question may have been better phrased as "is it a really bad idea to allocate objects dynamically?". It is only after you decide that you really need dynamic storage that we get to the discussion proper of whether you should be saying new and delete a lot, or if there are preferable alternatives, which is the point of my original answer.

when should I use the new operator in C++

The first one creates a Money object on the stack, its lifespan is within the scope of when it was created. Meaning when you hit a } it goes out of scope and the memory is returned. Use this when you want to create an object within one function.

The second one creates a Money object on the heap, its lifespan is as long as you want it to be, namely until you delete it. Use this when you want your object to be passed around to different functions

In C++, why should I use new if I can directly assign an integer to the pointer without using new?

int *ptr_a = 1;doesn't create a new int, this creates a pointer ptr_a and assigns a value of 1 to it, meaning that this pointer will point to the address 0x00000001. It's not an integer. If you try to use the pointer later with *ptr_a = 2, you will get a segmentation fault because the pointer doesn't point to an allocated memory space (in this precise case, it points to kernel memory, which is a no-no).

A good principle nowadays is to used std::unique_ptr<int> ptr_a = std::make_unique<int>(1)which will allocate a new int with a value of 1 and deallocate the memory once ptr_a gets out of scope.

When do I need to use the new operator when declaring nodes in C++?

You would use new only when dynamically-allocating new memory for storing a new node. In your case, NewNode is just pointing to the existing memory pointed to by List, so this is not the right place for using the new operator.



Related Topics



Leave a reply



Submit