Should Accessors Return Values or Constant References

Should accessors return values or constant references?

The short answer is: it depends :-)

From the performance point of view returning a reference is (usually) better: you save the creation of a new std::string object. (In this case, the creation is costly enough and the size of the object is high enough to justify make this choice at least worth considering - but this is not always the case. With a smaller or built-in type the performance difference may be negligible, or returning by value may even be cheaper).

From the security point of view returning a copy of the original value may be better, as constness can be cast away by malicious clients. This is especially to be taken into consideration if the method is part of a public API, i.e. you(r team) have no full control over how the returned value is (mis)used.

C++ const accessors and references best practice

Where my accessors return a const reference variable, ie const std::string, this means that it (my class's member variable) cannot be changed. Is that correct?

Correct. A variable cannot be changed through a const reference.

The last const before a method's body indicates that no members of the class for which that method is a part of can be altered, unless they are designated mutable. Is this also correct?

Correct. It also allows the function to be called on a const object.

Where I'm passing in const method parameters, this means that I ensure these parameters are always stored as they were passed in, thus protecting any original variables being passed in from being altered by the method body. Is this also correct?

Correct. Same can be achieved with accepting the argument by value.

With regards to the mutable keyword, under what circumstances would I actually want to use this?

See When have you used C++ 'mutable' keyword?

Accessors seem like a good idea, even for data that will never be publicly exposed, because it ensures a single-point of entry, allowing for easier debugging and so on. Am I thinking along the right lines here

I don't buy this argument. Watchpoints allow for easy debugging of member variables regardless of where they're accessed from.

From a purely syntactical perspective, should I be writing my references like const int& intInput or const int &intInput.

Both are syntactically equivalent and the choice between them is purely aesthetic.

Finally, is what I'm doing in the example above good practice?

There is no general answer. Accessors are sometimes useful. Often they're redundant. If you provide a function that allows setting the value directly, such as you do here, then you might as well get rid of the accessors and make the member public.

C++ getter temporary vs const reference

Your method makes a copy of points. Modifications to the vector itself will not be propagated to the points inside your class. You could return a const reference instead to avoid copying. The caller would be able to copy if he wants, but he would be in control.

Note that the copy that is made when returning a vector is shallow. This means that the individual points are exposed: if the caller makes modifications to points (as opposed to making modifications to the vector) these modifications will propagate to the points pointed to by the vector inside your class.

Protecting the content of Points is a lot harder - it would require you to do manual copying. It would also require your caller to make manual freeing. In general, though, you would be better off using smart pointers (std::unique_ptr or std::shared_ptr) for easier resource management.

Which are the implications of return a value as constant, reference and constant reference in C++?

Here's the lowdown on all your cases:

• Return by reference: The function call can be used as the left hand side of an assignment. e.g. using operator overloading, if you have operator[] overloaded, you can say something like

a[i] = 7;

(when returning by reference you need to ensure that the object you return is available after the return: you should not return a reference to a local or a temporary)

• Return as constant value: Prevents the function from being used on the left side of an assignment expression. Consider the overloaded operator+. One could write something like:

a + b = c; // This isn't right

Having the return type of operator+ as "const SomeType" allows the return by value and at the same time prevents the expression from being used on the left side of an assignment.

Return as constant value also allows one to prevent typos like these:

if (someFunction() = 2)

when you meant

if (someFunction() == 2)

If someFunction() is declared as

const int someFunction()

then the if() typo above would be caught by the compiler.

• Return as constant reference: This function call cannot appear on the left hand side of an assignment, and you want to avoid making a copy (returning by value). E.g. let's say we have a class Student and we'd like to provide an accessor id() to get the ID of the student:

class Student
{
std::string id_;

public:

const std::string& id() const;
};

const std::string& Student::id()
{
return id_;
}

Consider the id() accessor. This should be declared const to guarantee that the id() member function will not modify the state of the object. Now, consider the return type. If the return type were string& then one could write something like:

Student s;
s.id() = "newId";

which isn't what we want.

We could have returned by value, but in this case returning by reference is more efficient. Making the return type a const string& additionally prevents the id from being modified.

Comparison between constant accessors of private members

There are semantic/behavioral differences that are far more significant than your (broken) benchmarks.

Copy semantics are broken

A live example:

#include <iostream>

class Broken {
public:
Broken(int i): read_only(read_write), read_write(i) {}

int const& read_only;

void set(int i) { read_write = i; }

private:
int read_write;
};

int main() {
Broken original(5);
Broken copy(original);

std::cout << copy.read_only << "\n";

original.set(42);

std::cout << copy.read_only << "\n";
return 0;
}

Yields:

5
42

The problem is that when doing a copy, copy.read_only points to original.read_write. This may lead to dangling references (and crashes).

This can be fixed by writing your own copy constructor, but it is painful.

Assignment is broken

A reference cannot be reseated (you can alter the content of its referee but not switch it to another referee), leading to:

int main() {
Broken original(5);
Broken copy(4);
copy = original;

std::cout << copy.read_only << "\n";

original.set(42);

std::cout << copy.read_only << "\n";
return 0;
}

generating an error:

prog.cpp: In function 'int main()':
prog.cpp:18:7: error: use of deleted function 'Broken& Broken::operator=(const Broken&)'
copy = original;
^
prog.cpp:3:7: note: 'Broken& Broken::operator=(const Broken&)' is implicitly deleted because the default definition would be ill-formed:
class Broken {
^
prog.cpp:3:7: error: non-static reference member 'const int& Broken::read_only', can't use default assignment operator

This can be fixed by writing your own copy constructor, but it is painful.

Unless you fix it, Broken can only be used in very restricted ways; you may never manage to put it inside a std::vector for example.

Increased coupling

Giving away a reference to your internals increases coupling. You leak an implementation detail (the fact that you are using an int and not a short, long or long long).

With a getter returning a value, you can switch the internal representation to another type, or even elide the member and compute it on the fly.

This is only significant if the interface is exposed to clients expecting binary/source-level compatibility; if the class is only used internally and you can afford to change all users if it changes, then this is not an issue.


Now that semantics are out of the way, we can speak about performance differences.

Increased object size

While references can sometimes be elided, it is unlikely to ever happen here. This means that each reference member will increase the size of an object by at least sizeof(void*), plus potentially some padding for alignment.

The original class MyClassA has a size of 4 on x86 or x86-64 platforms with mainstream compilers.

The Broken class has a size of 8 on x86 and 16 on x86-64 platforms (the latter because of padding, as pointers are aligned on 8-bytes boundaries).

An increased size can bust up CPU caches, with a large number of items you may quickly experience slow downs due to it (well, not that it'll be easy to have vectors of Broken due to its broken assignment operator).

Better performance in debug

As long as the implementation of the getter is inline in the class definition, then the compiler will strip the getter whenever you compile with a sufficient level of optimizations (-O2 or -O3 generally, -O1 may not enable inlining to preserve stack traces).

Thus, the performance of access should only vary in debug code, where performance is least necessary (and otherwise so crippled by plenty of other factors that it matters little).


In the end, use a getter. It's established convention for a good number of reasons :)

Why use a function rather than a reference to member?

There are a few reasons why (inline) functions that return an appropriate reference are better:

  1. A reference will require memory (typically the same amount as a pointer) in each object

  2. References typically have the same alignment as pointers, thus causing the surrounding object to potentially require a higher alignment and thus wasting even more memory

  3. Initializing a reference requires (a minuscule amount of) time

  4. Having a member field of reference type will disable the default copy and move assignment operators since references are not reseatable

  5. Having a member field of reference type will cause the automatically generated default copy and move constructors to be incorrect, since they will now contain references to the members of other objects

  6. Functions can do additional checking like verifying invariants in debug builds

Be aware that due to inlining, the function will usually not incur any extra costs beyond a potentially slightly larger binary.

Any problems with this C++ const reference accessor interface idiom?

There is an aliasing issue in that because you expose a reference to the foo_t's internal data, it's possible for code external to a foo_t object to hold on to references into its data beyond the object's lifetime. Consider:

foo_t* f = new foo_t();
const int& x2 = f->x;
delete f;
std::cout << x2; // Undefined behavior; x2 refers into a foo_t object that was deleted

Or, even simpler:

const int& x2 = foo_t().x;
std::cout << x2; // Undefined behvior; x2 refers into a foo_t object that no longer exists

These aren't particularly realistic examples, but this is a potential issue whenever an object exposes or returns a reference to its data (public or private). Of course, it's just as possible to hold on to a reference to the foo_t object itself beyond its lifetime, but that might be harder to miss or to do by accident.

Not that this is an argument against what you're doing. In fact I've used this pattern before (for a different reason) and I don't think there's anything inherently wrong with it, aside from the lack of encapsulation, which you seem to recognize. The above issue is just something to be aware of.



Related Topics



Leave a reply



Submit