What Exactly Is a 'Side-Effect' in C++

What is a side effect in C?

My question is that does the C standard explicitly describe a meaning of side effects?

The sentence in the C standard you quote (C 1999 5.1.2.3 2, and the same in C 2018) explicitly describes the meaning of side effects. These are further explained below.

Modifying An Object

Modifying an object is understood to include the things that update the stored bytes that represent the object. I believe a complete list of them is:

  • Simple assignment (=) and compound assignment (*=, /= %=, +=, -=, <<=, >>=, &=, ^=, and |=).
  • The increment and decrement operators (++ and --), both prefix and postfix.
  • Initialization of an object included in its definition.
  • Library routines that are specified to change objects, such as memcpy.

Accessing a Volatile Object

“Access” is defined in C 2018 3.1 as “⟨execution-time action⟩ to read or modify the value of an object”. If x is a volatile int, then using the value of x in an expression accesses it (when the expression is evaluated), because it reads the value of x. You can follow this more specifically in that 6.3.2.1 2 tells us that the use of x in an expression results in the value of x being taken:

Except when it is the operand of the sizeof operator, the unary & operator, the ++ operator, the -- operator, or the left operand of the . operator or an assignment operator, an lvalue that does not have array type is converted to the value stored in the designated object (and is no longer an lvalue); this is called lvalue conversion.

So the x in the expression which is, by itself, just a designation of the object x, is converted to the value stored in x, which means that stored value is read from memory. That is an access of x.

Modifying a volatile object is the same as modifying any object, described above.

Modifying a File

Files are modified by way of the routines defined in clause 7.21 (“Input/output <stdio.h>”).

What is side effect in C?

A function is (should be) a black box, in which the return value, or the value of a variable passed by reference, should be the only thing that may change depending upon the input parameters.

Any other observable change that the function produces outside these cases, is a side-effect. The most well-known example may be the printf() function which, besides returning the number of written characters, changes the contents of the standard output, which means altering some memory buffer associated with a pipe, a file, or the screen, for instance, and which doesn't belong to the local environment of the function.

does a local variable assignment constitute a side effect?

So, who is right?

When it comes to the definitions in the standard, it's the standard.

is x = 1 really a side effect? even though it does not change anything outside it's scope?

Yes, the standard paragraph you quoted said as much.

or am I wrongly interpreted the standard?

You understood and applied the standard paragraph correctly to x = 1. But you were wrong to try and apply an outside colloquial definition onto the standard text. The C standard is not meant to teach anyone about C. It is a formal document whose sole purpose is to define how the C abstract machine executes a translated program. To that end it defines a bunch of concepts and terms. That's it. When referring to those terms in order to divine the intended behavior of a C program, only the standard's definition applies.

The book on the other hand does aim to teach you C. Its purpose is to give you a "feel" for how a C program behaves. But to that end it may very well use colloquialisms and imprecise language, that's to be expected. You should not disregard the book if it has good reviews, but bear in mind that it is not a normative reference, unlike the standard.

When exactly does a method have side effects?

Your instructor is mistaken. With apologies to the SO editors for not pasting the entire article here, this is what Wikipedia has to say:

http://en.wikipedia.org/wiki/Side_effect_(computer_science)

Money Quote #1:

In computer science, a function or expression is said to have a side effect if, in addition to producing a value, it also modifies some state or has an observable interaction with calling functions or the outside world.

Money Quote #2:

In the presence of side effects, a program's behavior depends on past history; that is, the order of evaluation matters.

Non-NOP Setters always satisfy that criteria.

Avoiding Side Effects in C

Changing the contents of an array is always a "side-effect" in C, as the formal definition goes. If you are rather looking for a way to make an array etc immutable, as in read-only and always creating a new object upon manipulation, there are ways to do that too.

You have to be aware that this typically involves a "hard copy" of the data contents, so it comes with execution overhead. C gives you the option not to be that inefficient if you don't want to. But if you want it, then the more flexible option is dynamic allocation. Something like this:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int* increment(const int* myarray, int size)
{
int* new_obj = malloc(sizeof(int[size]));
for(int i=0; i<size; i++)
{
new_obj[i] = myarray[i] + 1;
}
return new_obj;
}

int main (void)
{
int* myarray = malloc(sizeof(int[4]));
memcpy(myarray, (int[]){0,1,2,3}, sizeof(int[4]));
for(int i=0; i<4; i++)
{
printf("%d ", myarray[i]);
}
puts("");

int* another_array = increment(myarray, 4);
free(myarray);

for(int i=0; i<4; i++)
{
printf("%d ", another_array[i]);
}

free(another_array);
}

Note that this is significantly slower than modifying the original array in place. The heap allocation and the data copy are both relatively slow.

You could create "bad API" functions in C though, such as

int* increment(int *myarray, int size) {
for(int i = 0; i < size; i++){
myarray[i] += 1;

}
return myarray;
}

This returns a pointer to the same array that was passed along. It's bad API because it's confusing, though some C standard functions were designed just like this (strcpy etc). And in order to use this function you need a pointer to the first element of the array, rather than the array itself.

Why an access to a volatile glvalue is considered a side effect by [intro.execution]/12?

The entire purpose of volatile is to indicate to the compiler that "you don't really know exactly what the result of accessing this variable is, so don't mess about with it".

Say for example we have:

 int x = 7;
...
int func1()
{
return x;
}
...
int func2()
{
return func1() + func1();
}

the compiler could (some would argue should) convert this to return 2 * func1(); [which is trivially calculated by a single add].

However, if x is a hardware register [so that return x; actually behaves like return x++;], which changes with each read (e.g. it's a counter register), then func1()+func1() can not, and should not be optimised to 2 * func1(); - to avoid the compiler doing so volatile int x; will make that happen [unfortunately, there is no way to cause this behaviour in plain C++ code / some real hardware is required]

Hardware registers, which is the normal use-case for volatile (typically in conjunction with pointers, but doesn't have to be), the read of a register will potentially have actual side-effects on the hardware - for example reading a fifo-register on a serial port [or network card, hard disk, or whatever], will affect the state of the hardware, since the fifo has now "moved on" one step. Skipping over, duplicating, caching the result of, or some other such optimisation would definitely cause a piece of driver code and hardware to behave in a different way than the programmer wanted - which would be the case if volatile wasn't considered as having a side-effect.

Read volatile variable has persistent effect? Misra C

This has nothing to do with MISRA-C as such, but the C language itself. The definition is found in the C standard (C11 5.1.2.3):

Accessing a volatile object, modifying an object, modifying a file, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment.

It is somewhat common to have hardware peripheral flag registers that change their value upon read. For example: reading a status register followed by reading a data register is a common way to clear flags in UART or SPI hardware.

For this reason, MISRA-C wants you to isolate volatile accesses into simple expressions, so that you don't get unexpected results such as in x && my_volatile, where my_volatile may or may not be evaluated and updated. The correct approach is to isolate the volatile access to a line of its own:

if(x) 
{
int tmp = my_volatile;
// do stuff with tmp
}

These rules also prevents you from writing dysfunctional code such as:

REGISTER |= FLAG1;
REGISTER |= FLAG2;
...

resulting in multiple writes to the register which causes execution overhead and possibly unwanted side effects. The above bad code scenario is sadly far more common in embedded firmware than you'd expect. Again, it is easiest to store the results in a temporary variable and isolate the write to a single line:

uint32_t reg = FLAG1 | FLAG2 | ...;
REGISTER = reg;


Related Topics



Leave a reply



Submit