Foreach Loop and Reference of &$Value

foreach loop and reference of &$value

At the end of the first loop, $value is pointing to the same place as $variable[3] (they are pointing to the same location in memory):

$variable  = [1,2,3,4];
foreach ($variable as $key => &$value)
$value ++;

Even as this loop is finished, $value is still a reference that's pointing to the same location in memory as $variable[3], so each time you store a value in $value, this also overwrites the value stored for $variable[3]:

foreach ($variable as $key => $value);
var_dump($variable);

With each evaluation of this foreach, both $value and $variable[3] are becoming equal to the value of the iterable item in $variable.

So in the 3rd iteration of the second loop, $value and $variable[3] become equal to 4 by reference, then during the 4th and final iteration of the second loop, nothing changes because you're passing the value of $variable[3] (which is still &$value) to $value (which is still &$value).

It's very confusing, but it's not even slightly idiosyncratic; it's the code executing exactly as it should.

More info here: PHP: Passing by Reference


To prevent this behavior it is sufficient to add an unset($value); statement after each loop where it is used. An alternative to the unset may be to enclose the foreach loop in a self calling closure, in order to force $value to be local, but the amount of additional characters needed to do that is bigger than just unsetting it:

(function($variable){
foreach ($variable as $key => &$value) $value++;
})($variable);

PHP Pass by reference in foreach

Because on the second loop, $v is still a reference to the last array item, so it's overwritten each time.

You can see it like that:

$a = array ('zero','one','two', 'three');

foreach ($a as &$v) {

}

foreach ($a as $v) {
echo $v.'-'.$a[3].PHP_EOL;
}

As you can see, the last array item takes the current loop value: 'zero', 'one', 'two', and then it's just 'two'... : )

What is better in a foreach loop... using the & symbol or reassigning based on key?

Since the highest scoring answer states that the second method is better in every way, I feel compelled to post an answer here. True, looping by reference is more performant, but it isn't without risks/pitfalls.

Bottom line, as always: "Which is better X or Y", the only real answers you can get are:

  • It depends on what you're after/what you're doing
  • Oh, both are OK, if you know what you're doing
  • X is good for Such, Y is better for So
  • Don't forget about Z, and even then ...("which is better X, Y or Z" is the same question, so the same answers apply: it depends, both are ok if...)

Be that as it may, as Orangepill showed, the reference-approach offers better performance. In this case, the tradeoff one of performance vs code that is less error-prone, easier to read/maintan. In general, it's considered better to go for safer, more reliable, and more maintainable code:

'Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.' — Brian Kernighan

I guess that means the first method has to be considered best practice. But that doesn't mean the second approach should be avoided at all time, so what follows here are the downsides, pitfalls and quirks that you'll have to take into account when using a reference in a foreach loop:

Scope:

For a start, PHP isn't truly block-scoped like C(++), C#, Java, Perl or (with a bit of luck) ECMAScript6... That means that the $value variable will not be unset once the loop has finished. When looping by reference, this means a reference to the last value of whatever object/array you were iterating is floating around. The phrase "an accident waiting to happen" should spring to mind.

Consider what happens to $value, and subsequently $array, in the following code:

$array = range(1,10);
foreach($array as &$value)
{
$value++;
}
echo json_encode($array);
$value++;
echo json_encode($array);
$value = 'Some random value';
echo json_encode($array);

The output of this snippet will be:

[2,3,4,5,6,7,8,9,10,11]
[2,3,4,5,6,7,8,9,10,12]
[2,3,4,5,6,7,8,9,10,"Some random value"]

In other words, by reusing the $value variable (which references the last element in the array), you're actually manipulating the array itself. This makes for error-prone code, and difficult debugging. As opposed to:

$array = range(1,10);
$array[] = 'foobar';
foreach($array as $k => $v)
{
$array[$k]++;//increments foobar, to foobas!
if ($array[$k] === ($v +1))//$v + 1 yields 1 if $v === 'foobar'
{//so 'foobas' === 1 => false
$array[$k] = $v;//restore initial value: foobar
}
}

Maintainability/idiot-proofness:

Of course, you might say that the dangling reference is an easy fix, and you'd be right:

foreach($array as &$value)
{
$value++;
}
unset($value);

But after you've written your first 100 loops with references, do you honestly believe you won't have forgotten to unset a single reference? Of course not! It's so uncommon to unset variables that have been used in a loop (we assume the GC will take care of it for us), so most of the time, you don't bother. When references are involved, this is a source of frustration, mysterious bug-reports, or traveling values, where you're using complex nested loops, possibly with multiple references... The horror, the horror.

Besides, as time passes, who's to say that the next person working on your code won't foget about unset? Who knows, he might not even know about references, or see your numerous unset calls and deem them redundant, a sign of your being paranoid, and delete them all together. Comments alone won't help you: they need to be read, and everyone working with your code should be thoroughly briefed, perhaps have them read a full article on the subject. The examples listed in the linked article are bad, but I've seen worse, still:

foreach($nestedArr as &$array)
{
if (count($array)%2 === 0)
{
foreach($array as &$value)
{//pointless, but you get the idea...
$value = array($value, 'Part of even-length array');
}
//$value now references the last index of $array
}
else
{
$value = array_pop($array);//assigns new value to var that might be a reference!
$value = is_numeric($value) ? $value/2 : null;
array_push($array, $value);//congrats, X-references ==> traveling value!
}
}

This is a simple example of a traveling value problem. I did not make this up, BTW, I've come across code that boils down to this... honestly. Quite apart from spotting the bug, and understanding the code (which has been made more difficult by the references), it's still quite obvious in this example, mainly because it's a mere 15 lines long, even using the spacious Allman coding style... Now imagine this basic construct being used in code that actually does something even slightly more complex, and meaningful. Good luck debugging that.

side-effects:

It's often said that functions shouldn't have side-effects, because side-effects are (rightfully) considered to be code-smell. Though foreach is a language construct, and not a function, in your example, the same mindset should apply. When using too many references, you're being too clever for your own good, and might find yourself having to step through a loop, just to know what is being referenced by what variable, and when.

The first method hasn't got this problem: you have the key, so you know where you are in the array. What's more, with the first method, you can perform any number of operations on the value, without changing the original value in the array (no side-effects):

function recursiveFunc($n, $max = 10)
{
if (--$max)
{
return $n === 1 ? 10-$max : recursiveFunc($n%2 ? ($n*3)+1 : $n/2, $max);
}
return null;
}
$array = range(10,20);
foreach($array as $k => $v)
{
$v = recursiveFunc($v);//reassigning $v here
if ($v !== null)
{
$array[$k] = $v;//only now, will the actual array change
}
}
echo json_encode($array);

This generates the output:

[7,11,12,13,14,15,5,17,18,19,8]

As you can see, the first, seventh and tenth elements have been altered, the others haven't. If we were to rewrite this code using a loop by reference, the loop looks a lot smaller, but the output will be different (we have a side-effect):

$array = range(10,20);
foreach($array as &$v)
{
$v = recursiveFunc($v);//Changes the original array...
//granted, if your version permits it, you'd probably do:
$v = recursiveFunc($v) ?: $v;
}
echo json_encode($array);
//[7,null,null,null,null,null,5,null,null,null,8]

To counter this, we'll either have to create a temporary variable, or call the function tiwce, or add a key, and recalculate the initial value of $v, but that's just plain stupid (that's adding complexity to fix what shouldn't be broken):

foreach($array as &$v)
{
$temp = recursiveFunc($v);//creating copy here, anyway
$v = $temp ? $temp : $v;//assignment doesn't require the lookup, though
}
//or:
foreach($array as &$v)
{
$v = recursiveFunc($v) ? recursiveFunc($v) : $v;//2 calls === twice the overhead!
}
//or
$base = reset($array);//get the base value
foreach($array as $k => &$v)
{//silly combine both methods to fix what needn't be a problem to begin with
$v = recursiveFunc($v);
if ($v === 0)
{
$v = $base + $k;
}
}

Anyway, adding branches, temp variables and what have you, rather defeats the point. For one, it introduces extra overhead which will eat away at the performance benefits references gave you in the first place.

If you have to add logic to a loop, to fix something that shouldn't need fixing, you should step back, and think about what tools you're using. 9/10 times, you chose the wrong tool for the job.

The last thing that, to me at least, is a compelling argument for the first method is simple: readability. The reference-operator (&) is easily overlooked if you're doing some quick fixes, or try to add functionality. You could be creating bugs in the code that was working just fine. What's more: because it was working fine, you might not test the existing functionality as thoroughly because there were no known issues.

Discovering a bug that went into production, because of your overlooking an operator might sound silly, but you wouldn't be the first to have encountered this.

Note:

Passing by reference at call-time has been removed since 5.4. Be weary of features/functionality that is subject to changes. a standard iteration of an array hasn't changed in years. I guess it's what you could call "proven technology". It does what it says on the tin, and is the safer way of doing things. So what if it's slower? If speed is an issue, you can optimize your code, and introduce references to your loops then.

When writing new code, go for the easy-to-read, most failsafe option. Optimization can (and indeed should) wait until everything's tried and tested.

And as always: premature optimization is the root of all evil. And Choose the right tool for the job, not because it's new and shiny.

PHP - foreach loops - $arr and &$value as same variable

It will work, ONCE, but only because of a quirk in PHP:

php > $x = array(1,2,3);
php > foreach($x as $x) { echo $x; }
123
php > var_dump($x);
int(3)

Note that the loop actually ran for all 3 values of the original $x array, but after the loop exits, $x, is now a mere int - it's no longer an array.

This holds true if the as $x is a straight plain $x variable, or a &$x reference.

What is & in the php foreach statement?

In the beginning when learning what passing by reference it isn't obvious....

Here's an example that I hope will hope you get a clearer understanding on what the difference on passing by value and passing by reference is...

<?php
$money = array(1, 2, 3, 4); //Example array
moneyMaker($money); //Execute function MoneyMaker and pass $money-array as REFERENCE
//Array $money is now 2,3,4,5 (BECAUSE $money is passed by reference).

eatMyMoney($money); //Execute function eatMyMoney and pass $money-array as a VALUE
//Array $money is NOT AFFECTED (BECAUSE $money is just SENT to the function eatMyMoeny and nothing is returned).
//So array $money is still 2,3,4,5

echo print_r($money,true); //Array ( [0] => 2 [1] => 3 [2] => 4 [3] => 5 )

//$item passed by VALUE
foreach($money as $item) {
$item = 4; //would just set the value 4 to the VARIABLE $item
}
echo print_r($money,true); //Array ( [0] => 2 [1] => 3 [2] => 4 [3] => 5 )

//$item passed by REFERENCE
foreach($money as &$item) {
$item = 4; //Would give $item (current element in array)value 4 (because item is passed by reference in the foreach-loop)
}

echo print_r($money,true); //Array ( [0] => 4 [1] => 4 [2] => 4 [3] => 4 )

function moneyMaker(&$money) {
//$money-array is passed to this function as a reference.
//Any changes to $money-array is affected even outside of this function
foreach ($money as $key=>$item) {
$money[$key]++; //Add each array element in money array with 1
}
}

function eatMyMoney($money) { //NOT passed by reference. ONLY the values of the array is SENT to this function
foreach ($money as $key=>$item) {
$money[$key]--; //Delete 1 from each element in array $money
}
//The $money-array INSIDE of this function returns 1,2,3,4
//Function isn't returing ANYTHING
}
?>

Retrieving Previous Values in For Each Loop Reference &$value

You can do something like:

$arr = array(
array
(
"id" => 1,
"SKU" => 'SKU_1',
"ProductIDRef" => '45645-12'
),
array
(
"id" => 2,
"SKU" => 'SKU_2',
"ProductIDRef" => '43445-45'
),
array
(
"id" => 3,
"SKU" => 'SKU_2',
"ProductIDRef" => null
)
);

foreach( $arr as $key => $value ) {
//Check if current item's ProductIDRef is empty or null
//Check if prev entry exist AND SKU is the same
if ( ( $value[ "ProductIDRef" ] == "" || $value[ "ProductIDRef" ] == null ) && ( isset( $arr[ $key - 1 ] ) && $arr[ $key - 1 ][ "SKU" ] == $value[ "SKU" ] ) ) {
$arr[ $key ][ "ProductIDRef" ] = $arr[ $key - 1 ][ "ProductIDRef" ];
}
}

echo "<pre>";
print_r( $arr );
echo "</pre>";

This will result to:

Array
(
[0] => Array
(
[id] => 1
[SKU] => SKU_1
[ProductIDRef] => 45645-12
)

[1] => Array
(
[id] => 2
[SKU] => SKU_2
[ProductIDRef] => 43445-45
)

[2] => Array
(
[id] => 3
[SKU] => SKU_2
[ProductIDRef] => 43445-45
)

)

When is foreach with a parameter by reference dangerous?

About foreach

First of all, some (maybe obvious) clarifications about two behaviors of PHP:

  1. foreach($array as $item) will leave the variable $item untouched after the loop. If the variable is a reference, as in foreach($array as &$item), it will "point" to the last element of the array even after the loop.

  2. When a variable is a reference then the assignation, e.g. $item = 'foo'; will change whatever the reference is pointing to, not the variable ($item) itself. This is also true for a subsequent foreach($array2 as $item) which will treat $item as a reference if it has been created as such and therefore will modify whatever the reference is pointing to (the last element of the array used in the previous foreach in this case).

Obviously this is very error prone and that is why you should always unset the reference used in a foreach to ensure following writes do not modify the last element (as in example #10 of the doc for the type array).

About the function that modifies the array

It's worth noting that - as pointed out in a comment by @iainn - the behavior in your example has nothing to do with foreach. The mere existence of a reference to an element of the array will allow this element to be modified. Example:

function should_not_modify($array){
$array[0] = 'modified';
$array[1] = 'modified2';
}
$array = ['test', 'test2'];
$item = & $array[0];

should_not_modify($array);
var_dump($array);

Will output:

array(2) {
[0] =>
string(8) "modified"
[1] =>
string(5) "test2"
}

This is admittedly very suprising but explained in the PHP documentation "What References Do"

Note, however, that references inside arrays are potentially dangerous. Doing a normal (not by reference) assignment with a reference on the right side does not turn the left side into a reference, but references inside arrays are preserved in these normal assignments. This also applies to function calls where the array is passed by value. [...] In other words, the reference behavior of arrays is defined in an element-by-element basis; the reference behavior of individual elements is dissociated from the reference status of the array container.

With the following example (copy/pasted):

/* Assignment of array variables */
$arr = array(1);
$a =& $arr[0]; //$a and $arr[0] are in the same reference set
$arr2 = $arr; //not an assignment-by-reference!
$arr2[0]++;
/* $a == 2, $arr == array(2) */
/* The contents of $arr are changed even though it's not a reference! */

It's important to understand that when creating a reference, for example $a = &$b then both $a and $b are equal. $a is not pointing to $b or vice versa. $a and $b are pointing to the same place.

So when you do $item = & $array[0]; you actually make $array[0] pointing to the same place as $item. Since $item is a global variable, and references inside array are preserved, then modifying $array[0] from anywhere (even from within the function) modifies it globally.

Conclusion

Are there other cases that are dangerous, and can we build an exhaustive list of what is dangerous. Or the other way round: is it possible to describe when it is not dangerous.

I'm going to repeat the quote from the PHP doc again: "references inside arrays are potentially dangerous".

So no, it's not possible to describe when it is not dangerous, because it is never not dangerous. It's too easy to forget that $item has been created as a reference (or that a global reference as been created and not destroyed), and reuse it elsewhere in your code and corrupt the array. This has long been a topic of debate (in this bug for example), and people call it either a bug or a feature...

Why do I need unset $value after foreach loop

There's no need to use unset in the current context that you are using it. unset will simply destroy the variable and its content.

In the example you are giving, this is looping through an array creating $value, then you are unsetting that variable. Which means it no longer exists in that code. So that does absolutely nothing.

To visuallize what I am talking about look at this example:

$value = 'Hello World';
echo $value;
unset($value);
echo $value;

The following out will be:

Hello World<br /><b>NOTICE</b> Undefined variable: value on line number 6<br />

So you will first see the Hello World, but after unsetting that variable trying to call it will just cause an error.

To answer your question, you really don't have to unset value; there's no need for it. As the foreach loop is setting a $value of each array() + 10.

Unsetting it will cause the work to be removed, and forgotten.



Related Topics



Leave a reply



Submit