Is There Garbage Collection in PHP

Is there garbage collection in PHP?

Yes there is, here's a nice article describing its pitfalls. In PHP > 5.3.0, there is also the gc_enable function.

Php garbage collection

Yes it does. Local variables are disposed at the end of the function call. And if one of these local variables was an object, it is no longer referenced. Therefore it will get freed by the garbage collector.

When does PHP run garbage collection in long running scripts?

It comes down to this. You have to trigger garbage collection manually by calling gc_collect_cycles().

I have written a bunch of code to try and track this down and come down to two scripts:

This one doesn't crash:

for($i = 0;$i < 100;$i++) {
useMemory();
gc_collect_cycles();
}

And this one crashes:

for($i = 0;$i < 100;$i++) {
useMemory();
}

Here is a link to compare these scripts on Blackfire

As you can see, when you don't call gc_collect_cycles it never happens, and you get to the memory limit, and PHP kills itself.

PHP doesn't even use this opportunity to GC itself. The reasoning behind this is discussed on the PHP-DEV mailing list, but basically comes down to complications of how to run __destruct methods, that require memory, when the memory limit has been reached. (Also on the bug tracker #60982).

Memory usage func:

This is the code I used to 'waste' memory, it purposefully creates cyclic references that can only be cleaned by the garbage collector. Note that without these cycles the objects will be cleaned by reference counting as soon as they fall out of scope.

class Big {
private $data;
public function __construct($d = 0) {
for($i = 0;$i< 1024 * 10;$i++) {
$this->$i = chr(rand(97, 122));
}
}
}

function useMemory() {
$a = new Big();
$b = new Big();

$a->b = $b;
$b->a = $a;
}

php garbage collection while script running

The key is that you unset your global variables as soon as you don't need them.

You needn't call unset explicitly for local variables and object properties because these are destroyed when the function goes out of scope or the object is destroyed.

PHP keeps a reference count for all variables and destroys them (in most conditions) as soon as this reference count goes to zero. Objects have one internal reference count and the variables themselves (the object references) each have one reference count. When all the object references have been destroyed because their reference coutns have hit 0, the object itself will be destroyed. Example:

$a = new stdclass; //$a zval refcount 1, object refcount 1
$b = $a; //$a/$b zval refcount 2, object refcount 1
//this forces the zval separation because $b isn't part of the reference set:
$c = &$a; //$a/$c zval refcount 2 (isref), $b 1, object refcount 2
unset($c); //$a zval refcount 1, $b 1, object refcount 2
unset($a); //$b refcount 1, object refcount 1
unset($b); //everything is destroyed

But consider the following scenario:

class A {
public $b;
}
class B {
public $a;
}

$a = new A;
$b = new B;
$a->b = $b;
$b->a = $a;
unset($a); //cannot destroy object $a because $b still references it
unset($b); //cannot destroy object $b because $a still references it

These cyclic references are where PHP 5.3's garbage collector kicks in. You can explicitly invoke the garbage collector with gc_collect_cycles.

See also Reference Counting Basics and Collecting Cycles in the manual.

PHP garbage collection: why is this object still referenced

TL;DR;

The short answer is that references are merely a way for two variables to share the same value and unset() only deletes a variable, not a value. The key thing to remember here is that variables have values, while references link values, not variables.

The Long and Long of It...

First, understanding how PHP removes objects from memory...

Objects are only removed from memory when the last reference to that object is deleted. When I say reference I don't mean the same thing as references in PHP, like the one you're describing here in the pass-by-reference example. Instead, any variable that is assigned the object, is considered something that references this object. As such PHP will not remove the object from memory as long as this is true.

Because you create the object outside of the function, then call the function, there are now two places that reference the same object. One is the global variable that instantiated the object. The second is the local variable, in your function, that's using the object. Pass-by-reference or assign-by-reference has no bearing on this behavior, whatsoever, because it's a totally different thing.

When you create an object in PHP, and assign it to a variable, the variable does not store the object itself. Instead, it stores a unique handle that points to the object in memory. The object is stored in an object store, that only PHP has direct control over. This is by design, because PHP manages memory for you. It does not expect you to understand or have to care about, how memory is allocated or freed. It tries to manage memory for you as efficiently as possible, through these abstractions.

So in the following code the object Foo is not really deleted until after we get to the last line of this example.

class Foo { }

$foo = new Foo; // Object is initialized and stored in memory

$fooCopy = $foo; // The same object handle is copied to $fooCopy

bar($foo);

function bar($foo) {
unset($foo); // object is still in memory
}

baz($foo);

function baz(&$foo) {
unset($foo); // object is still in memory
}

$foo->quix = 1; // object is still in memory

unset($foo); // object is still in memory because $fooCopy is still a reference

$fooCopy->quix++;

var_dump($fooCopy->quix); // int(2)

unset($fooCopy); // object is now deleted because last reference is gone

As you can see from this example there is a very good reason why PHP won't delete the object in these functions or even when we do unset($foo), because otherwise, by the time we get to the last few lines of this script, this code would not work as we expect it. The object would be freed prematurely. PHP just assumes that since you still have at least one variable pointing to the object, that you might still need to use it somewhere down the line. So it does not free it until it reaches a point where nothing points to that object (i.e. nothing can use it).

Ref Counted GC

This is called ref-counted GC. In principle, each time some variable points to the same place in memory, the ref-count is incremented. Each time a variable is deleted, the ref count to that memory is decremented. Once the ref-count reaches 0, then, and only then, will the memory be marked for garbage collection, and eventually cleaned up by the garbage collector. So in the example above, the variable $foo creates a ref-count of 1 to the object Foo, that's stored in memory. The variable $fooCopy increments that ref-count to 2. At the point we call the function bar() the ref-count is at 3. By the time we unset() or return from bar() the ref count goes back down to 2. Same thing with baz(), up to 3 and then down to 2 again. At the point we unset($foo) the ref-count is still 1. PHP will not delete the object. Finally, we reach unset($fooCopy) and the ref-count is now 0. Same thing would happen if the script just ended. PHP would implicitly just clean up all memory.

Why unset() on references doesn't work with Objects

To answer your question, specifically, about why using pass-by-reference and calling unset() on the object doesn't work, or doesn't remove the last reference to the object we actually have to explain a bit more about how references actually work in PHP by contrast to objects.

$obj = new stdClass;

foo($obj);

function foo(&$obj) {
unset($obj);
}

var_dump($obj); // this is still here

A reference, in PHP, is a way of having two variables share the same value. But an object, is not stored inside of a variable in PHP. Instead, all that is stored is the handle that points to that object (i.e. another level of indirection). So by using pass-by-reference, all you've managed to accomplish is have two different variables share the same object handle. By deleting one of these variables you're still left with the other variable pointing to that handle. So the ref count is still at 1 regardless of which variable you delete.

Now, if you were to re-assign either variable the value null, then the one and only handle pointing to that object is now lost, and as such PHP will remove the object.

$obj = new stdClass;

foo($obj);

function foo(&$obj) {
$obj = null;
}

var_dump($obj); // this is now null and the object is gone

Try to think of it like this. The thing that's actually assigned to $obj, both in the local variable, and global variable, is just the object's handle, which is the thing that points to the object itself. So all unset($obj) inside this function does is delete the local variable. Deleting one does not delete both.

php object memory

By assigning a value to a variable, that's a reference to another variable, however, you get the same value in both places.

php object memory references

Remember, unset($obj) inside the function, only deletes the local variable and breaks the reference. It does not delete the object, because the variable outside of the function still continues to reference the handle.

Check if Session is deleted by Garbage Collector

The overall mechanism is not as sophisticated as you probably think.

Sessions can have several storage back-ends, the default of which is the builtin file handler, that merely creates, well, files:

Files in Windows Explorer

The only way to link a given file with a given session is the session ID which, as you can see, is part of the file name.

Garbage collection is a file removal based on last modification time. Once it happens, files are gone forever. There's just no trace or record that the file ever existed.

In general, you don't need to worry about this case. Just make sure you define a lifetime that's long enough for your application. The default value in many systems often ranges from 20 to 30 minutes, which is fairly small. Also, make sure your app has its own session directory, so other apps with a shorter lifetime won't remove your files:

session_save_path('/home/foo/app/sessions');
ini_set('session.gc_maxlifetime', 86400); // 1 day (in seconds)

P.S. Some Linux systems disable PHP garbage collection and replace it with a custom cron script, what prevents custom locations from being cleaned up. For that reason I normally set these other directives just in case:

// Restore the default values
ini_set('session.gc_probability', 1);
ini_set('session.gc_divisor', 100);

memory handling in php vs java

PHP does have a garbage collector, but previous to PHP 5.3 (5.2?) it could not handle circular references and would be unable to GC certain constructs,. e.g.

$a = &$a;

would cause a memory leak. PHP will not run the GC unless it has to, as a GC run is expensive, and usually not needed as most PHP scripts are short-lived. The GC will kick in only when memory pressure is present, and you'll get an OOM error only if enough memory can't be freed at all.

How does garbage collection work in PHP? Namely, how do local function variables get cleaned up?

The variable will be unset when the function exits, unless it has external references to it which would keep it "alive". Whether the actual memory the variable occupied is freed or not is entirely up to the garbage collector. GC is an expensive operation, and PHP will only invoke it when needed (e.g. memory's getting tight).



Related Topics



Leave a reply



Submit