Garbage recovery mechanism based on php underlying principle

Keywords: PHP C

php garbage collection mechanism is a familiar but not very familiar content for PHPer. So how does php achieve the recovery of unnecessary memory?

Internal storage structure of php variables

First of all, we need to understand the basic knowledge to facilitate the understanding of the principles of garbage collection. As we all know, php is written in C, so the internal storage structure of php variables is also related to C language, that is, the structure of zval:

struct _zval_struct {
    union {
        long lval;
        double dval;
        struct {
            char *val;
            int len;
        } str;
        HashTable *ht;
        zend_object_value obj;
        zend_ast *ast;
    } value;                    //Value value of variable
    zend_uint refcount__gc;   //Delete the variable for 0 by referencing the number of times used in counting memory
    zend_uchar type;           //Variable type
    zend_uchar is_ref__gc;    //Is the distinction a reference variable?
};

From the content of the above structure, we can see that each php variable is composed of four parts: variable type, value value, number of reference counts and whether it is a reference variable or not.

Note: The above zval structure is the structure after php5.3. Before php5.3, there was no new garbage collection mechanism, namely GC, so the name was not _gc. After php7, due to performance problems, the zval structure was rewritten, which is not described here.

Reference Counting Principle

After understanding the internal storage structure of php variables, we can understand the principles of php variable assignment and the early garbage collection mechanism.

Variable container

Non-array and object variables

Each time a constant is assigned to a variable, a container of variables is created.

Give an example:

$a = 'Xu Zheng's Way of Technological Growth';
xdebug_debug_zval('a')

Result:

a: (refcount=1, is_ref=0)='Xu Zheng's Way of Technological Growth'

array and object variables

Variable containers that produce the number of elements + 1

Give an example:

$b = [
'name' => 'Xu Zheng's Way of Technological Growth',
'number' => 3
];
xdebug_debug_zval('b')

Result:

b: (refcount=1, is_ref=0)=array ('name' => (refcount=1, is_ref=0)='Xu Zheng's Way of Technological Growth', 'number' => (refcount=1, is_ref=0)=3)

Assignment Principle (Write-time Replication Technology)

Now that we know about constant assignment, let's consider the assignment between variables from a memory perspective.

Give an example:

$a = [
'name' => 'Xu Zheng's Way of Technological Growth',
'number' => 3
]; //Create a variable container with variable a pointing to the variable container and ref_count of a being 1
$b = $a; //Variable b also points to the container of variables pointed to by variable a, and ref_count of a and b is 2.
xdebug_debug_zval('a', 'b');
$b['name'] = 'Xu Zheng's Way of Technological Growth 1';//When one of the elements of variable b changes, a new variable container is copied, variable b redirects to the new variable container, and ref_count of a and B becomes 1.
xdebug_debug_zval('a', 'b'); 

Result:

a: (refcount=2, is_ref=0)=array ('name' => (refcount=1, is_ref=0)='Xu Zheng's Way of Technological Growth', 'number' => (refcount=1, is_ref=0)=3)
b: (refcount=2, is_ref=0)=array ('name' => (refcount=1, is_ref=0)='Xu Zheng's Way of Technological Growth', 'number' => (refcount=1, is_ref=0)=3)
a: (refcount=1, is_ref=0)=array ('name' => (refcount=1, is_ref=0)='Xu Zheng's Way of Technological Growth', 'number' => (refcount=1, is_ref=0)=3)
b: (refcount=1, is_ref=0)=array ('name' => (refcount=1, is_ref=0)='Xu Zheng's Way of Technological Growth 1', 'number' => (refcount=1, is_ref=0)=3)

Therefore, when variable a is assigned to variable b, it does not immediately generate a new container of variables, but points variable B to the container of variables that variable a points to, that is, memory "sharing". When one of the elements of variable b changes, variable container replication will really occur, that is, write-time replication technology.

Reference counting 0

When the ref_count count count of the variable container is zero, it means that the variable container will be destroyed and memory recovery is realized, which is also the garbage collection mechanism before php5.3 version.

Give an example:

$a = "Xu Zheng's Way of Technological Growth";
$b = $a;
xdebug_debug_zval('a');
unset($b);
xdebug_debug_zval('a');

Result:

a: (refcount=2, is_ref=0)='Xu Zheng's Way of Technological Growth'
a: (refcount=1, is_ref=0)='Xu Zheng's Way of Technological Growth'

Memory leaks caused by circular references

However, there is a loophole in the garbage collection mechanism before php5.3, that is, when a child element in an array or object refers to its parent element, and if deletion of its parent element occurs at this time, the variable container will not be deleted, because its child element is still pointing to the variable container, but because there is no symbol pointing to the variable container in all scopes, it cannot be deleted. Clear up, so a memory leak occurs until the script is executed

Give an example:

$a = array( 'one' );
$a[] = &$a;
xdebug_debug_zval( 'a' );

Because the example does not output the results well, it is illustrated by a graph, as shown in the figure:

Give an example:

unset($a);
xdebug_debug_zval('a');

As shown in the picture:

New Waste Recycling Mechanism

After version 5.3 of php, the root buffer mechanism was introduced, that is, when PHP starts, the default setting of the specified number of root buffers (default is 10000) for zval is set. When PHP finds that there are circular references to zval, it will put it into the root buffer. When the root buffer reaches the specified number in the configuration file (default is 10000), garbage collection will be carried out to solve the memory caused by circular references. Leakage problem

Criteria for recognition as garbage

1. If the reference count is reduced to zero, the container in which the variable is located will be free and not garbage.
2. If the reference count of a zval decreases and is greater than 0, it will enter the garbage cycle. Secondly, in a garbage cycle, we can find out which part of garbage is by checking whether the reference count is reduced by 1 and checking which variable container has zero references.

summary

Garbage recycling mechanism:
1. Based on the reference counting mechanism of php (which was the only mechanism before php 5.3)
2. Using the root buffer mechanism, when PHP finds zval with circular reference, it will put it into the root buffer. When the root buffer reaches the specified number in the configuration file, garbage collection will be carried out to solve the memory leak problem caused by circular reference (php 5.3 began to introduce this mechanism).

Posted by aubeasty on Sun, 12 May 2019 03:32:09 -0700