Use the array search function of php with caution

Keywords: PHP

Array search is an array function that is frequently used by phper, but array search is also a function that is often abused. For example, in the following business scenario, you need to count the same elements in two large arrays (well, yes, there is an array intersect function that can do this, but this does not prevent us from talking about this example).

If you use array search, that's how it's written

$arr1 = ['Suppose he had a million elements'];
$arr2 = ['Suppose he had a million elements'];
$arr3 = [];
foreach ($arr1 as $v) {
    $k = array_search($v, $arr2);
    if ($k === false) {
        $arr3[] = $v;
    }
}

The implementation of array [search, php source ext/standard/array.c, intercepts part of the code, the implementation of the macro will not be pasted out, and you can guess how it is implemented by looking at the name, that is, loop traversal.

static inline void php_search_array(INTERNAL_FUNCTION_PARAMETERS, int behavior) /* {{{ */
{
    if (strict) {
        ZEND_HASH_FOREACH_KEY_VAL(Z_ARRVAL_P(array), num_idx, str_idx, entry) {
            ......
        } ZEND_HASH_FOREACH_END();
    } else {
        if (Z_TYPE_P(value) == IS_LONG) {
            ZEND_HASH_FOREACH_KEY_VAL(Z_ARRVAL_P(array), num_idx, str_idx, entry) {
                ......
            } ZEND_HASH_FOREACH_END();
        } else if (Z_TYPE_P(value) == IS_STRING) {
            ZEND_HASH_FOREACH_KEY_VAL(Z_ARRVAL_P(array), num_idx, str_idx, entry) {
                ......
            } ZEND_HASH_FOREACH_END();
        } else {
            ZEND_HASH_FOREACH_KEY_VAL(Z_ARRVAL_P(array), num_idx, str_idx, entry) {
                ......
            } ZEND_HASH_FOREACH_END();
         }
    }
    RETURN_FALSE;
}

So the time complexity O(n 2) of the above implementation is quite terrible.

Then, let's look at another implementation. When the key and value of arr2 are converted, because the time complexity of hashmap is O (1) - O(n) (most of the time is 1), the time complexity of this implementation is O(n). With the qualitative improvement, the time complexity changes from quadratic time to linear time.

$arr1 = ['Suppose he had a million elements'];
$arr2 = array_flip(['Suppose he had a million elements']);
$arr3 = [];
foreach ($arr1 as $v) {
    if (isset($arr2[$v])) {
        $arr3[] = $v;
    }
}

When we do data statistics, we often need to encounter similar business scenarios, which can greatly improve the running performance of the program, but this writing method also has its limitations, such as value has duplicate data, so there is no omnipotent silver bullet.

Posted by grayscale2005. on Sun, 05 Jan 2020 23:01:49 -0800