Python shallow and deep copies

1. Difference between = = and is

==Operator compares values between objects for equality.
The is operator compares whether the identities of objects are equal, that is, whether they are the same object and point to the same memory address.

In Python, the identity of each object can be obtained through the function id(object). Therefore, the is operator is equivalent to comparing whether the IDS between objects are equal.

a = 10
b = 10

print(a == b)  # True

id(a)  # 4427562448

id(b)  # 4427562448

print(a is b)  # True

First, Python will open up a piece of memory for the value of 10, and then variables A and b point to this memory area at the same time, that is, both a and b point to the variable of 10, so the values of a and b are equal, the IDs are equal, and a == b and a is b return True.

It should be noted that for integer numbers, the above conclusion of a is b = True only applies to numbers in the range of [- 5, 256].

a = 257
b = 257

print(a == b)  # True

id(a)  # 4473417552

id(b)  # 4473417584

print(a is b)  # False

In fact, in consideration of performance optimization, Python internally maintains an array of integers of [- 5, 256] to serve as a cache. In this way, every time you try to create an integer in the range of [- 5, 256], Python will return the corresponding reference from the array instead of reopening a new memory space. However, if an integer is out of range, Python will open up two memory areas for two integers.

Generally speaking, in practice, when we compare variables, we use = = much more times than is, because generally, we are more concerned about the values of two variables than their storage addresses. However, when we compare a variable with a Singleton, we usually use is, for example:

if a is None:
    pass

if a is not None:
    pass

The is operator cannot be overloaded, and Python does not need to find out whether the comparison operator is overloaded elsewhere in the program. Therefore, when the comparison operator is is executed, it only compares whether the ID s of the two variables are equal.

==Different operators, when a == b is executed, it is equivalent to executing a__ eq__ (b) , and most data types in Python will be overloaded__ eq__ The internal processing of this function is usually a little more complex. For example, for a list__ eq__ The function iterates over the elements in the list and compares whether their order and values correspond to each other.

Although tuples are immutable, tuples can be nested. The elements in them can be of list type, and the list is variable. Therefore, if we modify a variable element in a tuple, the tuple will also change.

Shallow and deep copy

1. Common shallow copy methods

Use the constructor of the data type itself

l1 = [1, 2, 3]
l2 = list(l1)

print(l2)  # [1, 2, 3]
print(l1 == l2)  # True
print(l1 is l2)  # False

l2 is a shallow copy of l1.

For variable sequences, you can use the slice operator: to complete the shallow copy

l1 = [1, 2, 3]
l2 = l1[:]

print(l1 == l2)  # True
print(l1 is l2)  # False

Note: for tuples, using tuple() or slicing operator: will not create a shallow copy, it will return a reference to the same tuple. For example:

t1 = (1, 2, 3)
t2 = tuple(t1)

print(t1 == t2)  # True
print(t1 is t2)  # True

Use copy.copy() for any data type

import copy

l1 = [1, 2, 3]
l2 = copy.copy(l1)

print(l1 == l2)  # True
print(l1 is l2)  # False

Shallow copy: it refers to reallocating a piece of memory and creating a new object. The elements in it are references to sub objects of member objects. If some sub objects in the meta object are variable, shallow copy usually brings some side effects, such as:

l1 = [[1, 2], (3, 4)]
l2 = list(l1)
l1.append(100)
l1[0].append(3)

print(l1)  # [[1, 2, 3], (3, 4), 100]
print(l2)  # [[1, 2, 3], (3, 4)]

l1[1] += (5, 6)
print(l1)  # [[1, 2, 3], (3, 4, 5, 6), 100]

print(l2)  # [[1, 2, 3], (3, 4)]

Among them, l1.append(100) means adding 100 elements to the l1 list. This operation will not have any impact on l2, because l2 and l1 are two different objects, and they do not share memory addresses. l2 does not change after operation, but l1 will change.

l1[0].append(3) indicates that the first list element in l1 adds 3. Because l2 is a shallow copy of l1, the first element in l2 and the first element in l1 point to the same list, so the first list of l2 will add 3 correspondingly.

l1[1] += (5, 6) operation. Because tuples are immutable, it means splicing the second tuple in l1. At this time, a new tuple will be generated and assigned to the second tuple of l1. The new element group is not referenced in l2, so the second element in l2 does not change.

Deep copy

Deep copy: it refers to reallocating a piece of memory, creating a new object, and recursively copying the elements in the original object to the new object by creating a new sub object. Therefore, the new object has no association with the original object.

import copy

l1 = [[1, 2], (3, 4)]
l2 = copy.deepcopy(l1)
l1.append(10)
l1[0].append(3)

print(l1)  # [[1, 2, 3], (3, 4), 10]
print(l2)  # [[1, 2], (3, 4)]

Note: deep copy is not perfect, and often brings a series of problems. If there is a reference to itself in the copied object, the program is easy to fall into an infinite loop:

import copy

x = [1]
x.append(x)

print(x)   #[1, [...]]

y = copy.deepcopy(x)
print(y)  # [1, [...]]

In the above example, list X has references to itself, so x is an infinitely nested list. However, we found that there was no stack overflow error in the process of deep copying x to y. Why?

In fact, this is because the deep copy function deepcopy maintains a dictionary to record the copied objects and their ID s. During copying, if the object to be copied has been stored in the dictionary, it will be returned directly from the dictionary, for example:

def deepcopy(x, memo=None, _nil=[]):
    if memo is None:
        memo = {}
    d = id(x)  # Query the ID of the copied object x
    y = memo.get(d, _nil)  # Query whether the object has been stored in the dictionary
    if y is not _nil:
        return y   # If the object to be copied has been stored in the dictionary, it will be returned directly

reflection:

import copy

x = [1]
x.append(x)

y = copy.deepcopy(x)

# What does the following command output?
print(x == y)

Answer: the program will make mistakes, because x is an infinitely nested list, and Y is deeply copied from X. according to the principle, x == y will output True, but when performing the comparison operation = = the = = operator will recursively traverse all the values of the object and compare them one by one. In order to prevent stack crash, Python limits the number of layers of recursion, so an error will be reported when executing the above code: recursionerror: maximum recursion depth exceeded in comparison

Posted by russellbcv on Thu, 18 Nov 2021 18:40:08 -0800

Programmer Group

Python shallow and deep copies

1. Difference between = = and is

Shallow and deep copy

1. Common shallow copy methods

Deep copy

Hot Keywords