Keywords: Python | List Copying | Deep Copy | Shallow Copy | Object References
Abstract: This article provides a comprehensive examination of list copying mechanisms in Python, focusing on the critical distinctions between shallow and deep copying. Through detailed code examples and memory structure analysis, it explains why the list() function fails to achieve true deep copying and demonstrates the correct implementation using copy.deepcopy(). The discussion also covers reference relationship preservation during copying operations, offering complete guidance for Python developers.
Overview of Python List Copying Mechanisms
In Python programming, list copying is a common but frequently misunderstood operation. Many developers mistakenly believe that using the list() function or slice operation [:] can create completely independent copies of lists, but these methods only achieve shallow copying. Shallow copying duplicates only the outermost container while maintaining reference relationships to nested internal objects.
Limitations of Shallow Copying
Let us understand the working mechanism of shallow copying through a concrete example. Consider the following code:
E0 = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
for k in range(3):
E0_copy = list(E0)
E0_copy[k][k] = 0
print(E0) # Output: [[0, 2, 3], [4, 0, 6], [7, 8, 0]]
In this example, although E0_copy is a new list object (verifiable through id(E0) != id(E0_copy)), when modifying nested list elements within E0_copy, the original list E0 also changes accordingly. This occurs because list(E0) only creates a new instance of the outer list while the internally nested list objects still share the same references as the original list.
Correct Implementation of Deep Copying
To achieve true deep copying, the copy.deepcopy() function from Python's standard library must be used. This function recursively copies all nested objects, creating completely independent duplicates:
import copy
# Original list
a = [[1, 2, 3], [4, 5, 6]]
# Shallow copy example
b = list(a)
a[0][1] = 10
print("a:", a) # Output: [[1, 10, 3], [4, 5, 6]]
print("b:", b) # Output: [[1, 10, 3], [4, 5, 6]] - b also changes
# Deep copy example
c = copy.deepcopy(a)
a[0][1] = 9
print("a:", a) # Output: [[1, 9, 3], [4, 5, 6]]
print("c:", c) # Output: [[1, 10, 3], [4, 5, 6]] - c remains unchanged
Memory Reference Relationship Verification
Examining object identifiers provides clearer understanding of copying mechanisms:
a = [[1, 2, 3], [4, 5, 6]]
b = list(a) # Shallow copy
c = copy.deepcopy(a) # Deep copy
print("Outer list identifier comparison:")
print(f"id(a) == id(b): {id(a) == id(b)}") # False
print(f"id(a) == id(c): {id(a) == id(c)}") # False
print("Inner list identifier comparison:")
print(f"id(a[0]) == id(b[0]): {id(a[0]) == id(b[0])}") # True - shallow copy shares references
print(f"id(a[0]) == id(c[0]): {id(a[0]) == id(c[0])}") # False - deep copy creates new objects
Graph Structure Preservation in Deep Copying
An important characteristic of copy.deepcopy() is its ability to preserve the graphical structure of original data. Consider this complex structure containing shared references:
import copy
# Create list with shared references
a = [1, 2]
b = [a, a] # Both elements in b reference the same list a
# Perform deep copy
c = copy.deepcopy(b)
# Verify structure preservation
print(f"c[0] is a: {c[0] is a}") # False - new objects created
print(f"c[0] is c[1]: {c[0] is c[1]}") # True - shared reference structure maintained
The deep copy algorithm employs a hash table to map original objects to newly created objects, ensuring no unnecessary duplicate objects are created during the copying process while preserving the reference relationship structure of the original data.
Copying Method Comparison Summary
Python provides multiple copying approaches, each suitable for different scenarios:
- Assignment Operation:
b = a- Creates only reference aliases, copies no data - Shallow Copy:
list(a),a[:],a.copy()- Copies only the outermost container - Deep Copy:
copy.deepcopy(a)- Recursively copies all nested objects
Practical Application Recommendations
When selecting copying methods, consider data structure complexity and performance requirements. For simple flat lists, shallow copying is usually sufficient. However, for data structures containing nested mutable objects, deep copying is necessary to ensure data independence. Note that deep copy operations may incur performance overhead, particularly when handling large or complex data structures.
Similar copying mechanism discussions exist in other programming languages like Kotlin. As mentioned in the reference article, defining the depth of "deep copy" is a complex issue requiring consideration of special cases like recursive data structures and circular references. Python's copy.deepcopy() provides a relatively comprehensive solution capable of handling most common deep copying requirements.