Python memory management Help

Lists

An empty list

from pympler import asizeof import sys a=[] print(sys.getsizeof(a)) # 56 print(asizeof.asizeof(a)) # 56

The empty list takes 56 bytes of memory due to the fact that it is an object and has a number of attributes and methods which are also stored

Manually created lists

from pympler import asizeof import sys a=[] print(sys.getsizeof(a)) # 56 print(asizeof.asizeof(a)) # 56 a=[0] print(sys.getsizeof(a)) # 64 print(asizeof.asizeof(a)) # 96 a=[0, 1] print(sys.getsizeof(a)) # 72 print(asizeof.asizeof(a)) # 136

A list with one integer The .asizeof() function returns the empty list (56 bytes) plus the integer pointer (8 bytes) plus the integer itself in memory (32 bytes) for a total of 96 bytes

A list with two integers The .asizeof() function returns the empty list (56 bytes) plus the two integer pointers (16 bytes) plus the two integer themselves in memory (64 bytes) for a total of 136 bytes

Beyond this the resizing functions start to affect the list size. Besides, we are interested in dynamically resised lists.

Dynamic lists

Consider the following code:

import sys from pympler import asizeof a = [] print(f'List alone requires {sys.getsizeof(a)} bytes. Total memory requirement is {asizeof.asizeof(a)} bytes.') for i in range(20): a.append(i) print(f'List of {len(a)} elements requires {sys.getsizeof(a)} bytes. Total memory requirement is {asizeof.asizeof(a)} bytes.')

The output is:

List alone requires 56 bytes. Total memory requirement is 56 bytes. List of 1 elements requires 88 bytes. Total memory requirement is 120 bytes. List of 2 elements requires 88 bytes. Total memory requirement is 152 bytes. List of 3 elements requires 88 bytes. Total memory requirement is 184 bytes. List of 4 elements requires 88 bytes. Total memory requirement is 216 bytes. List of 5 elements requires 120 bytes. Total memory requirement is 280 bytes. List of 6 elements requires 120 bytes. Total memory requirement is 312 bytes. List of 7 elements requires 120 bytes. Total memory requirement is 344 bytes. List of 8 elements requires 120 bytes. Total memory requirement is 376 bytes. List of 9 elements requires 184 bytes. Total memory requirement is 472 bytes. List of 10 elements requires 184 bytes. Total memory requirement is 504 bytes. List of 11 elements requires 184 bytes. Total memory requirement is 536 bytes. List of 12 elements requires 184 bytes. Total memory requirement is 568 bytes. List of 13 elements requires 184 bytes. Total memory requirement is 600 bytes. List of 14 elements requires 184 bytes. Total memory requirement is 632 bytes. List of 15 elements requires 184 bytes. Total memory requirement is 664 bytes. List of 16 elements requires 184 bytes. Total memory requirement is 696 bytes. List of 17 elements requires 248 bytes. Total memory requirement is 792 bytes. List of 18 elements requires 248 bytes. Total memory requirement is 824 bytes. List of 19 elements requires 248 bytes. Total memory requirement is 856 bytes. List of 20 elements requires 248 bytes. Total memory requirement is 888 bytes.

The list sizes

Note that the empty list takes 56 bytes.

If a single element is appended the list size increases to 88 bytes. This is an increase of 32 bytes which is equivalent to four (4), eight byte pointers.

Resizing the list is very CPU intensive so the list size is increased in increments.

static int list_resize(PyListObject *self, Py_ssize_t newsize) { size_t new_allocated, target_bytes; Py_ssize_t allocated = self->allocated; /* Bypass realloc() when a previous overallocation is large enough to accommodate the newsize. If the newsize falls lower than half the allocated size, then proceed with the realloc() to shrink the list. */ if (allocated >= newsize && newsize >= (allocated >> 1)) { assert(self->ob_item != NULL || newsize == 0); Py_SET_SIZE(self, newsize); return 0; } /* This over-allocates proportional to the list size, making room * for additional growth. The over-allocation is mild, but is * enough to give linear-time amortized behavior over a long * sequence of appends() in the presence of a poorly-performing * system realloc(). * Add padding to make the allocated size multiple of 4. * The growth pattern is: 0, 4, 8, 16, 24, 32, 40, 52, 64, 76, ... * Note: new_allocated won't overflow because the largest possible value * is PY_SSIZE_T_MAX * (9 / 8) + 6 which always fits in a size_t. */ new_allocated = ((size_t)newsize + (newsize >> 3) + 6) & ~(size_t)3; /* Do not overallocate if the new size is closer to overallocated size * than to the old size. */ if (newsize - Py_SIZE(self) > (Py_ssize_t)(new_allocated - newsize)) new_allocated = ((size_t)newsize + 3) & ~(size_t)3; if (newsize == 0) new_allocated = 0;

Need to play with this: new_allocated = ((size_t)newsize + (newsize >> 3) + 6) & ~(size_t)3

Resizing explained

As soon as an element is appended to the empty list the resizing algorithm is triggered. The resizing algorithm calculates elements, not bytes, so the result has to be multiplied by eight to get the memory requirement in bytes

Eg 1. Append one integer to empty list

Recall that the list already requires 56 bytes.

The list is currently zero elements long. Appending an integer takes the new size to one. The calculation becomes:

new_alloc = [1 + (1 >> 3) + 6] & ~3

Now (1 >> 3) = 0 so the calculation simplifies to:

new_alloc = 7 & ~3

The final stage needs to be done in binary. 7 & ~3 is:

0000 0111 1111 1100 --------- 0000 0100 = 4

The list increases its length to four elements (of which one contains an eight byte pointer) with each element reserving eight bytes. 4 x 8 bytes = 32 bytes so the size of the list in memory is 56 + 32 = 88 bytes. Include the 32 bytes for the integer itself that was appended and the total memory requirement is 88 + 32 = 120 bytes.

Eg 2. Append one integer to a list of four integers

If the list contains four integers then the list itself requires 88 bytes (from previous example) plus the four 32 byte integers in memory. 4 x 32 = 128 bytes so the total memory required is 88 + 128 = 216 bytes. Adding another integer will trigger another resize.

new_alloc = [5 + (5 >> 3) + 6] & ~3

Now (5 >> 3) = 0 so the calculation simplifies to:

new_alloc = 11 & ~3

The final stage needs to be done in binary. 11 & ~3 is:

0000 1011 1111 1100 --------- 0000 1000 = 8

The list increases its length to eight elements (of which five contain eight byte pointers) with each element reserving eight bytes. 8 x 8 bytes = 64 bytes so the size of the list in memory is 56 + 64 = 120 bytes. Include the (5 x 32) bytes for the integers themselves that have been appended and the total memory requirement is 120 + (5 x 32) = 280 bytes.

Repeated values

Consider the following code:

import sys from pympler import asizeof a = [] print(f'List alone requires {sys.getsizeof(a)} bytes. Total memory requirement is {asizeof.asizeof(a)} bytes.') for i in range(6): a.append(i) print(f'List: {a} requires {sys.getsizeof(a)} bytes. Total memory requirement is {asizeof.asizeof(a)} bytes.')

The output is:

List alone requires 56 bytes. Total memory requirement is 56 bytes. List: [0] requires 88 bytes. Total memory requirement is 120 bytes. List: [0, 1] requires 88 bytes. Total memory requirement is 152 bytes. List: [0, 1, 2] requires 88 bytes. Total memory requirement is 184 bytes. List: [0, 1, 2, 3] requires 88 bytes. Total memory requirement is 216 bytes. List: [0, 1, 2, 3, 4] requires 120 bytes. Total memory requirement is 280 bytes. List: [0, 1, 2, 3, 4, 5] requires 120 bytes. Total memory requirement is 312 bytes.

Exactly as described above. Change this line:

a.append(i)

to

a.append(1)

and run the code again. The output is different:

List alone requires 56 bytes. Total memory requirement is 56 bytes. List: [1] requires 88 bytes. Total memory requirement is 120 bytes. List: [1, 1] requires 88 bytes. Total memory requirement is 120 bytes. List: [1, 1, 1] requires 88 bytes. Total memory requirement is 120 bytes. List: [1, 1, 1, 1] requires 88 bytes. Total memory requirement is 120 bytes. List: [1, 1, 1, 1, 1] requires 120 bytes. Total memory requirement is 152 bytes. List: [1, 1, 1, 1, 1, 1] requires 120 bytes. Total memory requirement is 152 bytes.

The total memory requirement is only 32 bytes above the list requirement in all cases. This is because the pointers in the list all point to the same integer in memory so there is a single "1" stored in 32 bytes and each pointer points to it.

If a list contains many repeated values it will be a lot smaller overall compared to a list that contains unique values.

Last modified: 03 July 2024