Skip to content Skip to sidebar Skip to footer

Why Is Numpy Much Faster At Creating A Zero Array Compared To Replacing The Values Of An Existing Array With Zeros?

I have an array which is used to track various values. The array is 2500x1700 in size, so it is not very large. At the end of a session I need to reset all of the values within tha

Solution 1:

The reason is that the array is not filled in memory on mainstream operating systems (Windows, Linux and MaxOS). Numpy allocates a zero-filled array by requesting to the operating systems (OS) a zero-filled area in virtual memory. This area is not directly mapping in physical RAM. The mapping and zero-initialization is generally done lazily by the OS when you read/write the pages in virtual memory. This cost is paid when you set later the array to 1 for example. Here is a proof:

In [19]: %timeit res = np.zeros(shape=(2500, 1700))
10.8 µs ± 118 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [20]: %timeit res = np.ones(shape=(2500, 1700))
7.54 ms ± 151 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

The former would imply a RAM throughput of at least 4.2 GiB/s which is not high but fair. The latter would imply a RAM throughput of at least roughly 2930 GiB/s which is stupidly high since my machine (as well as any standard desktop/server machine) is barely able to reach 36 GiB/s (using a carefully-optimized benchmark).


Post a Comment for "Why Is Numpy Much Faster At Creating A Zero Array Compared To Replacing The Values Of An Existing Array With Zeros?"