Changing Something From Iterating Over A Numpy Array To Vectorization
Solution 1:
You can significantly improve your performance using np.where
to get the indices where your conditions happen:
ind = np.where( flow_direction_np==32 )
you will see that ind
is a tuple with two elements, the first is the indices of the first axis and the second of the second axis of your flow_direction_np
array.
You can work out with this indices to apply the shifts: i-1
, j-1
and so on...
ind_32 = (ind[0]-1, ind[1]-1)
Then you use fancy indexing to update the arrays:
elevation_gain[ ind_32 ] += sediment_transport_np[ ind ]
EDIT: applying this concept to your case would give something like this:
lookup = {32: (-1, -1),
16: ( 0, -1),
8: (+1, -1),
4: (+1, 0),
64: (-1, 0),
128: (-1, +1),
1: ( 0, +1),
2: (+1, +1)}
for num, shift in lookup.iteritems():
ind = np.where( flow_direction_np==num )
ind_num = ind[0] + shift[0], ind[1] + shift[1]
elevation_gain[ ind_num] += sediment_transport_np[ ind ]
Solution 2:
The reason that you're getting different results is due to the way python handles negative indexing.
For other folks reading, this question (and answer) are a follow up from here: Iterating through a numpy array and then indexing a value in another array
First off, I apologize that the "vectorized" code is as dense as it is. There's a through explanation in my earlier answer, so I won't repeat it here.
Your original code (in the original question) is actually subtly different than the version you posted here.
Basically, before you had
for [i, j], flow in np.ndenumerate(flow_direction_np):
try:
if flow == 32:
...
elif ...
...
and you were getting an index error when i+1
or j+1
was greater than the size of the grid.
Just doing:
for [i, j], flow in np.ndenumerate(flow_direction_np):
try:
if flow == 32:
...
elif ...
...
except IndexError:
elevation_change[i, j] = 0
is actually incorrect because it defines different boundary conditions on different sides of the grid.
In the second case, when j-1
or i-1
is negative, the value from the opposite side of the grid will be returned. However, when j+1
or i+1
is greater than the size of the grid, 0
will be returned. (Thus the "different boundary conditions".)
In the vectorized version of the code, 0
is returned both when the indices are negative and when they're beyond the bounds of the grid.
As a quick example, notice what happens with the following:
In [1]: x = [1, 2, 3]
In [2]: x[0]
Out[2]: 1
In [3]: x[1]
Out[3]: 2
In [4]: x[2]
Out[4]: 3
In [5]: x[3]
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-5-ed224ad0520d> in <module>()
----> 1 x[3]
IndexError: list index out of range
In [6]: x[-1]
Out[6]: 3
In [7]: x[-2]
Out[7]: 2
In [8]: x[-3]
Out[8]: 1
In [9]: x[-4]
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-9-f9c639f21256> in <module>()
----> 1 x[-4]
IndexError: list index out of range
In [10]:
Notice that negative indices up to the size of the sequence are valid and return the "opposite end" of the sequence. So, x[3]
raises an error, while x[-1]
just returns the other end.
Hopefully that's a bit clearer.
Post a Comment for "Changing Something From Iterating Over A Numpy Array To Vectorization"