python - Numpy nditer for memory saving? -
i'm lost when iterating on ndarray nditer.
background
i trying compute eigenvalues of 3x3 symmetric matrices each point in 3d array. data 4d array of shape [6,x,y,z] 6 values being values of matrix @ point x,y,z, on ~500x500x500 cube of float32. first used numpy's eigvalsh, it's optimized large matrices, while can use analytical simplification 3x3 symmetric matrices.
i implemented wikipedia's simplification , both function takes single matrix , computes eigenvalues (then iterating naively nested loops), , vectorized using numpy.
the problem inside vectorization, each operation creates internal array of data's size, culminating in ram used , pc freeze.
i tried using numexpr etc, it's still around 10g usage.
what i'm trying do
i want iterate (using numpy's nditer) through array each matrix, compute eigenvalues. remove need allocate huge intermediary arrays because calculate ~ 10 float numbers @ time. trying substitute nested for
loops 1 iterator.
i'm looking :
for a,b,c,d,e,f in np.nditer([symmatrix,eigenout]): # each matrix in x,y,z # computing output matrix eigenout[...] = mylovelyeigenvalue(a,b,c,d,e,f)
the best have far :
for in np.nditer([derived],[],[['readonly']],op_axes=[[1,2,3]]):
but means i
takes values of 4d array instead of being tuple of 6 length. can't seem hang of nditer documentation.
what doing wrong ? have tips , tricks iterating on "all one" axis ?
the point have nditer outperform regular nested loops on iteration (once works i'll change function calls, buffer iteration ... far want work ^^)
you don't need np.nditer
this. simpler way of iterating on first axis reshape [6, 500 ** 3]
array, transpose [500 ** 3, 6]
, iterate on rows:
for (a, b, c, d, e, f) in (symmatrix.reshape(6, -1).t): # involving a, b, c, d, e, f...
if want use np.nditer
this:
for (a, b, c, d, e, f) in np.nditer(x, flags=['external_loop'], order='f'): # involving a, b, c, d, e, f...
a potentially important thing consider if symmatrix
c-order (row-major) rather fortran-order (column-major) iterating on first dimension may faster iterating on last 3 dimensions, since accessing adjacent blocks of memory address. might therefore want consider switching fortran-order.
i wouldn't expect massive performance gain either of these, since @ end of day you're still doing of looping in python , operating on scalars rather taking advantage of vectorization.
Comments
Post a Comment