python - Many small matrices speed-up for loops -


i have large coordinate grid (vectors , b), generate , solve matrix (10x10) equation. there way scipy.linalg.solve accept vector input? far solution run for cycles on coordinate arrays.

import time import numpy np import scipy.linalg  n = 10 = np.linspace(0, 1, 10**3) b = np.linspace(0, 1, 2*10**3) = np.random.random((n, n))      # input matrix, not static  def f(a,b,n):     # array-filling function    return a*b*n  def sol(x,y):   # matrix solver    d = np.arange(0,n)    b = f(x,y,d)**2 + f(x-1, y+1, d)      # source vector    x = scipy.linalg.solve(a,b)    return x    # output n-size vector  start = time.time()  answer = np.zeros(shape=(a.size, b.size)) # predefine output array  egg in range(a.size):             # ugly double-for cycle    ham in range(b.size):        aa = a[egg]        bb = b[ham]        answer[egg,ham] = sol(aa,bb)[0]  print time.time() - start 

@yevgeniy right, efficiently solving multiple independent linear systems a x = b scipy bit tricky (assuming a array changes every iteration).

for instance, here benchmark solving 1000 systems of form, a x = b, 10x10 matrix, , b 10 element vector. surprisingly, approach put 1 block diagonal matrix , call scipy.linalg.solve once indeed slower both dense , sparse matrices.

import numpy np scipy.linalg import block_diag, solve scipy.sparse import block_diag sp_block_diag scipy.sparse.linalg import spsolve  n = 10 m = 1000 # number of coordinates  ai = np.random.randn(n, n) # can compute inverse here, # let's assume ai different matrices in loop loop bi = np.random.randn(n)  %timeit [solve(ai, bi) el in range(m)] # 10 loops, best of 3: 32.1 ms per loop  afull = sp_block_diag([ai]*m, format='csr') bfull = np.tile(bi, m)  %timeit afull = sp_block_diag([ai]*m, format='csr') %timeit spsolve(afull, bfull)  # 1 loops, best of 3: 303 ms per loop # 100 loops, best of 3: 5.55 ms per loop  afull = block_diag(*[ai]*m)   %timeit afull = block_diag(*[ai]*m) %timeit solve(afull, bfull)  # 100 loops, best of 3: 14.1 ms per loop # 1 loops, best of 3: 23.6 s per loop 

the solution of linear system, sparse arrays faster, time create block diagonal array slow. dense arrays, slower in case (and take lots of ram).

maybe i'm missing how make work efficiently sparse arrays, if keeping for loops, there 2 things optimizations.

from pure python, @ source code of scipy.linalg.solve : remove unnecessary tests , factorize repeated operations outside of loops. instance, assuming arrays not symmetrical positives, do

from scipy.linalg import get_lapack_funcs  gesv, = get_lapack_funcs(('gesv',), (ai, bi))  def solve_opt(a, b, gesv=gesv):     # not sure if copying , b necessary, in case (faster if arrays not copied)     lu, piv, x, info = gesv(a.copy(), b.copy(), overwrite_a=false, overwrite_b=false)     if info == 0:         return x     if info > 0:         raise linalgerror("singular matrix")     raise valueerror('illegal value in %d-th argument of internal gesv|posv' % -info)  %timeit [solve(ai, bi) el in range(m)] %timeit [solve_opt(ai, bi) el in range(m)]  # 10 loops, best of 3: 30.1 ms per loop # 100 loops, best of 3: 3.77 ms per loop 

which results in 6.5x speed up.

if need better performance, have port loop in cython , interface gesv blas functions directly in c, discussed here, or better cython api blas/lapack in scipy 0.16.

edit: @eelco hoogendoorn mentioned if a matrix fixed, there simpler , more efficient approach.


Comments

Popular posts from this blog

Fail to load namespace Spring Security http://www.springframework.org/security/tags -

sql - MySQL query optimization using coalesce -

unity3d - Unity local avoidance in user created world -