Accelerating Your Python Code: Tips for Faster Performance

Python
Author

Kai Tan

Published

May 1, 2024

In this blog post, I will explore several techniques to accelerate your Python code and enhance its performance.

Use broadcasting to avoid creating a diagonal matrix

  • Use A * v instead of A @ np.diag(v)
  • Use v[:, np.newaxis] * A instead of np.diag(v) @ A
import numpy as np
import timeit
A = np.random.randn(1000, 1000)
v = np.random.randn(1000)
assert np.all( A @ np.diag(v) == A * v )
assert np.all( np.diag(v) @ A == v[:, np.newaxis] * A )
print('time for method 1:', timeit.timeit(lambda: A @ np.diag(v), number=10))
print('time for method 2:', timeit.timeit(lambda: A * v, number=10))
time for method 1: 0.22306541702710092
time for method 2: 0.0215400829911232

Avoid large matrix multiplication

  • Use np.sum(A * B.T) instead of np.trace(A @ B)
import numpy as np
import timeit
A = np.random.randn(1000, 1000)
B = np.random.randn(1000, 1000)
assert np.isclose(np.trace(A @ B), np.sum(A * B.T))
print('time for method 1:', timeit.timeit(lambda: np.trace(A @ B), number=10))
print('time for method 2:', timeit.timeit(lambda: np.sum(A * B.T), number=10))
time for method 1: 0.19224495813250542
time for method 2: 0.023234750144183636

Prioritize the order of matrix multiplication

  • Use A @ (B @ v) instead of A @ B @ v when v is a vector or an array of smaller size.
import numpy as np
import timeit
A = np.random.randn(1000, 1000)
B = np.random.randn(1000, 1000)
v = np.random.randn(1000)
assert np.allclose(A @ B @ v, A @ (B @ v))
print('time for method 1:', timeit.timeit(lambda: A @ B @ v, number=10))
print('time for method 2:', timeit.timeit(lambda: A @ (B @ v), number=10))
time for method 1: 0.1941463330294937
time for method 2: 0.0012861250434070826