Accelerating Your Python Code: Tips for Faster Performance

Python
Author

Kai Tan

Published

May 1, 2024

In this blog post, I will explore several techniques to accelerate your Python code and enhance its performance.

Use broadcasting to avoid creating a diagonal matrix

  • Use A * v instead of A @ np.diag(v)
  • Use v[:, np.newaxis] * A instead of np.diag(v) @ A
import numpy as np
import timeit
A = np.random.randn(1000, 1000)
v = np.random.randn(1000)
assert np.all( A @ np.diag(v) == A * v )
assert np.all( np.diag(v) @ A == v[:, np.newaxis] * A )
print('time for method 1:', timeit.timeit(lambda: A @ np.diag(v), number=10))
print('time for method 2:', timeit.timeit(lambda: A * v, number=10))
time for method 1: 0.2450430829776451
time for method 2: 0.013035333948209882

Avoid large matrix multiplication

  • Use np.einsum('ij,ji->', A, B) instead of np.trace(A @ B)
import numpy as np
import timeit
A = np.random.randn(1000, 1000)
B = np.random.randn(1000, 1000)
assert np.isclose(np.trace(A @ B), np.einsum('ij,ji->', A, B))
print('time for method 1:', timeit.timeit(lambda: np.trace(A @ B), number=10))
print('time for method 2:', timeit.timeit(lambda: np.einsum('ij,ji->', A, B), number=10))
time for method 1: 0.2524136659922078
time for method 2: 0.010554542066529393

Prioritize the order of matrix multiplication

  • Use A @ (B @ v) instead of A @ B @ v when v is a vector or an array of smaller size.
import numpy as np
import timeit
A = np.random.randn(1000, 1000)
B = np.random.randn(1000, 1000)
v = np.random.randn(1000)
assert np.allclose(A @ B @ v, A @ (B @ v))
print('time for method 1:', timeit.timeit(lambda: A @ B @ v, number=10))
print('time for method 2:', timeit.timeit(lambda: A @ (B @ v), number=10))
time for method 1: 0.29409229196608067
time for method 2: 0.002889833995141089