Accelerating Your Python Code: Tips for Faster Performance
Python
Author
Kai Tan
Published
May 1, 2024
In this blog post, I will explore several techniques to accelerate your Python code and enhance its performance.
Use broadcasting to avoid creating a diagonal matrix
Use A * v instead of A @ np.diag(v)
Use v[:, np.newaxis] * A instead of np.diag(v) @ A
import numpy as npimport timeitA = np.random.randn(1000, 1000)v = np.random.randn(1000)assert np.all( A @ np.diag(v) == A * v )assert np.all( np.diag(v) @ A == v[:, np.newaxis] * A )print('time for method 1:', timeit.timeit(lambda: A @ np.diag(v), number=10))print('time for method 2:', timeit.timeit(lambda: A * v, number=10))
time for method 1: 0.2450430829776451
time for method 2: 0.013035333948209882
Avoid large matrix multiplication
Use np.einsum('ij,ji->', A, B) instead of np.trace(A @ B)
import numpy as npimport timeitA = np.random.randn(1000, 1000)B = np.random.randn(1000, 1000)assert np.isclose(np.trace(A @ B), np.einsum('ij,ji->', A, B))print('time for method 1:', timeit.timeit(lambda: np.trace(A @ B), number=10))print('time for method 2:', timeit.timeit(lambda: np.einsum('ij,ji->', A, B), number=10))
time for method 1: 0.2524136659922078
time for method 2: 0.010554542066529393
Prioritize the order of matrix multiplication
Use A @ (B @ v) instead of A @ B @ v when v is a vector or an array of smaller size.
import numpy as npimport timeitA = np.random.randn(1000, 1000)B = np.random.randn(1000, 1000)v = np.random.randn(1000)assert np.allclose(A @ B @ v, A @ (B @ v))print('time for method 1:', timeit.timeit(lambda: A @ B @ v, number=10))print('time for method 2:', timeit.timeit(lambda: A @ (B @ v), number=10))
time for method 1: 0.29409229196608067
time for method 2: 0.002889833995141089