NumPy Tutorial 04: Vectorization, UFuncs, and Broadcasting¶
Download Notebook¶
import numpy as np
from time import perf_counter
1. UFuncs are the core engine¶
Universal functions (ufuncs) like np.exp, np.log, np.sqrt, np.sin operate element-wise and are optimized in C.
x = np.linspace(1, 5, 5)
print('x:', x)
print('exp(x):', np.round(np.exp(x), 3))
print('sqrt(x):', np.round(np.sqrt(x), 3))
print('log(x):', np.round(np.log(x), 3))
x: [1. 2. 3. 4. 5.]
exp(x): [ 2.718 7.389 20.086 54.598 148.413]
sqrt(x): [1. 1.414 1.732 2. 2.236]
log(x): [0. 0.693 1.099 1.386 1.609]
2. Conditional vectorized logic¶
Use np.where for branch-like operations on arrays.
x = np.array([-3, -1, 0, 2, 4])
y = np.where(x >= 0, x**2, -x)
print('x:', x)
print('piecewise y:', y)
x: [-3 -1 0 2 4]
piecewise y: [ 3 1 0 4 16]
3. Broadcasting rules¶
Two dimensions are compatible when they are equal or one of them is 1 (compared from right to left).
A = np.arange(12).reshape(3, 4)
b = np.array([10, 20, 30, 40])
print('A shape:', A.shape)
print('b shape:', b.shape)
print('A + b:\n', A + b)
A shape: (3, 4)
b shape: (4,)
A + b:
[[10 21 32 43]
[14 25 36 47]
[18 29 40 51]]
A = np.arange(12).reshape(3, 4)
c = np.array([[1], [2], [3]])
print('A shape:', A.shape)
print('c shape:', c.shape)
print('A + c:\n', A + c)
A shape: (3, 4)
c shape: (3, 1)
A + c:
[[ 1 2 3 4]
[ 6 7 8 9]
[11 12 13 14]]
4. Broadcasting pitfall and fix¶
Shape mismatch raises an error. We can often fix it by reshaping explicitly.
A = np.arange(12).reshape(3, 4)
v = np.array([1, 2, 3])
try:
A + v
except ValueError as e:
print('Broadcast error:', e)
print('Fix with reshape to column vector:\n', A + v.reshape(3, 1))
Broadcast error: operands could not be broadcast together with shapes (3,4) (3,)
Fix with reshape to column vector:
[[ 1 2 3 4]
[ 6 7 8 9]
[11 12 13 14]]
5. Vectorization benchmark (loop vs ufunc)¶
n = 800_000
x = np.linspace(-3, 3, n)
t0 = perf_counter()
y_loop = np.array([np.sin(v) + np.cos(v**2) for v in x])
t1 = perf_counter()
t2 = perf_counter()
y_vec = np.sin(x) + np.cos(x**2)
t3 = perf_counter()
print(f'Loop: {t1 - t0:.4f}s')
print(f'Vectorized: {t3 - t2:.4f}s')
print('Close?', np.allclose(y_loop, y_vec))
Loop: 1.3580s
Vectorized: 0.0209s
Close? True