NumPy Tutorial 04: Vectorization, UFuncs, and Broadcasting

Download Notebook

Download this notebook

import numpy as np
from time import perf_counter

1. UFuncs are the core engine

Universal functions (ufuncs) like np.exp, np.log, np.sqrt, np.sin operate element-wise and are optimized in C.

x = np.linspace(1, 5, 5)
print('x:', x)
print('exp(x):', np.round(np.exp(x), 3))
print('sqrt(x):', np.round(np.sqrt(x), 3))
print('log(x):', np.round(np.log(x), 3))
x: [1. 2. 3. 4. 5.]
exp(x): [  2.718   7.389  20.086  54.598 148.413]
sqrt(x): [1.    1.414 1.732 2.    2.236]
log(x): [0.    0.693 1.099 1.386 1.609]

2. Conditional vectorized logic

Use np.where for branch-like operations on arrays.

x = np.array([-3, -1, 0, 2, 4])
y = np.where(x >= 0, x**2, -x)
print('x:', x)
print('piecewise y:', y)
x: [-3 -1  0  2  4]
piecewise y: [ 3  1  0  4 16]

3. Broadcasting rules

Two dimensions are compatible when they are equal or one of them is 1 (compared from right to left).

A = np.arange(12).reshape(3, 4)
b = np.array([10, 20, 30, 40])

print('A shape:', A.shape)
print('b shape:', b.shape)
print('A + b:\n', A + b)
A shape: (3, 4)
b shape: (4,)
A + b:
 [[10 21 32 43]
 [14 25 36 47]
 [18 29 40 51]]
A = np.arange(12).reshape(3, 4)
c = np.array([[1], [2], [3]])

print('A shape:', A.shape)
print('c shape:', c.shape)
print('A + c:\n', A + c)
A shape: (3, 4)
c shape: (3, 1)
A + c:
 [[ 1  2  3  4]
 [ 6  7  8  9]
 [11 12 13 14]]

4. Broadcasting pitfall and fix

Shape mismatch raises an error. We can often fix it by reshaping explicitly.

A = np.arange(12).reshape(3, 4)
v = np.array([1, 2, 3])

try:
    A + v
except ValueError as e:
    print('Broadcast error:', e)

print('Fix with reshape to column vector:\n', A + v.reshape(3, 1))
Broadcast error: operands could not be broadcast together with shapes (3,4) (3,) 
Fix with reshape to column vector:
 [[ 1  2  3  4]
 [ 6  7  8  9]
 [11 12 13 14]]

5. Vectorization benchmark (loop vs ufunc)

n = 800_000
x = np.linspace(-3, 3, n)

t0 = perf_counter()
y_loop = np.array([np.sin(v) + np.cos(v**2) for v in x])
t1 = perf_counter()

t2 = perf_counter()
y_vec = np.sin(x) + np.cos(x**2)
t3 = perf_counter()

print(f'Loop: {t1 - t0:.4f}s')
print(f'Vectorized: {t3 - t2:.4f}s')
print('Close?', np.allclose(y_loop, y_vec))
Loop: 1.3580s
Vectorized: 0.0209s
Close? True