Random Fourier Feature Expansions

📐 Definition

Random Fourier features map inputs through randomized sinusoids whose frequencies are drawn from a kernel-dependent distribution, enabling finite-dimensional approximations of shift-invariant kernels.

\phi(x) = \sqrt{\tfrac{2}{m}}[\cos(\omega_i^\top x + b_i)]_{i=1}^m

Domain and Codomain

Inputs are vectors in the kernel domain; outputs are feature vectors used for linear models.

⚙️ Key Properties

\omega_i \sim p(\omega), \qquad k(x,x') \approx \phi(x)^\top \phi(x')

Under the Fourier convention $\hat{k}(\omega)=\int_{\mathbb{R}^d} k(\delta)\,e^{-i\omega^\top\delta}\,d\delta$ , the normalized sampling density is

p(\omega)=\frac{1}{(2\pi)^d}\frac{\hat{k}(\omega)}{k(0)}

(so that $\int p(\omega)\,d\omega=1$ ), and expectations over $\omega_i$ and $b_i$ recover the target shift-invariant kernel. The estimator’s mean-square error scales like $\mathcal{O}(1/m)$ (so typical error scales like $\mathcal{O}(1/\sqrt{m})$ ).

In the common cosine-with-random-phase construction above, one takes $b_i\sim U(0,2\pi)$ independently of $\omega_i$ .

For the Gaussian/RBF kernel $k(\delta)=\exp(-\|\delta\|^2/(2\ell^2))$ (so $k(0)=1$ ), one has $\omega_i \sim \mathcal{N}(0, \ell^{-2} I)$ . As $m \to \infty$ , Monte Carlo error vanishes almost surely; for $m=1$ , the feature collapses to a single random sinusoid capturing one spectral component.

🎯 Special Cases and Limits

Gaussian/RBF kernels use Gaussian spectral sampling for $\omega_i$ .
Increasing $m$ reduces Monte Carlo error at rate $\mathcal{O}(1/\sqrt{m})$ .

Fourier transforms connect $p(\omega)$ to the target kernel; Gaussian functions set the spectral density for RBF kernels; chirps and complex exponentials provide the underlying sinusoidal components.

Usage in Oakfield

Oakfield uses random Fourier feature expansions as a stimulus family:

stimulus_random_fourier operator synthesizes structured fields from a sum of random Fourier features with configurable band limits (k_min/k_max), feature count, spectral slope, and seed.
Used to generate controlled random spatial structure (often as a “textured” forcing/initial condition) rather than as a learned-kernel approximation layer.

Historical Foundations

📜 Bochner’s Theorem and Random Features

Random Fourier features are motivated by Bochner’s theorem, which represents shift-invariant positive-definite kernels via spectral measures.

🌍 Modern Perspective

They provide practical finite-dimensional surrogates for kernel methods, trading deterministic accuracy for randomized scalability.

📚 References

Rahimi & Recht, “Random Features for Large-Scale Kernel Machines” (2007)
Stein & Shakarchi, Fourier Analysis