Log-Sum-Exp
📐 Definition
Section titled “📐 Definition”For ,
To avoid overflow or underflow, use a shift and compute
Domain and Codomain
Section titled “Domain and Codomain”Domain: real vectors or arrays. Codomain: real values; extends to complex arguments via the principal logarithm.
⚙️ Key Properties
Section titled “⚙️ Key Properties”Invariance to uniform shifts: . Gradients are softmax weights:
The function is convex and provides a smooth approximation to .
🎯 Special Cases and Limits
Section titled “🎯 Special Cases and Limits”- Dominant entry: if one is much larger than the rest, .
- Equal entries: if for all , then .
- Two values: is a smooth max with a soft transition region.
🔗 Related Functions
Section titled “🔗 Related Functions”Log-sum-exp is built from the exponential and logarithm and is tightly coupled to softmax (its gradient). It is also related to log-absolute-value as a stabilized log-domain primitive.
Usage in Oakfield
Section titled “Usage in Oakfield”Oakfield does not currently expose a dedicated “log-sum-exp” operator, but the same stabilization pattern shows up in a few places:
- Math utilities:
core/math_utils.hprovidessim_logsumexp2_doubleandsim_logsumexp2_complexas reusable helpers for numerically stable log-domain accumulation. - Softplus-style clamps: the
thermostatoperator useslog1p(exp(kx)), which is a special case of log-sum-exp (log(exp(0) + exp(kx))) used for smooth, stable clamping. - Future-facing: these primitives are intended for operator kernels that need stable “smooth max / free energy” style reductions without overflow.
Historical Foundations
Section titled “Historical Foundations”📜 Free Energy and Partition Functions
Section titled “📜 Free Energy and Partition Functions”Expressions of the form arise naturally in statistical mechanics (log partition functions) and large-deviation/Laplace principles, where they summarize ensembles in a stable log domain.
🌍 Modern Perspective
Section titled “🌍 Modern Perspective”Log-sum-exp is a standard numerical stabilization primitive in optimization and machine learning, avoiding overflow/underflow while retaining differentiability.
📚 References
Section titled “📚 References”- Boyd & Vandenberghe, Convex Optimization
- Cover & Thomas, Elements of Information Theory