A Proof of Jensen's Inequality and its Applications

November 4, 2019

This post gives a general proof of Jensen’s inequality and several useful applications.

Definition 1. A function $f : \mathbb{R} \rightarrow \mathbb{R}$ is convex if

\begin{aligned} f(\lambda x + (1 - \lambda) y) \leq \lambda f(x) + (1 - \lambda) f(y) \end{aligned}

whenever $x < y \in \mathbb{R}$ and $\lambda \in [0,1]$.

Definition 2. A subderivative of a convex function $f : \mathbb{R} \rightarrow \mathbb{R}$ at a point $x_0$ is a real number $c$ such that \begin{align} f(x) \geq f(x_0) + c(x - x_0) \end{align} for all $x \in \mathbb{R}$.

Proposition 3. A subderivative of a convex function $f : \mathbb{R} \rightarrow \mathbb{R}$ exists at any point $x_0$.

Proof. Let $x < y < z \in \mathbb{R}$ be given. First notice that $y = \lambda x + (1 - \lambda) z$ where

\begin{aligned} \lambda = \frac{y - z}{x - z} \in (0,1). \end{aligned}

Then since $f$ is convex,

\begin{aligned} f(y) \leq \left( \frac{y - z}{x - z} \right) f(x) + \left( \frac{x - y}{x - z} \right) f(z) \end{aligned}

which implies

\begin{aligned} \frac{f(x) - f(y)}{x - y} \leq \frac{f(z) - f(y)}{z - y}. \end{aligned}

Hence the interval

\begin{aligned} \partial f(x_0) \mathrel{:=} \left[ \lim_{h \rightarrow 0^-} \frac{f(x_0 + h) - f(x_0)}{h}, \ \lim_{h \rightarrow 0^+} \frac{f(x_0 + h) - f(x_0)}{h} \right] \end{aligned}

is nonempty, and we can choose $c \in \partial f(x_0)$. Note that the above two limits exist by the monotone convergence theorem. Let an arbitrary $x \in \mathbb{R}$ be given. If $x = x_0$, Inequality (1) is trivially satisfied. If $x < x_0$,

\begin{aligned} \frac{f(x) - f(x_0)}{x - x_0} \leq \lim_{h \rightarrow 0^-} \frac{f(x_0 + h) - f(x_0)}{h} \leq c \end{aligned}

and thus Inequality (1) is satisfied. If $x > x_0$,

\begin{aligned} \frac{f(x) - f(x_0)}{x - x_0} \geq \lim_{h \rightarrow 0^+} \frac{f(x_0 + h) - f(x_0)}{h} \geq c \end{aligned}

and again Inequality (1) is satisfied. This completes the proof. $\blacksquare$

Theorem 4. (Jensen’s Inequality) Let $(X,\mathcal{A},\mu)$ be a measure space, suppose $\mu(X) = 1$, and let $f : \mathbb{R} \rightarrow \mathbb{R}$ be convex. Let $g : X \rightarrow \mathbb{R}$ be integrable. Then the following inequality holds:

\begin{aligned} f \left( \int g \, d\mu \right) \leq \int f \circ g \, d\mu. \end{aligned}

Proof. Let $x_0 = \int g$. By Proposition 3, there is $c \in \mathbb{R}$ such that $f(y) \geq f(x_0) + c(y - x_0)$ for all $y \in \mathbb{R}$. Hence

\begin{aligned} \int f \circ g \, d\mu &\geq \int (f(x_0) + c(g(x) - x_0)) \, d\mu \\ &= \mu(X) f(x_0) + c \int g \, d\mu - c x_0 \mu(X) \\ &= f(x_0) = f \left( \int g \, d\mu \right). \end{aligned}

$\blacksquare$

Corollary 5. (Jensen’s Inequality, Discrete Form) Suppose $\lambda_n \geq 0$, $\sum_{n = 1}^\infty \lambda_n = 1$, and let $f : \mathbb{R} \rightarrow \mathbb{R}$ be convex. Let ${a_n}$ be a sequence of real numbers such that

\begin{aligned} \sum_{n = 1}^\infty \lambda_n a_n \end{aligned}

converges absolutely. Then the following inequality holds:

\begin{aligned} f \left( \sum_{n = 1}^\infty \lambda_n a_n \right) \leq \sum_{n = 1}^\infty \lambda_n f(a_n). \end{aligned}

Proof. Apply Theorem 4 with measure space $(\mathbb{N},\mathcal{P}(\mathbb{N}), \mu)$ such that $\mu(\{n\}) = \lambda_n$ and $g(n) \mathrel{:=} a_n$. $\blacksquare$