First Principles

A.3 Proof of Euler-Lagrange Equation

The Euler-Lagrange equation provides the solution to a classic problem in the calculus of variations. The goal of our problem is to find

\[\begin{equation} \arg\min_{x(t)} \mathcal{S} \qquad \mathcal{S} = \mathcal{I}[\mathcal{L}] = \int\limits_{t_0}^{t}{\mathcal{L}(x,\dot{x},t)\,\mathrm{d}t} \tag{1.5} \end{equation}\]

The usual method of finding critical points by finding the roots of a derivative works only with a function \(\mathbb{R} \to \mathbb{R}\). \(\mathcal{I}\), however, is a functional that takes in a function \(x(t)\) which represents a path and outputs a real number, the value of the integral from \(t_0\) to \(t\).

Let \(x^\star(t)\) be the optimal path, that is, the one that meets our objective of minimizing \(\mathcal{S}\). Then, any arbitrary path \(x(t)\) must either be the optimal path or a path that deviates from the optimal path. Let \(\epsilon\eta(t)\) be this deviation (\(\epsilon\) is just an arbitrary constant which will come into play later). Then,

\[\begin{align*} x(\epsilon) &= x^\star(t) + \epsilon\eta(t) \\ \dot{x}(\epsilon) &= \dot{x}^\star(t) + \epsilon\dot{\eta}(t) \end{align*}\]

Now we’re ready to begin the proof. We must show that \(\frac{\mathrm{d}\mathcal S}{\mathrm{d}\epsilon}\) is zero for some type of \(x(t)\). The analogy for a single-variable function is to show that there are points where \(y\prime(x)=0\). Expanding \(\mathcal{S}\),

\[\begin{align*} 0 &= \frac{\mathrm{d}}{\mathrm{d}\epsilon}\int\limits_{t_0}^{t}{\mathcal{L}(x,\dot{x},t)\,\mathrm{d}t} \\ &= \int\limits_{t_0}^{t}{\frac{\mathrm{d}}{\mathrm{d}\epsilon}{\Big[\mathcal{L}(x,\dot{x},t)\Big]}\,\mathrm{d}t} \\ &= \int\limits_{t_0}^{t}{ \left[\frac{\partial \mathcal{L}}{\partial x}\cancelto{\eta(t)}{\frac{\partial x}{\partial \epsilon}}\; + \frac{\partial \mathcal{L}}{\partial \dot{x}}\cancelto{\dot\eta(t)}{\frac{\partial \dot{x}}{\partial \epsilon}}\qquad \right]\,\mathrm{d}t} \\ \end{align*}\]

\[ 0 = \int\limits_{t_0}^{t}{ \frac{\partial \mathcal{L}}{\partial x} \eta(t) \,\mathrm{d}t} + {\int\limits_{t_0}^{t}{ \frac{\partial \mathcal{L}}{\partial \dot x} \dot\eta(t) \,\mathrm{d}t}} \]

We’ll set aside the first integral in this equation and integrate the second one with integration by parts. We will choose \(u=\shpderiv{\mathcal{L}}{\dot{x}}\) which leaves \(\diff{v} = \eta(t)\,\diff{t}\). Then \(v=\eta\) and

\[ \diff{u} = \deriv{}{t}\pderiv{\mathcal{L}}{\dot{x}}\diff{t} \]

\[\begin{align*} {\int\limits_{t_0}^{t}{ \frac{\partial \mathcal{L}}{\partial \dot x} \dot\eta(t) }\diff{t}} &= uv\Big\rvert_{t_0}^{t} - \int\limits_{t_0}^{t}{v\,\diff{u}}\\ &= \cancelto{0}{\left[\frac{\partial \mathcal{L}}{\partial \dot{x}}\eta(t)\right]_{t_0}^{t}} - \int\limits_{t_0}^{t}{\eta(t)\deriv{}{t}\frac{\partial \mathcal{L}}{\partial \dot{x}}}{t} \end{align*}\]

Since \(\eta(t)-\eta(t_0)=0\), we can cancel the first term without knowing anything about the partial derivative. Now we can substitute what’s left of the integral.

\[\begin{align*} 0 &= \int\limits_{t_0}^{t}{ \pderiv{\mathcal{L}}{x} \eta(t) }\diff{t} - \int\limits_{t_0}^{t}{\eta(t)\deriv{}{t}\frac{\partial \mathcal{L}}{\partial \dot{x}}}\diff{t} \\ &= \int\limits_{t_0}^{t}{\left[\pderiv{\mathcal{L}}{x} \eta(t) - \eta(t)\deriv{}{t}\frac{\partial \mathcal{L}}{\partial \dot{x}}\right]}\diff{t} \\ &= \int\limits_{t_0}^{t}{\left[\eta(t)\left(\pderiv{\mathcal{L}}{x} - \deriv{}{t}\frac{\partial \mathcal{L}}{\partial \dot{x}}\right)\right]}\diff{t} \end{align*}\]

Remember, \(\eta(x)\) is an arbitrary construct; what we’ve actually found is a constraint: the optimal solution has to obey the partial differential equation in parentheses. And so we yield the Euler-Lagrange equation:

\[\begin{equation} \pderiv{\mathcal L}{x} - \deriv{}{t}\frac{\partial \mathcal{L}}{\partial \dot{x}} = 0 \tag{1.6} \end{equation}\]

Example 1.2 (Optimizing Distance) Let’s pick the functional that represents arc length or distance:

\[ \mathcal{L}(y, \dot{y}, x) = \sqrt{1+\dot{y}^2} \]

Now we apply Euler-Lagrange and see what it yields:

\[ 0 = \cancelto{0}{\frac{\partial \mathcal{L}}{\partial y}} - \deriv{}{x}\pderiv{\mathcal{L}}{\dot{y}} = - \deriv{}{x} \frac{\dot{y}}{\sqrt{1+\dot{y}^2}} \]

\[ \frac{\dot{y}}{\sqrt{1+\dot{y}^2}} = C \implies \dot{y} = \frac{C}{\sqrt{1-C^2}} \]

As expected, \(y\) is linear w.r.t. \(x\) since straight lines are the most efficient way of getting from one point to another in Euclidean space.