Calculus of Variations: Optimizing Over Functions
The calculus of variations extends optimization from numbers to functions. Instead of finding the x that minimizes f(x), we find the function y(x) that minimizes a functional J[y] — a real-valued map on a space of functions. This framework underlies classical mechanics, optics, geometry, and modern optimal control theory.
Learning Objectives
After completing this guide, you will be able to:
- ✓Define functionals and identify admissible function spaces for variational problems
- ✓Derive the Euler-Lagrange equation from the first variation and apply it to classical problems
- ✓Solve the brachistochrone, geodesic, minimal surface, and isoperimetric problems
- ✓Analyze the second variation using the Legendre and Jacobi conditions
- ✓Formulate mechanics using Hamilton's principle and derive Lagrange's equations
- ✓Apply the Legendre transform to pass from Lagrangian to Hamiltonian mechanics
- ✓State and apply Noether's theorem to derive conservation laws from symmetries
- ✓Understand the direct methods and Sobolev space framework for existence theory
1. Functionals and Admissible Functions
Ordinary calculus optimizes real-valued functions of finitely many variables. The calculus of variations generalizes this to real-valued functions defined on infinite-dimensional spaces of functions, called functionals.
Definition of a Functional
A functional J assigns a real number to each function in some admissible class. The archetypal variational functional takes the integral form:
J[y] = integral from a to b of F(x, y(x), y'(x)) dx
- x is the independent variable on [a, b]
- y(x) is the unknown function (the curve we seek)
- y'(x) = dy/dx is the derivative of y
- F(x, y, p) is the Lagrangian density, assumed smooth
Admissible Functions
The admissible class specifies which functions y are permitted competitors. For fixed-endpoint problems, the standard admissible class is:
A = (y in C^1[a,b] : y(a) = alpha, y(b) = beta)
This class requires y to be continuously differentiable on [a, b] and to satisfy fixed endpoint conditions. Natural (Neumann) boundary conditions relax the endpoint requirements and lead to additional transversality conditions.
Canonical Examples of Functionals
Arc Length
J[y] = integral of sqrt(1 + y'^2) dx
Measures the length of the curve y(x) between two points. Minimizing gives straight lines (in the plane).
Surface of Revolution
J[y] = 2 pi * integral of y * sqrt(1 + y'^2) dx
Area of the surface generated by rotating y(x) around the x-axis. Minimizing gives catenoids (soap film surfaces).
Action (Mechanics)
S[q] = integral of L(q, q-dot, t) dt
The action of a mechanical system with Lagrangian L = T - V. Stationary paths satisfy Newton's second law.
Time of Descent
T[y] = integral of sqrt((1 + y'^2) / (2g|y|)) dx
Time for a bead to slide down curve y(x) under gravity g. The brachistochrone: minimized by a cycloid.
Multiple Dependent Variables
Many physical problems involve several dependent variables y_1(x), ..., y_n(x). The functional generalizes to:
J[y_1, ..., y_n] = integral of F(x, y_1, ..., y_n, y_1', ..., y_n') dx
This leads to a system of Euler-Lagrange equations, one for each y_i. In Lagrangian mechanics, the y_i are generalized coordinates q_1, ..., q_n describing the configuration space of the system.
2. The Euler-Lagrange Equation
The Euler-Lagrange equation is the central result of the calculus of variations. It is the necessary condition that any smooth minimizer (or maximizer) of the functional J[y] must satisfy.
Derivation via the First Variation
Let y* be an extremum of J in the admissible class A. For any admissible variation eta(x) — a smooth function with eta(a) = eta(b) = 0 — define the perturbed function:
y_epsilon(x) = y*(x) + epsilon * eta(x)
Then phi(epsilon) = J[y_epsilon] is a function of the real parameter epsilon, and y* is an extremum only if:
phi'(0) = 0 for all admissible eta
Computing phi'(0) by differentiating under the integral sign and integrating by parts yields:
First Variation:
delta J[y; eta] = integral of (F_y - d/dx F_y') * eta dx = 0
where F_y = partial F / partial y and F_y' = partial F / partial y'. Since eta is arbitrary, the fundamental lemma of the calculus of variations forces the integrand to vanish.
The Euler-Lagrange Equation
F_y - d/dx(F_y') = 0
Equivalently: partial F / partial y - d/dx (partial F / partial y') = 0. This is an ODE for y(x), generally of second order. Combined with the boundary conditions y(a) = alpha, y(b) = beta, it defines a boundary value problem whose solutions are the extremals of J.
The Fundamental Lemma of Variational Calculus
Fundamental Lemma
If g is continuous on [a, b] and the integral of g(x) * eta(x) dx = 0 for every smooth function eta with eta(a) = eta(b) = 0, then g(x) = 0 for all x in [a, b]. This lemma is what allows us to extract a pointwise equation from the integral condition delta J = 0.
Beltrami Identity (F Independent of x)
When the Lagrangian F does not depend explicitly on x (autonomous problems), the Euler-Lagrange equation has a first integral called the Beltrami identity:
F - y' * F_y' = C (constant)
This reduces the Euler-Lagrange ODE from second order to first order — a major simplification. The Beltrami identity applies directly to the brachistochrone and geodesic problems.
Natural Boundary Conditions
If one or both endpoints are free (not prescribed), the integration by parts in the first variation produces boundary terms. Setting these to zero gives natural boundary conditions:
F_y'(a, y(a), y'(a)) = 0 (if left endpoint is free)
F_y'(b, y(b), y'(b)) = 0 (if right endpoint is free)
More generally, if an endpoint lies on a curve g(x, y) = 0, the transversality condition imposes an orthogonality-type relation between the extremal and the endpoint curve.
Higher-Order Lagrangians
If J[y] depends on higher derivatives F(x, y, y', y''), the Euler-Lagrange equation generalizes via repeated integration by parts:
F_y - d/dx(F_y') + d^2/dx^2(F_y'') = 0
This is a fourth-order ODE for y. Higher-order problems appear in beam bending (Euler-Bernoulli beam theory uses the curvature squared as the Lagrangian) and thin plate theory.
3. Classical Variational Problems
Several famous problems from the history of mathematics and physics reduce to applying the Euler-Lagrange equation to specific functionals. These classical problems are the standard examples tested on graduate qualifying exams.
The Brachistochrone Problem
Find the curve connecting (0, 0) to (a, -b) (with b greater than 0) along which a frictionless bead slides under gravity in minimum time.
Setup:
By energy conservation: v = sqrt(2gy) (taking y downward). Time = integral of ds/v = integral of sqrt((1 + y'^2) / (2gy)) dx
Solution via Beltrami Identity:
F = sqrt((1 + y'^2) / y) is independent of x, so F - y' F_y' = C. This simplifies to: y(1 + y'^2) = 1/C^2 = 2R (constant).
Parametric Solution (Cycloid):
x = R(theta - sin theta), y = R(1 - cos theta)
Historical Note
Johann Bernoulli posed the brachistochrone challenge in 1696. Solutions were submitted by Leibniz, Newton (anonymously, in one night), Jakob Bernoulli, and L'Hôpital. The problem launched the systematic development of the calculus of variations by Euler and Lagrange in the following decades.
Geodesics
A geodesic is the shortest path between two points on a surface. In the plane, geodesics are straight lines. On a sphere, they are great circle arcs.
Plane Geodesic:
Minimize J[y] = integral of sqrt(1 + y'^2) dx. Euler-Lagrange: d/dx(y' / sqrt(1 + y'^2)) = 0, so y' = const. Solution: y = mx + b (straight lines, as expected).
Sphere Geodesic:
On the unit sphere with metric ds^2 = d(theta)^2 + sin^2(theta) d(phi)^2, the geodesic equations from the Euler-Lagrange system yield great circles: intersections of the sphere with planes through the origin.
In Riemannian geometry, geodesics satisfy the geodesic equation:
d^2 x^k/ds^2 + Gamma^k_ij (dx^i/ds)(dx^j/ds) = 0
where Gamma^k_ij are Christoffel symbols of the metric. In general relativity, freely falling particles follow geodesics of spacetime — this is the variational content of Einstein's equivalence principle.
Minimal Surfaces and Soap Films
A minimal surface is one with zero mean curvature — the variational condition for a surface of least area spanning a given boundary curve. Physical soap films realize minimal surfaces because surface tension drives them to minimize area.
Plateau's Problem:
Given a closed curve in space, find the surface of least area spanning it. This was solved rigorously by Jesse Douglas and Tibor Radó in 1930-31, earning Douglas the first Fields Medal.
Catenoid (Surface of Revolution):
Minimize 2 pi * integral of y * sqrt(1 + y'^2) dx. Beltrami identity: y / sqrt(1 + y'^2) = C. Solution: y = C * cosh((x - x_0)/C) (catenary).
Minimal Surface Equation:
(1 + z_y^2)z_xx - 2z_x z_y z_xy + (1 + z_x^2)z_yy = 0
This nonlinear PDE for z(x,y) is the Euler-Lagrange equation of the area functional for graphs. Famous solutions include Scherk's surface, Enneper's surface, and the helicoid.
The Isoperimetric Problem
Among all closed curves of fixed perimeter L, which encloses the greatest area? The answer — known since antiquity — is the circle. This is the classical isoperimetric problem, and its rigorous proof requires the calculus of variations with a constraint.
Formulation:
Maximize: A[y] = integral of y dx (area under curve) Subject to: L[y] = integral of sqrt(1 + y'^2) dx = L (fixed length)
Solution via Lagrange Multiplier:
Form H[y] = A[y] + lambda * L[y] and apply Euler-Lagrange. The extremals are circles of radius R = L / (2 pi).
Isoperimetric Inequality
4 pi A ≤ L^2
with equality if and only if the curve is a circle. This inequality has deep generalizations in Riemannian geometry and geometric measure theory.
4. The Second Variation
The first variation gives necessary conditions (the Euler-Lagrange equation) analogous to f'(x) = 0. To distinguish minima from maxima and saddle points, we need the second variation — the variational analogue of the second derivative test.
Definition of the Second Variation
Let y* be an extremal (solution of Euler-Lagrange). The second variation is the quadratic form in eta:
delta^2 J[y*; eta] = integral of (F_yy eta^2 + 2 F_yy' eta eta' + F_y'y' eta'^2) dx
For y* to be a weak local minimum, we need delta^2 J ≥ 0 for all admissible eta. This is the necessary condition for a minimum.
The Legendre Condition
A necessary condition for a weak local minimum is the Legendre condition:
F_y'y'(x, y*, y*') ≥ 0 for all x in [a, b]
If F_y'y' is strictly positive everywhere (strengthened Legendre condition), the functional is locally convex in the y' direction, which is a necessary step toward confirming a minimum. The Legendre condition fails for maximization problems.
The Jacobi Condition and Conjugate Points
Even when the Legendre condition holds, an extremal may fail to be a minimum if it is "too long." The Jacobi condition addresses this via conjugate points.
Jacobi Equation:
d/dx(F_y'y' u') - (F_yy - d/dx F_yy') u = 0
Conjugate Point:
The point x = c is conjugate to a with respect to the extremal y* if the Jacobi equation has a nontrivial solution u with u(a) = 0 and u(c) = 0.
Jacobi Condition for a Minimum:
The interval (a, b] contains no conjugate point to a. If the open interval (a, b) contains a conjugate point, y* is not even a weak local minimum.
Weak vs. Strong Local Minima
Weak Local Minimum
y* minimizes J over functions close to y* in the C^1 norm (both y and y' close). Conditions: Euler-Lagrange + Legendre + Jacobi (no conjugate points).
Strong Local Minimum
y* minimizes J over functions close to y* in the C^0 norm (only y close, y' may differ greatly). Requires additionally the Weierstrass E-function condition: E(x, y, p, q) ≥ 0.
Summary: Necessary and Sufficient Conditions
- Necessary: Euler-Lagrange equation holds (first variation = 0)
- Necessary: Legendre condition F_y'y' ≥ 0
- Necessary: Jacobi condition (no conjugate points in (a,b))
- Sufficient for weak min: Legendre (strict) + Jacobi (strict) conditions
- Sufficient for strong min: Add Weierstrass E ≥ 0
5. Constrained Variational Problems
Many variational problems include constraints — integral constraints (isoperimetric), pointwise constraints (obstacles), or differential constraints (holonomic and non-holonomic). The method of Lagrange multipliers extends naturally from finite dimensions.
Isoperimetric Constraints
The general isoperimetric problem is: minimize J[y] subject to K[y] = constant, where both J and K are integral functionals.
Method:
Introduce a constant Lagrange multiplier lambda and extremize the unconstrained functional:
H[y] = J[y] + lambda * K[y] = integral of (F + lambda G) dx
Apply the Euler-Lagrange equation to the combined Lagrangian F + lambda G. The constant lambda is determined by the constraint K[y] = c.
Holonomic Constraints
Holonomic constraints restrict the admissible functions by algebraic equations involving x and y only (not y'). In mechanics, a bead constrained to move on a wire y = f(x) provides an example.
For several dependent variables y_1, ..., y_n with holonomic constraints g_k(x, y_1, ..., y_n) = 0, use Lagrange multiplier functions lambda_k(x) (not constants) and extremize:
integral of (F + sum lambda_k g_k) dx
This yields n + m Euler-Lagrange equations for n unknowns y_i and m multiplier functions lambda_k, along with the m constraint equations.
Non-Holonomic Constraints
Non-holonomic constraints involve derivatives: g(x, y, y') = 0. These cannot in general be integrated to yield holonomic constraints, and the variational treatment is more subtle. Examples include rolling without slipping in mechanics.
The D'Alembert-Lagrange principle handles non-holonomic constraints in mechanics by working with virtual displacements compatible with the constraints. For a single non-holonomic constraint g(x, y_1, ..., y_n, y_1', ..., y_n') = 0, the Lagrange-D'Alembert equations take the form: Euler-Lagrange for F + lambda * partial g / partial y_i' for each i, plus the constraint equation.
6. Hamilton's Principle and Lagrangian Mechanics
Hamilton's principle (principle of stationary action) reformulates Newtonian mechanics entirely in variational terms. This formulation is coordinate-independent, handles constraints naturally, and directly reveals conservation laws through symmetry.
The Action Functional
Hamilton's Principle
The true trajectory q(t) of a mechanical system is the one that makes the action stationary:
S[q] = integral from t1 to t2 of L(q, q-dot, t) dt
where L = T - V is the Lagrangian: kinetic energy minus potential energy. The condition delta S = 0 with fixed endpoints q(t1) and q(t2) yields the Euler-Lagrange equations, which are precisely Lagrange's equations of motion.
Lagrange's Equations of Motion
d/dt(partial L / partial q-dot_i) - partial L / partial q_i = 0
for each generalized coordinate q_i, i = 1, ..., n. These n second-order ODEs completely determine the motion of the system.
Example: Particle in Potential V(x):
L = (1/2)m x-dot^2 - V(x)
d/dt(m x-dot) + V'(x) = 0
m x-ddot = -V'(x) = F(x) (Newton's second law)
Advantages of the Lagrangian Formulation
Coordinate Freedom
Works in any generalized coordinates — polar, spherical, curvilinear. Constraint forces disappear automatically.
Constraint Handling
Holonomic constraints reduce the number of degrees of freedom. No need to compute normal forces.
Symmetry and Conservation
Cyclic coordinates (L independent of q_i) immediately yield conserved momenta p_i = partial L / partial q-dot_i.
Worked Example: Spherical Pendulum
A mass m on a rod of length l, constrained to a sphere. Generalized coordinates: theta (polar angle), phi (azimuthal angle).
T = (1/2) m l^2 (theta-dot^2 + sin^2(theta) phi-dot^2)
V = -mgl cos(theta)
L = T - V
Since L does not depend on phi (phi is cyclic), the azimuthal momentum p_phi = m l^2 sin^2(theta) phi-dot is conserved (angular momentum).
The theta equation: m l^2 theta-ddot - m l^2 sin(theta)cos(theta) phi-dot^2 + mgl sin(theta) = 0.
7. Hamiltonian Mechanics
Hamiltonian mechanics reformulates Lagrangian mechanics by replacing generalized velocities q-dot with conjugate momenta p via the Legendre transform. The result is an equivalent but geometrically richer description on phase space.
The Legendre Transform
Conjugate Momentum:
p_i = partial L / partial q-dot_i
Hamiltonian:
H(q, p, t) = sum_i p_i q-dot_i - L(q, q-dot, t)
The Legendre transform converts L (a function of (q, q-dot, t)) to H (a function of (q, p, t)). For natural mechanical systems (T quadratic in q-dot, V independent of q-dot): H = T + V = total energy.
Hamilton's Equations
Hamilton's Canonical Equations
q-dot_i = partial H / partial p_i
p-dot_i = -partial H / partial q_i
These 2n first-order ODEs replace Lagrange's n second-order ODEs. The state is a point (q, p) in 2n-dimensional phase space, and Hamilton's equations define a flow on phase space.
Phase Space and Liouville's Theorem
Phase space is the 2n-dimensional space with coordinates (q_1, ..., q_n, p_1, ..., p_n). Hamiltonian flow defines a vector field on phase space. Liouville's theorem states that this flow preserves phase space volume:
div(q-dot, p-dot) = sum_i (partial^2 H / partial q_i partial p_i - partial^2 H / partial p_i partial q_i) = 0
By the divergence theorem, any volume in phase space is preserved under Hamiltonian flow. This has profound consequences: the system is incompressible in phase space, preventing trajectories from converging to attractors (a Hamiltonian system cannot have asymptotically stable fixed points with a basin of attraction).
Poisson Brackets
Definition:
(f, g) = sum_i (partial f/partial q_i * partial g/partial p_i - partial f/partial p_i * partial g/partial q_i)
Time Evolution:
df/dt = (f, H) + partial f/partial t
A quantity f is conserved if and only if (f, H) = 0 and f has no explicit time dependence. Canonical coordinates satisfy the fundamental Poisson bracket relations: (q_i, p_j) = delta_ij, (q_i, q_j) = 0, (p_i, p_j) = 0.
Worked Example: Simple Harmonic Oscillator
L = (1/2)m q-dot^2 - (1/2)k q^2
p = m q-dot, so q-dot = p/m
H = p^2/(2m) + (1/2)k q^2
q-dot = partial H/partial p = p/m
p-dot = -partial H/partial q = -kq
Phase space trajectories are ellipses: p^2/(2mE) + kq^2/(2E) = 1, all enclosing the same phase-space area 2 pi E/omega (Liouville).
8. Noether's Theorem
Emmy Noether proved in 1915 (published 1918) one of the most beautiful results in mathematical physics: every continuous symmetry of the action functional corresponds to a conserved quantity. This theorem unifies the conservation laws of classical mechanics and extends to field theory and general relativity.
Symmetries and Conservation Laws
Noether's Theorem (Classical Mechanics)
If the action S[q] is invariant under a one-parameter family of transformations q_i(t) to Q_i(q, t, epsilon) (with Q_i = q_i at epsilon = 0), then the quantity:
I = sum_i p_i * (dQ_i/d epsilon)|_(epsilon=0)
is conserved along every solution of the equations of motion (dI/dt = 0). Here p_i = partial L / partial q-dot_i is the conjugate momentum.
The Three Fundamental Conservation Laws
Time Translation
If L has no explicit time dependence (t to t + epsilon), then:
E = sum_i p_i q-dot_i - L = H
is conserved. This is energy conservation.
Spatial Translation
If L is invariant under q to q + epsilon e (shift in direction e), then:
p * e = sum_i m_i q-dot_i * e
is conserved. This is linear momentum conservation.
Rotation
If L is invariant under rotations about axis n, then:
L * n = sum_i (r_i x m_i v_i) * n
is conserved. This is angular momentum conservation.
Gauge Symmetries and Field Theory
In field theory, Noether's theorem applies to both global and local (gauge) symmetries. For a field phi(x, t) with Lagrangian density L:
Each continuous symmetry yields a conserved current j^mu (density and flux) satisfying the continuity equation:
partial_mu j^mu = 0 (Einstein summation)
The conserved charge Q = integral of j^0 d^3x is the spatial integral of the charge density. In electromagnetism, U(1) gauge symmetry yields charge conservation. In the Standard Model, SU(3) x SU(2) x U(1) gauge symmetries yield the conservation laws governing the strong, weak, and electromagnetic forces.
Exam Tip: Identifying Symmetries
To apply Noether's theorem: (1) identify a one-parameter transformation under which L changes by at most a total time derivative; (2) compute dQ_i/d epsilon at epsilon = 0; (3) write I = sum p_i (dQ_i/d epsilon). If L is unchanged (delta L = 0), then dI/dt = 0 directly. If L changes by a total derivative d(F)/dt, include a correction term -F in the conserved quantity.
9. Direct Methods and Existence Theory
The Euler-Lagrange equation is a necessary condition for an extremum, but it does not guarantee that a minimizer exists. The direct methods of the calculus of variations provide a general framework for proving existence without solving the Euler-Lagrange equation.
Sobolev Spaces
Classical function spaces (C^k) are too small and too weak for existence theory. Sobolev spaces provide the right framework.
Definition of W^(1,2)(Omega) = H^1(Omega):
H^1(Omega) = (u in L^2(Omega) : weak gradient Du in L^2(Omega))
with inner product: (u, v)_H1 = integral of (u*v + Du * Dv) dx. This is a Hilbert space — complete, with an inner product.
Sobolev Embedding:
In dimension n, H^1(Omega) embeds continuously into L^(2n/(n-2))(Omega) for n ≥ 3 (Sobolev embedding), and compactly into L^2(Omega) by the Rellich-Kondrachov theorem. These compact embeddings are crucial for the direct method.
The Direct Method of the Calculus of Variations
Direct Method Strategy
- Take a minimizing sequence: J[u_k] to inf J as k to infinity
- Show the sequence is bounded in H^1 (using coercivity: J[u] ≥ C||u||_H1 - C)
- Extract a weakly convergent subsequence: u_k converges weakly to u* in H^1
- Use weak lower semicontinuity: J[u*] ≤ lim inf J[u_k] = inf J
- Conclude: u* achieves the infimum, hence is a minimizer
Weak Lower Semicontinuity
The critical property needed in Step 4 is weak lower semicontinuity of J. For integral functionals, this follows from convexity:
Tonelli's Theorem:
If F(x, u, p) is convex in p (the gradient variable) and satisfies appropriate growth conditions (coercivity: F ≥ alpha |p|^r - beta for some alpha greater than 0, r greater than 1), then the functional J[u] = integral of F(x, u, Du) dx is weakly lower semicontinuous on W^(1,r)(Omega) and attains its minimum on closed convex subsets.
Quasiconvexity (Morrey):
For vector-valued problems (elasticity, geometric problems), convexity in p must be replaced by quasiconvexity, a weaker condition introduced by Morrey. Quasiconvexity is necessary and (with growth conditions) sufficient for weak lower semicontinuity of the functional.
Regularity Theory
The direct method produces a weak minimizer in a Sobolev space. Elliptic regularity theory then upgrades this to a classical solution:
If u* in H^1 minimizes J and the Lagrangian F is smooth and satisfies the Legendre-Hadamard condition (F_pp is positive definite), then u* is smooth — this is De Giorgi-Nash-Moser regularity for uniformly elliptic problems.
For systems (vector-valued u), partial regularity holds: u* is smooth except on a set of measure zero (Giusti-Miranda for area-minimizing currents, Evans for strongly elliptic systems).
10. Applications Across Science
The calculus of variations appears throughout physics, engineering, and mathematics. The following applications illustrate the breadth of the subject.
Fermat's Principle in Optics
Light travels between two points along the path that minimizes (or more precisely, makes stationary) the optical path length, which is the travel time times the speed of light:
T[y] = (1/c) integral of n(x, y) * sqrt(1 + y'^2) dx
where n(x, y) is the refractive index. The Euler-Lagrange equation yields the ray equation, which in a homogeneous medium (n = constant) gives straight lines. At an interface between two media, Snell's law follows as a consequence:
n_1 sin(theta_1) = n_2 sin(theta_2)
General Relativity: Geodesics in Spacetime
In general relativity, freely falling particles and light rays follow geodesics of the spacetime metric g_mu nu. Massive particles extremize proper time:
tau[x] = integral of sqrt(-g_mu nu (dx^mu/dlambda)(dx^nu/dlambda)) dlambda
The Euler-Lagrange equations yield the geodesic equation with Christoffel symbols encoding the curvature of spacetime. In the Schwarzschild metric, geodesics account for the perihelion precession of Mercury and the bending of light by the sun.
Optimal Control Theory: Pontryagin's Maximum Principle
Optimal control extends the calculus of variations to problems with control variables and state dynamics constraints:
Problem Formulation:
Minimize J[u] = integral of f_0(x, u, t) dt + phi(x(T)) subject to: x-dot = f(x, u, t), x(0) = x_0, u(t) in U
Pontryagin's Maximum Principle:
Introduce costate variables p(t) (analogous to momenta) and the control Hamiltonian H(x, p, u) = p * f(x, u, t) - f_0(x, u, t). Optimal control u* must maximize H at every time t:
H(x*, p*, u*) = max over u in U of H(x*, p*, u)
Bang-Bang Control:
When the Hamiltonian is linear in u, the optimal control often takes values only at the boundary of the constraint set U (the bang-bang principle). This is why minimum-time problems for rockets and missiles involve switching between full thrust and zero thrust at optimal switching times.
Elasticity Theory
The equilibrium configuration of an elastic body minimizes the total elastic energy:
E[u] = integral over Omega of W(Du) dx - integral over partial Omega of g * u dS
where u is the displacement vector, W(F) is the stored energy density (a function of the deformation gradient F = Du), and g is the surface traction. For linear elasticity, W(F) is quadratic in the symmetric strain tensor epsilon = (F + F^T)/2.
The Euler-Lagrange equations yield the Lamé-Navier equations of linear elasticity. For nonlinear elasticity (rubber, biological tissue), quasiconvexity of W is the condition for well-posedness of the variational problem.
Quantum Mechanics: The Rayleigh-Ritz Method
Variational Principle for Ground State Energy:
E_0 ≤ (psi, H psi) / (psi, psi) for any trial function psi
The ground state energy E_0 is the minimum of the Rayleigh quotient over all nonzero functions psi in the domain of H. Minimizing over a parametric family of trial functions (Ritz method) gives upper bounds for E_0 and approximate ground states.
The Rayleigh-Ritz method is the foundation of finite element methods and density functional theory (DFT) in computational chemistry.
Image Processing: Total Variation
Rudin-Osher-Fatemi Model:
Minimize: TV[u] + (lambda/2)||u - f||_L2^2
where TV[u] = integral of |Du| dx is the total variation (a variational analogue of arc length), f is a noisy image, and u is the denoised output. The total variation functional preserves sharp edges while smoothing noise — a crucial property for image processing.
The Euler-Lagrange equation: -div(Du/|Du|) + lambda(u - f) = 0, a nonlinear PDE solved numerically via iterative splitting methods.
Practice Problems
Find the extremals of the functional: J[y] = integral from 0 to 1 of (y'^2 + 2xy) dx with boundary conditions y(0) = 0 and y(1) = 1.
Show Solution
Solution:
Here F(x, y, y') = y'^2 + 2xy. Compute: F_y = 2x and F_y' = 2y'.
Euler-Lagrange: 2x - d/dx(2y') = 0, so y'' = x.
Integrating twice: y' = x^2/2 + C_1, then y = x^3/6 + C_1 x + C_2.
Apply BCs: y(0) = 0 gives C_2 = 0. y(1) = 1 gives 1/6 + C_1 = 1, so C_1 = 5/6.
y(x) = x^3/6 + (5/6)x
Find the curve y(x) of minimum arc length between (0, 1) and (1, 2) using the Beltrami identity. Verify that the answer is a straight line.
Show Solution
Solution:
F = sqrt(1 + y'^2) is independent of x. Beltrami identity: F - y' F_y' = C.
sqrt(1 + y'^2) - y' * y'/sqrt(1 + y'^2) = C
(1 + y'^2 - y'^2) / sqrt(1 + y'^2) = 1/sqrt(1 + y'^2) = C
So sqrt(1 + y'^2) = 1/C, meaning y' = sqrt(1/C^2 - 1) = constant. Hence y = mx + b (straight line).
BCs: y(0) = 1 gives b = 1. y(1) = 2 gives m + 1 = 2, so m = 1.
y(x) = x + 1
A mass m slides without friction on a frictionless inclined plane of angle alpha. Using the Lagrangian formulation with coordinate x along the plane, find the equation of motion and solve for x(t).
Show Solution
Solution:
Height above reference: h = -x sin(alpha) (x increases down the slope).
T = (1/2) m x-dot^2
V = mgh = -mgx sin(alpha)
L = T - V = (1/2)m x-dot^2 + mgx sin(alpha)
Lagrange's equation: d/dt(m x-dot) - mg sin(alpha) = 0
x-ddot = g sin(alpha)
Integrating with x(0) = 0, x-dot(0) = 0:
x(t) = (1/2) g sin(alpha) t^2
The Lagrangian for a free particle in 2D is L = (1/2)m(x-dot^2 + y-dot^2). Identify the symmetry corresponding to translations in the x-direction and use Noether's theorem to find the associated conserved quantity.
Show Solution
Solution:
The transformation is: x to x + epsilon, y to y (with epsilon small). Under this transformation:
L = (1/2)m((x + epsilon)-dot^2 + y-dot^2) = (1/2)m(x-dot^2 + y-dot^2)
L is unchanged (delta L = 0), confirming translation symmetry.
dQ_x / d epsilon at epsilon = 0: d(x + epsilon)/d epsilon = 1 (for x-component), 0 for y.
Noether charge: I = p_x * 1 + p_y * 0 = p_x = m x-dot
I = m x-dot (x-momentum) is conserved
Find the curve y(x) of length L from (-a, 0) to (a, 0) that encloses the maximum area with the x-axis. Set up the functional and identify the Lagrange multiplier equation.
Show Solution
Solution:
Maximize A[y] = integral of y dx subject to L[y] = integral of sqrt(1 + y'^2) dx = L.
Form the augmented functional with Lagrange multiplier lambda:
H[y] = integral of (y + lambda sqrt(1 + y'^2)) dx
Euler-Lagrange for H: F_y - d/dx(F_y') = 0 where F = y + lambda sqrt(1 + y'^2).
1 - d/dx(lambda y'/sqrt(1 + y'^2)) = 0
This is the equation for a circle of radius R = |lambda|. Parametrically: x = R sin(theta) + x_0, y = -R cos(theta) + y_0.
Answer: circular arcs of radius R = L/(2 theta_0) determined by the endpoint constraint.
Exam Tips and Common Mistakes
1. Check for the Beltrami Identity First
Before computing full Euler-Lagrange equations, check if F is independent of x. If so, use F - y' F_y' = C to reduce to a first-order ODE. This applies to brachistochrone, geodesics, and minimal surface problems.
2. Don't Forget Boundary Terms
When endpoints are free or on a curve, the integration-by-parts step produces boundary terms. Setting F_y' = 0 at free endpoints is a necessary condition often missed in exams. For endpoints on a prescribed curve, write out the transversality condition explicitly.
3. Euler-Lagrange vs. Sufficient Conditions
Solving the Euler-Lagrange equation gives candidates for extrema, not confirmed minima. For exam problems asking for a "minimum," verify the Legendre condition (F_y'y' ≥ 0) and note that many textbook problems on bounded domains have unique extremals that must be the minimum by comparison.
4. Cyclic Coordinates in Lagrangian Mechanics
If the Lagrangian L is independent of a generalized coordinate q_i (q_i is cyclic), then p_i = partial L / partial q-dot_i is an immediate first integral. Always identify cyclic coordinates before writing Lagrange's equations — this reduces the order of the system.
5. Hamiltonian = Energy (When It Applies)
H = T + V (total energy) only when: (a) constraints are time-independent, (b) kinetic energy is quadratic in velocities, and (c) potential energy is independent of velocities. If the Lagrangian has explicit time dependence or velocity-dependent potentials (e.g., electromagnetic), H does not equal T + V.
6. Noether's Theorem: Total Derivative vs. Zero
The action is invariant if L changes by at most a total time derivative dF/dt under the transformation. This is more general than L being unchanged. If delta L = dF/dt, the conserved quantity is I = sum p_i (dQ_i/d epsilon) - F (subtract the function F that appears in the total derivative).
7. Sign Convention for Brachistochrone
In the brachistochrone, be careful with the orientation of y. If y is measured downward (positive direction of descent), then v = sqrt(2gy) and the functional is T = integral sqrt((1 + y'^2) / (2gy)) dx. With y measured upward, the signs change. Clarify the convention at the start of any brachistochrone problem.
Quick Reference: Key Formulas
Euler-Lagrange Equation
F_y - d/dx(F_y') = 0
Beltrami Identity (F = F(y, y'))
F - y' F_y' = constant
Hamilton's Equations
q-dot_i = H_p_i, p-dot_i = -H_q_i
Legendre Transform
H = sum p_i q-dot_i - L
Noether Conserved Quantity
I = sum p_i (dQ_i/d eps)|_(eps=0)
Isoperimetric Inequality
4 pi A ≤ L^2 (equality iff circle)
Legendre Condition
F_y'y' ≥ 0 (necessary for min)
Poisson Bracket
(f,g) = sum(f_q g_p - f_p g_q)
Related Topics
Differential Equations
The Euler-Lagrange equation is an ODE. Master ODEs, phase plane analysis, and boundary value problems — the backbone of variational problem-solving.
Real Analysis
Functional analysis, weak convergence, and semicontinuity underpin the direct methods. Real analysis provides the rigorous foundations for existence theory.
Partial Differential Equations
Many variational problems lead to elliptic PDEs. Sobolev spaces, weak solutions, and elliptic regularity connect the calculus of variations to modern PDE theory.
Frequently Asked Questions
How is the calculus of variations different from ordinary calculus?
Ordinary calculus optimizes a function f(x) over real numbers — finding x* where f'(x*) = 0. The calculus of variations optimizes a functional J[y] over a space of functions — finding y*(x) where the first variation delta J = 0. The Euler-Lagrange equation is the functional analogue of f'(x) = 0. The main new difficulty is that function spaces are infinite-dimensional, requiring more sophisticated analysis for existence and regularity.
Why does the cycloid solve the brachistochrone problem?
The cycloid is the solution because it optimally trades off two competing effects: steeper descent accelerates the bead faster, but longer paths take more time. The cycloid is steep near the start (to quickly build speed) and nearly flat near the end (where high speed makes distance cheap in time). The Beltrami identity — the first integral of the Euler-Lagrange equation for autonomous problems — reduces the problem to a first-order ODE whose general solution is exactly the cycloid family.
What is the relationship between the Lagrangian and Hamiltonian formulations?
Lagrangian mechanics works in configuration space (q, q-dot, t) and uses generalized velocities as the state description. Hamiltonian mechanics works in phase space (q, p, t) and uses conjugate momenta. The Legendre transform converts between the two: p_i = partial L / partial q-dot_i and H = sum p_i q-dot_i - L. Both formulations are equivalent for mechanical systems, but the Hamiltonian framework is more natural for perturbation theory, quantum mechanics (where p becomes -i hbar d/dq), and geometric mechanics (symplectic geometry).
Where does the calculus of variations appear in modern mathematics?
The calculus of variations has evolved into several modern fields: (1) Optimal control theory (Pontryagin, Bellman dynamic programming); (2) Geometric measure theory (area-minimizing surfaces, Plateau's problem); (3) Variational PDE methods (weak solutions, Sobolev spaces, mountain pass theorem); (4) Geometric analysis (Ricci flow, harmonic maps, Einstein metrics); (5) Mathematical physics (quantum field theory via path integrals, string theory action functionals); (6) Numerical methods (finite element methods are variational in nature); (7) Machine learning (training neural networks minimizes a loss functional over weight functions).