Functions of two variables, partial derivatives, gradient vectors, directional derivatives, critical points, the second derivative test, and Lagrange multipliers — the complete guide.
A function f(x, y) maps each point (x, y) in the domain to a single output value z. The graph of z = f(x, y) is a surface in three-dimensional space.
D ⊆ ℝ²
The set of all (x, y) pairs where the function is defined. Exclude points causing division by zero or negative square roots.
z = f(x, y)
The set of all output values z. For z = √(1 − x² − y²) the range is [0, 1], restricted by the domain disk x² + y² ≤ 1.
S: z = f(x, y)
A surface in ℝ³. Each point (x, y) in the domain lifts to a point (x, y, f(x, y)) on the surface.
| Function | Domain | Surface Shape |
|---|---|---|
| f(x,y) = x² + y² | All of ℝ² | Circular paraboloid |
| f(x,y) = x² − y² | All of ℝ² | Hyperbolic paraboloid (saddle) |
| f(x,y) = √(4 − x² − y²) | x² + y² ≤ 4 | Upper hemisphere, radius 2 |
| f(x,y) = 1/(x² + y²) | (x,y) ≠ (0,0) | Funnel / cone singularity |
| f(x,y) = sin(x)cos(y) | All of ℝ² | Undulating wave surface |
A level curve of f(x, y) at height c is the set of all (x, y) where f(x, y) = c. A contour map is a collection of level curves for several values of c, showing the shape of the surface from above.
Set f(x, y) = c and identify the curve type:
x² + y² = c → circle (c > 0)
x² − y² = c → hyperbola
y = x² − c → parabola
ax + by = c → line
Example: Level Curves of f(x, y) = x² + 4y²
Level curve at c = 4: x² + 4y² = 4, or x²/4 + y²/1 = 1 — an ellipse with semi-axes 2 and 1.
At c = 16: x²/16 + y²/4 = 1 — a larger ellipse with semi-axes 4 and 2.
The surface is an elliptic paraboloid; each level curve is an ellipse scaled by √c.
Partial derivatives measure the rate of change of f with respect to one variable while holding all others fixed. They are the multivariable generalization of the ordinary derivative.
∂f/∂x = fₓ = f₁ = ∂z/∂x
∂f/∂y = fᵧ = f₂ = ∂z/∂y
∂²f/∂x² = fₓₓ (second partial)
∂²f/∂y² = fᵧᵧ (second partial)
∂²f/∂y∂x = fₓᵧ (mixed partial)
fₓ(a, b) is the slope of the surface in the x-direction at the point (a, b, f(a,b)) — the slope of the curve cut by the plane y = b. Similarly, fᵧ(a, b) is the slope in the y-direction at that point, cut by the plane x = a.
Rule
To find ∂f/∂x: treat every occurrence of y as a constant, then differentiate with respect to x using ordinary rules (power, product, chain, etc.).
Power functions
f = xⁿyᵐ → fₓ = nxⁿ⁻¹yᵐ (y is a constant coefficient), fᵧ = mxⁿyᵐ⁻¹
Trig / exponential
f = sin(xy) → fₓ = y·cos(xy) (chain rule, y is the inner derivative), fᵧ = x·cos(xy)
Chain rule applies
f = e^(x²y) → fₓ = 2xy·e^(x²y), fᵧ = x²·e^(x²y)
If f and its partial derivatives fₓ, fᵧ, fₓᵧ, fᵧₓ are all continuous on a region, then the mixed partials are equal:
Order of differentiation does not matter for well-behaved functions.
fₓₓ = ∂²f/∂x² → diff. fₓ w.r.t. x
fᵧᵧ = ∂²f/∂y² → diff. fᵧ w.r.t. y
fₓᵧ = ∂²f/∂y∂x → diff. fₓ w.r.t. y
fᵧₓ = ∂²f/∂x∂y → diff. fᵧ w.r.t. x
fₓᵧ = fᵧₓ by Clairaut's theorem
Example: All Second Partials of f(x, y) = x³y − 2xy²
fₓ = 3x²y − 2y²
fᵧ = x³ − 4xy
fₓₓ = 6xy
fᵧᵧ = −4x
fₓᵧ = 3x² − 4y (diff. fₓ w.r.t. y)
fᵧₓ = 3x² − 4y (diff. fᵧ w.r.t. x) — equal, as expected ✓
The gradient of f(x, y) is the vector formed by its partial derivatives. It encodes all first-order rate-of-change information into a single geometric object.
∇f points in the direction of greatest increase of f. Moving in direction ∇f climbs the surface as steeply as possible.
−∇f points in the direction of greatest decrease. Gradient descent algorithms move in −∇f to minimize a function.
∇f at a point (a, b) is perpendicular to the level curve f(x, y) = c passing through that point. This is fundamental to implicit differentiation and tangent lines.
The directional derivative gives the rate of change of f at a point in any specified direction, not just along the coordinate axes.
û = (u₁, u₂) must be a unit vector: |û| = 1
Compute the gradient
Find fₓ and fᵧ, then evaluate at the given point (a, b).
Normalize the direction vector
If given v = (v₁, v₂), compute û = v/|v| = (v₁, v₂)/√(v₁² + v₂²).
Take the dot product
Dᵤf = ∇f · û = fₓu₁ + fᵧu₂.
Example: f(x,y) = x²y, at (2, 3), direction v = (1, 2)
∇f = (2xy, x²) → at (2,3): ∇f = (12, 4)
|v| = √(1² + 2²) = √5 → û = (1/√5, 2/√5)
Dᵤf = 12·(1/√5) + 4·(2/√5) = (12 + 8)/√5 = 20/√5 = 4√5 ≈ 8.94
Maximum Directional Derivative
Max Dᵤf = |∇f|, achieved when û = ∇f/|∇f| (direction of gradient).
Minimum Directional Derivative
Min Dᵤf = −|∇f|, achieved when û = −∇f/|∇f| (opposite to gradient).
A critical point of f(x, y) is a point (a, b) where both partial derivatives are zero, or where f is not differentiable. Critical points are candidates for local maxima, local minima, and saddle points.
Compute both partial derivatives
Find fₓ(x, y) and fᵧ(x, y) as expressions.
Set both equal to zero
Solve the system: fₓ = 0 AND fᵧ = 0 simultaneously.
Solve the system
Use substitution or elimination. There may be zero, one, or many critical points.
Classify with the second derivative test
Compute the discriminant D at each critical point to determine max, min, or saddle.
Example: Find Critical Points of f(x,y) = x³ + y³ − 3xy
fₓ = 3x² − 3y = 0 → y = x²
fᵧ = 3y² − 3x = 0 → x = y²
Substitute: x = (x²)² = x⁴ → x⁴ − x = 0 → x(x³ − 1) = 0 → x = 0 or x = 1
x = 0: y = 0² = 0 → critical point (0, 0)
x = 1: y = 1² = 1 → critical point (1, 1)
At each critical point (a, b), compute the discriminant D (also called the Hessian determinant) to classify the point.
D = fₓₓ(a,b) · fᵧᵧ(a,b) − [fₓᵧ(a,b)]²
Hessian determinant
D > 0 and fₓₓ > 0
The surface curves upward in all directions — like the bottom of a bowl.
D > 0 and fₓₓ < 0
The surface curves downward in all directions — like the top of a hill.
D < 0
The surface curves up in some directions and down in others — like a mountain pass.
D = 0
The second derivative test gives no information. Use other methods or inspect the function directly.
fₓₓ = 6x, fᵧᵧ = 6y, fₓᵧ = −3
At (0, 0):
D = (6·0)(6·0) − (−3)² = 0 − 9 = −9 < 0 → Saddle point
At (1, 1):
D = (6·1)(6·1) − (−3)² = 36 − 9 = 27 > 0
fₓₓ(1,1) = 6 > 0 → Local minimum
f(1,1) = 1 + 1 − 3 = −1 is the local minimum value.
Lagrange multipliers find the extrema of f(x, y) subject to a constraint g(x, y) = k. At a constrained extremum, the gradient of f must be a scalar multiple of the gradient of g.
The vector equation ∇f = λ∇g expands into a system of three scalar equations:
fₓ(x, y) = λ · gₓ(x, y)
fᵧ(x, y) = λ · gᵧ(x, y)
g(x, y) = k
Three equations, three unknowns: x, y, and λ. Solve for all solution points, then evaluate f at each to determine which is the maximum and which is the minimum.
g(x, y) = x + 2y, k = 8
Gradients:
∇f = (y, x), ∇g = (1, 2)
System ∇f = λ∇g:
y = λ·1 → λ = y
x = λ·2 → x = 2λ = 2y
Constraint:
x + 2y = 8 → 2y + 2y = 8 → 4y = 8 → y = 2
x = 2y = 4
Maximum f(4, 2) = 4·2 = 8, achieved at (4, 2).
When to Use Lagrange Multipliers
When direct substitution from the constraint equation into f is algebraically messy or impossible. Also useful when the constraint is an implicit curve or ellipse rather than a simple linear equation.
Geometric Meaning
∇f = λ∇g means the level curves of f and the constraint curve g = k are tangent at the solution — they share the same tangent line. Moving along the constraint no longer increases or decreases f.
Find ∇f and Dᵤf at (1, 2) in direction v = (3, 4).
fₓ = 6xy → fₓ(1,2) = 12
fᵧ = 3x² − 3y² → fᵧ(1,2) = 3 − 12 = −9
∇f(1,2) = (12, −9)
|v| = √(9 + 16) = 5 → û = (3/5, 4/5)
Dᵤf = 12·(3/5) + (−9)·(4/5) = 36/5 − 36/5 = 0
The directional derivative is 0 — v is tangent to the level curve at (1, 2).
fₓ = 2x − 2 = 0 → x = 1
fᵧ = 2y − 4 = 0 → y = 2
Critical point: (1, 2)
fₓₓ = 2, fᵧᵧ = 2, fₓᵧ = 0
D = (2)(2) − 0² = 4 > 0 and fₓₓ = 2 > 0
Local minimum at (1, 2). f(1,2) = 1 + 4 − 2 − 8 + 1 = −4.
This is a paraboloid opening upward; (1, 2) is its vertex (the global minimum).
Minimize f(x,y) = x² + y², g(x,y) = x + y = 1
∇f = (2x, 2y), ∇g = (1, 1)
System: 2x = λ, 2y = λ → x = y
Constraint: x + y = 1, x = y → 2x = 1 → x = 1/2
y = 1/2
Minimum f(1/2, 1/2) = 1/4 + 1/4 = 1/2 at (1/2, 1/2).
Geometric meaning: (1/2, 1/2) is the point on the line x + y = 1 closest to the origin.
Find the tangent line to x² + 2y² = 6 at the point (2, 1).
Define F(x,y) = x² + 2y² − 6. The curve is the level set F = 0.
∇F = (2x, 4y) → at (2,1): ∇F = (4, 4)
∇F is normal to the curve at (2,1).
Tangent line: 4(x − 2) + 4(y − 1) = 0
→ x − 2 + y − 1 = 0 → x + y = 3
Tangent line: x + y = 3
Verify: (2,1) lies on x + y = 3 ✓. Direction (4,4) is perpendicular to tangent direction (1,−1) since 4·1 + 4·(−1) = 0 ✓.
A function of two variables f(x, y) assigns a single real number to each ordered pair (x, y) in its domain. You can think of it as a surface in 3D space: for every (x, y) point on the xy-plane, f(x, y) gives the height z above that point. Example: f(x, y) = x² + y² is a paraboloid. Its domain is all of ℝ² (any real x and y), and its range is [0, ∞) because x² + y² ≥ 0 always.
Level curves (also called contour lines) of f(x, y) are the curves in the xy-plane where f has a constant value c: f(x, y) = c. They show the 'shape' of the surface viewed from above. To find level curves: set f(x, y) = c for various values of c and solve for the relationship between x and y. Example: for f(x, y) = x² + y², the level curve at c = 4 is x² + y² = 4, which is a circle of radius 2. Level curves that are close together indicate a steep surface; widely spaced level curves indicate a gentle slope.
The partial derivative of f(x, y) with respect to x, written ∂f/∂x or fₓ, measures how f changes as x changes while y is held constant. To compute it, treat y as a constant and differentiate with respect to x using ordinary single-variable rules. Similarly, ∂f/∂y treats x as a constant and differentiates with respect to y. Example: if f(x, y) = 3x²y + sin(y), then ∂f/∂x = 6xy (y is constant, so sin(y) contributes 0) and ∂f/∂y = 3x² + cos(y) (x² is constant with respect to y).
The gradient of f(x, y) is the vector ∇f = (∂f/∂x, ∂f/∂y). It points in the direction of steepest ascent on the surface — the direction you would travel in the xy-plane to climb the surface as steeply as possible. Its magnitude |∇f| equals the rate of that maximum increase. Geometrically, ∇f at a point (a, b) is always perpendicular (normal) to the level curve f(x, y) = f(a, b) passing through that point. The gradient of zero means you are at a critical point — a peak, valley, or saddle.
The directional derivative of f in the direction of a unit vector û = (u₁, u₂) is Dᵤf = ∇f · û = fₓu₁ + fᵧu₂. It gives the rate of change of f as you move from the point in the direction û. Important steps: (1) if the direction vector is not a unit vector, normalize it first by dividing by its magnitude. (2) Compute the gradient ∇f. (3) Take the dot product. The maximum directional derivative equals |∇f| and occurs in the direction of ∇f itself. The minimum equals −|∇f| and occurs in the direction −∇f.
Critical points of f(x, y) occur where both partial derivatives are zero simultaneously, or where one or both partial derivatives do not exist. To find them: (1) compute fₓ and fᵧ, (2) set both equal to zero: fₓ = 0 and fᵧ = 0, (3) solve the system of equations for (x, y). Example: for f(x, y) = x³ − 3x + y² − 4y, set fₓ = 3x² − 3 = 0 → x = ±1, and fᵧ = 2y − 4 = 0 → y = 2. Critical points are (1, 2) and (−1, 2). Then use the second derivative test to classify each one.
At a critical point (a, b), compute the discriminant D = fₓₓ(a,b)·fᵧᵧ(a,b) − [fₓᵧ(a,b)]². Then: if D > 0 and fₓₓ > 0, the point is a local minimum. If D > 0 and fₓₓ < 0, it is a local maximum. If D < 0, the point is a saddle point (neither max nor min). If D = 0, the test is inconclusive and other methods are needed. The discriminant measures whether the surface curves up in every direction (D > 0, like a bowl) or curves up in some directions and down in others (D < 0, like a saddle).
Lagrange multipliers solve constrained optimization problems: find the extrema of f(x, y) subject to a constraint g(x, y) = k. The key insight is that at a constrained extremum, the gradients of f and g must be parallel: ∇f = λ∇g for some scalar λ (the Lagrange multiplier). This gives the system: fₓ = λgₓ, fᵧ = λgᵧ, and g(x, y) = k — three equations for three unknowns (x, y, λ). Solve the system, then evaluate f at each solution to find the maximum or minimum. Use Lagrange multipliers when direct substitution from the constraint is algebraically difficult.
Mixed partial derivatives differentiate with respect to two different variables in sequence. For example, fₓᵧ means differentiate first with respect to x, then with respect to y: fₓᵧ = ∂/∂y(∂f/∂x). By Clairaut's theorem (also called Schwarz's theorem), if f and its partial derivatives are continuous, then fₓᵧ = fᵧₓ — the order of differentiation does not matter. This equality holds for virtually all functions encountered in calculus courses. The mixed partials appear in the discriminant D of the second derivative test, reflecting how the surface twists.
| Concept | Formula |
|---|---|
| Gradient | ∇f = (fₓ, fᵧ) |
| Directional derivative | Dᵤf = ∇f · û (û unit vector) |
| Max directional derivative | |∇f|, in direction ∇f/|∇f| |
| Critical point condition | fₓ = 0 and fᵧ = 0 |
| Discriminant (Hessian) | D = fₓₓfᵧᵧ − (fₓᵧ)² |
| D > 0, fₓₓ > 0 | Local minimum |
| D > 0, fₓₓ < 0 | Local maximum |
| D < 0 | Saddle point |
| Lagrange condition | ∇f = λ∇g, g = k |
| Clairaut's theorem | fₓᵧ = fᵧₓ (when continuous) |
| Level curve | f(x, y) = c (set) |
| Gradient ⊥ level curve | ∇f normal to f = c at every point |
Interactive problems with step-by-step solutions and private tutoring — free to try.
Start Practicing Free