Set Theory & Foundations of Mathematics
Set theory is the bedrock on which all of modern mathematics is built. From naive comprehension to the ZFC axioms, from Russell's paradox to Godel's incompleteness theorems, this subject asks the deepest questions: What is a mathematical object? What can be proved? What cannot?
Learning Objectives
After working through this page you will be able to:
- ✓Identify the failures of naive set theory and explain why axiomatic foundations are necessary
- ✓State each ZFC axiom and describe its role in preventing paradox
- ✓Work with ordinal and cardinal numbers, including transfinite induction and arithmetic
- ✓State the Axiom of Choice and its equivalent forms (Zorn's lemma, well-ordering theorem)
- ✓Explain Godel's incompleteness theorems and the concept of Godel numbering
- ✓Describe the forcing method and its role in independence results
- ✓Compare alternative foundations: NBG, ETCS, and homotopy type theory
1. Naive Set Theory
Naive set theory, as developed informally by Georg Cantor in the late nineteenth century, starts from an intuitive notion of a "collection of objects." A set is any well-defined collection, and membership is the primitive relation: we write x in A to mean x is an element of the set A.
Basic Set Operations
Subset and Equality
A is a subset of B (A subset-of B) means: for all x, if x in A then x in B. Two sets are equal iff each is a subset of the other: A = B iff A subset-of B and B subset-of A.
Union, Intersection, Complement
A union B = (x : x in A or x in B). A intersect B = (x : x in A and x in B). A complement = (x in U : x not-in A) for universe U.
Power Set
P(A) = (S : S subset-of A). The power set of a set with n elements has 2^n elements. For infinite A, P(A) has strictly greater cardinality than A (Cantor's theorem).
Cartesian Product
A x B = ((a, b) : a in A, b in B). An ordered pair (a, b) is formally defined as the set ((a), (a, b)) (Kuratowski's definition), reducing pairs to pure sets.
Russell's Paradox
Bertrand Russell discovered the most famous paradox in mathematics in 1901. In naive set theory, the comprehension schema allows forming the set of all sets satisfying any property. Consider the property "does not contain itself." Define:
Russell's Paradox
Let R = (x : x not-in x). Is R in R?
- If R in R, then R satisfies "x not-in x," so R not-in R. Contradiction.
- If R not-in R, then R satisfies "x not-in x," so R in R. Contradiction.
Either case yields a contradiction. The unrestricted comprehension schema is therefore inconsistent. This forced mathematicians to develop axiomatic set theory with restricted comprehension.
Other Paradoxes
Russell's paradox was not alone. The Burali-Forti paradox (1897) showed that the collection of all ordinal numbers cannot be a set, for it would itself be an ordinal larger than all ordinals. Cantor's paradox showed the collection of all sets cannot be a set. These paradoxes share a common structure: they arise from collections that are "too large" to be sets — now called proper classes.
Functions as Sets
In axiomatic set theory, a function f from A to B is defined as a set of ordered pairs — a subset of A x B — such that for every a in A there exists exactly one b in B with (a, b) in f. This reduces functions entirely to sets, eliminating functions as a separate primitive concept.
2. Axiomatic Set Theory: The ZFC Axioms
Zermelo-Fraenkel set theory with the Axiom of Choice (ZFC) is the standard axiomatic foundation for mathematics. It resolves the paradoxes of naive set theory by restricting how sets can be formed. Every mathematical object studied in analysis, algebra, and topology can be encoded within ZFC.
Axiom of Extensionality
For all A, B: (for all x, x in A iff x in B) implies A = B
Two sets are identical if and only if they have exactly the same elements. This makes set equality entirely determined by membership, not by how the set was defined or named.
Axiom of Empty Set
There exists a set with no elements: exists E, for all x, x not-in E
The empty set exists. By extensionality it is unique, written as the empty set symbol. Every set is a superset of the empty set. This is the starting point from which all other sets are built.
Axiom of Pairing
For all a, b: exists C, for all x, x in C iff (x = a or x = b)
For any two sets a and b, the set (a, b) exists. Combined with the empty set axiom, this lets us form finite sets. It is also used in the Kuratowski definition of ordered pairs.
Axiom of Union
For all A: exists U, for all x, x in U iff exists B in A, x in B
The union of a family of sets exists. The union of A collects all elements of all members of A. This allows forming arbitrary unions, including the union of infinitely many sets.
Axiom of Power Set
For all A: exists P, for all S, S in P iff S subset-of A
The power set P(A) — the collection of all subsets of A — exists as a set. This is essential for constructing the real numbers and function spaces. The power set always has strictly greater cardinality than A.
Axiom of Infinity
There exists a set I such that: empty-set in I, and for all x in I, (x union (x)) in I
At least one infinite set exists. The canonical example is the set of von Neumann ordinals: 0 = empty, 1 = (0), 2 = (0, 1), and so on. Without this axiom, we could only reason about finite sets.
Axiom Schema of Separation (Aussonderung)
For any set A and property P(x): exists B, for all x, x in B iff (x in A and P(x))
We can form the subset of a given set satisfying any first-order property. Crucially, we separate from an existing set rather than forming an arbitrary collection. This avoids Russell's paradox: the "set of all sets not containing themselves" must be separated from some ambient set, but no set of all sets exists in ZFC.
Axiom Schema of Replacement
If F is a function (class), then for any set A, the image F(A) = (F(x) : x in A) is a set
The image of any set under a definable function is again a set. This is needed to construct certain large sets and ordinals that separation alone cannot reach. It is required to prove transfinite recursion in full generality.
Axiom of Regularity (Foundation)
Every non-empty set A contains an element x such that x and A are disjoint
No set can be an element of itself, and there are no infinite descending membership chains (x1 contains x2 contains x3 contains ...). This axiom rules out pathological self-containing sets and ensures that the cumulative hierarchy of sets is well-founded.
Axiom of Choice (AC)
For any family F of non-empty sets, there exists a choice function f such that f(A) in A for all A in F
A choice function exists for any collection of non-empty sets. This allows simultaneous selection from infinitely many sets without constructing the selection explicitly. It is independent of ZF — both ZF plus AC and ZF plus not-AC are consistent (assuming ZF is consistent).
Why ZF Alone Is Called ZF, not ZFC
The nine axioms of Zermelo-Fraenkel set theory (without Choice) are called ZF. Adding the Axiom of Choice gives ZFC. Some theorems require AC while others do not; distinguishing which is which is an active research area. Most working mathematicians accept ZFC without reservation.
3. Ordinal Numbers
Ordinal numbers generalize the natural numbers to capture the notion of "position in a well-ordered sequence," including infinite positions. They are the backbone of transfinite induction and recursion.
Well-Ordering
A well-order on a set S is a total order in which every non-empty subset has a least element. The natural numbers under their usual order are well-ordered. The integers are not: the set of all negative integers has no least element. Well-ordering is what makes induction work: every descending chain must eventually terminate.
Von Neumann Ordinals
Von Neumann defined each ordinal as the set of all smaller ordinals:
- 0 = empty set
- 1 = (0) = (empty set)
- 2 = (0, 1) = (empty set, (empty set))
- 3 = (0, 1, 2)
- omega = (0, 1, 2, 3, ...) = the set of all finite ordinals
- omega + 1 = omega union (omega) = (0, 1, 2, ..., omega)
With this definition, alpha is less than beta iff alpha is in beta iff alpha is a proper subset of beta. The membership relation on ordinals is a well-order.
Successor and Limit Ordinals
Successor Ordinals
The successor of an ordinal alpha is alpha + 1 = alpha union (alpha). Every finite ordinal is a successor (except 0). The successor operation moves to the "next" position in the well-order, just as n+1 follows n for natural numbers.
Limit Ordinals
A non-zero ordinal lambda is a limit ordinal if it has no predecessor: lambda = union of all alpha less than lambda. The first limit ordinal is omega. Then come omega*2, omega*3, ..., omega^2, omega^omega, epsilon_0, and so on — a vast well-ordered hierarchy.
Transfinite Induction
Transfinite induction proves properties P(alpha) for all ordinals alpha. The principle has three cases, matching the three types of ordinals:
- Base case: Prove P(0).
- Successor step: Assuming P(alpha), prove P(alpha + 1).
- Limit step: If P(alpha) holds for all alpha less than lambda (where lambda is a limit ordinal), prove P(lambda).
Transfinite recursion allows defining functions by the same three-case structure. For example, ordinal addition, multiplication, and exponentiation are each defined by transfinite recursion.
Ordinal Arithmetic
Ordinal arithmetic is not commutative. Key examples:
- 1 + omega = omega (not omega + 1)
- 2 * omega = omega (adding omega copies of 2)
- omega * 2 = omega + omega (two copies of omega)
- omega^omega = (sup of omega, omega^2, omega^3, ...)
Addition alpha + beta is defined as: order a copy of alpha, then a copy of beta behind it. Thus 1 + omega places 1 before omega, which does not add a new maximum element, giving just omega. But omega + 1 places 1 after omega, adding a new maximum element strictly greater than all natural numbers.
4. Cardinal Numbers and Cantor's Theorem
While ordinals capture "order type," cardinal numbers capture "size." Two sets have the same cardinality if there exists a bijection between them. Cantor showed there are infinitely many distinct infinite cardinalities.
Equipotence and Countability
Sets A and B are equipotent (have the same cardinality, written |A| = |B|) if there is a bijection from A to B. A set is countable if it is finite or in bijection with the natural numbers N. The rationals Q are countable (via the Cantor diagonal enumeration of fractions). The real numbers R are uncountable.
Cantor's Diagonal Argument
Cantor proved that the real numbers in (0, 1) cannot be listed as a sequence r_1, r_2, r_3, ....
The Diagonal Argument
Suppose r_1, r_2, r_3, ... lists all reals in (0, 1) in decimal expansion. Construct a real number d by setting the n-th decimal digit of d to differ from the n-th decimal digit of r_n (e.g., use 4 if the digit is 5, and 5 otherwise). Then d differs from every r_n in the n-th digit, so d is not on the list. Contradiction. Thus no such listing exists and (0, 1) is uncountable.
Cantor's Theorem
For any set A, the power set P(A) has strictly greater cardinality than A. That is, there is no surjection from A onto P(A).
Proof Sketch
Suppose f: A to P(A) is any function. Define D = (x in A : x not-in f(x)). Then D is a subset of A, so D in P(A). If D = f(a) for some a in A, then: a in D iff a not-in f(a) = D. Contradiction. So f is not surjective. Since f was arbitrary, no surjection A to P(A) exists, so |P(A)| is strictly greater than |A|.
Aleph Numbers and Beth Numbers
Aleph Numbers
aleph-0 = |N| = cardinality of natural numbers
aleph-1 = smallest uncountable cardinal
aleph-2 = next cardinal after aleph-1
aleph-alpha for each ordinal alpha
Beth Numbers
beth-0 = aleph-0 = |N|
beth-1 = |P(N)| = |R| = 2^aleph-0
beth-2 = |P(R)| = 2^beth-1
beth-(alpha+1) = 2^(beth-alpha)
The Continuum Hypothesis
The Continuum Hypothesis (CH), stated by Cantor in 1878, asks: is there a cardinal strictly between aleph-0 and beth-1 = |R|? Equivalently, is |R| = aleph-1? Godel (1940) showed CH is consistent with ZFC. Cohen (1963) showed the negation of CH is also consistent with ZFC. Therefore CH is independent of ZFC — it can neither be proved nor disproved from the standard axioms.
The Generalized Continuum Hypothesis (GCH) asserts that for every ordinal alpha, 2^(aleph-alpha) = aleph-(alpha+1). GCH implies AC and is also independent of ZFC. Under GCH, beth-alpha = aleph-alpha for all alpha.
5. The Axiom of Choice: Equivalent Forms and Consequences
The Axiom of Choice (AC) is one of the most discussed axioms in mathematics. It is provably independent of ZF, meaning it can neither be proved nor disproved from the other nine axioms. Mathematicians who accept AC work in ZFC; those who reject it work in constructive or intuitionistic frameworks.
Three Equivalent Forms
The following three statements are equivalent in ZF (each can be proved from the others, all require AC):
Axiom of Choice
For any collection F of non-empty sets, there exists a function f (a choice function) such that f(A) in A for every A in F. The function simultaneously chooses one element from each set, even when F is infinite or uncountable.
Zorn's Lemma
If P is a partially ordered set in which every chain (totally ordered subset) has an upper bound in P, then P has at least one maximal element (an element not strictly below any other). Zorn's lemma is the workhorse form of AC in algebra and analysis: it is used to prove every vector space has a basis, every ring has a maximal ideal, and Tychonoff's theorem.
Well-Ordering Theorem
Every set can be well-ordered: there exists a total order on any set S such that every non-empty subset of S has a least element. In particular, the real numbers can be well-ordered (though no explicit well-order on R is constructible). This was proved by Zermelo in 1904 and was initially considered very counterintuitive.
Consequences of AC
- →Every vector space has a Hamel basis (a basis as a set, not just a Schauder basis)
- →Tychonoff's theorem: any product of compact topological spaces is compact
- →Every field has an algebraic closure
- →The Hahn-Banach theorem in functional analysis
- →The Banach-Tarski paradox: a solid ball in 3D can be decomposed into five pieces and reassembled into two balls identical to the original
- →Existence of non-measurable sets (Vitali sets), showing that not all sets of reals can be assigned a consistent length
Weaker Forms of AC
Several weaker axioms have been studied: the Axiom of Countable Choice (ACC) allows choice from countably many sets; the Axiom of Dependent Choice (DC) allows sequential choices where each choice depends on the previous. Many classical results in analysis only need DC or ACC, not full AC.
6. Constructive vs. Classical Mathematics
Classical mathematics, as encoded in ZFC, freely uses the law of excluded middle (every statement is either true or false) and proof by contradiction. Constructive mathematics imposes stricter standards: a proof of existence must exhibit the object, and a proof of a disjunction must specify which disjunct holds.
Intuitionistic Logic
Brouwer founded intuitionism, rejecting the law of excluded middle (LEM) as a logical axiom. In intuitionistic logic, the logical connectives have computational content:
Brouwer-Heyting-Kolmogorov (BHK) Interpretation
- A proof of P and Q = a proof of P together with a proof of Q
- A proof of P or Q = either a proof of P or a proof of Q (with specification)
- A proof of P implies Q = a procedure converting any proof of P into a proof of Q
- A proof of not-P = a proof that P implies absurdity (false)
- A proof of (exists x) P(x) = a specific object a with a proof of P(a)
- A proof of (for all x) P(x) = a procedure giving a proof of P(a) for any a
Under BHK, the law of excluded middle P or not-P would require a procedure that either proves P or proves not-P for any statement P — clearly impossible in general. Thus LEM is rejected, and classical double-negation elimination (not-not-P implies P) fails.
Type Theory
Martin-Lof type theory (MLTT) provides an alternative foundation that is both constructive and formal. Types play the role of sets and propositions simultaneously (the Curry-Howard correspondence). A proof of a proposition P is a term of type P. The type theory directly encodes the BHK interpretation.
Curry-Howard Correspondence
Propositions correspond to types. Proofs correspond to terms (programs). A proof of P and Q is a pair (p, q). A proof of P implies Q is a function from P-terms to Q-terms. This identifies logic with computation.
Dependent Types
In MLTT, types can depend on terms: the type of vectors of length n depends on the natural number n. This allows expressing "there exists n such that P(n)" as the sigma type (n: N) x P(n), a dependent sum.
7. Model Theory Basics
Model theory studies the relationship between formal languages and the mathematical structures that satisfy them. A key question: which structures satisfy which theories?
Structures and Satisfaction
A structure M for a language L consists of: a non-empty domain (universe) |M|, an interpretation of each constant symbol as an element of |M|, each function symbol as a function on |M|, and each relation symbol as a subset of |M|^n for appropriate n.
A formula phi is satisfied in M under assignment s (written M satisfies phi[s]) by induction on the structure of phi. For atomic formulas this checks whether a relation holds; for logical connectives it follows truth tables (or their intuitionistic analogues); for quantifiers it ranges over the domain.
A structure M is a model of a theory T (a set of sentences) if M satisfies every sentence in T. A theory is satisfiable if it has at least one model.
Compactness Theorem
A theory T has a model if and only if every finite subset of T has a model. Equivalently, if T is finitely satisfiable, then T is satisfiable. The compactness theorem (proved using ultraproducts or Henkin construction) has surprising consequences:
Non-Standard Models
By compactness, there exist non-standard models of arithmetic containing "infinite" natural numbers larger than all standard natural numbers. Add constants c_n for each standard n and axioms c_(n+1) greater than c_n and c_1 greater than n for each standard n. Every finite subset has a model (interpret c_i as a sufficiently large standard number). By compactness, the whole theory has a model — containing elements larger than all standard naturals.
Lowenheim-Skolem Theorems
Downward Lowenheim-Skolem
If a countable theory T has an infinite model, it has a countable model. More generally, if T has a model of cardinality kappa greater than |T|, it has models of all infinite cardinalities between |T| and kappa.
Upward Lowenheim-Skolem
If T has a model of any infinite cardinality, it has models of all greater cardinalities. Combined with the downward theorem: a consistent first-order theory with an infinite model has models of every infinite cardinality.
Skolem's paradox is an apparent contradiction arising from the downward theorem: ZFC (if consistent) has a countable model, yet ZFC proves the existence of uncountable sets. The resolution is that "uncountable" is not absolute — a set may be uncountable from the model's internal perspective while being countable from the outside.
8. Godel's Incompleteness Theorems
Kurt Godel's 1931 incompleteness theorems are among the most profound results in the history of mathematics. They establish fundamental limits on what any formal axiomatic system can prove about arithmetic.
Godel Numbering
Godel's key technical device is an encoding of syntax as arithmetic. Assign a unique natural number (a Godel number) to each symbol, formula, and proof in the formal system. Then statements about provability become statements about natural numbers, and arithmetic can "talk about" its own proofs.
The Provability Predicate
Using Godel numbering, we can write an arithmetic formula Provable(n) that is true iff n is the Godel number of a provable sentence. Similarly, Proves(m, n) means m is (the code of) a proof of (the sentence coded by) n. These predicates are representable in arithmetic — that is the essential content of the representation theorem.
First Incompleteness Theorem
Statement
Any consistent formal system F capable of expressing elementary arithmetic contains a sentence G_F such that neither G_F nor its negation is provable in F. That is, F is incomplete.
The Godel Sentence
G_F is constructed to say "I am not provable in F." More precisely, via Godel numbering, G_F is an arithmetic sentence that asserts its own Godel number is not in the extension of Provable. If F is consistent, G_F is true (since F cannot prove a false statement) but not provable in F. If F were omega-consistent (a stronger assumption), the negation of G_F is also not provable. Thus F is incomplete.
Second Incompleteness Theorem
Statement
Any consistent formal system F capable of expressing elementary arithmetic cannot prove its own consistency. That is, the sentence Con(F) (expressing "F is consistent" in arithmetic) is not provable in F (assuming F is indeed consistent).
This shattered Hilbert's program — the goal of finding a complete, consistent, finitely axiomatizable foundation for all of mathematics and proving its consistency using only elementary methods. No such program can succeed.
Scope and Limits
The incompleteness theorems apply to systems strong enough to encode Robinson arithmetic (a very weak fragment of Peano arithmetic). They do not apply to weaker systems (like Presburger arithmetic, which admits a decision procedure) or to non-arithmetic theories. They also do not imply that mathematics is in any sense "broken" — they show that no single formal system can capture all mathematical truth.
9. Independence Results and Forcing
A sentence S is independent of a theory T if neither S nor its negation is provable from T. The most famous independent statements are the Continuum Hypothesis and the Axiom of Choice relative to ZF. Cohen's forcing method is the main tool for establishing independence.
Godel's Constructible Universe (L)
To show CH is consistent with ZFC, Godel (1938-1940) defined the constructible universe L: the class of all sets that can be built in a definable transfinite hierarchy. Starting with L_0 = empty set, each L_(alpha+1) consists of all subsets of L_alpha definable by a first-order formula with parameters from L_alpha. L is the union of all L_alpha.
Godel proved: (1) L satisfies all ZFC axioms, and (2) L satisfies GCH (the Generalized Continuum Hypothesis). Since GCH implies CH, CH is consistent with ZFC. L also satisfies AC, showing AC is consistent with ZF.
Cohen's Forcing
To show the negation of CH is consistent with ZFC, Cohen (1963) invented forcing — a method for constructing new models by adding "generic" sets to a ground model. The technique works as follows:
- Forcing poset: Start with a ground model M of ZFC and a partially ordered set P in M (the "forcing poset" or "notion of forcing"). Elements p in P are called "conditions" and represent partial information about the generic set to be added.
- Generic filter: A filter G in P (closed under finite intersections and upward closed) is M-generic if it meets every dense subset of P that is in M. By a theorem of Rasiowa and Sikorski, generic filters exist outside M.
- Generic extension: Form the generic extension M[G] by adding G to M and closing under set-theoretic operations. M[G] satisfies ZFC and contains new sets not in M.
- Forcing relation: The forcing relation p forces phi (written p forces phi) is defined in M and determines which sentences are true in M[G]. The key theorem: a sentence phi is true in M[G] iff some condition in G forces phi.
Cohen used forcing to add aleph-2 many new subsets of omega to a ground model satisfying GCH, creating a model where the cardinality of R (= 2^aleph-0) is aleph-2, strictly greater than aleph-1. This showed the negation of CH is consistent with ZFC.
Other Independence Results
- →Suslin's hypothesis (whether every dense complete linear order without endpoints satisfying the countable chain condition is isomorphic to R) is independent of ZFC
- →Whitehead's problem in group theory (whether every Whitehead group is free) is independent of ZFC (Shelah, 1974)
- →Many questions about the cardinal characteristics of the continuum (covering numbers, bounding numbers) are independent of ZFC
10. Alternative Foundations
ZFC is the dominant but not the only foundation for mathematics. Several alternatives have been proposed, each with different strengths, philosophical motivations, and computational properties.
NBG: Von Neumann-Bernays-Godel Class Theory
NBG is a conservative extension of ZFC that adds proper classes — collections too large to be sets (like the class of all sets or the class of all ordinals). NBG is finitely axiomatizable (unlike ZFC, which has axiom schemas), yet proves exactly the same theorems about sets as ZFC. Classes in NBG are second-class citizens: they cannot be elements of other classes, only of sets.
ETCS: Elementary Theory of the Category of Sets
Lawvere's ETCS (1964) axiomatizes the category of sets rather than sets directly. The primitives are sets and functions (morphisms), with composition as the basic operation. The axioms describe structural properties of the category: it has finite limits and colimits, exponentials, a natural numbers object, and satisfies an axiom of choice. ETCS is equiconsistent with ZFC and provides a structuralist foundation: mathematical objects are characterized up to isomorphism, not up to identity.
Homotopy Type Theory (HoTT)
HoTT (Voevodsky, 2009) is a foundational framework that unifies type theory with homotopy theory. It extends MLTT with the Univalence Axiom:
Univalence Axiom
(A = B) is equivalent to (A is equivalent to B), where equivalence means a bijection with coherent higher-dimensional witnesses. Informally: "equivalent things are equal." This makes isomorphic structures literally interchangeable in proofs.
Higher Inductive Types
HoTT adds higher inductive types (HITs) that can have constructors for paths (equalities) and higher paths. For example, the circle S^1 is defined as a type with one point and one path from the point to itself. HITs allow synthetic homotopy theory directly in type theory.
HoTT has several advantages: proofs are directly checkable by computer proof assistants, the univalence axiom makes structure-preserving definitions automatically correct, and it provides a genuinely new perspective on homotopy theory. The HoTT book (2013) presents a complete development of the foundations.
Practice Problems
Problem 1: Cantor's Theorem
Let A be any set. Prove that there is no surjection from A onto P(A). (Hint: given any function f from A to P(A), exhibit a subset of A not in the range of f.)
Show Solution
Let f: A to P(A) be any function. Define D = (x in A : x not-in f(x)). Then D is a well-defined subset of A, so D in P(A). Suppose D = f(a) for some a in A. Then: a in D iff a not-in f(a) = D. This is a contradiction. So D is not in the range of f. Since f was arbitrary, no function from A to P(A) is surjective, meaning |A| is strictly less than |P(A)|.
Problem 2: Ordinal Arithmetic
Explain why 1 + omega = omega but omega + 1 is strictly greater than omega. Define both ordinals explicitly as well-ordered sets.
Show Solution
Ordinal addition alpha + beta places a copy of alpha before a copy of beta. So 1 + omega is a copy of 1 followed by a copy of omega: the order type is (*, 0, 1, 2, ...) where * comes first. This well-ordered set has no maximum element and every element except * has a predecessor or is the immediate successor of *. The order type is exactly omega. In contrast, omega + 1 is (0, 1, 2, ..., *) where * comes last. This set has a maximum element * (the copy of 1 appended after omega), so its order type is omega + 1, which is strictly greater than omega.
Problem 3: Russell's Paradox and Separation
Show that Russell's paradox cannot be derived within ZF set theory. What goes wrong when we try to form the set of all sets not containing themselves?
Show Solution
In ZF, the Axiom of Separation allows forming (x in A : P(x)) for a given set A — we can only separate a subset of an existing set, not form arbitrary comprehensions. To form R = (x : x not-in x), we would need an ambient set A of all sets. But no such set exists in ZF: the Axiom of Regularity implies no set can be an element of itself, and the combination of axioms yields that the class of all sets (V) is a proper class, not a set. Without a set A to separate from, we cannot form R. The paradox is blocked because unrestricted comprehension is not an axiom of ZF.
Problem 4: Axiom of Choice
Prove that the Axiom of Choice implies the Well-Ordering Theorem: every set can be well-ordered. Outline the proof for an arbitrary set S.
Show Solution
Let S be any non-empty set. The family F of all non-empty subsets of S is a collection of non-empty sets. By AC, there exists a choice function f: F to S with f(A) in A for all A in F. Now define a well-ordering of S by transfinite recursion: let s_0 = f(S) be the first element. Given that s_alpha has been chosen for all alpha less than some ordinal beta, let R_beta = S minus (s_alpha : alpha less than beta). If R_beta is non-empty, set s_beta = f(R_beta). Continue until R_beta is empty. The order s_0, s_1, ..., s_alpha, ... (in the order of their indices) is a well-ordering of S, since every non-empty subset T of S has a first element (the s_alpha with smallest alpha such that s_alpha in T).
Problem 5: Transfinite Induction
Use transfinite induction to prove that for every ordinal alpha, alpha is not an element of itself (no ordinal is a member of itself).
Show Solution
We prove P(alpha): alpha not-in alpha for all ordinals alpha. Base case: P(0). The empty set has no elements, so 0 not-in 0. Successor step: Assume P(alpha) holds (i.e., alpha not-in alpha). Suppose alpha+1 in alpha+1 = alpha union (alpha). Then either alpha+1 in alpha or alpha+1 = alpha. If alpha+1 = alpha, then alpha in alpha, contradicting P(alpha). If alpha+1 in alpha, then since alpha is transitive (every element of alpha is a subset of alpha), alpha+1 is a subset of alpha, but alpha in alpha+1 would then give alpha in alpha, again contradicting P(alpha). So P(alpha+1) holds. Limit step: Let lambda be a limit ordinal and suppose P(alpha) holds for all alpha less than lambda. If lambda in lambda, then since lambda is an ordinal, lambda less than lambda, a contradiction. So P(lambda) holds. By transfinite induction, P(alpha) holds for all ordinals alpha.
Problem 6: Godel's Incompleteness
Explain informally why the Godel sentence G for a system F is true (in the standard model of arithmetic) but not provable in F. What would happen if F were inconsistent?
Show Solution
The Godel sentence G asserts "the formula with Godel number g (namely G itself) is not provable in F." Suppose G is provable in F. Then G is false (it says it is not provable, but it is). An inconsistent system proves everything, including false statements, so if F is consistent, it cannot prove a false statement. Therefore G is not provable in F. But then G's assertion is correct — G truly is not provable — so G is true in the standard model. We have a true, unprovable sentence. If F were inconsistent, F would prove both G and not-G; this causes no additional paradox within the system itself (inconsistency already allows proving anything), but it means F has no models and is useless as a foundation.
Exam Tips
Know the Paradoxes
Always be able to state Russell's paradox clearly and explain exactly which axiom of ZFC prevents it. Examiners frequently ask how ZFC resolves paradoxes — the answer is always the Separation axiom (restricted comprehension) combined with the absence of a universal set.
Ordinal Arithmetic Is Non-Commutative
Memorize the asymmetry: 1 + omega = omega but omega + 1 is strictly greater than omega. The definition of alpha + beta as "alpha before beta" makes the non-commutativity clear. Always verify ordinal computations by thinking about order types, not by blindly applying algebraic rules from N.
Distinguish Cardinals from Ordinals
Cardinals measure size (ignoring order); ordinals measure position. Every cardinal is an ordinal, but not vice versa. The cardinal aleph-0 equals the ordinal omega, but omega + 1 and omega + 2 have the same cardinality aleph-0 even though they are different ordinals.
Three Equivalent Forms of AC
Be able to state AC, Zorn's Lemma, and the Well-Ordering Theorem, and know they are all equivalent in ZF. In practice, Zorn's lemma is used most often in algebra and analysis. Know the standard application: to prove every vector space has a basis, apply Zorn's lemma to the poset of linearly independent sets ordered by inclusion.
Independence Means Neither Proves Nor Disproves
CH is independent of ZFC means: (1) ZFC plus CH is consistent (Godel's L shows this), and (2) ZFC plus not-CH is consistent (Cohen's forcing shows this). Neither result says which is "really true" — both are valid axiom choices with their own consistent mathematics.
Incompleteness Scope
Godel's theorems apply to systems that are consistent, recursively axiomatizable, and strong enough to encode arithmetic. They do not apply to Presburger arithmetic (addition only, no multiplication) or to Euclidean geometry — both are complete and decidable. The theorems are about provability in a fixed system, not about mathematical truth in the abstract.
Related Topics
Mathematical Logic
First-order logic, propositional logic, proof systems, soundness and completeness theorems. The formal language in which ZFC is written.
Discrete Mathematics
Combinatorics, graph theory, relations, and functions. Many foundational concepts from set theory are applied directly in discrete settings.
Real Analysis
The rigorous foundations of calculus, built on the completeness of the reals. Set theory provides the formal basis for sequences, limits, and measure theory.
Quick Reference: Key Results
| Result | Statement | Significance |
|---|---|---|
| Cantor's Theorem | |A| is strictly less than |P(A)| for any set A | Infinitely many distinct infinite cardinalities exist |
| Cantor-Bernstein | If |A| is at most |B| and |B| is at most |A|, then |A| = |B| | Cardinality comparison is well-defined (provable in ZF without AC) |
| Zermelo (1904) | AC implies every set can be well-ordered | Establishes equivalence of AC and Well-Ordering Theorem |
| Godel (1940) | Con(ZF) implies Con(ZFC + GCH) | AC and CH are consistent with ZF |
| Cohen (1963) | Con(ZF) implies Con(ZFC + not-CH) | CH is independent of ZFC; forcing method introduced |
| Godel (1931) | Any consistent system encoding arithmetic is incomplete | Hilbert's program cannot succeed; limits of formal proof |
| Lowenheim-Skolem | A consistent countable theory with an infinite model has models of all infinite cardinalities | First-order logic cannot pin down cardinality uniquely |
| Compactness | T has a model iff every finite subset of T has a model | Non-standard models exist; local properties propagate globally |