Classical First-Order Logic (Classical FOL) has an absolutely central place in traditional
logic, model theory, and set theory. It is the foundation upon which **ZF**(**C**), which is itself
often taken as the foundation of mathematics, is built. When classical FOL was being established
there was a lot of study and debate around alternative options. There are a variety of philosophical
and metatheoretic reasons supporting classical FOL as The Right Choice.

This all happened, however, well before category theory was even a twinkle in Mac Lane’s and Eilenberg’s eyes, and when type theory was taking its first stumbling steps.

My focus in this article is on what classical FOL looks like to a modern categorical logician.
This can be neatly summarized as “classical FOL is the internal logic of
a Boolean First-Order Hyperdoctrine.
Each of the three words in this term,”Boolean”, “First-Order”, and “Hyperdoctrine”, suggest
a distinct axis in which to vary the (class of categorical models of the) logic. *All* of them
have compelling categorical motivations to be varied.

The first and simplest is the term “Boolean”. This is what differentiates the categorical semantics of classical (first-order) logic from constructive (first-order) logic. Considering arbitrary first-order hyperdoctrines would give us a form of intuitionistic first-order logic.

It is fairly rare that the categories categorists are interested in are Boolean. For example, most toposes, all of which give rise to first-order hyperdoctrines, are not Boolean. The assumption that they are tends to correspond to a kind of “discreteness” that’s often at odds with the purpose of the topos. For example, a category of sheaves on a topological space is Boolean if and only if that space is a Stone space. These are certainly interesting spaces, but they are also totally disconnected unlike virtually every non-discrete topological space one would typically mention.

The next term is the term “first-order”. As the name suggests, a first-order hyperdoctrine has the necessary structure to interpret first-order logic. The question, then, is what kind of categories have this structure and only this structure. The answer, as far as I’m aware, is not many.

Many (classes of) categories have the structure to be first-order hyperdoctrines, but often they have
additional structure as well that it seems odd to ignore. The most notable and interesting example is
toposes. All elementary toposes (which includes all Grothendieck toposes) have the structure to give
rise to a first-order hyperdoctrine. But, famously, they also have the structure to give rise to a
higher order logic. Even more interesting, while Grothendieck toposes, being elementary toposes, technically
*do* support the necessary structure for first-order logic, the natural morphisms of Grothendieck
toposes, geometric morphisms, *do not preserve that structure*, unlike the logical functors between
elementary toposes.

The natural internal logic for Grothendieck toposes turns out to
be geometric logic. This is a logic that lacks
universal quantification and implication (and thus negation) but does have *infinitary* disjunction.
This leads to a logic that is, at least superficially, incomparable to first-order logic. Closely
related logics are regular logic and coherent logic which are sub-logics of both geometric
logic and first-order logic.

We see, then, just from the examples of the natural logics of toposes, none of them are first-order logic, and we get examples that are more powerful, less powerful, and incomparable to first-order logic. Other common classes of categories give other natural logics, such as the cartesian logic from left exact categories, and monoidal categories give rise to (ordered) linear logics. We get the simply typed lambda calculus from cartesian closed categories which leads to the next topic.

A (posetal) hyperdoctrine essentially takes a category and, for each object in that category, assigns to it
a poset of “predicates” on that object. In many cases, this takes the form of the **Sub** functor assigning
to each object its poset of subobjects. Various versions of hyperdoctrines will require additional structure
on the source category, these posets, and/or the functor itself to interpret various logical connectives.
For example, a regular hyperdoctrine requires the
source category to have finite limits, the posets to be meet-semilattices, and the functor to give rise
to monotonic functions with left adjoints satisfying certain properties. This notion of hyperdoctrines is
suitable for regular logic.

It’s very easy to recognize that these functors are essentially indexed |(0,1)|-categories. This immediately suggests that we should consider higher categorical versions or at the very least normal indexed categories.

What this means for the logic is that we move from proof-irrelevant logic to proof-relevant logic. We now have potentially multiple ways a “predicate” could “entail” another “predicate”. We can present the simply typed lambda calculus in this indexed category manner. This naturally leads/connects to the categorical semantics of type theories.

Pushing forward to |(\infty, 1)|-categories is also fairly natural, as it’s natural to want to talk about an entailment holding for distinct but “equivalent” reasons.

Moving in all three of these directions simultaneously leads pretty naturally to something like Homotopy Type Theory (HoTT). HoTT is a naturally constructive (but not anti-classical) type theory aimed at being an internal language for |(\infty, 1)|-toposes.

Okay, so why did people pick classical FOL in the first place? It’s not like the concept of, say, a higher-order logic wasn’t considered at the time.

Classical versus Intuitionistic was debated at the time, but at that time it was primarily a philosophical argument, and the defense of Intuitionism was not very compelling (to me and obviously people at the time). The focus would probably have been more on (classical) FOL versus second- (or higher-)order logic.

Oversimplifying, the issue with second-order logic is fairly evident from the semantics. There are two
main approaches: Henkin-semantics and full (or standard) semantics. Henkin-semantics keeps the nice
properties of (classical) FOL but fails to get the nice properties, namely categoricity properties,
of second-order logic. This isn’t surprising as Henkin-semantics can be encoded into first-order logic.
It’s essentially syntactic sugar. Full semantics, however, states that the interpretation of predicate
sorts is power sets of (cartesian products of) the domain^{1}. This leads to massive completeness problems as our metalogical set theory has many, many
ways of building subsets of the domain. There are metatheoretic results
that state that there is no computable set of *logical* axioms that would give us a sound and
complete theory for second-order logic with respect to full semantics. This aspect is also philosophically
problematic, because we don’t want to need set theory to understand the very formulation of set theory.
Thus Quine’s comment that “second-order logic [was] set theory in sheep’s clothing”.

On the more positive and (meta-)mathematical side, we have results like
Lindström’s theorem which states that
classical FOL is the strongest logic that simultaneously satisfies (downward) Löwenheim-Skolem
and compactness. There’s also a syntactic
result by Lindström which characterizes first-order logic as the only logic having a recursively
enumerable set of tautologies and satisfying Löwenheim-Skolem^{2}.

There’s one big caveat to the above. All of the above results are formulated in traditional model
theory which means there are various assumptions built in to their statements. In the language of
categorical logic, these assumptions can basically be summed up in the statement that the *only*
category of semantics that traditional model theory considers is **Set**.

This is an utterly bizarre thing to do from the standpoint of categorical logic.

The issues with full semantics follow directly from this choice. If, as categorical logic would
have us do, we considered *every* category with sufficient structure as a potential category of
semantics, then our theory would not be forced to follow every nook and cranny of **Set**’s
notion of subset to be complete. Valid formulas would need to be true not only in **Set** but
in wildly different categories, e.g. every (Boolean) topos.

These traditional results are also often very specific to *classical* FOL. Dropping this constraint
of classical logic would lead to an even broader class of models.

A Boolean category is just a coherent category
where every object has a complement. Since coherent functors
preserve complements, we have that the category of Boolean categories is a *full* subcategory of the
category of coherent categories.

One nice thing about, specifically, *classical* first-order logic from the perspective of category
theory is the following. First, coherent logic
is a sub-logic of geometric logic restricted to finitary disjunction. Via Morleyization,
we can encode *classical* first-order logic into coherent logic such that the
categories of models of each are equivalent. This implies that a classical FOL formula is
valid if and only if its encoding is. Morleyization allows us to analyze classical FOL using
the tools of classifying toposes. On the one hand, this once again suggests the importance
of *coherent* logic, but it also means that we *can* use categorical tools with *classical* FOL.

There are certain things that I and, I believe, most logicians take as table stakes for a
(foundational) logic^{3}.
For example, *checking* a proof should be computably decidable. For these reasons, I am in
complete accord with early (formal) logicians that classical second-order logic with full semantics
*is* an unacceptably worse alternative to classical first-order logic.

However, when it comes to statements about the specialness of FOL, a lot of them seem to be more statements about traditional model theory than FOL itself, and also statements about the philosophical predilections of the time. I feel that philosophical attitudes among logicians and mathematicians have shifted a decent amount since the beginning of the 20th century. We have different philosophical predilections today than then, but they are informed by another hundred years of thought, and they are more relevant to what is being done today.

Martin-Löf type theory (MLTT) and its progeny also present an alternative path with their own philosophical
and metalogical justifications. I mention this to point out actual cases of foundational frameworks
that a (*very*) superficial reading of traditional model theory results would seem to have been “ruled
out”. Even if one thinks the FOL+**ZFC** (or whatever) is the better foundations, I think it is unreasonable
to assert that MLTT derivatives are unworkable as a foundations.

It’s worth mentioning that this is exactly what categorical logic would suggest: our syntactic power objects should be mapped to semantic power objects.↩︎

While nice, it’s not clear that compactness and, especially, Löwenheim-Skolem are sacrosanct properties that we’d be unwilling to do without. Lindström’s first theorem is thus a nice abstract characterization theorem for classical FOL, but it doesn’t shut the door on considering alternatives even in the context of traditional model theory.↩︎

I’m totally fine thinking about logics that lack these properties, but I would never put any of them forward as an acceptable foundational logic.↩︎