The Pedagogy of Logic: A Rant

Introduction

Over the years I’ve seen a lot of confusion about formal logic online. Usually this is from students (but I’ve also seen it with experienced mathematicians) on platforms like the Math or CS StackExchange. Now there is clearly some selection bias where the people asking the questions are the people who are confused, but while the questions about the confusions are common, the confusions are often evident even in questions about other aspects, the confusions are in the (upvoted!) answers, and when you look at the study material they are using it is often completely unsurprising that they are confused. Again, these confusions are also not limited to just “students”, though I’ll use that word in a broad sense below. Finally, on at least one of the points below, the confusion seems to be entirely on the instructors’ side.

An indication that this is true more broadly is this quote from the nLab:

The distinction between object language and metalanguage exists even in classical mathematics, but it seems that most classical mathematicians are not used to remembering it, although it is not entirely clear why this should be so. One possibly relevant observation is that even if P is a statement which is neither provable nor disprovable (like the continuum hypothesis), in classical mathematics it is still provable that “P is either true or false.” Moreover, classical model theory often restricts itself to two-valued models in which the only truth values are “true” and “false,” although classical logic still holds in Boolean-valued models, and in such a case the truth value of P may be neither “true” nor “false,” although the truth value of “P or not P” is still “true.” Certainly when talking about classical truths which fail constructively, such as excluded middle, it is important to remember that “fails to be provable” is different from “is provably false.”

To give an indication of the problem, here is my strong impression of what would happen if I gave a student who had just passed a full year introduction to formal logic the following exercise: “Give me a formal proof of |(\lnot P \Rightarrow \lnot Q) \Rightarrow (Q \Rightarrow P)|”. I suspect most would draw up a truth table. When I responded, “that’s not a formal proof,” they would be confused. If you are confused by that response, then this is definitely for you. I’m sure they would completely agree with me that if I asked for the inverse matrix of some matrix |M|, showing that the determinant of |M| is non-zero does not constitute an answer to that question. The parallelism of these scenarios would likely be lost on them though. While I think courses that focus on a syntactic approach to logic will produce students that are much more likely to give an appropriate answer to my exercise, that doesn’t mean they lack the confusions, just that this exercise isn’t a good discriminator for them. For example, if they don’t see the parallelism of the two scenarios I described, or worse, have no idea what truth tables have to do with anything, then they have some gap.

As a disclaimer, I am not an educator nor even an academic in any professional capacity.

As a summary, the points I will touch on are:

If anyone reading this is aware of introductory textbooks or other resources (ideally freely available, but I’ll take a reasonably priced textbook too) that avoid most or all of the major issues I list and otherwise do a good job, I would be very interested. My own education on these topics has been piecemeal and eclectic and also strongly dependent on my background in programming. This leaves me with no reasonable recommendations. Please leave a comment or email me, if you have such a recommendation.

Syntax versus Semantics

The distinction between syntax and semantics is fundamental in logic. Many, many students, on the other hand, seem completely unaware of the distinction. The most blatant form of this is that for many students it seems quite clear that “proving a formula true” means plugging in truth values and seeing that the result is always “true”. Often they (seemingly) think a truth table is a formal proof. They find it almost impossible to state what the difference between provable and “true” is. The impression I get from some is that they think plugging truth values into formulas is what logic is about.

If the students get to classical predicate logic, most of these misunderstandings will be challenged, but students with this misunderstanding will struggle unnecessarily, and you can see this on sites like math.stackexchange.com where questioners present attempts to prove statements of predicate logic by reducing them to propositional statements. It’s not clear to me what typically happens to students with such misunderstandings as they gain proficiency with classical predicate logic. Certainly some clear up their misunderstanding, but since it’s clear that some still have a slippery grasp of the distinction between syntax and semantics, some misunderstanding remains for them. My guess would be that many just view classical propositional logic and classical predicate logic as more different than they are with unrelated approaches to proof. Any instructors reading this can probably describe much stranger rationalizations that students have come up with, and perhaps give some indications of what faulty rationalizations are common.

Speaking of instructors, the pedagogical problem I see here is not that the textbook authors and instructors don’t understand these distinctions, or even that they don’t present these distinctions, but that they don’t continually emphasize these distinctions. Maybe they do in the classroom, but going by the lecture notes or books, the notation and terminology that many use make it very easy to conflate syntax and semantics. For understandable reasons, proofs of the soundness and completeness theorems that are necessary to justify this conflation are pushed off until much later.

I googled lecture notes "classical propositional logic" and looked at the first hit I got. It clearly has a section on semantics and multiple sections on a (syntactic) deductive system “helpfully” named “semantic tableaux”. It bases all concepts on truth tables introduced in section 4. Later concepts are justified by corresponding to the truth table semantics. Here is how it describes the semantics for negation:

A negation |\lnot A| is true (in any situation) if and only if |A| is not true (in that situation).

This is followed by the sentence: “For the moment, we shall suppress the qualifier ‘in any situation’[…]”, and we get the rules for the other connectives starting with the one for conjunction:

A conjunction |A \wedge B| is true if and only if |A| is true and |B| is true.

It later explicates the “in any situation” by talking about truth-value assignments and giving “rules” like:

|A \wedge B| is true in an assignment iff |A| and |B| are true in that assignment.

Between these two renditions of the rule for conjunction, the notes tell us what a “truth-value” is and seemingly what “is true” means with the following text:

In classical propositional logic, formulas may have only one of two possible “truth values”. Each formula must be either true or false; and no formula may be both true and false. These values are written T and F. (Some authors use |1| and |0| instead. Nothing hinges on this.)

The description so far makes it sound like logical formula are “expressions” which “compute” to true or false, which will also be written T or F, like arithmetical expressions. This is made worse in section 11 where the connectives are identified with truth-functions (which is only mentioned in this early section). It would be better to have a notation to distinguish a formula from its truth value e.g. |v(A)| or, the notation I like, |[\! [A]\!]|. (As I’ll discuss in the next section, it would also be better to avoid using philosophically loaded terms like “true” and “false”. |1| and |0| are better choices, but if you really wanted to drive the point home, you could use arbitrary things like “dog” and “cat”. As the authors state, “nothing hinges on this”.) Rewording the above produces a result that I believe is much harder to misunderstand.

|v(A \wedge B)=1| if and only if |v(A)=1| and |v(B)=1|.

This is likely to be perceived as less “intuitive” to students, but I strongly suspect the seeming cognitive ease was masking actually grappling with what is being said in the earlier versions of the statement. If we actually want to introduce truth functions, I would highly recommend using different notation for them, for example:

|v(A \wedge B)=\mathsf{and}(v(A),v(B))|

Of course, we can interpret “|A| is true” as defining a unary predicate on formulas, “_ is true”, but I seriously doubt this is where most students’ minds jump to when being introduced to formal logic, and this view is undermined by the notion of “truth values”. Again, I’m not saying the authors are saying things that are wrong, I’m saying that they make it very easy to misunderstand what is going on.

Following this, most of the remainder of the notes discusses the method of semantic tableaux. It mentions soundness and completeness but also states: “In this introductory course, we do not expect you to study the proof of these two results[.]”) At times it suggests that the tableaux method can stand on its own: “However, we can show that it is inconsistent without using a truth table, by a form of deductive reasoning. That is, by following computer-programmable inference rules.” But the method is largely presented as summarizing semantic arguments: “This rule just summarizes the information in the truth table that…” In my ideal world, this more deductive approach would be clearly presented as an alternative approach to thinking about logic with the inference rules not explained in terms of semantic notions but related to them after the fact.

My point isn’t to pick on these notes which seem reasonable enough overall. They have some of the other issues I’ll cover but also include some of the things I’d like to see in such texts.

Unnecessary Philosophizing

As mentioned in the previous section, words like “true” and “false” have a huge philosophical weight attached to them. It is quite easy to define any particular logic either syntactically or semantically without even suggesting the identification of anything with the philosophical notion of “truth”. My point isn’t that philosophy should be banished from a mathematical logic course, though that is an option, but that we can separate the definition of a logical system with how it relates to the intuitive or philosophical notion of “truth”. This allows us to present logics as the mathematical constructions they are and not feel the need to justify why any given rules/semantics really are the “right” ones. Another way of saying it is that your philosophical understanding/stance doesn’t change the definition of classical propositional logic. It just changes whether that logic is a good reflection of your philosophical views.

(As an aside, while I did read introductions to logic early on, the beginnings of my in-depth understanding of logic arose via programming language theory and type systems [and then category theory]. These areas often cover the same concepts and systems but with little to no philosophizing. Papers discuss classical and constructive logics with no philosophical commitment for or against the law of excluded middle being remotely implied.)

Of course, we do want to leverage our intuitive understanding of logic to motivate either the inference rules or the chosen semantics, but we can phrase the definitions as being inspired by the intuition rather than codifying it. Things like the common misgivings students (and others) have about material implication are a lot easier to deal with when logic isn’t presented as “The One True Formulation of Correct Reasoning”. Of course another way to undermine this impression would be to present a variety of logics including ones where material implication doesn’t hold. This is also confounded by unnecessary philosphizing. It makes it seem like non-classical logics are only of interest to crazy contrarians who reject “self-evident” “truths” like every statement is either “true” or “false”. While there definitely is a philosophical discussion that could be had here, there are plenty of mathematical reasons to be interested in non-classical logics regardless of your philosophical stance on truth. Non-classical logics can also shed a lot of light on what’s going on in classical logics.

This assumes non-classical logics are even mentioned at all.

The Complete Absence of Non-Classical Logics

I’m highly confident you can go through many introductory courses on logic and never have any inkling that there are logics other than classical logics. I suspect you can easily go through an entire math major and also be completely unaware of non-classical logics. For an introductory course on logic, I completely understand that time constraints can make it difficult to include any substantial discussion of non-classical logics. I somewhat less forgiving when it comes to textbooks.

What bothers me isn’t that non-classical logics aren’t given equal time - I wouldn’t expect them to be - it’s that classical logic is presented as a comprehensive picture of logic. This is likely, to some degree, unintentional, but it is easily fixable. First, at the beginning of the course, simply mention that the course will cover classical logics and there are other logics like constructive or paraconsistent logics that have different rules and different notions of semantics. Use less encompassing language. Instead of saying “logic”, say “a logic” or “the logic we’re studying”. Similarly, talk about “a semantics” rather than “the semantics”, and “a set of rules” rather than “the rules”. Consistently say “classical propositional/first-order/predicate logic” rather than just “propositional/first-order/predicate logic” or worse just “logic”. Note points where non-classical logics make different choices when the topics come up. For example, that truth tables are a semantics for classical propositional logic while non-classical logics will use a different notion of semantics. The law of excluded middle, indirect proof, and material implication are other areas where a passing mention of how different logics make different decisions about these topics could be made. The principle of explosion, related to the issues with material implication, is an interesting point where classical and constructive logics typically agree while paraconsistent logics differ. Heck, I suspect some people find my use of the plural, “logics”, jarring, as if there could be more than one logic. Fancy that!

The goal with this advice is just to indicate to students that there is a wider world of logic out there. I don’t expect students to be quizzed on this. Some of this advice is relevant even purely for classical logics. It’s not uncommon for people to ask questions on the Math StackExchange asking for a proof of some formula. When asked what rules they are using, they seem confused: “The rules of predicate logic(?!)” The idea that the presentation of (classical) predicate logic that they were given is only one of many possibilities seems never to have occurred to them. Similarly for semantics. So generally when a choice is made, it would seem of long-term pedagogical value to at least mention that a choice has been made and that other choices are possible.

Silly Linguistics

Nearly any week on the Math StackExchange you can find someone asking about an exercise like the following:

Translate “Jan likes Sally but not Alice” to a logical formula.

Exercises like these are just a complete and utter waste of time. The only people who might do something like this are linguists. Even then the “linguistics” presented in introductions to formal logic is (understandably) ridiculously naive, and an introductory course on formal logic is simply not the place for it anyway. What the vast majority of consumers of formal logic do need is to 1) be able to understand what formal expressions mean, and 2) be able to present their own thoughts formally. Even when formalizing another person’s informal proof, the process is one of understanding what the person was saying and then writing down your understanding in formal language. Instead what’s presented is rules-of-thumb like “‘but’ means ‘and’”.

I could understand having an in-class discussion where the instructor presents natural language statements and asks students interactively to provide various interpretations of the statements. The ambiguity in natural language should quickly become apparent and this motivates the introduction of unambiguous formal notation. What makes no sense to me is giving students sheets of natural language statements about everyday situations that they need to “formalize” and grading them on this. Even if this was valuable, which I don’t think it is, the opportunity cost of spending time on this rather than some other topic would make it negatively valued to me.

What is fine is telling students how to read formal notation in natural language as long as it’s clear that these readings are not definitions. I have seen people claim that e.g. |A \Rightarrow B| is defined as “if |A|, then |B|”, but this is likely a misunderstanding on their part and not what their instructors or books said. On the other hand, it doesn’t seem too rare for a textbook to present a logical connective and then explicate it in terms of natural language examples shortly before or after providing a more formal definition (via rules or semantics). I don’t find it completely surprising that some students confuse these examples as definitions especially when the actual definitions, at times, don’t look terribly different from schematic phrases of natural languages.

I think natural language examples, and especially everyday as opposed to mathematical examples, should largely be avoided, or at the very least should be used with care.

The Complete Conflation of Negation Introduction with Double Negation Elimination

Negation introduction is the following rule: if by assuming |P| we can produce a contradiction, then we can conclude |\lnot P|. You could present this as an axiom: |(P \Rightarrow \bot) \Rightarrow \lnot P|.

Double negation elimination is the following rule: if by assuming |\lnot P| we can produce a contradiction, then we can conclude |P|. You could present this as an axiom: |(\lnot P \Rightarrow \bot) \Rightarrow P| or |\lnot\lnot P \Rightarrow P|.

It seems pretty evident that many experienced mathematicians, let alone instructors and students, don’t see any difference between these. “Proof by contradiction” or “reductio ad absurdum” are often used to ambiguously refer to either of the above rules. This has been discussed a few times. When even the likes of Timothy Gowers does this, I think it is safe to say many others do as well. Or just look at the comments to Andrej’s article which I highly recommend.

As discussed in the referenced articles, and evidenced in the comments, there seems to be two things going on here. First, it seems that many people believe that dropping double negations, i.e. treating |\lnot\lnot P| and |P| is just a thing that you can do. They incorrectly reason as follows: substitute |\lnot P| into |(P \Rightarrow \bot) \Rightarrow \lnot P| to derive |(\lnot P \Rightarrow \bot) \Rightarrow \lnot\lnot P| which is the same as |(\lnot P \Rightarrow \bot) \Rightarrow P|. Of course, double negation elimination, the thing they are trying to show follows from negation introduction, is what lets them do that last step in the first place! It’s clear that negation introduction only adds (introduces, if you will) negations and can’t eliminate them. The second is that any informal proof of the form “Suppose |P| … hence a contradiction” is called a “proof by contradiction”.

As alluded to in the articles, people develop and state false beliefs about constructive logics because they are unaware of this distinction. This is an excellent area where exploring non-classical logics can be hugely clarifying. The fact that negation introduction is not only derivable but often axiomatic in constructive logics, while adding double negation elimination to a constructive logic makes it classical, very clearly illustrates that there is a huge difference between these rules. But forget about constructive logics, what’s happening here is many mathematicians are deeply confused about the key thing that makes classical logics classical. Further, students are being taught this confusion. Clearly there is some pedagogical failure occurring here.

Overuse of Indirect Proof

A major cause of the above conflation of negation introduction and double negation elimination is the overuse and overemphasis of indirect proof. This causes more problems than just the aforementioned conflation. For example, it’s relatively common to see proofs by students that are of the form: “Assume |\lnot P|. We can show that |P| holds via [proof of |P| without using the assumption |\lnot P|]. Hence we have |P| and |\lnot P| and thus a contradiction. Therefore |P|.” This is, of course, silly. The indirect proof by contradiction is just a wrapper around a direct proof of |P|. To make this completely clear, the logical structure mirrors the logical structure of the following argument: “Assume for contradiction |1+1 \neq 2|. Since by calculation |1+1=2| we have a contradiction. Thus |1+1=2|.”

It’s not always this egregious. Often the use indirect proof isn’t completely superfluous like the previous examples, but there is nevertheless a direct proof which is clearer, more informative, more general, and shorter. If these proofs are clearer and shorter, why don’t students find them? I see a variety of reasons for this. First, indirect proof is conceptually more complicated so more effort is spent explaining it, more examples using it are given, and its use is encouraged for practice. Second, the proof system provided is often intimately based on classical propositional logic (e.g. the “semantic tableaux” above or resolution) and often essentially requires the use of indirect proof to prove anything. For example, if we take resolution as the primitive rule, i.e. if |P \vee Q| and |\lnot Q| then |P|, then doing a case analysis requires an indirect proof in general since clearly both |P| and |Q| could hold so we can only hypothetically assume |\lnot Q|. (If you want a more formal argument, see this question and its answers.) Relying on double negation elimination to get negation introduction is another example. It’s common in less formal approaches to effectively take all classical propositional tautologies as axioms (which blurs the distinction between syntax and semantics). Many equivalences or definitions also hide uses of indirect proof (if we were to demonstrate them with respect to a constructive logic). For example, |P \Rightarrow Q \equiv \lnot P \vee Q|, |\lnot\lnot P \equiv P|, |\exists x. P(x) \equiv \lnot \forall x.\lnot P(x)|, and |P \vee Q \equiv \lnot (\lnot P \wedge \lnot Q)|. Third, often definitions and theorems are given in a “negative” way so that their negations or contrapositives are actually more useful. For example, defining “injective” as |f(x)=f(y) \Rightarrow x = y| is a little more convenient than |x \neq y \Rightarrow f(x) \neq f(y)| e.g. when proving an injective function has a post-inverse or here. Last but not least, it certainly can be harder, up to and including impossible, to prove a statement directly (i.e. constructively). That extra information doesn’t come from nowhere. As an example, there is a simple non-constructive proof that either |\sqrt{2}^\sqrt{2}| or |(\sqrt{2}^\sqrt{2})^\sqrt{2}| is a rational number that’s of the form of an irrational number raised to an irrational power, but which it is is much harder to show.

What I’d recommend is for instructors and authors to strongly encourage direct/constructive proofs suggesting that a direct proof should be attempted before reaching for indirect proof. Perhaps even require direct/constructive proofs unless the exercise specifies otherwise. This, of course, requires a proof system that doesn’t force the use of indirect proofs. I’d recommend a natural deduction or sequent style presentation where each logical connective is specified in terms of rules that are independent of the other connectives. (I’m not saying that the formal sequent calculus notation needs to be used, just that there should be a rule, formal or informal, for each rule in the usual presentation of natural deduction, say.) As in the earlier section, I strongly recommend keeping syntax and semantics separate: the proof system should be able to stand on its own. Definitions and theorems should be given in forms that are constructively-friendly, which, admittedly, can take a good amount of care. To avoid students spending a lot of time looking for simple direct proofs that don’t exist, give hints where indirect proof is or is not required. While it can be helpful to motivate the significance of direct proofs and why certain choices were made, there’s no need to explicitly reference constructive logics for any of this.

Smaller Issues

Finally, here are some minor issues that either seem less common or less problematic as well as some potential missed opportunities.

The first is a failure of abstraction I’ve occasionally seen, and I think certain books encourage. I’ve seen students who talk about the “symbols” of their logic as “including …, '(', and ')'”. While certainly how to parse strings into syntax trees is a core topic of formal languages, it’s not that important for formal logic. The issues with parsing formulas I see have to do with conventions on omitting parentheses, not an inability to match them. There can be some benefit to driving home that syntax is just inert symbols, but the concept of a syntax tree seems more important and the process of producing a syntax tree from a string of symbols rather secondary. Working at the level of strings adds a lot of complexity and offers little insight into the logical concepts.

As a missed opportunity, there’s a lot of algorithmic content in formal logic. This should come as no surprise as the mechanization of reasoning has been a long time dream of mathematics and was an explicit goal during the birth of formal logic. Unfortunately, if the instructors and/or students don’t have a programming background it can be hard to really leverage this. Still, my impression is that the only indication of the mechanizability of much of logic that students get is in the mechanical drudgery of doing certain exercises. Even when it’s not possible to actually implement some of the relevant ideas, they can still be mentioned. For example, it is often taken as a requirement of a proof system that the proofs can be mechanically verified. That is, we can write a program that takes a formal proof and tells you whether it is a valid proof or not. This should be at least mentioned. Even better, for more than just this reason, it can be experienced by using proof assistants like Coq or MetaMath or simpler things like Logitext. There are other areas of mechanization. Clearly, we can mechanize the process of checking truth tables for classical propositional logic, though there are interesting efficiency aspects to mention here. The completeness theorem of classical propositional logic is also constructively provable producing an algorithm that will take a formula with a truth table showing its validity and will produce a formal proof of that formula. As a little more open-ended topic, much of proof search is mechanical. I’ve often seen students stumped by problems whose proofs can be found in a completely goal-directed manner without any guessing at all.

Another issue is the use of Hilbert-style presentations of logics. It does seem that many logic texts thankfully use natural deduction or some other humane proof system, but Hilbert-style presentations aren’t uncommon. The benefit of Hilbert-style presentations of logics is that they can simplify meta-logical reasoning. But many introductions to logic do little to no meta-logical reasoning. The downsides of Hilbert-style presentations is that they don’t match informal reasoning well, they don’t provide much insight into the logical connectives, and proofs in Hilbert-style proof systems are just painful where even simple results are puzzles. Via the lens of Curry-Howard, Hilbert-style proof systems correspond to combinatory algebras, and writing programs in combinator systems (such as unlambda) is similarly unpleasant. The fact that introducing logic using a Hilbert-style proof system is formally analogous to introducing programming using an esoteric programming language designed to be obfuscatory says something…