关于数学论文中的"局部"与"全局"错误及其检测方法

Terence Tao 2019-02-15

On “local” and “global” errors in mathematical papers, and how to detect them

Formally, a mathematical proof consists of a sequence of mathematical statements and deductions (e.g. “If $A$ , then $B$ ”), strung together in a logical fashion to create a conclusion. A simple example of this is a linear chain of deductions, such as " $A \implies B \implies C \implies D$ ", to create the conclusion " $D$ ". In practice, though, proofs tend to be more complicated than a linear chain, often acquiring a tree-like structure (or more generally, the structure of a directed acyclic graph), due to the need to branch into cases, or to reuse a hypothesis multiple times. Proof methods such as proof by contradiction, or proof by induction, can lead to even more intricate loops and reversals in a mathematical argument.

Unfortunately, not all proposed proofs of a statement in mathematics are actually correct, and so some effort needs to be put into verification of such a proposed proof. Broadly speaking, there are two ways that one can show that a proof can fail. Firstly, one can find a “local”, “low-level” or “direct” objection to the proof, by showing that one of the steps (or perhaps a cluster of steps, see below) in the proof is invalid. For instance, if the implication $B \implies C$ is false, then the above proposed proof " $A \implies B \implies C \implies D$ " of " $D$ " is invalid (though it is of course still conceivable that $D$ could be proven by some other route).

Sometimes, a low-level error cannot be localised to a single step, but rather to a cluster of steps. For instance, if one has a circular argument, in which a statement $A$ is claimed using $B$ as justification, and $B$ is then claimed using $A$ as justification, then it is possible for both implications $A \implies B$ and $B \implies A$ to be true, while the deduction that $A$ and $B$ are then both true remains invalid. (Note though that there are important and valid examples of near-circular arguments, such as proofs by induction, but this is not the topic of this current discussion.)

Another example of a low-level error that is not localisable to a single step arises from ambiguity. Suppose that one is claiming that $A \implies B$ and $B \implies C$ , and thus that $A \implies C$ . If all terms are unambiguously well-defined, this is a valid deduction. But suppose that the expression $B$ is ambiguous, and actually has at least two distinct interpretations, say $B_1$ and $B_2$ . Suppose further that the $A \implies B$ implication presumes the former interpretation $B_1$ , while the $B \implies C$ implication presumes the latter interpretation $B_2$ , thus we actually have $A \implies B_1$ and $B_2 \implies C$ . In such a case we can no longer validly deduce that $A \implies C$ (unless of course we can show in addition that $B_1 \implies B_2$ ). In such a case, one cannot localise the error to either " $A \implies B$ " or " $B \implies C$ " until $B$ is defined more unambiguously. This simple example illustrates the importance of getting key terms defined precisely in a mathematical argument.

The other way to find an error in a proof is to obtain a “high level” or “global” objection, showing that the proof, if valid, would necessarily imply a further consequence that is either known or strongly suspected to be false. The most well-known (and strongest) example of this is the counterexample. If one possesses a counterexample to the claim $D$ , then one instantly knows that the chain of deduction " $A \implies B \implies C \implies D$ " must be invalid, even if one cannot immediately pinpoint where the precise error is at the local level. Thus we see that global errors can be viewed as “non-constructive” guarantees that a local error must exist somewhere.

A bit more subtly, one can argue using the structure of the proof itself. If a claim such as $P$ could be proven by a chain $A \implies B \implies C \implies D \implies P$ , then this might mean that a parallel claim $Q$ could then also be proven by a parallel chain $A' \implies B' \implies C' \implies D' \implies Q$ of logical reasoning. But if one also possesses a counterexample to $Q$ , then this implies that there is a flaw somewhere in this parallel chain, and hence (presumably) also in the original chain. Other examples of this type include proofs of some conclusion that mysteriously never use in any essential way a crucial hypothesis (e.g. proofs of the non-existence of non-trivial integer solutions to $x^n + y^n = z^n$ that mysteriously never use the hypothesis that $n$ is strictly greater than $2$ , or which could be trivially adapted to cover the $n=2$ case).

While global errors are less constructive than local errors, and thus less satisfying as a “smoking gun”, they tend to be significantly more robust. A local error can often be patched or worked around, especially if the proof is designed in a fault-tolerant fashion (e.g. if the proof proceeds by factoring a difficult problem into several strictly easier pieces, which are in turn factored into even simpler pieces, and so forth). But a global error tends to invalidate not only the proposed proof as it stands, but also all reasonable perturbations of that proof. For instance, a counterexample to $D$ will automatically defeat any attempts to patch the invalid argument $A \implies B \implies C \implies D$ , whereas the more local objection that $B$ does not imply $C$ could conceivably be worked around.

(There is a mathematical joke in which a mathematician is giving a lecture expounding on a recent difficult result that he has just claimed to prove. At the end of the lecture, another mathematician stands up and asserts that she has found a counterexample to the claimed result. The speaker then rebuts, “This does not matter; I have two proofs of this result!”. Here one sees quite clearly the distinction of impact between a global error and a local one.)

It is also a lot quicker to find a global error than a local error, at least if the paper adheres to established standards of mathematical writing. To find a local error in an $n$ -page paper, one basically has to read a significant fraction of that paper line-by-line, whereas to find a global error it is often sufficient to skim the paper to extract the large-scale structure. This can sometimes lead to an awkward stage in the verification process when a global error has been found, but the local error predicted by the global error has not yet been located. Nevertheless, global errors are often the most serious errors of all.

It is generally good practice to try to structure a proof to be fault tolerant with respect to local errors, so that if, say, a key step in the proof of Lemma 17 fails, then the paper does not collapse completely, but contains at least some salvageable results of independent interest, or shows a reduction of the main problem to a simpler one. Global errors, by contrast, cannot really be defended against by a good choice of proof structure; instead, they require a good choice of proof strategy that anticipates global pitfalls and confronts them directly.

One last closing remark: as error-testing is the complementary exercise to proof-building, it is not surprising that the standards of rigour for the two activities are dual to each other. When one is building a proof, one is expected to adhere to the highest standards of rigour that are practical, since a single error could well collapse the entire effort. But when one is testing an argument for errors or other objections, then it is perfectly acceptable to use heuristics, hand-waving, intuition, or other non-rigorous means to locate and describe errors. This may mean that some objections to proofs are not watertight, but instead indicate that either the proof is invalid, or some accepted piece of mathematical intuition is in fact inaccurate. In some cases, it is the latter possibility that is the truth, in which case the result is deemed “paradoxical”, yet true. Such objections, even if they do not invalidate the paper, are often very important for improving one’s intuition about the subject.

关于数学论文中的”局部”与”全局”错误及其检测方法

从形式上说，数学证明由一系列数学陈述和推理（例如”若 $A$ ，则 $B$ ”）组成，以逻辑方式串联起来形成结论。一个简单的例子是线性推理链，如" $A \implies B \implies C \implies D$ "，从而得出" $D$ "这一结论。然而在实践中，证明往往比线性链更复杂，常常呈现出树状结构（或更一般地说，有向无环图的结构），这是因为需要分情况讨论，或者多次重复使用某个假设。诸如反证法或数学归纳法之类的证明方法，可能导致数学论证中出现更复杂的循环和反转。

不幸的是，并非所有对数学命题的拟议证明都是正确的，因此需要付出一些努力来验证这些拟议证明。广义上说，有两种方式可以表明证明可能失败。首先，可以通过证明证明中的某个步骤（或者可能是几个步骤的集合，见下文）无效来找到对证明的”局部”、“低层次”或”直接”反驳。例如，如果蕴含关系 $B \implies C$ 为假，那么上述对" $D$ "的拟议证明" $A \implies B \implies C \implies D$ "就是无效的（尽管当然仍有可能通过其他途径证明 $D$ ）。

有时，低层次错误无法定位到单个步骤，而是定位到几个步骤的集合。例如，如果存在循环论证，其中陈述 $A$ 以 $B$ 作为理由被声称成立，而 $B$ 又以 $A$ 作为理由被声称成立，那么即使两个蕴含关系 $A \implies B$ 和 $B \implies A$ 都为真，推导出 $A$ 和 $B$ 都成立的推理仍然是无效的。（但请注意，存在重要且有效的近似循环论证的例子，例如归纳证明，但这并非当前讨论的主题。）

另一个无法定位到单个步骤的低层次错误的例子源于歧义性。假设某人声称 $A \implies B$ 和 $B \implies C$ ，因此 $A \implies C$ 。如果所有术语都无歧义地明确定义，这是一个有效的推理。但假设表达式 $B$ 存在歧义，实际上至少有两种不同的解释，比如 $B_1$ 和 $B_2$ 。进一步假设 $A \implies B$ 这一蕴含关系假定前一种解释 $B_1$ ，而 $B \implies C$ 这一蕴含关系假定后一种解释 $B_2$ ，因此我们实际上有 $A \implies B_1$ 和 $B_2 \implies C$ 。在这种情况下，我们不能再有效地推导出 $A \implies C$ （除非我们能够额外证明 $B_1 \implies B_2$ ）。在这种情况下，在 $B$ 被更明确地定义之前，无法将错误定位到" $A \implies B$ "或" $B \implies C$ "中的任何一个。这个简单的例子说明了在数学论证中精确定义关键术语的重要性。

在证明中发现错误的另一种方法是获得”高层次”或”全局”的反驳，表明该证明如果有效，将必然推出一个已知或被强烈怀疑为假的进一步结论。这方面最著名（也最强有力）的例子是反例。如果某人拥有对命题 $D$ 的反例，那么他立即知道推理链" $A \implies B \implies C \implies D$ "必定无效，即使无法立即在局部层面 pinpoint 精确的错误所在。因此我们看到，全局错误可以被视为”非构造性”的保证，表明某处必定存在局部错误。

更微妙一点，可以利用证明本身的结构进行论证。如果像 $P$ 这样的命题可以通过链 $A \implies B \implies C \implies D \implies P$ 来证明，那么这可能意味着一个平行的命题 $Q$ 也可以通过平行的逻辑推理链 $A' \implies B' \implies C' \implies D' \implies Q$ 来证明。但如果某人还拥有对 $Q$ 的反例，那么这意味着在这个平行链中的某处存在缺陷，因此（大概）在原始链中也存在缺陷。这种类型的其他例子包括对某个结论的证明，这些证明神秘地从未以任何实质性的方式使用一个关键假设（例如，对 $x^n + y^n = z^n$ 不存在非平凡整数解的证明，神秘地从未使用 $n$ 严格大于 $2$ 这一假设，或者可以轻易修改以覆盖 $n=2$ 的情况）。

虽然全局错误比局部错误更缺乏构造性，因此作为”确凿证据”而言不那么令人满意，但它们往往显著地更稳健。局部错误通常可以被修补或绕过，特别是当证明是以容错方式设计时（例如，如果证明通过将困难问题分解为几个严格更简单的部分，这些部分又被分解为更简单的部分，依此类推）。但全局错误往往不仅使拟议的证明本身无效，而且使该证明的所有合理变体也无效。例如，对 $D$ 的反例将自动挫败任何修补无效论证 $A \implies B \implies C \implies D$ 的尝试，而对 $B$ 不蕴含 $C$ 这一更局部的反驳则有可能被绕过。

（有一个数学笑话：一位数学家正在做讲座，阐述他刚刚声称证明的一个近期困难结果。讲座结束时，另一位数学家站起来断言她找到了对该声称结果的反例。演讲者反驳道：“这没关系；我有两个证明这个结果的方法！“这里可以非常清楚地看到全局错误和局部错误在影响上的区别。）

发现全局错误也比发现局部错误快得多，至少当论文遵循既定的数学写作标准时是如此。要在 $n$ 页的论文中发现局部错误，基本上需要逐行阅读该论文的相当一部分内容，而要发现全局错误，通常只需浏览论文以提取其大尺度结构。这有时会导致验证过程中出现一个尴尬的阶段：已经发现了全局错误，但由全局错误预测的局部错误尚未定位。尽管如此，全局错误往往是最严重的错误。

通常，良好的做法是尝试构建对局部错误具有容错性的证明结构，这样，例如，如果引理17证明中的某个关键步骤失败，论文不会完全崩溃，而是至少包含一些可挽救的独立结果，或者显示出将主要问题简化为更简单问题的过程。相比之下，全局错误无法通过良好的证明结构选择来防御；相反，它们需要良好的证明策略选择，以预见全局陷阱并直接面对它们。

最后一点结束语：由于错误检测是证明构建的互补活动，这两种活动对严谨性的标准是相互对偶的，这并不奇怪。当构建证明时，人们被期望遵循实际可行的最高严谨性标准，因为单个错误很可能导致整个努力崩溃。但当检测论证中的错误或其他反驳时，使用启发式方法、粗略估计、直觉或其他非严谨手段来定位和描述错误是完全可接受的。这可能意味着对证明的一些反驳并非无懈可击，而是表明要么证明无效，要么某个被接受的数学直觉实际上是不准确的。在某些情况下，后一种可能性是事实，在这种情况下，结果被认为是”悖论性的”，但却是真实的。这样的反驳，即使不使论文无效，也常常对于改进对该主题的直觉非常重要。