不要过早地执着于单一的"大问题"或"大理论"

Terence Tao 2007-05-06

Don’t prematurely obsess on a single “big problem” or “big theory”

Millions long for immortality who do not know what to do with themselves on a rainy Sunday afternoon. — Susan Ertz, “Anger in the Sky”

There is a particularly dangerous occupational hazard in this subject: one can become focused, to the exclusion of other mathematical activity (and in extreme cases, on non-mathematical activity also), on a single really difficult problem in a field (or on some grand unifying theory) before one is really ready (both in terms of mathematical preparation, and also in terms of one’s career) to devote so much of one’s research time to such a project. This is doubly true if one has not yet learnt the limitations of one’s tools or acquired a healthy scepticism of one’s own work, as this can lead to the humiliating spectacle of proudly announcing a major breakthrough on a well known problem, only to withdraw the preprint shortly afterwards after serious flaws (often arising from pushing a method well beyond its known limits, and encounting obstructions beyond those limits that were known to the experts) are pointed out in the manuscript.

When one begins to neglect other tasks (such as writing and publishing one’s “lesser” results), hoping to use the eventual “big payoff” of solving a major problem or establishing a revolutionary new theory to compensate for lack of progress in all other areas of one’s career, then this is a strong warning sign that one should rebalance one’s priorities. While it is true that several major problems have been solved, and several important theories introduced, by precisely such an obsessive approach, this has only worked out well when the mathematician involved:

had a proven track record of reliably producing significant papers in the area already; and
had a secure career (e.g. a tenured position).

If you do not yet have both (1) and (2), and if your ideas on how to solve a big problem still have a significant speculative component (or if your grand theory does not yet have a definite and striking application), I would strongly advocate a more balanced, patient, and flexible approach instead: one can certainly keep the big problems and theories in mind, and tinker with them occasionally, but spend most of your time on more feasible “low-hanging fruit”, which will build up your experience, mathematical power, and credibility for when you are ready to tackle the more ambitious projects.

See also “Don’t base career decisions on glamour or fame” and “Use the wastebasket”. Henry Cohn also has some related advice for amateur mathematicians. This MathOverflow answer by Minhyong Kim also makes the point that one should accrue some definite mathematical results (preferably as published papers) before one can afford spending this “reputational capital” on philosophising on some “big picture” vision of mathematics.

Addendum: on publishing proofs of famous open problems

If you do believe that you have managed to solve a major problem, I would advise you to be extraordinarily sceptical of your own work, and to exercise the utmost care and caution before releasing it to anyone; there have been too many examples in the past of mathematicians whose reputation has been damaged by claiming a proof of a well-known result to much fanfare, only to find serious errors in the proof shortly thereafter. I recommend asking yourself the following questions regarding the paper:

What is the key new idea or insight? How does it differ from what has been tried before? Is this idea emphasised in the introduction to the paper? (As a colleague of mine is fond of saying: “Where’s the beef?”)
How does the arguments in this paper relate to earlier partial results or attempts on the problem? Are there clear analogues between the steps here and steps in earlier papers? Does the new work shed some light as to why previous approaches did not fully succeed? Is this discussed in the paper?
What is the simplest, shortest, or clearest new application of that idea? A related question: what is the first non-trivial new statement made in the paper, that was not able to have been shown before by earlier methods? Is this proof-of-concept given in the paper, or does it jump straight to the big conjecture with all its additional (and potentially error-prone) complications? In the event that there is a fatal error in the full proof, is there a good chance that a deep and non-trivial new partial result can at least be salvaged?
Any major problem comes with known counterexamples, obstructions, or philosophical objections to various classes of attack strategies (e.g. strategy X does not work because it does not distinguish between problem Y, which is the big conjecture, and problem Z, for which counterexamples are known). Do you know why your argument does not encounter these obstructions? Is this stated in the paper? Do you know any specific limitations of the argument? Are these stated in the paper also?
What was the high-level strategy you employed to attack the problem? Was it guided by some heuristic, philosophy, or intuition? If so, what is it? Is it stated in the paper? If the strategy was “continue blindly transforming the problem repeatedly until a miracle occurs”, this is a particularly bad sign. Can you state, in high-level terms (i.e. rising above all the technical details and computations), why the argument works?
Does the proof come with key milestones – such as a key proposition used in the proof which is already of independent interest, or a major reduction of the unsolved problem to one which looks significantly easier? Are these milestones clearly identified in the paper?
How robust is the argument – could a single sign error or illegal use of a lemma or formula destroy the entire argument? Good indicators of robustness include: alternate proofs (or heuristics, or supporting examples) of key steps, or analogies between key parts of the argument in this paper and in other papers in the literature.
How critically have you checked the paper and reworked the exposition? Have you tried to deliberately disprove or hunt for errors in the paper? One expects a certain amount of checking to have been done when a major paper is released; if this is not done, and errors are quickly found after the paper is made public, this can potentially be quite embarrassing. Note that there is usually no rush when solving a major problem that has already withstood all attempts at solution for many years; taking the few extra days to go through the paper one last time can save oneself a lot of trouble.
How much space in the paper is devoted to routine and standard theory and computations that already appears in previous literature, and how much is devoted to the new and exciting stuff which does not have any ready counterpart in previous literature? How soon in the paper does the new stuff appear? Are both parts of the paper being given appropriate amounts of detail?

Also, to reduce any potential negative reception to such a paper (especially if – as is all too likely – significant errors are detected in it) – any bragging or otherwise self-promoting text with little informative mathematical content should be kept to a minimum in the title, abstract, and introduction of the paper. For instance:

Example of a bad title: “A proof of the Poincaré conjecture”
Example of a good title: “The entropy formula for the Ricci flow and its geometric applications”

More generally, given any major open problem, the importance of the problem and its standard history will be a given to any informed reader, and should only be given a perfunctory treatment in the paper, except for those portions of the history of the problem which are of relevance to the proof. Pointing out that countless great mathematicians had tried to solve the problem and failed before you came along is in particularly bad taste and should be avoided completely.

It should also be noted that due to the sheer volume of failed attempts at solving these problems, most professional mathematicians will refuse to read any further attempts unless there is substantial auxiliary evidence that there is a non-zero chance of correctness (e.g. a previous track record of recognised mathematical achievement in the area). See for instance my editorial policy on papers involving a famous problem, or Oded Goldreich’s page on solving famous problems.

See also Scott Aaronson’s “Ten signs a claimed mathematical proof is wrong” and Dick Lipton’s “On Mathematical Diseases”.

不要过早地执着于单一的”大问题”或”大理论”

数百万人渴望永生，却不知道在某个下雨的周日下午该做些什么。 ——苏珊·厄茨，《天空中的愤怒》

在这个学科中存在一种特别危险的职业风险：一个人可能会在尚未真正准备好（既包括数学准备，也包括职业生涯准备）的情况下，过度专注于某个领域中的一个极其困难的问题（或某个宏大的统一理论），以至于排除了其他数学活动（在极端情况下，甚至排除了非数学活动）。如果你还没有了解自己工具的局限性或培养起对自己工作的健康怀疑态度，这种情况就更加危险，因为这可能导致令人尴尬的景象：自豪地宣布在一个知名问题上取得重大突破，但在手稿中严重的缺陷（通常是由于将某种方法推至已知极限之外，并遇到专家们已知的那些极限之外的障碍）被指出后不久就撤回了预印本。

当你开始忽视其他任务（例如撰写和发表你的”次要”成果），希望用解决重大难题或建立革命性新理论的最终”大回报”来弥补职业生涯其他领域缺乏进展时，这是一个强烈的警告信号，表明你应该重新平衡自己的优先事项。虽然确实有几个重大问题正是通过这种执着的方法得以解决，几个重要理论也正是通过这种方式得以建立，但这只有在涉及的数学家具备以下条件时才取得了良好结果：

在该领域已经拥有可靠产出重要论文的证明记录；并且
拥有稳定的职业生涯（例如终身职位）。

如果你还没有同时具备(1)和(2)，并且如果你关于如何解决大问题的想法仍然具有重要的推测性成分（或者如果你的宏大理论还没有明确而引人注目的应用），我强烈建议采取更加平衡、耐心和灵活的方法：你当然可以记住那些大问题和大理论，偶尔研究一下它们，但将大部分时间花在更可行的”低垂果实”上，这将积累你的经验、数学能力和可信度，为你准备好应对更雄心勃勃的项目时打下基础。

另见”不要基于魅力或名声做出职业决策”和”使用废纸篓”。亨利·科恩也有一些相关的给业余数学家的建议。这个闵炯金在MathOverflow上的回答也指出，一个人应该积累一些明确的数学成果（最好是已发表的论文），然后才能负担得起将这种”声誉资本”花费在对数学”大图景”的哲学思考上。

附录：关于发表著名开放问题的证明

如果你确实认为自己已经成功解决了一个重大问题，我建议你对自己的工作保持极其怀疑的态度，并在将其发布给任何人之前采取最大程度的谨慎和小心；过去有太多数学家因为声称证明了知名结果而引起轰动，但不久后就在证明中发现严重错误，从而损害了声誉。我建议你针对论文问自己以下问题：

关键的新思想或见解是什么？ 它与之前尝试过的方法有何不同？这个思想是否在论文引言中得到了强调 ？（正如我的一位同事喜欢说的：“牛肉在哪里？”）
这篇论文中的论证与早期关于该问题的部分结果或尝试有何关系？ 这里的步骤与早期论文中的步骤是否有明确的相似之处？新工作是否阐明了为什么以前的方法没有完全成功？论文中是否讨论了这一点？
这个思想最简单、最短或最清晰的新应用是什么？ 一个相关的问题：论文中第一个非平凡的、以前的方法无法证明的新陈述是什么？这个概念验证是否在论文中给出，还是直接跳到了具有所有额外（且可能容易出错）复杂性的大猜想？如果完整证明中存在致命错误，是否有可能至少挽救一个深刻且非平凡的新部分结果？
任何重大问题都带有已知的反例、障碍或对各种攻击策略的哲学反对（例如，策略X不起作用，因为它无法区分问题Y（即大猜想）和问题Z（已知存在反例））。你知道为什么你的论证不会遇到这些障碍吗？这在论文中是否说明了？你知道论证的任何具体局限性吗？这些是否也在论文中说明了？
你用来攻击问题的高层策略是什么？ 它是否受到某种启发、哲学或直觉的指导？如果是，那是什么？它在论文中是否说明了？如果策略是”盲目地不断变换问题直到奇迹发生”，这是一个特别糟糕的迹象。你能用高层术语（即超越所有技术细节和计算）说明为什么论证有效吗？
证明是否有关键里程碑——例如证明中使用的关键命题本身具有独立意义，或者将未解决问题简化为看起来明显更容易的问题？这些里程碑是否在论文中明确标识？
论证的稳健性如何——单个符号错误或引理或公式的非法使用是否会破坏整个论证？稳健性的良好指标包括：关键步骤的替代证明（或启发式论证，或支持性例子），或者本文论证的关键部分与文献中其他论文之间的类比。
你检查论文和重写阐述的批判性如何？ 你是否尝试过故意反驳或寻找论文中的错误？当发布重要论文时，人们期望已经进行了一定程度的检查；如果没有这样做，并且在论文公开后很快发现错误，这可能会相当尴尬。请注意，解决一个已经经受多年所有解决尝试的重大问题时通常没有时间压力；多花几天时间最后通读一遍论文可以为自己省去很多麻烦。
论文中有多少篇幅用于常规和标准的理论和计算（这些内容已经出现在以前的文献中），有多少篇幅用于新的、令人兴奋的内容（这些内容在以前的文献中没有现成的对应物）？新内容在论文中多早出现？论文的这两个部分是否都给出了适当数量的细节？

此外，为了减少对此类论文的任何潜在负面反应（特别是如果——这是非常可能的——在其中检测到重大错误），标题、摘要和引言中任何吹嘘或其他自我推销但数学内容很少的文本应保持在最低限度。例如：

坏标题的例子：“庞加莱猜想的证明”
好标题的例子：“里奇流的熵公式及其几何应用”

更一般地说，对于任何重大开放问题，问题的重要性及其标准历史对任何知情读者都是已知的，在论文中只应给予敷衍的处理，除非问题的历史中与证明相关的部分。指出无数伟大的数学家在你出现之前曾尝试解决该问题但失败了，品味特别差，应完全避免。

还应注意，由于解决这些问题的失败尝试数量庞大，大多数专业数学家会拒绝阅读任何进一步的尝试，除非有实质性的辅助证据表明存在非零的正确可能性（例如，在该领域有公认数学成就的先前记录）。例如，参见我关于涉及著名问题的论文的编辑政策，或奥代德·戈德里希关于解决著名问题的页面。

另见斯科特·阿伦森的”声称的数学证明错误的十个迹象”和迪克·利普顿的”关于数学疾病”。