如何变得流行

Paul Graham 2001-05-01

如何变得流行

2001年5月

(本文是作为一种新语言的商业计划书而写的。因此它缺少(因为想当然)一个好的编程语言最重要的特征:非常强大的抽象能力。)

我有一个朋友曾经告诉一位著名的操作系统专家,他想设计一种真正好的编程语言。专家告诉他,这将是浪费时间,编程语言不会因为其优点而变得流行或不流行,所以无论他的语言有多好,都没人会使用。至少,这是他设计的语言所发生的情况。

到底是什么让一种语言变得流行?流行的语言是否配得上它们的流行度?试图定义一个好的编程语言是否值得?你会怎么做?

我认为这些问题的答案可以通过观察黑客并了解他们想要什么来找到。编程语言是为黑客而设计的,而编程语言作为编程语言(而不是,比如说,作为指称语义练习或编译器设计练习)是好的,当且仅当黑客喜欢它。

1 流行的机制

当然,大多数人在选择编程语言时并不仅仅基于它们的优点。大多数程序员被告知要使用什么语言。然而我认为这些外部因素对编程语言流行度的影响并不像人们有时认为的那么大。我认为更大的问题是黑客对好的编程语言的想法与大多数语言设计者的想法不同。

在这两者之间,黑客的意见才是重要的。编程语言不是定理。它们是工具,为人们设计的,必须像鞋子必须为人类的脚设计一样,适合人类的优点和缺点。如果鞋子穿上时夹脚,那它就是一双坏鞋,无论它作为雕塑品多么优雅。

可能是大多数程序员无法区分好语言和坏语言。但这与其他工具没有什么不同。这并不意味着尝试设计好语言是浪费时间。专家黑客在看到好语言时能够识别出来,他们会使用它。诚然,专家黑客是极少数,但这个极少数群体编写了所有好的软件,他们的影响力使得其他程序员倾向于使用他们使用的任何语言。通常,这不仅仅是影响,而是命令:专家黑客通常就是那些作为老板或导师告诉其他程序员使用什么语言的人。

专家黑客的意见并不是决定编程语言相对流行度的唯一力量——遗留软件(Cobol)和炒作(Ada、Java)也起作用——但我认为它是长期内最强大的力量。考虑到初始的关键质量和足够的时间,编程语言可能会变得与其应得的流行度相当。而流行度进一步将好语言与坏语言分开,因为来自真实用户的反馈总是导致改进。看看任何流行语言在其生命周期中发生了多大变化。Perl和Fortran是极端的例子,但即使是Lisp也发生了很大变化。例如,Lisp 1.5没有宏;这些是后来发展的,在麻省理工学院的黑客花了几年时间使用Lisp编写真实程序之后。[1]

所以无论一种语言是否必须好才能流行,我认为一种语言必须流行才能好。它必须保持流行才能保持好。编程语言的工艺水平不会停滞不前。然而我们今天拥有的Lisp几乎与麻省理工学院在1980年代中期的Lisp相同,因为那是Lisp最后一次拥有足够大和要求高的用户群的时候。

当然,黑客在使用语言之前必须了解它。他们如何听到?从其他黑客那里。但必须有一些初始的黑客群体使用该语言,其他人才能听到。我想知道这个群体必须多大;多少用户构成关键质量?我随口一说,二十个。如果一种语言有二十个独立的用户,意味着二十个用户自己决定使用它,我会认为它是真实的。

达到那里并不容易。如果从零到二十比从二十到一千更难,我不会感到惊讶。获得那初始二十个用户的最好方法可能是使用特洛伊木马:给人们一个他们想要的应用程序,而这个应用程序恰好是用新语言编写的。

2 外部因素

让我们首先承认一个确实影响编程语言流行度的外部因素。要变得流行,编程语言必须是流行系统的脚本语言。Fortran和Cobol是早期IBM大型机的脚本语言。C是Unix的脚本语言,后来Perl也是。Tcl是Tk的脚本语言。Java和Javascript旨在成为Web浏览器的脚本语言。

Lisp不是一个大规模流行的语言,因为它不是大规模流行系统的脚本语言。它保留的流行度可以追溯到1960年代和1970年代,当时它是麻省理工学院的脚本语言。当时的许多伟大程序员都在某个时期与麻省理工学院有关联。在1970年代初期,在C之前,麻省理工学院的Lisp方言MacLisp是严肃黑客想要使用的唯一编程语言之一。

今天,Lisp是两个适度流行系统的脚本语言,Emacs和Autocad,因此我怀疑今天大部分Lisp编程都是在Emacs Lisp或AutoLisp中完成的。

编程语言不是孤立存在的。Hack是一个及物动词——黑客通常在hack某物——在实践中,语言是相对于它们用来hack的任何东西来评判的。所以如果你想设计一种流行的语言,你要么必须提供比语言更多的东西,要么你必须设计你的语言来取代某个现有系统的脚本语言。

Common Lisp不受欢迎的部分原因是因为它是一个孤儿。它确实伴随着一个要hack的系统:Lisp Machine。但Lisp Machines(以及并行计算机)在1980年代被通用处理器日益增长的力量压垮了。如果Common Lisp是Unix的好脚本语言,它可能会保持流行。唉,它是一个非常糟糕的脚本语言。

描述这种情况的一种方法是说语言不是根据其自身的优点来评判的。另一种观点是,编程语言除非也是某物的脚本语言,否则就不是真正的编程语言。如果这让人感到意外,这似乎不公平。我认为这并不比期望编程语言有实现更不公平。这只是编程语言的一部分。

当然,编程语言需要好的实现,而且这必须是免费的。公司会为软件付费,但个人黑客不会,而你需要吸引的是黑客。

语言也需要有一本关于它的书。这本书应该是薄的、写得好的,充满了好的例子。K&R是这里的理想。目前我几乎会说一种语言必须有一本O’Reilly出版的书。这正成为对黑客重要性的测试。

也应该有在线文档。事实上,书可以作为在线文档开始。但我不认为物理书已经过时了。它们的格式方便,出版商施加的事实审查是一种有用的、虽然不完美的过滤器。书店是学习新语言的最重要地方之一。

3 简洁

假设你能提供任何语言需要的三个东西——一个免费的实现、一本书、以及要hack的东西——你如何制造一种黑客会喜欢的语言?

黑客喜欢的一件事是简洁。黑客是懒惰的,与数学家和现代主义建筑师是懒惰的方式相同:他们讨厌任何多余的东西。说一个即将编写程序的黑客基于他必须键入的字符总数来选择使用什么语言,至少是潜意识的,这离真相不远。如果这不是黑客思考的精确方式,语言设计者最好表现得好像是这样。

试图用冗长的表达来宠用户,这些表达旨在类似英语,是错误的。Cobol因此缺陷而臭名昭著。黑客会认为被要求写

add x to y giving z

而不是

z = x+y

是介于对他智力的侮辱和对上帝的罪之间。

有时有人说Lisp应该使用first和rest而不是car和cdr,因为这将使程序更容易阅读。可能是最初的几个小时。但黑客可以很快学会car意味着列表的第一个元素,cdr意味着其余的。使用first和rest意味着多50%的输入。而且它们的长度也不同,这意味着参数在调用时不会对齐,而car和cdr经常在连续行中这样使用。我发现代码在页面上的对齐方式非常重要。当Lisp代码以可变宽度字体设置时,我几乎无法阅读,朋友们说这对其他语言也是如此。

简洁是强类型语言失败的一个地方。在所有其他条件相同的情况下,没有人想以一堆声明开始程序。任何可以隐含的东西,都应该是。

单个标记也应该简短。Perl和Common Lisp在这个问题上占据相反的极端。Perl程序几乎可以神秘地密集,而内置Common Lisp操作符的名称则可笑地长。Common Lisp的设计者可能期望用户有文本编辑器为他们键入这些长名称。但长名称的成本不仅仅是键入它的成本。还有阅读它的成本,以及它在屏幕上占用空间的成本。

4 可Hack性

对黑客来说,有一件事比简洁更重要:能够做你想做的事。在编程语言的历史中,令人惊讶的是,大量的努力都用于防止程序员做被认为不合适的事情。这是一个危险傲慢的计划。语言设计师如何知道程序员将要需要做什么?我认为语言设计师最好将他们的目标用户视为天才,他们将需要做设计师从未预料到的事情,而不是需要保护自己免受自己的伤害的笨蛋。笨蛋无论如何都会伤到自己的脚。你可能阻止他引用另一个包中的变量,但你无法阻止他编写一个设计不良的程序来解决错误的问题,并且永远花时间做这件事。

好的程序员经常想做危险和不雅的事情。我所说的不雅是指超越语言试图呈现的任何语义门面的东西:例如,获取某些高级抽象的内部表示。黑客喜欢hack,而hack意味着进入事物内部并重新猜测原始设计师。

让自己被重新猜测。当你制造任何工具时,人们以你未曾打算的方式使用它,这对于高度复杂的工具如编程语言尤其如此。许多黑客想要以你从未想象过的方式调整你的语义模型。我说,让他们吧;在不危害运行时系统如垃圾收集器的情况下,给予程序员尽可能多的内部东西访问权限。

在Common Lisp中,我经常想要迭代结构的字段——例如,梳理对已删除对象的引用,或找到未初始化的字段。我知道结构在底层只是向量。然而我无法编写一个可以在任何结构上调用的通用函数。我只能通过名称访问字段,因为这就是结构应该意味着的意思。

黑客可能只想在一个大程序中一两次颠覆事物的预期模型。但能够这样做会产生多大的不同。这可能不仅仅是解决问题的问题。这里也有一种乐趣。黑客分享外科医生秘密的探索内脏的乐趣,青少年秘密的挤痘痘的乐趣。[2] 至少对于男孩来说,某些类型的恐怖是迷人的。Maxim杂志每年出版一卷照片,包含美女写真和可怕事故的混合。他们了解他们的观众。

从历史上看,Lisp在让黑客为所欲为方面一直很好。Common Lisp的政治正确性是一种异常。早期的Lisp让你可以接触到一切。幸运的是,这种精神的大部分在宏中得以保留。能够对源代码进行任意转换,这是一件多么美妙的事情。

经典的宏是真正的黑客工具——简单、强大和危险。理解它们的作用是如此简单:你调用一个函数在宏的参数上,无论它返回什么都被插入到宏调用的地方。卫生宏体现了相反的原则。它们试图保护你理解它们在做什么。我从未听过卫生宏用一句话解释过。它们是决定程序员被允许想要什么危险的经典例子。卫生宏旨在保护我免受变量捕获等问题的影响,但变量捕获正是在某些宏中我想要的东西。

真正好的语言应该是既干净又肮脏的:干净地设计,具有少量良好理解和高度正交的操作符核心,但肮脏在它让黑客为所欲为的意义上。C就是这样。早期的Lisp也是。真正的黑客语言总是有一种稍微放荡不羁的特征。

好的编程语言应该有让那些使用”软件工程”短语的人摇头不赞成的特征。在连续体的另一端是像Ada和Pascal这样的语言,它们是适合教学的礼仪模型,但不适合其他用途。

5 一次性程序

为了吸引黑客,语言必须擅长编写他们想要编写的程序类型。这意味着,也许令人惊讶的是,它必须擅长编写一次性程序。

一次性程序是你为某些有限任务快速编写的程序:一个自动化某些系统管理任务的程序,或为模拟生成测试数据的程序,或将数据从一种格式转换为另一种格式的程序。关于一次性程序的令人惊讶的事情是,像二战期间在许多美国大学建造的”临时”建筑一样,它们通常不会被丢弃。许多演变成真实的程序,具有真实的功能和真实的用户。

我有一种预感,最好的大程序是以这种方式开始的,而不是像胡佛水坝一样从一开始就设计成大的。从头开始建造大东西是可怕的。当人们承担一个太大的项目时,他们会不知所措。项目要么陷入困境,要么结果是贫乏和呆板的:购物中心而不是真正的市中心,巴西利亚而不是罗马,Ada而不是C。

获得大程序的另一种方法是从一次性程序开始并不断改进它。这种方法不那么令人生畏,程序的设计从进化中受益。我想,如果人们观察,会发现这是大多数大程序的开发方式。那些以这种方式演进的程序可能仍然是用它们最初编写的任何语言编写的,因为程序很少被移植,除非出于政治原因。所以,矛盾的是,如果你想制造一种用于大系统的语言,你必须让它对编写一次性程序有好处,因为大系统就是来自那里。

Perl是这个想法的一个显著例子。它不仅为编写一次性程序而设计,而且几乎本身就是一次性程序。Perl最初是作为生成报告的实用程序集合开始的,只有当人们用它们编写的一次性程序变大时才演变成编程语言。直到Perl 5(如果那时)该语言才适合编写严肃的程序,然而它已经非常流行了。

什么使语言对一次性程序有好处?首先,它必须容易获得。一次性程序是你期望在一小时内写完的程序。所以语言可能已经安装在你使用的计算机上。它不能是你在使用前必须安装的东西。它必须在那里。C在那里是因为它随操作系统一起提供。Perl在那里是因为它最初是系统管理员的工具,而你的系统已经安装了它。

可用不仅仅意味着已安装。具有命令行界面的交互式语言比必须单独编译和运行的语言更可用。流行的编程语言应该是交互式的,并且启动快速。

在一次性程序中你想要的另一件事是简洁。简洁总是对黑客有吸引力,在他们期望在一小时内完成的程序中尤其如此。

6 库

当然,简洁的极致是让程序已经为你编写好,你只需要调用它。这引出了我认为将成为编程语言日益重要特征的东西:库函数。Perl获胜是因为它有用于操作字符串的大型库。这类库函数对一次性程序特别重要,这些程序通常最初是为转换或提取数据而编写的。许多Perl程序可能开始时只是几个粘在一起的库调用。

我认为未来五十年编程语言中发生的许多进步将与库函数有关。我认为未来的编程语言将拥有与核心语言一样精心设计的库。编程语言设计将不是关于让你的语言是强类型还是弱类型,或面向对象,或函数式,或其他什么,而是关于如何设计伟大的库。那些喜欢思考如何设计类型系统的语言设计师可能会对此不寒而栗。这几乎就像编写应用程序!太糟糕了。语言是为程序员的,而库是程序员需要的。

设计好的库很困难。这不仅仅是编写大量代码的问题。一旦库变得太大,有时找到你需要的功能比自己编写代码花费的时间更长。库需要使用少量正交操作符来设计,就像核心语言一样。程序员应该能够猜测哪个库调用将做他需要的事情。

库是Common Lisp不足的一个地方。只有用于操作字符串的基本库,几乎没有用于与操作系统对话的库。由于历史原因,Common Lisp试图假装操作系统不存在。因为你无法与操作系统对话,你不太可能仅使用Common Lisp中的内置操作符编写严肃程序。你还必须使用一些特定于实现的技巧,实际上这些技巧往往不能给你想要的一切。如果Common Lisp有强大的字符串库和好的操作系统支持,黑客会对Lisp有更高的评价。

7 语法

具有Lisp语法,或更精确地说,缺乏语法的语言能变得流行吗?我不知道这个问题的答案。我确实认为语法不是Lisp目前不流行的主要原因。Common Lisp有比不熟悉的语法更糟糕的问题。我认识几个熟悉前缀语法的程序员,但他们默认使用Perl,因为它有强大的字符串库并且可以与操作系统对话。

前缀记号有两个可能的问题:它对程序员来说不熟悉,而且它不够密集。Lisp世界中的传统观点是第一个问题是真正的问题。我不太确定。是的,前缀记号让普通程序员恐慌。但我不认为普通程序员的意见重要。语言变得流行或不流行是基于专家黑客对它们的看法,我认为专家黑客可能能够处理前缀记号。Perl语法可能非常难以理解,但这并没有阻碍Perl的流行。如果有的话,它可能帮助培育了Perl崇拜。

更严重的问题是前缀记号的扩散性。对于专家黑客来说,这确实是一个问题。没有人想写(aref a x y)当他们可以写a[x,y]时。

在这个特定情况下,有一种方法可以巧妙地解决问题。如果我们将数据结构视为索引上的函数,我们可以写(a x y)而不是,这甚至比Perl形式更短。类似的技巧可能会缩短其他类型的表达式。

我们可以通过使缩进有意义来摆脱(或使可选)很多括号。这就是程序员无论如何阅读代码的方式:当缩进说一件事而分隔符说另一件事时,我们遵循缩进。将缩进视为有意义将消除这种常见的错误来源,同时也使程序更短。

有时中缀语法更容易阅读。对于数学表达式尤其如此。我整个编程生涯都在使用Lisp,但我仍然不觉得前缀数学表达式自然。然而,能够接受任意数量参数的操作符是方便的,尤其是在生成代码时。所以如果我们确实有中缀语法,它可能应该作为某种读取宏来实现。

我不认为我们应该宗教上反对在Lisp中引入语法,只要它以良好理解的方式转换为底层的s-表达式。Lisp中已经有相当多的语法。引入更多并不一定是坏事,只要没有人被迫使用它。在Common Lisp中,一些分隔符为语言保留,表明至少一些设计者打算将来有更多语法。

Common Lisp中最不像Lisp的语法片段之一出现在格式字符串中;format本身就是一种语言,而那种语言不是Lisp。如果有在Lisp中引入更多语法的计划,格式说明符可能可以包括在内。如果宏能够像生成任何其他类型的代码一样生成格式说明符,那将是一件好事。

一位著名的Lisp黑客告诉我,他的CLTL副本打开在format部分。我的也是。这可能表明有改进的空间。这也可能意味着程序做大量的I/O。

8 效率

众所周知,好的语言应该生成快速的代码。但在实践中,我不认为快速的代码主要来自语言设计中的事情。正如Knuth很久以前指出的,速度只在某些关键瓶颈中重要。正如许多程序员从那时起观察到的,人们经常误判这些瓶颈在哪里。

所以,在实践中,获得快速代码的方法是拥有非常好的分析器,而不是通过,比如说,使语言强类型。你不需要知道程序中每个调用中每个参数的类型。你确实需要能够声明瓶颈中参数的类型。更重要的是,你需要能够找出瓶颈在哪里。

人们对Lisp的一个抱怨是很难判断什么是昂贵的。这可能是真的。如果你想要一个非常抽象的语言,这也可能是不可避免的。无论如何,我认为好的分析器会大大有助于解决这个问题:你会很快学到什么是昂贵的。

这里的问题部分是社会性的。语言设计师喜欢编写快速的编译器。这就是他们衡量自己技能的方式。他们认为分析器最多只是一个附加组件。但在实践中,好的分析器可能比生成快速代码的编译器更能提高用该语言编写的实际程序的速度。在这里,语言设计师再次与他们的用户有些脱节。他们在解决稍微错误的问题上做得非常好。

拥有主动分析器可能是个好主意——将性能数据推送给程序员,而不是等待他来询问。例如,当程序员编辑源代码时,编辑器可以用红色显示瓶颈。另一种方法是以某种方式表示正在运行的程序中发生的事情。这在基于服务器的应用程序中将是一个特别大的胜利,在那里你有很多正在运行的程序可以查看。主动分析器可以图形化地显示程序运行时内存中发生的事情,甚至发出告诉正在发生什么的声音。

声音是对问题的很好提示。我曾经工作过的一个地方,我们有一个大的仪表板显示我们的Web服务器正在发生什么。指针由小型伺服电机移动,它们在转动时会发出轻微的噪音。我从我的办公桌看不到仪表板,但我发现我能够立即通过声音判断服务器何时有问题。

甚至可能可以编写一个能自动检测低效算法的分析器。如果某些内存访问模式被证明是坏算法的可靠迹象,我不会感到惊讶。如果有一个小家伙在计算机内部执行我们的程序,他可能会像联邦政府雇员一样对他的工作有同样长而悲伤的故事。我经常有一种感觉,我正在发送处理器进行许多徒劳的追逐,但我从未有过好的方法来看看它在做什么。

许多Lisp现在编译成字节码,然后由解释器执行。这通常是为了使实现更容易移植,但它可能是一个有用的语言特性。将字节码作为语言的官方部分可能是个好主意,并允许程序员在瓶颈中使用内联字节码。那么这样的优化也将是可移植的。

最终用户感知的速度性质可能正在改变。随着基于服务器的应用程序的兴起,越来越多的程序可能证明是I/O绑定的。使I/O快速将是值得的。语言可以通过简单直接的措施如简单、快速、格式化的输出函数来帮助,也可以通过缓存和持久对象等深刻的结构变化来帮助。

用户对响应时间感兴趣。但另一种效率将变得越来越重要:每个处理器可以支持的并发用户数量。在不久的将来编写的许多有趣应用程序将是基于服务器的,每台服务器的用户数量是任何托管此类应用程序的人的关键问题。在提供基于服务器应用程序的企业的资本成本中,这是除数。

多年来,在大多数最终用户应用程序中效率并不重要。开发人员能够假设每个用户将在他们的办公桌上拥有一个越来越强大的处理器。根据帕金森定律,软件已经扩展到使用可用资源。随着基于服务器的应用程序,这将改变。在那个世界里,硬件和软件将一起提供。对于提供基于服务器应用程序的公司来说,他们每台服务器能够支持多少用户将对底线产生非常大的影响。

在某些应用程序中,处理器将是限制因素,执行速度将是最重要的优化事项。但通常内存将是限制;并发用户数量将取决于每个用户数据所需的内存量。语言在这里也可以帮助。对线程的良好支持将使所有用户能够共享单个堆。拥有持久对象和/或语言级别的惰性加载支持也可能有帮助。

9 时间

流行语言需要的最后一个成分是时间。没有人想用一种可能消失的语言编写程序,正如许多编程语言那样。所以大多数黑客倾向于等待一种语言存在几年后才考虑使用它。

奇妙新事物的发明者经常惊讶地发现这一点,但你需要时间让人们理解任何信息。我的朋友很少在第一次有人要求他做事时就做。他知道人们有时要求的东西他们最终并不想要。为了避免浪费时间,他等到第三次或第四次有人要求他做事时;到那时,要求他的人可能相当恼火,但至少他们可能真的想要他们所要求的东西。

大多数人对他们听到的新事物学会了做类似的过滤。直到他们听到某事十次后他们才开始注意。他们是完全合理的:大多数热门的新事物最终证明是浪费时间,最终会消失。通过延迟学习VRML,我完全避免了学习它。

所以任何发明新事物的人必须期望多年来不断重复他们的信息,人们才会开始理解。我们编写了,据我所知,第一个基于Web服务器的应用程序,我们花了多年时间才让人们理解它不必被下载。不是他们愚蠢。他们只是把我们调出去了。

好消息是,简单的重复解决了问题。你所要做的就是继续讲述你的故事,最终人们会开始听到。重要的不是人们注意到你在那里,而是他们注意到你仍然在那里。

通常需要一段时间才能获得动力,这也是好事。大多数技术在首次推出后会进化很多——编程语言尤其如此。对于新技术来说,没有什么比几年内只被少数早期采用者使用更好的了。早期采用者是复杂和挑剔的,并迅速冲洗掉你的技术中仍然存在的任何缺陷。当你只有少数用户时,你可以与所有他们保持密切联系。当你改进你的系统时,即使这造成了一些破坏,早期采用者也会原谅。

有两种引入新技术的方法:有机增长方法和大爆炸方法。有机增长方法以经典的、资金不足的车库创业公司为代表。几个人,在默默无闻中工作,开发一些新技术。他们没有营销就推出它,最初只有少数(狂热的)用户。他们继续改进技术,同时他们的用户群通过口碑增长。在他们意识到之前,他们已经变大了。

另一种方法,大爆炸方法,以风险投资支持的、大量营销的创业公司为代表。他们匆忙开发产品,以极大的宣传推出,并立即(他们希望)拥有大量用户群。

通常,车库的人羡慕大爆炸的人。大爆炸的人是圆滑、自信和受风险投资家尊重的。他们能负担得起最好的东西,围绕发布的宣传活动使他们成为名人也有副作用。有机增长的人,坐在他们的车库里,感到贫穷和不被爱。然而我认为他们经常错误地为自己感到难过。有机增长似乎比大爆炸方法产生更好的技术和更富有的创始人。如果你看看今天的主导技术,你会发现大多数都是有机增长的。

这种模式不仅适用于公司。你在赞助研究中也看到它。Multics和Common Lisp是大爆炸项目,Unix和MacLisp是有机增长项目。

10 重新设计

“最好的写作是重写,“E.B.怀特写道。每个好作家都知道这一点,这对软件也是如此。设计中最重要的部分是重新设计。编程语言尤其如此,没有足够地重新设计。

要编写好的软件,你必须同时在头脑中保持两个相反的想法。你需要年轻黑客对自己能力的天真信念,同时需要老手的怀疑。你必须能够用一半的大脑思考这能有多难?而另一半思考这将永远不会成功。

诀窍是认识到这里没有真正的矛盾。你想要对两件不同的事情乐观和怀疑。你必须对解决问题的可能性乐观,但对你到目前为止的任何解决方案的价值持怀疑态度。

做好工作的人经常认为他们正在做的事情不好。其他人看到他们所做的,充满惊奇,但创造者充满担忧。这种模式并非巧合:正是担忧使工作变好。

如果你能够保持希望和担忧平衡,它们将像你的双腿驱动自行车前进一样驱动项目前进。在双循环创新引擎的第一阶段,你受到你能够解决问题的信心的鼓舞,疯狂地工作于某个问题。在第二阶段,你在清晨的冷光中查看你所做的,非常清楚地看到它的所有缺陷。但只要你的批判精神不超过你的希望,你就能够看着你承认不完整的系统,并思考,剩下的路能有多难?从而继续循环。

保持两种力量平衡是棘手的。在年轻黑客中,乐观主义占主导。他们生产出一些东西,相信它很棒,从不改进它。在老黑客中,怀疑主义占主导,他们甚至不敢承担雄心勃勃的项目。

你能做任何事情来保持重新设计循环都是好的。散文可以一遍又一遍地重写,直到你满意为止。但作为规则,软件没有足够地重新设计。散文有读者,但软件有用户。如果作家重写一篇文章,阅读旧版本的人不太可能抱怨他们的思想被某些新引入的不兼容性破坏。

用户是一把双刃剑。他们可以帮助你改进你的语言,但他们也可以阻止你改进它。所以要小心选择你的用户,并缓慢增长他们的数量。拥有用户就像优化:明智的做法是延迟它。此外,作为一般规则,你在任何给定时间都可以改变比你认为更多的东西。引入变化就像撕掉创可贴:疼痛几乎在你感觉到它时就成为记忆。

每个人都知道由委员会设计语言不是一个好主意。委员会产生糟糕的设计。但我认为委员会最危险的危险是它们干扰重新设计。引入变化是如此多的工作,没有人愿意麻烦。委员会决定的任何事情往往会保持那样,即使大多数成员不喜欢它。

即使是两人的委员会也会妨碍重新设计。这尤其发生在由两个不同的人编写的软件片段之间的接口中。要改变接口,双方必须同意同时改变它。所以接口往往根本不改变,这是一个问题,因为它们往往是任何系统中最临时的部分之一。

这里的一个解决方案可能是设计系统,使接口是水平的而不是垂直的——这样模块总是垂直堆叠的抽象层。然后接口将倾向于由其中一个拥有。两个层次中较低的一个要么是较高层次编写的语言,在这种情况下较低层次将拥有接口,要么它是一个奴隶,在这种情况下接口可以由较高层次支配。

11 Lisp

所有这一切意味着新的Lisp有希望。任何给黑客想要的东西的语言都有希望,包括Lisp。我认为我们可能犯了一个错误,认为黑客被Lisp的奇怪所吓倒。这种令人安慰的幻想可能阻止我们看到Lisp的真正问题,或者至少是Common Lisp的真正问题,即它在做黑客想要做的事情方面很糟糕。黑客的语言需要强大的库和要hack的东西。Common Lisp两者都没有。黑客的语言是简洁和可hack的。Common Lisp不是。

好消息是,不是Lisp糟糕,而是Common Lisp糟糕。如果我们能够开发一种真正是黑客语言的新Lisp,我认为黑客会使用它。他们会使用任何能完成工作的语言。我们所要做的就是确保这个新Lisp在某些重要工作上比其他语言做得更好。

历史提供了一些鼓励。随着时间的推移,新的编程语言越来越多地从Lisp中获取特征。在你制作的语言是Lisp之前,没有多少东西可以复制了。最新的热门语言Python是一种稀释的Lisp,具有中缀语法但没有宏。新的Lisp将是这个进展中的自然步骤。

我有时认为称它为Python的改进版本将是一个好的营销技巧。这听起来比Lisp更时髦。对许多人来说,Lisp是一种有很多括号的慢速AI语言。Fritz Kunze的官方传记小心翼翼地避免提及L-word。但我猜测我们不应该害怕称呼新的Lisp为Lisp。Lisp在最好的黑客中仍然有很多潜在的尊重——那些上过6.001并理解它的人,例如。而那些是你需要赢得的用户。

在《如何成为黑客》中,Eric Raymond将Lisp描述为类似拉丁语或希腊语的东西——你应该作为智力练习学习的语言,即使你不会实际使用它:

Lisp值得学习,因为当你最终理解它时你将获得的深刻启蒙体验;即使你实际上不会大量使用Lisp本身,这种体验也将使你在余下的日子里成为更好的程序员。

如果我不了解Lisp,阅读这个会让我提出问题。一种能让我成为更好程序员的语言,如果它意味着任何东西的话,意味着一种对编程更好的语言。而这实际上是Eric所说的话的含义。

只要那个想法仍然存在,我认为黑客会对新的Lisp足够接受,即使它被称为Lisp。但这个Lisp必须是黑客的语言,像1970年代的经典Lisp。它必须是简洁、简单和可hack的。它必须拥有强大的库来做黑客现在想要做的事情。

在库的问题上,我认为有空间在Perl和Python自己的游戏中击败它们。未来几年需要编写的许多新应用程序将是基于服务器的应用程序。新的Lisp没有理由不应该有与Perl一样好的字符串库,如果这个新Lisp也有用于基于服务器应用程序的强大库,它可能会非常流行。真正的黑客不会对能让他们用几个库调用解决困难问题的新工具嗤之以鼻。记住,黑客是懒惰的。

拥有对基于服务器应用程序的核心语言支持可能是一个更大的胜利。例如,对具有多用户程序的明确支持,或在类型标签级别的数据所有权。

基于服务器的应用程序也给了我们这个新Lisp将用来hack什么问题的答案。让Lisp成为Unix的更好脚本语言不会有什么伤害。(让它更糟将是困难的。)但我认为有些领域现有语言会更容易被击败。我认为遵循Tcl的模式可能更好,提供Lisp以及支持基于服务器应用程序的完整系统。Lisp自然适合基于服务器的应用程序。词法闭包提供了一种在ui只是一系列网页时获得子程序效果的方法。S-表达式很好地映射到html,而宏善于生成它。需要有更好的工具来编写基于服务器的应用程序,需要新的Lisp,两者将很好地协同工作。

12 梦想语言

作为总结,让我们尝试描述黑客的梦想语言。梦想语言是美丽、干净和简洁的。它有一个快速启动的交互式顶层。你可以用非常少的代码编写解决常见问题的程序。在你编写的任何程序中,几乎所有代码都是特定于你的应用程序的。其他一切都已为你完成。

语言的语法简洁得过分。你永远不必键入不必要的字符,甚至不必经常使用shift键。

使用大的抽象,你可以非常快速地编写程序的第一个版本。后来,当你想要优化时,有一个非常好的分析器告诉你在哪里集中注意力。你可以使内循环变得极快,如果需要甚至编写内联字节码。

有很多好的例子可以学习,语言足够直观,你可以从例子中在几分钟内学会如何使用它。你不需要经常看手册。手册是薄的,很少有警告和限定。

语言有一个小的核心,以及强大的、高度正交的库,这些库与核心语言一样精心设计。所有库都很好地协同工作;语言中的一切都像精密相机中的零件一样配合。没有东西被弃用,或为兼容性而保留。所有库的源代码都容易获得。与操作系统和其他语言编写的应用程序对话很容易。

语言是分层构建的。更高级别的抽象以非常透明的方式从较低级别的抽象构建,如果你想要,你可以掌握它们。

没有任何东西对你隐藏,除非绝对必须。语言提供抽象只是为了节省你的工作,而不是作为告诉你做什么的方式。事实上,语言鼓励你成为其设计的平等参与者。你可以改变它的一切,甚至它的语法,而你编写的任何东西都具有与预定义内容尽可能相同的地位。

注释

[1] 与现代想法非常接近的宏由Timothy Hart在1964年提出,即在Lisp 1.5发布两年后。最初缺少的是避免变量捕获和多次求值的方法;Hart的例子受两者影响。

[2] 在When the Air Hits Your Brain中,神经外科医生Frank Vertosick讲述了一段对话,他的首席住院医师Gary谈论外科医生和内科医生(“跳蚤”)之间的区别:

Gary和我点了一个大披萨,找到了一个开放的摊位。主任点燃了一支香烟。“看那些该死的跳蚤,叽叽喳喳地谈论他们一生只会看到一次的疾病。这就是跳蚤的麻烦,他们只喜欢奇怪的东西。他们讨厌他们的面包和黄油案件。这就是我们与该死的跳蚤的区别。看,我们喜欢大的多汁的腰椎间盘突出,但他们讨厌高血压…”

很难将腰椎间盘突出视为多汁的(除了字面意思)。然而我想我知道他们的意思。我经常有一个多汁的bug要追踪。不是程序员的人会发现很难想象bug中可能有乐趣。当然,如果一切正常工作会更好。在一方面,是的。然而在追踪某些类型的bug时无疑有一种 grim satisfaction。


后记版本

Arc

语言设计五个问题

如何成为黑客

日语翻译

Being Popular

May 2001

(This article was written as a kind of business plan for a new language. So it is missing (because it takes for granted) the most important feature of a good programming language: very powerful abstractions.)

A friend of mine once told an eminent operating systems expert that he wanted to design a really good programming language. The expert told him that it would be a waste of time, that programming languages don’t become popular or unpopular based on their merits, and so no matter how good his language was, no one would use it. At least, that was what had happened to the language he had designed.

What does make a language popular? Do popular languages deserve their popularity? Is it worth trying to define a good programming language? How would you do it?

I think the answers to these questions can be found by looking at hackers, and learning what they want. Programming languages are for hackers, and a programming language is good as a programming language (rather than, say, an exercise in denotational semantics or compiler design) if and only if hackers like it.

1 The Mechanics of Popularity

It’s true, certainly, that most people don’t choose programming languages simply based on their merits. Most programmers are told what language to use by someone else. And yet I think the effect of such external factors on the popularity of programming languages is not as great as it’s sometimes thought to be. I think a bigger problem is that a hacker’s idea of a good programming language is not the same as most language designers’.

Between the two, the hacker’s opinion is the one that matters. Programming languages are not theorems. They’re tools, designed for people, and they have to be designed to suit human strengths and weaknesses as much as shoes have to be designed for human feet. If a shoe pinches when you put it on, it’s a bad shoe, however elegant it may be as a piece of sculpture.

It may be that the majority of programmers can’t tell a good language from a bad one. But that’s no different with any other tool. It doesn’t mean that it’s a waste of time to try designing a good language. Expert hackers can tell a good language when they see one, and they’ll use it. Expert hackers are a tiny minority, admittedly, but that tiny minority write all the good software, and their influence is such that the rest of the programmers will tend to use whatever language they use. Often, indeed, it is not merely influence but command: often the expert hackers are the very people who, as their bosses or faculty advisors, tell the other programmers what language to use.

The opinion of expert hackers is not the only force that determines the relative popularity of programming languages — legacy software (Cobol) and hype (Ada, Java) also play a role — but I think it is the most powerful force over the long term. Given an initial critical mass and enough time, a programming language probably becomes about as popular as it deserves to be. And popularity further separates good languages from bad ones, because feedback from real live users always leads to improvements. Look at how much any popular language has changed during its life. Perl and Fortran are extreme cases, but even Lisp has changed a lot. Lisp 1.5 didn’t have macros, for example; these evolved later, after hackers at MIT had spent a couple years using Lisp to write real programs. [1]

So whether or not a language has to be good to be popular, I think a language has to be popular to be good. And it has to stay popular to stay good. The state of the art in programming languages doesn’t stand still. And yet the Lisps we have today are still pretty much what they had at MIT in the mid-1980s, because that’s the last time Lisp had a sufficiently large and demanding user base.

Of course, hackers have to know about a language before they can use it. How are they to hear? From other hackers. But there has to be some initial group of hackers using the language for others even to hear about it. I wonder how large this group has to be; how many users make a critical mass? Off the top of my head, I’d say twenty. If a language had twenty separate users, meaning twenty users who decided on their own to use it, I’d consider it to be real.

Getting there can’t be easy. I would not be surprised if it is harder to get from zero to twenty than from twenty to a thousand. The best way to get those initial twenty users is probably to use a trojan horse: to give people an application they want, which happens to be written in the new language.

2 External Factors

Let’s start by acknowledging one external factor that does affect the popularity of a programming language. To become popular, a programming language has to be the scripting language of a popular system. Fortran and Cobol were the scripting languages of early IBM mainframes. C was the scripting language of Unix, and so, later, was Perl. Tcl is the scripting language of Tk. Java and Javascript are intended to be the scripting languages of web browsers.

Lisp is not a massively popular language because it is not the scripting language of a massively popular system. What popularity it retains dates back to the 1960s and 1970s, when it was the scripting language of MIT. A lot of the great programmers of the day were associated with MIT at some point. And in the early 1970s, before C, MIT’s dialect of Lisp, called MacLisp, was one of the only programming languages a serious hacker would want to use.

Today Lisp is the scripting language of two moderately popular systems, Emacs and Autocad, and for that reason I suspect that most of the Lisp programming done today is done in Emacs Lisp or AutoLisp.

Programming languages don’t exist in isolation. To hack is a transitive verb — hackers are usually hacking something — and in practice languages are judged relative to whatever they’re used to hack. So if you want to design a popular language, you either have to supply more than a language, or you have to design your language to replace the scripting language of some existing system.

Common Lisp is unpopular partly because it’s an orphan. It did originally come with a system to hack: the Lisp Machine. But Lisp Machines (along with parallel computers) were steamrollered by the increasing power of general purpose processors in the 1980s. Common Lisp might have remained popular if it had been a good scripting language for Unix. It is, alas, an atrociously bad one.

One way to describe this situation is to say that a language isn’t judged on its own merits. Another view is that a programming language really isn’t a programming language unless it’s also the scripting language of something. This only seems unfair if it comes as a surprise. I think it’s no more unfair than expecting a programming language to have, say, an implementation. It’s just part of what a programming language is.

A programming language does need a good implementation, of course, and this must be free. Companies will pay for software, but individual hackers won’t, and it’s the hackers you need to attract.

A language also needs to have a book about it. The book should be thin, well-written, and full of good examples. K&R is the ideal here. At the moment I’d almost say that a language has to have a book published by O’Reilly. That’s becoming the test of mattering to hackers.

There should be online documentation as well. In fact, the book can start as online documentation. But I don’t think that physical books are outmoded yet. Their format is convenient, and the de facto censorship imposed by publishers is a useful if imperfect filter. Bookstores are one of the most important places for learning about new languages.

3 Brevity

Given that you can supply the three things any language needs — a free implementation, a book, and something to hack — how do you make a language that hackers will like?

One thing hackers like is brevity. Hackers are lazy, in the same way that mathematicians and modernist architects are lazy: they hate anything extraneous. It would not be far from the truth to say that a hacker about to write a program decides what language to use, at least subconsciously, based on the total number of characters he’ll have to type. If this isn’t precisely how hackers think, a language designer would do well to act as if it were.

It is a mistake to try to baby the user with long-winded expressions that are meant to resemble English. Cobol is notorious for this flaw. A hacker would consider being asked to write

add x to y giving z

instead of

z = x+y

as something between an insult to his intelligence and a sin against God.

It has sometimes been said that Lisp should use first and rest instead of car and cdr, because it would make programs easier to read. Maybe for the first couple hours. But a hacker can learn quickly enough that car means the first element of a list and cdr means the rest. Using first and rest means 50% more typing. And they are also different lengths, meaning that the arguments won’t line up when they’re called, as car and cdr often are, in successive lines. I’ve found that it matters a lot how code lines up on the page. I can barely read Lisp code when it is set in a variable-width font, and friends say this is true for other languages too.

Brevity is one place where strongly typed languages lose. All other things being equal, no one wants to begin a program with a bunch of declarations. Anything that can be implicit, should be.

The individual tokens should be short as well. Perl and Common Lisp occupy opposite poles on this question. Perl programs can be almost cryptically dense, while the names of built-in Common Lisp operators are comically long. The designers of Common Lisp probably expected users to have text editors that would type these long names for them. But the cost of a long name is not just the cost of typing it. There is also the cost of reading it, and the cost of the space it takes up on your screen.

4 Hackability

There is one thing more important than brevity to a hacker: being able to do what you want. In the history of programming languages a surprising amount of effort has gone into preventing programmers from doing things considered to be improper. This is a dangerously presumptuous plan. How can the language designer know what the programmer is going to need to do? I think language designers would do better to consider their target user to be a genius who will need to do things they never anticipated, rather than a bumbler who needs to be protected from himself. The bumbler will shoot himself in the foot anyway. You may save him from referring to variables in another package, but you can’t save him from writing a badly designed program to solve the wrong problem, and taking forever to do it.

Good programmers often want to do dangerous and unsavory things. By unsavory I mean things that go behind whatever semantic facade the language is trying to present: getting hold of the internal representation of some high-level abstraction, for example. Hackers like to hack, and hacking means getting inside things and second guessing the original designer.

Let yourself be second guessed. When you make any tool, people use it in ways you didn’t intend, and this is especially true of a highly articulated tool like a programming language. Many a hacker will want to tweak your semantic model in a way that you never imagined. I say, let them; give the programmer access to as much internal stuff as you can without endangering runtime systems like the garbage collector.

In Common Lisp I have often wanted to iterate through the fields of a struct — to comb out references to a deleted object, for example, or find fields that are uninitialized. I know the structs are just vectors underneath. And yet I can’t write a general purpose function that I can call on any struct. I can only access the fields by name, because that’s what a struct is supposed to mean.

A hacker may only want to subvert the intended model of things once or twice in a big program. But what a difference it makes to be able to. And it may be more than a question of just solving a problem. There is a kind of pleasure here too. Hackers share the surgeon’s secret pleasure in poking about in gross innards, the teenager’s secret pleasure in popping zits. [2] For boys, at least, certain kinds of horrors are fascinating. Maxim magazine publishes an annual volume of photographs, containing a mix of pin-ups and grisly accidents. They know their audience.

Historically, Lisp has been good at letting hackers have their way. The political correctness of Common Lisp is an aberration. Early Lisps let you get your hands on everything. A good deal of that spirit is, fortunately, preserved in macros. What a wonderful thing, to be able to make arbitrary transformations on the source code.

Classic macros are a real hacker’s tool — simple, powerful, and dangerous. It’s so easy to understand what they do: you call a function on the macro’s arguments, and whatever it returns gets inserted in place of the macro call. Hygienic macros embody the opposite principle. They try to protect you from understanding what they’re doing. I have never heard hygienic macros explained in one sentence. And they are a classic example of the dangers of deciding what programmers are allowed to want. Hygienic macros are intended to protect me from variable capture, among other things, but variable capture is exactly what I want in some macros.

A really good language should be both clean and dirty: cleanly designed, with a small core of well understood and highly orthogonal operators, but dirty in the sense that it lets hackers have their way with it. C is like this. So were the early Lisps. A real hacker’s language will always have a slightly raffish character.

A good programming language should have features that make the kind of people who use the phrase “software engineering” shake their heads disapprovingly. At the other end of the continuum are languages like Ada and Pascal, models of propriety that are good for teaching and not much else.

5 Throwaway Programs

To be attractive to hackers, a language must be good for writing the kinds of programs they want to write. And that means, perhaps surprisingly, that it has to be good for writing throwaway programs.

A throwaway program is a program you write quickly for some limited task: a program to automate some system administration task, or generate test data for a simulation, or convert data from one format to another. The surprising thing about throwaway programs is that, like the “temporary” buildings built at so many American universities during World War II, they often don’t get thrown away. Many evolve into real programs, with real features and real users.

I have a hunch that the best big programs begin life this way, rather than being designed big from the start, like the Hoover Dam. It’s terrifying to build something big from scratch. When people take on a project that’s too big, they become overwhelmed. The project either gets bogged down, or the result is sterile and wooden: a shopping mall rather than a real downtown, Brasilia rather than Rome, Ada rather than C.

Another way to get a big program is to start with a throwaway program and keep improving it. This approach is less daunting, and the design of the program benefits from evolution. I think, if one looked, that this would turn out to be the way most big programs were developed. And those that did evolve this way are probably still written in whatever language they were first written in, because it’s rare for a program to be ported, except for political reasons. And so, paradoxically, if you want to make a language that is used for big systems, you have to make it good for writing throwaway programs, because that’s where big systems come from.

Perl is a striking example of this idea. It was not only designed for writing throwaway programs, but was pretty much a throwaway program itself. Perl began life as a collection of utilities for generating reports, and only evolved into a programming language as the throwaway programs people wrote in it grew larger. It was not until Perl 5 (if then) that the language was suitable for writing serious programs, and yet it was already massively popular.

What makes a language good for throwaway programs? To start with, it must be readily available. A throwaway program is something that you expect to write in an hour. So the language probably must already be installed on the computer you’re using. It can’t be something you have to install before you use it. It has to be there. C was there because it came with the operating system. Perl was there because it was originally a tool for system administrators, and yours had already installed it.

Being available means more than being installed, though. An interactive language, with a command-line interface, is more available than one that you have to compile and run separately. A popular programming language should be interactive, and start up fast.

Another thing you want in a throwaway program is brevity. Brevity is always attractive to hackers, and never more so than in a program they expect to turn out in an hour.

6 Libraries

Of course the ultimate in brevity is to have the program already written for you, and merely to call it. And this brings us to what I think will be an increasingly important feature of programming languages: library functions. Perl wins because it has large libraries for manipulating strings. This class of library functions are especially important for throwaway programs, which are often originally written for converting or extracting data. Many Perl programs probably begin as just a couple library calls stuck together.

I think a lot of the advances that happen in programming languages in the next fifty years will have to do with library functions. I think future programming languages will have libraries that are as carefully designed as the core language. Programming language design will not be about whether to make your language strongly or weakly typed, or object oriented, or functional, or whatever, but about how to design great libraries. The kind of language designers who like to think about how to design type systems may shudder at this. It’s almost like writing applications! Too bad. Languages are for programmers, and libraries are what programmers need.

It’s hard to design good libraries. It’s not simply a matter of writing a lot of code. Once the libraries get too big, it can sometimes take longer to find the function you need than to write the code yourself. Libraries need to be designed using a small set of orthogonal operators, just like the core language. It ought to be possible for the programmer to guess what library call will do what he needs.

Libraries are one place Common Lisp falls short. There are only rudimentary libraries for manipulating strings, and almost none for talking to the operating system. For historical reasons, Common Lisp tries to pretend that the OS doesn’t exist. And because you can’t talk to the OS, you’re unlikely to be able to write a serious program using only the built-in operators in Common Lisp. You have to use some implementation-specific hacks as well, and in practice these tend not to give you everything you want. Hackers would think a lot more highly of Lisp if Common Lisp had powerful string libraries and good OS support.

7 Syntax

Could a language with Lisp’s syntax, or more precisely, lack of syntax, ever become popular? I don’t know the answer to this question. I do think that syntax is not the main reason Lisp isn’t currently popular. Common Lisp has worse problems than unfamiliar syntax. I know several programmers who are comfortable with prefix syntax and yet use Perl by default, because it has powerful string libraries and can talk to the os.

There are two possible problems with prefix notation: that it is unfamiliar to programmers, and that it is not dense enough. The conventional wisdom in the Lisp world is that the first problem is the real one. I’m not so sure. Yes, prefix notation makes ordinary programmers panic. But I don’t think ordinary programmers’ opinions matter. Languages become popular or unpopular based on what expert hackers think of them, and I think expert hackers might be able to deal with prefix notation. Perl syntax can be pretty incomprehensible, but that has not stood in the way of Perl’s popularity. If anything it may have helped foster a Perl cult.

A more serious problem is the diffuseness of prefix notation. For expert hackers, that really is a problem. No one wants to write (aref a x y) when they could write a[x,y].

In this particular case there is a way to finesse our way out of the problem. If we treat data structures as if they were functions on indexes, we could write (a x y) instead, which is even shorter than the Perl form. Similar tricks may shorten other types of expressions.

We can get rid of (or make optional) a lot of parentheses by making indentation significant. That’s how programmers read code anyway: when indentation says one thing and delimiters say another, we go by the indentation. Treating indentation as significant would eliminate this common source of bugs as well as making programs shorter.

Sometimes infix syntax is easier to read. This is especially true for math expressions. I’ve used Lisp my whole programming life and I still don’t find prefix math expressions natural. And yet it is convenient, especially when you’re generating code, to have operators that take any number of arguments. So if we do have infix syntax, it should probably be implemented as some kind of read-macro.

I don’t think we should be religiously opposed to introducing syntax into Lisp, as long as it translates in a well-understood way into underlying s-expressions. There is already a good deal of syntax in Lisp. It’s not necessarily bad to introduce more, as long as no one is forced to use it. In Common Lisp, some delimiters are reserved for the language, suggesting that at least some of the designers intended to have more syntax in the future.

One of the most egregiously unlispy pieces of syntax in Common Lisp occurs in format strings; format is a language in its own right, and that language is not Lisp. If there were a plan for introducing more syntax into Lisp, format specifiers might be able to be included in it. It would be a good thing if macros could generate format specifiers the way they generate any other kind of code.

An eminent Lisp hacker told me that his copy of CLTL falls open to the section format. Mine too. This probably indicates room for improvement. It may also mean that programs do a lot of I/O.

8 Efficiency

A good language, as everyone knows, should generate fast code. But in practice I don’t think fast code comes primarily from things you do in the design of the language. As Knuth pointed out long ago, speed only matters in certain critical bottlenecks. And as many programmers have observed since, one is very often mistaken about where these bottlenecks are.

So, in practice, the way to get fast code is to have a very good profiler, rather than by, say, making the language strongly typed. You don’t need to know the type of every argument in every call in the program. You do need to be able to declare the types of arguments in the bottlenecks. And even more, you need to be able to find out where the bottlenecks are.

One complaint people have had with Lisp is that it’s hard to tell what’s expensive. This might be true. It might also be inevitable, if you want to have a very abstract language. And in any case I think good profiling would go a long way toward fixing the problem: you’d soon learn what was expensive.

Part of the problem here is social. Language designers like to write fast compilers. That’s how they measure their skill. They think of the profiler as an add-on, at best. But in practice a good profiler may do more to improve the speed of actual programs written in the language than a compiler that generates fast code. Here, again, language designers are somewhat out of touch with their users. They do a really good job of solving slightly the wrong problem.

It might be a good idea to have an active profiler — to push performance data to the programmer instead of waiting for him to come asking for it. For example, the editor could display bottlenecks in red when the programmer edits the source code. Another approach would be to somehow represent what’s happening in running programs. This would be an especially big win in server-based applications, where you have lots of running programs to look at. An active profiler could show graphically what’s happening in memory as a program’s running, or even make sounds that tell what’s happening.

Sound is a good cue to problems. In one place I worked, we had a big board of dials showing what was happening to our web servers. The hands were moved by little servomotors that made a slight noise when they turned. I couldn’t see the board from my desk, but I found that I could tell immediately, by the sound, when there was a problem with a server.

It might even be possible to write a profiler that would automatically detect inefficient algorithms. I would not be surprised if certain patterns of memory access turned out to be sure signs of bad algorithms. If there were a little guy running around inside the computer executing our programs, he would probably have as long and plaintive a tale to tell about his job as a federal government employee. I often have a feeling that I’m sending the processor on a lot of wild goose chases, but I’ve never had a good way to look at what it’s doing.

A number of Lisps now compile into byte code, which is then executed by an interpreter. This is usually done to make the implementation easier to port, but it could be a useful language feature. It might be a good idea to make the byte code an official part of the language, and to allow programmers to use inline byte code in bottlenecks. Then such optimizations would be portable too.

The nature of speed, as perceived by the end-user, may be changing. With the rise of server-based applications, more and more programs may turn out to be i/o-bound. It will be worth making i/o fast. The language can help with straightforward measures like simple, fast, formatted output functions, and also with deep structural changes like caching and persistent objects.

Users are interested in response time. But another kind of efficiency will be increasingly important: the number of simultaneous users you can support per processor. Many of the interesting applications written in the near future will be server-based, and the number of users per server is the critical question for anyone hosting such applications. In the capital cost of a business offering a server-based application, this is the divisor.

For years, efficiency hasn’t mattered much in most end-user applications. Developers have been able to assume that each user would have an increasingly powerful processor sitting on their desk. And by Parkinson’s Law, software has expanded to use the resources available. That will change with server-based applications. In that world, the hardware and software will be supplied together. For companies that offer server-based applications, it will make a very big difference to the bottom line how many users they can support per server.

In some applications, the processor will be the limiting factor, and execution speed will be the most important thing to optimize. But often memory will be the limit; the number of simultaneous users will be determined by the amount of memory you need for each user’s data. The language can help here too. Good support for threads will enable all the users to share a single heap. It may also help to have persistent objects and/or language level support for lazy loading.

9 Time

The last ingredient a popular language needs is time. No one wants to write programs in a language that might go away, as so many programming languages do. So most hackers will tend to wait until a language has been around for a couple years before even considering using it.

Inventors of wonderful new things are often surprised to discover this, but you need time to get any message through to people. A friend of mine rarely does anything the first time someone asks him. He knows that people sometimes ask for things that they turn out not to want. To avoid wasting his time, he waits till the third or fourth time he’s asked to do something; by then, whoever’s asking him may be fairly annoyed, but at least they probably really do want whatever they’re asking for.

Most people have learned to do a similar sort of filtering on new things they hear about. They don’t even start paying attention until they’ve heard about something ten times. They’re perfectly justified: the majority of hot new whatevers do turn out to be a waste of time, and eventually go away. By delaying learning VRML, I avoided having to learn it at all.

So anyone who invents something new has to expect to keep repeating their message for years before people will start to get it. We wrote what was, as far as I know, the first web-server based application, and it took us years to get it through to people that it didn’t have to be downloaded. It wasn’t that they were stupid. They just had us tuned out.

The good news is, simple repetition solves the problem. All you have to do is keep telling your story, and eventually people will start to hear. It’s not when people notice you’re there that they pay attention; it’s when they notice you’re still there.

It’s just as well that it usually takes a while to gain momentum. Most technologies evolve a good deal even after they’re first launched — programming languages especially. Nothing could be better, for a new techology, than a few years of being used only by a small number of early adopters. Early adopters are sophisticated and demanding, and quickly flush out whatever flaws remain in your technology. When you only have a few users you can be in close contact with all of them. And early adopters are forgiving when you improve your system, even if this causes some breakage.

There are two ways new technology gets introduced: the organic growth method, and the big bang method. The organic growth method is exemplified by the classic seat-of-the-pants underfunded garage startup. A couple guys, working in obscurity, develop some new technology. They launch it with no marketing and initially have only a few (fanatically devoted) users. They continue to improve the technology, and meanwhile their user base grows by word of mouth. Before they know it, they’re big.

The other approach, the big bang method, is exemplified by the VC-backed, heavily marketed startup. They rush to develop a product, launch it with great publicity, and immediately (they hope) have a large user base.

Generally, the garage guys envy the big bang guys. The big bang guys are smooth and confident and respected by the VCs. They can afford the best of everything, and the PR campaign surrounding the launch has the side effect of making them celebrities. The organic growth guys, sitting in their garage, feel poor and unloved. And yet I think they are often mistaken to feel sorry for themselves. Organic growth seems to yield better technology and richer founders than the big bang method. If you look at the dominant technologies today, you’ll find that most of them grew organically.

This pattern doesn’t only apply to companies. You see it in sponsored research too. Multics and Common Lisp were big-bang projects, and Unix and MacLisp were organic growth projects.

10 Redesign

“The best writing is rewriting,” wrote E. B. White. Every good writer knows this, and it’s true for software too. The most important part of design is redesign. Programming languages, especially, don’t get redesigned enough.

To write good software you must simultaneously keep two opposing ideas in your head. You need the young hacker’s naive faith in his abilities, and at the same time the veteran’s skepticism. You have to be able to think how hard can it be? with one half of your brain while thinking it will never work with the other.

The trick is to realize that there’s no real contradiction here. You want to be optimistic and skeptical about two different things. You have to be optimistic about the possibility of solving the problem, but skeptical about the value of whatever solution you’ve got so far.

People who do good work often think that whatever they’re working on is no good. Others see what they’ve done and are full of wonder, but the creator is full of worry. This pattern is no coincidence: it is the worry that made the work good.

If you can keep hope and worry balanced, they will drive a project forward the same way your two legs drive a bicycle forward. In the first phase of the two-cycle innovation engine, you work furiously on some problem, inspired by your confidence that you’ll be able to solve it. In the second phase, you look at what you’ve done in the cold light of morning, and see all its flaws very clearly. But as long as your critical spirit doesn’t outweigh your hope, you’ll be able to look at your admittedly incomplete system, and think, how hard can it be to get the rest of the way?, thereby continuing the cycle.

It’s tricky to keep the two forces balanced. In young hackers, optimism predominates. They produce something, are convinced it’s great, and never improve it. In old hackers, skepticism predominates, and they won’t even dare to take on ambitious projects.

Anything you can do to keep the redesign cycle going is good. Prose can be rewritten over and over until you’re happy with it. But software, as a rule, doesn’t get redesigned enough. Prose has readers, but software has users. If a writer rewrites an essay, people who read the old version are unlikely to complain that their thoughts have been broken by some newly introduced incompatibility.

Users are a double-edged sword. They can help you improve your language, but they can also deter you from improving it. So choose your users carefully, and be slow to grow their number. Having users is like optimization: the wise course is to delay it. Also, as a general rule, you can at any given time get away with changing more than you think. Introducing change is like pulling off a bandage: the pain is a memory almost as soon as you feel it.

Everyone knows that it’s not a good idea to have a language designed by a committee. Committees yield bad design. But I think the worst danger of committees is that they interfere with redesign. It is so much work to introduce changes that no one wants to bother. Whatever a committee decides tends to stay that way, even if most of the members don’t like it.

Even a committee of two gets in the way of redesign. This happens particularly in the interfaces between pieces of software written by two different people. To change the interface both have to agree to change it at once. And so interfaces tend not to change at all, which is a problem because they tend to be one of the most ad hoc parts of any system.

One solution here might be to design systems so that interfaces are horizontal instead of vertical — so that modules are always vertically stacked strata of abstraction. Then the interface will tend to be owned by one of them. The lower of two levels will either be a language in which the upper is written, in which case the lower level will own the interface, or it will be a slave, in which case the interface can be dictated by the upper level.

11 Lisp

What all this implies is that there is hope for a new Lisp. There is hope for any language that gives hackers what they want, including Lisp. I think we may have made a mistake in thinking that hackers are turned off by Lisp’s strangeness. This comforting illusion may have prevented us from seeing the real problem with Lisp, or at least Common Lisp, which is that it sucks for doing what hackers want to do. A hacker’s language needs powerful libraries and something to hack. Common Lisp has neither. A hacker’s language is terse and hackable. Common Lisp is not.

The good news is, it’s not Lisp that sucks, but Common Lisp. If we can develop a new Lisp that is a real hacker’s language, I think hackers will use it. They will use whatever language does the job. All we have to do is make sure this new Lisp does some important job better than other languages.

History offers some encouragement. Over time, successive new programming languages have taken more and more features from Lisp. There is no longer much left to copy before the language you’ve made is Lisp. The latest hot language, Python, is a watered-down Lisp with infix syntax and no macros. A new Lisp would be a natural step in this progression.

I sometimes think that it would be a good marketing trick to call it an improved version of Python. That sounds hipper than Lisp. To many people, Lisp is a slow AI language with a lot of parentheses. Fritz Kunze’s official biography carefully avoids mentioning the L-word. But my guess is that we shouldn’t be afraid to call the new Lisp Lisp. Lisp still has a lot of latent respect among the very best hackers — the ones who took 6.001 and understood it, for example. And those are the users you need to win.

In “How to Become a Hacker,” Eric Raymond describes Lisp as something like Latin or Greek — a language you should learn as an intellectual exercise, even though you won’t actually use it:

Lisp is worth learning for the profound enlightenment experience you will have when you finally get it; that experience will make you a better programmer for the rest of your days, even if you never actually use Lisp itself a lot.

If I didn’t know Lisp, reading this would set me asking questions. A language that would make me a better programmer, if it means anything at all, means a language that would be better for programming. And that is in fact the implication of what Eric is saying.

As long as that idea is still floating around, I think hackers will be receptive enough to a new Lisp, even if it is called Lisp. But this Lisp must be a hacker’s language, like the classic Lisps of the 1970s. It must be terse, simple, and hackable. And it must have powerful libraries for doing what hackers want to do now.

In the matter of libraries I think there is room to beat languages like Perl and Python at their own game. A lot of the new applications that will need to be written in the coming years will be server-based applications. There’s no reason a new Lisp shouldn’t have string libraries as good as Perl, and if this new Lisp also had powerful libraries for server-based applications, it could be very popular. Real hackers won’t turn up their noses at a new tool that will let them solve hard problems with a few library calls. Remember, hackers are lazy.

It could be an even bigger win to have core language support for server-based applications. For example, explicit support for programs with multiple users, or data ownership at the level of type tags.

Server-based applications also give us the answer to the question of what this new Lisp will be used to hack. It would not hurt to make Lisp better as a scripting language for Unix. (It would be hard to make it worse.) But I think there are areas where existing languages would be easier to beat. I think it might be better to follow the model of Tcl, and supply the Lisp together with a complete system for supporting server-based applications. Lisp is a natural fit for server-based applications. Lexical closures provide a way to get the effect of subroutines when the ui is just a series of web pages. S-expressions map nicely onto html, and macros are good at generating it. There need to be better tools for writing server-based applications, and there needs to be a new Lisp, and the two would work very well together.

12 The Dream Language

By way of summary, let’s try describing the hacker’s dream language. The dream language is beautiful, clean, and terse. It has an interactive toplevel that starts up fast. You can write programs to solve common problems with very little code. Nearly all the code in any program you write is code that’s specific to your application. Everything else has been done for you.

The syntax of the language is brief to a fault. You never have to type an unnecessary character, or even to use the shift key much.

Using big abstractions you can write the first version of a program very quickly. Later, when you want to optimize, there’s a really good profiler that tells you where to focus your attention. You can make inner loops blindingly fast, even writing inline byte code if you need to.

There are lots of good examples to learn from, and the language is intuitive enough that you can learn how to use it from examples in a couple minutes. You don’t need to look in the manual much. The manual is thin, and has few warnings and qualifications.

The language has a small core, and powerful, highly orthogonal libraries that are as carefully designed as the core language. The libraries all work well together; everything in the language fits together like the parts in a fine camera. Nothing is deprecated, or retained for compatibility. The source code of all the libraries is readily available. It’s easy to talk to the operating system and to applications written in other languages.

The language is built in layers. The higher-level abstractions are built in a very transparent way out of lower-level abstractions, which you can get hold of if you want.

Nothing is hidden from you that doesn’t absolutely have to be. The language offers abstractions only as a way of saving you work, rather than as a way of telling you what to do. In fact, the language encourages you to be an equal participant in its design. You can change everything about it, including even its syntax, and anything you write has, as much as possible, the same status as what comes predefined.

Notes

[1] Macros very close to the modern idea were proposed by Timothy Hart in 1964, two years after Lisp 1.5 was released. What was missing, initially, were ways to avoid variable capture and multiple evaluation; Hart’s examples are subject to both.

[2] In When the Air Hits Your Brain, neurosurgeon Frank Vertosick recounts a conversation in which his chief resident, Gary, talks about the difference between surgeons and internists (“fleas”):

Gary and I ordered a large pizza and found an open booth. The chief lit a cigarette. “Look at those goddamn fleas, jabbering about some disease they’ll see once in their lifetimes. That’s the trouble with fleas, they only like the bizarre stuff. They hate their bread and butter cases. That’s the difference between us and the fucking fleas. See, we love big juicy lumbar disc herniations, but they hate hypertension…”

It’s hard to think of a lumbar disc herniation as juicy (except literally). And yet I think I know what they mean. I’ve often had a juicy bug to track down. Someone who’s not a programmer would find it hard to imagine that there could be pleasure in a bug. Surely it’s better if everything just works. In one way, it is. And yet there is undeniably a grim satisfaction in hunting down certain sorts of bugs.


Postscript Version

Arc

Five Questions about Language Design

How to Become a Hacker

Japanese Translation