关于语言设计的五个问题

Paul Graham 2001-05-01

关于语言设计的五个问题

2001年5月

(这些是我在2001年5月10日MIT编程语言设计小组讨论会上所做的笔记。)

1. 编程语言是为人设计的。

编程语言是人们与计算机交流的方式。计算机对于任何无歧义的语言都会同样满意。我们有高级语言的原因是因为人们无法处理机器语言。编程语言的要点是防止我们可怜脆弱的人类大脑被大量细节所淹没。

建筑师知道某些设计问题比其他问题更具个人性。最干净、最抽象的设计问题之一是设计桥梁。在那里,你的工作主要是用最少的材料跨越给定的距离。光谱的另一端是设计椅子。椅子设计师必须花时间思考人类的臀部。

软件也是如此。设计网络数据路由算法是一个很好的抽象问题,就像设计桥梁。而设计编程语言就像设计椅子:这完全是关于处理人类弱点的问题。

我们大多数人都讨厌承认这一点。设计具有数学优雅性的系统对我们大多数人来说比迎合人类弱点听起来更有吸引力。数学优雅确实有作用:某些优雅性使程序更容易理解。但优雅本身不是目的。

当我说语言必须设计得适合人类弱点时,我并不是说语言必须为糟糕的程序员设计。事实上我认为你应该为最好的程序员设计,但即使是最好的程序员也有局限性。我认为没有人会喜欢在所有变量都是带整数下标的字母x的语言中编程。

2. 为自己和你的朋友设计。

如果你看看编程语言的历史,很多最好的语言都是为其作者自己使用的语言设计的,而很多最差的语言是为其他人使用而设计的。

当语言为其他人设计时,总是特定的一群其他人:没有语言设计师聪明的人。所以你得到一种居高临下的语言。Cobol是最极端的例子,但很多语言都充斥着这种精神。

这与语言的抽象程度无关。C是相当低级的,但它是为其作者使用而设计的,这就是为什么黑客喜欢它。

为糟糕的程序员设计语言的论点是糟糕的程序员比好程序员多。可能是这样。但是那些少数好程序员编写了不成比例的大量软件。

我对这个问题很感兴趣,你如何设计一种最好的黑客会喜欢的语言?我碰巧认为这与如何设计一种好的编程语言是同一个问题,但即使不是,它至少是一个有趣的问题。

3. 给程序员尽可能多的控制权。

许多语言(特别是为其他人设计的语言)都有保姆的态度:它们试图阻止你做它们认为对你不好的事情。我喜欢相反的方法:给程序员尽可能多的控制权。

当我第一次学习Lisp时,我最喜欢的是它把我当作平等的伙伴。在我之前学习的其他语言中,有语言本身和用该语言写的我的程序,两者非常分离。但在Lisp中,我编写的函数和宏就像构成语言本身的那些一样。如果我想,我可以重写语言。它有着与开源软件相同的吸引力。

4. 追求简洁。

简洁被低估甚至被鄙视。但如果你深入了解黑客的内心,你会发现他们真的很喜欢它。你有多少次听到黑客深情地谈到,比如说,在APL中,他们只需几行代码就能做出惊人的事情?我认为任何真正聪明的人真正喜欢的东西都值得注意。

我认为几乎任何能让程序更短的事情都是好的。应该有很多库函数;任何可以隐含的东西都应该;语法应该简洁到极致;甚至事物的名称都应该简短。

不仅程序应该简短。手册也应该薄。手册的很大部分被用于澄清、保留、警告和特殊情况。如果你强迫自己缩短手册,在最好的情况下,你会通过修复语言中需要这么多解释的东西来实现。

5. 承认黑客的本质。

很多人希望黑客是数学,或者至少是类似自然科学的东西。我认为黑客更像是建筑。建筑与物理学有关,因为建筑师必须设计不会倒塌的建筑,但建筑师的真正目标是建造伟大的建筑,而不是做出关于静力学的发现。

黑客喜欢做的是编写伟大的程序。而且我认为,至少在我们自己的心中,我们必须记住,编写伟大的程序是一件令人钦佩的事情,即使这项工作不容易转化为研究论文的传统智力货币。在智力上,设计程序员会喜欢的语言与设计一个包含你可以发表论文的某些想法的糟糕语言同样有价值。

开放性问题

1. 如何组织大型库?

库正在成为编程语言越来越重要的组成部分。它们也在变得更大,这可能很危险。如果找到能做你想要的事情的库函数比你自己编写它需要更长的时间,那么所有这些代码只是在让你的手册变厚。(Symbolics手册就是一个例子。)所以我认为我们必须研究组织库的方法。理想的情况是设计它们,使程序员能够猜测哪个库调用会做正确的事情。

2. 人们真的害怕前缀语法吗?

这是一个开放性问题,在这个意义上我多年来一直在思考它,但仍然不知道答案。前缀语法对我来说似乎完全自然,可能除了数学。但Lisp不受欢迎可能只是因为它有陌生的语法。如果这是真的,是否要对此做些什么是另一个问题。

3. 基于服务器的软件需要什么?

我认为未来二十年内编写的最令人兴奋的新应用程序中,很多将是基于Web的应用程序,意味着程序位于服务器上并通过Web浏览器与你交谈。而编写这类程序我们可能需要一些新东西。

我们需要的一件事是支持基于服务器的应用程序发布的新方式。不像桌面软件那样每年有一两个大的发布版本,基于服务器的应用程序作为一系列小变化发布。你一天可能有五到十个发布版本。而且作为规则,每个人都会总是使用最新版本。

你知道你可以设计程序以便调试吗?嗯,基于服务器的软件同样必须设计成可改变的。你必须能够轻松地改变它,或者至少知道什么是小的改变,什么是重大的改变。

另一个可能对基于服务器的软件有用的东西,令人惊讶的是,是continuations。在基于Web的软件中,你可以使用类似continuation-passing style的东西来在Web会话本质上无状态的世界中获得子程序的效果。如果不太昂贵,拥有实际的continuations可能是值得的。

4. 还有什么新的抽象有待发现?

我不确定这个希望有多合理,但我个人真的很想做的一件事是发现一个新的抽象——某种能像拥有一等函数或递归甚至关键字参数那样产生巨大影响的东西。这可能是一个不可能实现的梦想。这些东西并不经常被发现。但我一直在寻找。

预测

1. 你可以使用任何你想要的语言。

编写应用程序过去意味着编写桌面软件。在桌面软件中,有很大的偏向于使用与操作系统相同的语言编写应用程序。所以十年前,编写软件几乎意味着用C编写软件。最终形成了一个传统:应用程序不能用不寻常的语言编写。而这个传统有很长时间发展,以至于像经理和风险投资家这样的非技术人员也学会了它。

基于服务器的软件彻底打破了这一模式。使用基于服务器的软件,你可以使用任何你想要的语言。几乎没有人理解这一点(特别是经理和风险投资家)。一些黑客理解它,这就是为什么我们甚至听到像Perl和Python这样的新的独立语言。我们听到Perl和Python不是因为人们用它们编写Windows应用程序。

作为对设计编程语言感兴趣的人,这对我们意味着我们的工作现在可能有真正的受众。

2. 速度来自分析器。

语言设计师,或者至少语言实现者,喜欢编写生成快速代码的编译器。但我不认为这是使语言对用户来说快速的原因。Knuth很久以前就指出,速度只在几个关键瓶颈中重要。任何尝试过的人都知道你无法猜测这些瓶颈在哪里。分析器是答案。

语言设计师在解决错误的问题。用户不需要基准测试运行得快。他们需要的是一种能够显示他们自己程序的哪些部分需要重写的语言。这就是实践中速度的来源。所以也许如果语言实现者把他们本来会花在编译器优化上的一半时间用来编写一个好的分析器,这会是一个净收益。

3. 你需要一个应用程序来推动语言的设计。

这可能不是一个绝对的规则,但似乎最好的语言都是与它们被用来编写的某个应用程序一起演化的。C是由需要它进行系统编程的人编写的。Lisp的开发部分是为了进行符号微分,McCarthy如此急于开始,甚至在1960年第一篇关于Lisp的论文中就在编写微分程序。

如果你的应用程序解决一些新问题,那就特别好。这将倾向于推动你的语言拥有程序员需要的新特性。我个人对编写一种适合编写基于服务器的应用程序的语言感兴趣。

[在小组讨论中,Guy Steele也提出了这一点,并补充建议应用程序不应该包括为你语言编写编译器,除非你的语言碰巧是用于编写编译器的。]

4. 语言必须适合编写一次性程序。

你知道什么是一次性程序:你为某些有限任务快速编写的东西。我认为如果你环顾四周,你会发现很多大的、严肃的程序开始时是一次性程序。如果大多数程序开始时是一次性程序,我不会感到惊讶。所以如果你想制造一种适合一般软件编写的好语言,它必须适合编写一次性程序,因为这是大多数软件的幼虫阶段。

5. 语法与语义相连。

传统上认为语法和语义是完全分开的。这听起来很震惊,但它们可能不是。我认为你语言中想要的东西可能与你如何表达它有关。

我最近和Robert Morris交谈,他指出在具有中缀语法的语言中,运算符重载是一个更大的胜利。在具有前缀语法的语言中,你定义的任何函数实际上都是运算符。如果你想为你创建的新类型数字定义一个加号,你可以只定义一个新函数来添加它们。如果在中缀语法的语言中这样做,重载运算符的使用和函数调用在外观上有很大差异。

更多预测

1. 新编程语言。

回到1970年代,设计新编程语言很时髦。最近不是了。但我认为基于服务器的软件将再次使新语言变得时髦。使用基于服务器的软件,你可以使用任何你想要的语言,所以如果有人设计了一种实际上看起来比现有其他语言更好的语言,将会有人冒险使用它。

2. 分时系统。

Richard Kelsey在上一次小组讨论中将此作为一个时机已经成熟的想法提出,我完全同意他。我的猜测(以及微软的猜测,似乎)是很多计算将从桌面转移到远程服务器上。换句话说,分时系统回来了。我认为需要在语言层面支持它。例如,我知道Richard和Jonathan Rees在Scheme 48中实现进程调度方面做了很多工作。

3. 效率。

最近计算机似乎终于足够快了。我们越来越多地听到字节码,这至少对我意味着我们觉得有周期可以浪费。但我认为对于基于服务器的软件我们不会这样。有人必须支付软件运行的服务器费用,并且每台机器能支持的用户数量将是其资本成本的除数。

所以我认为效率很重要,至少在计算瓶颈中。快速进行i/o将特别重要,因为基于服务器的应用程序做大量的i/o。

最终字节码可能不会是一个胜利。Sun和微软似乎正在陷入某种字节码之战。但他们这样做是因为字节码是插入到过程中的方便位置,而不是因为字节码本身是个好主意。最终整个战场可能被绕过。那会很有趣。

争议性观点

1. 客户端。

这只是一个猜测,但我猜测大多数应用程序的获胜模式将是纯粹基于服务器的。设计假设每个人都会有你的客户端的软件就像设计一个假设每个人都会诚实的社会一样。这当然会很方便,但你必须假设这永远不会发生。

我认为会有大量具有某种Web访问的设备出现,你唯一能假设的是它们支持简单的html和表单。你的手机上会有浏览器吗?你的掌上电脑会有电话吗?你的黑莓会得到更大的屏幕吗?你能在游戏机上浏览Web吗?你的手表上吗?我不知道。如果我把一切都赌在服务器上,我就不必知道。把所有智能都放在服务器上要稳健得多。

2. 面向对象编程。

我意识到这是有争议的,但我不认为面向对象编程是那么了不起。我认为对于需要特定数据结构类型的某些应用程序,如窗口系统、模拟和cad程序,它是一个很好的模型。但我不明白为什么它应该是所有编程的模式。

我认为大公司的人们喜欢面向对象编程的部分原因是它产生了很多看起来像工作的东西。某些可能自然地表示为,比如说,整数列表的东西,现在可以表示为具有各种脚手架和喧嚣的类。

面向对象编程的另一个吸引力是方法给你一些一等函数的效果。但这对Lisp程序员来说是老新闻。当你有实际的一等函数时,你可以以适合手头任务的任何方式使用它们,而不是强迫一切进入类和方法的模式。

这对语言设计的意义,我认为是你不应该把面向对象编程构建得太深。也许答案是提供更通用的底层东西,让人们设计他们想要的任何对象系统作为库。

3. 委员会设计。

让你的语言由委员会设计是一个大陷阱,不仅仅是众所周知的原因。每个人都知道委员会倾向于产生不均匀、不一致的设计。但我认为更大的危险是他们不会冒险。当一个人负责时,他可以承担委员会永远不会同意的风险。

设计好语言需要冒险吗?许多人可能怀疑语言设计是你应该相当接近传统智慧的事情。我打赌这不是真的。在人们做的所有其他事情中,回报与风险成正比。为什么语言设计应该不同?

日语翻译

Five Questions about Language Design

May 2001

(These are some notes I made for a panel discussion on programming language design at MIT on May 10, 2001.)

1. Programming Languages Are for People.

Programming languages are how people talk to computers. The computer would be just as happy speaking any language that was unambiguous. The reason we have high level languages is because people can’t deal with machine language. The point of programming languages is to prevent our poor frail human brains from being overwhelmed by a mass of detail.

Architects know that some kinds of design problems are more personal than others. One of the cleanest, most abstract design problems is designing bridges. There your job is largely a matter of spanning a given distance with the least material. The other end of the spectrum is designing chairs. Chair designers have to spend their time thinking about human butts.

Software varies in the same way. Designing algorithms for routing data through a network is a nice, abstract problem, like designing bridges. Whereas designing programming languages is like designing chairs: it’s all about dealing with human weaknesses.

Most of us hate to acknowledge this. Designing systems of great mathematical elegance sounds a lot more appealing to most of us than pandering to human weaknesses. And there is a role for mathematical elegance: some kinds of elegance make programs easier to understand. But elegance is not an end in itself.

And when I say languages have to be designed to suit human weaknesses, I don’t mean that languages have to be designed for bad programmers. In fact I think you ought to design for the best programmers, but even the best programmers have limitations. I don’t think anyone would like programming in a language where all the variables were the letter x with integer subscripts.

2. Design for Yourself and Your Friends.

If you look at the history of programming languages, a lot of the best ones were languages designed for their own authors to use, and a lot of the worst ones were designed for other people to use.

When languages are designed for other people, it’s always a specific group of other people: people not as smart as the language designer. So you get a language that talks down to you. Cobol is the most extreme case, but a lot of languages are pervaded by this spirit.

It has nothing to do with how abstract the language is. C is pretty low-level, but it was designed for its authors to use, and that’s why hackers like it.

The argument for designing languages for bad programmers is that there are more bad programmers than good programmers. That may be so. But those few good programmers write a disproportionately large percentage of the software.

I’m interested in the question, how do you design a language that the very best hackers will like? I happen to think this is identical to the question, how do you design a good programming language?, but even if it isn’t, it is at least an interesting question.

3. Give the Programmer as Much Control as Possible.

Many languages (especially the ones designed for other people) have the attitude of a governess: they try to prevent you from doing things that they think aren’t good for you. I like the opposite approach: give the programmer as much control as you can.

When I first learned Lisp, what I liked most about it was that it considered me an equal partner. In the other languages I had learned up till then, there was the language and there was my program, written in the language, and the two were very separate. But in Lisp the functions and macros I wrote were just like those that made up the language itself. I could rewrite the language if I wanted. It had the same appeal as open-source software.

4. Aim for Brevity.

Brevity is underestimated and even scorned. But if you look into the hearts of hackers, you’ll see that they really love it. How many times have you heard hackers speak fondly of how in, say, APL, they could do amazing things with just a couple lines of code? I think anything that really smart people really love is worth paying attention to.

I think almost anything you can do to make programs shorter is good. There should be lots of library functions; anything that can be implicit should be; the syntax should be terse to a fault; even the names of things should be short.

And it’s not only programs that should be short. The manual should be thin as well. A good part of manuals is taken up with clarifications and reservations and warnings and special cases. If you force yourself to shorten the manual, in the best case you do it by fixing the things in the language that required so much explanation.

5. Admit What Hacking Is.

A lot of people wish that hacking was mathematics, or at least something like a natural science. I think hacking is more like architecture. Architecture is related to physics, in the sense that architects have to design buildings that don’t fall down, but the actual goal of architects is to make great buildings, not to make discoveries about statics.

What hackers like to do is make great programs. And I think, at least in our own minds, we have to remember that it’s an admirable thing to write great programs, even when this work doesn’t translate easily into the conventional intellectual currency of research papers. Intellectually, it is just as worthwhile to design a language programmers will love as it is to design a horrible one that embodies some idea you can publish a paper about.

Open Questions

1. How to Organize Big Libraries?

Libraries are becoming an increasingly important component of programming languages. They’re also getting bigger, and this can be dangerous. If it takes longer to find the library function that will do what you want than it would take to write it yourself, then all that code is doing nothing but make your manual thick. (The Symbolics manuals were a case in point.) So I think we will have to work on ways to organize libraries. The ideal would be to design them so that the programmer could guess what library call would do the right thing.

2. Are People Really Scared of Prefix Syntax?

This is an open problem in the sense that I have wondered about it for years and still don’t know the answer. Prefix syntax seems perfectly natural to me, except possibly for math. But it could be that a lot of Lisp’s unpopularity is simply due to having an unfamiliar syntax. Whether to do anything about it, if it is true, is another question.

3. What Do You Need for Server-Based Software?

I think a lot of the most exciting new applications that get written in the next twenty years will be Web-based applications, meaning programs that sit on the server and talk to you through a Web browser. And to write these kinds of programs we may need some new things.

One thing we’ll need is support for the new way that server-based apps get released. Instead of having one or two big releases a year, like desktop software, server-based apps get released as a series of small changes. You may have as many as five or ten releases a day. And as a rule everyone will always use the latest version.

You know how you can design programs to be debuggable? Well, server-based software likewise has to be designed to be changeable. You have to be able to change it easily, or at least to know what is a small change and what is a momentous one.

Another thing that might turn out to be useful for server based software, surprisingly, is continuations. In Web-based software you can use something like continuation-passing style to get the effect of subroutines in the inherently stateless world of a Web session. Maybe it would be worthwhile having actual continuations, if it was not too expensive.

4. What New Abstractions Are Left to Discover?

I’m not sure how reasonable a hope this is, but one thing I would really love to do, personally, is discover a new abstraction— something that would make as much of a difference as having first class functions or recursion or even keyword parameters. This may be an impossible dream. These things don’t get discovered that often. But I am always looking.

Predictions

1. You Can Use Whatever Language You Want.

Writing application programs used to mean writing desktop software. And in desktop software there is a big bias toward writing the application in the same language as the operating system. And so ten years ago, writing software pretty much meant writing software in C. Eventually a tradition evolved: application programs must not be written in unusual languages. And this tradition had so long to develop that nontechnical people like managers and venture capitalists also learned it.

Server-based software blows away this whole model. With server-based software you can use any language you want. Almost nobody understands this yet (especially not managers and venture capitalists). A few hackers understand it, and that’s why we even hear about new, indy languages like Perl and Python. We’re not hearing about Perl and Python because people are using them to write Windows apps.

What this means for us, as people interested in designing programming languages, is that there is now potentially an actual audience for our work.

2. Speed Comes from Profilers.

Language designers, or at least language implementors, like to write compilers that generate fast code. But I don’t think this is what makes languages fast for users. Knuth pointed out long ago that speed only matters in a few critical bottlenecks. And anyone who’s tried it knows that you can’t guess where these bottlenecks are. Profilers are the answer.

Language designers are solving the wrong problem. Users don’t need benchmarks to run fast. What they need is a language that can show them what parts of their own programs need to be rewritten. That’s where speed comes from in practice. So maybe it would be a net win if language implementors took half the time they would have spent doing compiler optimizations and spent it writing a good profiler instead.

3. You Need an Application to Drive the Design of a Language.

This may not be an absolute rule, but it seems like the best languages all evolved together with some application they were being used to write. C was written by people who needed it for systems programming. Lisp was developed partly to do symbolic differentiation, and McCarthy was so eager to get started that he was writing differentiation programs even in the first paper on Lisp, in 1960.

It’s especially good if your application solves some new problem. That will tend to drive your language to have new features that programmers need. I personally am interested in writing a language that will be good for writing server-based applications.

[During the panel, Guy Steele also made this point, with the additional suggestion that the application should not consist of writing the compiler for your language, unless your language happens to be intended for writing compilers.]

4. A Language Has to Be Good for Writing Throwaway Programs.

You know what a throwaway program is: something you write quickly for some limited task. I think if you looked around you’d find that a lot of big, serious programs started as throwaway programs. I would not be surprised if most programs started as throwaway programs. And so if you want to make a language that’s good for writing software in general, it has to be good for writing throwaway programs, because that is the larval stage of most software.

5. Syntax Is Connected to Semantics.

It’s traditional to think of syntax and semantics as being completely separate. This will sound shocking, but it may be that they aren’t. I think that what you want in your language may be related to how you express it.

I was talking recently to Robert Morris, and he pointed out that operator overloading is a bigger win in languages with infix syntax. In a language with prefix syntax, any function you define is effectively an operator. If you want to define a plus for a new type of number you’ve made up, you can just define a new function to add them. If you do that in a language with infix syntax, there’s a big difference in appearance between the use of an overloaded operator and a function call.

More Predictions

1. New Programming Languages.

Back in the 1970s it was fashionable to design new programming languages. Recently it hasn’t been. But I think server-based software will make new languages fashionable again. With server-based software, you can use any language you want, so if someone does design a language that actually seems better than others that are available, there will be people who take a risk and use it.

2. Time-Sharing.

Richard Kelsey gave this as an idea whose time has come again in the last panel, and I completely agree with him. My guess (and Microsoft’s guess, it seems) is that much computing will move from the desktop onto remote servers. In other words, time-sharing is back. And I think there will need to be support for it at the language level. For example, I know that Richard and Jonathan Rees have done a lot of work implementing process scheduling within Scheme 48.

3. Efficiency.

Recently it was starting to seem that computers were finally fast enough. More and more we were starting to hear about byte code, which implies to me at least that we feel we have cycles to spare. But I don’t think we will, with server-based software. Someone is going to have to pay for the servers that the software runs on, and the number of users they can support per machine will be the divisor of their capital cost.

So I think efficiency will matter, at least in computational bottlenecks. It will be especially important to do i/o fast, because server-based applications do a lot of i/o.

It may turn out that byte code is not a win, in the end. Sun and Microsoft seem to be facing off in a kind of a battle of the byte codes at the moment. But they’re doing it because byte code is a convenient place to insert themselves into the process, not because byte code is in itself a good idea. It may turn out that this whole battleground gets bypassed. That would be kind of amusing.

Controversial Opinions

1. Clients.

This is just a guess, but my guess is that the winning model for most applications will be purely server-based. Designing software that works on the assumption that everyone will have your client is like designing a society on the assumption that everyone will just be honest. It would certainly be convenient, but you have to assume it will never happen.

I think there will be a proliferation of devices that have some kind of Web access, and all you’ll be able to assume about them is that they can support simple html and forms. Will you have a browser on your cell phone? Will there be a phone in your palm pilot? Will your blackberry get a bigger screen? Will you be able to browse the Web on your gameboy? Your watch? I don’t know. And I don’t have to know if I bet on everything just being on the server. It’s just so much more robust to have all the brains on the server.

2. Object-Oriented Programming.

I realize this is a controversial one, but I don’t think object-oriented programming is such a big deal. I think it is a fine model for certain kinds of applications that need that specific kind of data structure, like window systems, simulations, and cad programs. But I don’t see why it ought to be the model for all programming.

I think part of the reason people in big companies like object-oriented programming is because it yields a lot of what looks like work. Something that might naturally be represented as, say, a list of integers, can now be represented as a class with all kinds of scaffolding and hustle and bustle.

Another attraction of object-oriented programming is that methods give you some of the effect of first class functions. But this is old news to Lisp programmers. When you have actual first class functions, you can just use them in whatever way is appropriate to the task at hand, instead of forcing everything into a mold of classes and methods.

What this means for language design, I think, is that you shouldn’t build object-oriented programming in too deeply. Maybe the answer is to offer more general, underlying stuff, and let people design whatever object systems they want as libraries.

3. Design by Committee.

Having your language designed by a committee is a big pitfall, and not just for the reasons everyone knows about. Everyone knows that committees tend to yield lumpy, inconsistent designs. But I think a greater danger is that they won’t take risks. When one person is in charge he can take risks that a committee would never agree on.

Is it necessary to take risks to design a good language though? Many people might suspect that language design is something where you should stick fairly close to the conventional wisdom. I bet this isn’t true. In everything else people do, reward is proportionate to risk. Why should language design be any different?

Japanese Translation