我从Hacker News学到的
我从Hacker News学到的
2009年2月
Hacker News上周满两岁了。最初它应该是一个副业项目——一个用来磨练Arc的应用程序,以及现任和未来Y Combinator创始人交流新闻的地方。它已经变得比我预期的更大,占用了我更多的时间,但我不后悔,因为我在工作中学到了很多东西。
增长
当我们在2007年2月推出时,工作日流量大约是1600个独立日访问者。从那时起增长到大约22,000个。这个增长率比我喜欢的要高一点。我希望网站增长,因为一个至少不缓慢增长的网站可能已经死了。但我不希望它变得像Digg或Reddit那样大——主要是因为这会稀释网站的特色,也是因为我不想把所有时间都花在处理扩展性问题上。
我已经有够多的问题了。记住,HN的最初动机是测试一种新的编程语言,而且是一种专注于实验语言设计而非性能的语言。每次网站变慢,我就通过回忆McIlroy和Bentley的名言”性能的关键在于优雅,而不是特殊情况的军队”来强化自己,并寻找我能用最少代码消除的瓶颈。到目前为止,我一直能够跟上,从某种意义上说,尽管增长了14倍,性能却一直保持中等水平。我不知道下一步该做什么,但我可能会想出办法。
这通常是我对网站的态度。Hacker News是一个实验,而且是在一个非常年轻的领域的实验。这种类型的网站只有几年历史。互联网对话一般来说也只有几十年历史。所以我们可能只发现了最终会发现的很小一部分。
这就是为什么我对HN如此乐观。当一项技术如此年轻时,现有的解决方案通常很糟糕;这意味着必须有可能做得更好;这意味着许多看似无法解决的问题其实可以解决。包括,我希望,困扰了这么多先前社区的问题:被增长所破坏。
稀释
自从网站建立几个月以来,用户就一直在担心这个问题。到目前为止,这些警报都是错误的,但它们可能不会永远如此。稀释是一个难题。但可能是可以解决的;当”总是”等于20个实例时,开放对话”总是”被增长破坏并没有多大意义。
但重要的是要记住我们正在尝试解决一个新问题,因为这意味着我们将不得不尝试新事物,其中大多数可能不会奏效。几周前我尝试用橙色显示平均评论分数最高的用户的名字。[1] 那是一个错误。突然间一个或多或少团结的文化被分为有和没有两部分。直到看到它被分裂,我才意识到这个文化有多么团结。看着很痛苦。[2]
所以橙色用户名不会回来了。(抱歉。)但将来会有其他同样看似有缺陷的想法,而那些被证明有效的想法可能与那些无效的想法看起来一样有缺陷。
可能我从稀释中学到的最重要的事情是,它更多地是用行为而不是用户来衡量的。你想要排除的是不良行为,而不是坏人。用户行为出人意料地具有可塑性。如果人们被期望表现良好,他们往往会这样;反之亦然。
当然,禁止不良行为确实倾向于使坏人远离,因为在一个必须表现良好的地方,他们会感到不舒服地受约束。但这种排除他们的方式比显性障碍更温和,可能也更有效。
现在很清楚,破窗理论也适用于社区网站。该理论认为,轻微的不良行为形式会鼓励更恶劣的行为:一个有很多涂鸦和破窗户的社区会成为发生抢劫的地方。当Giuliani推行使破窗理论著名的改革时,我住在纽约,转变是奇迹般的。当Reddit发生相反情况时,我是Reddit用户,转变同样引人注目。
我不是在批评Steve和Alexis。Reddit发生的事情不是出于疏忽。从一开始他们就有一个政策,除了垃圾邮件什么都不审查。此外,Reddit与Hacker News有不同的目标。Reddit是一家创业公司,不是一个副业项目;它的目标是尽可能快地增长。结合快速增长和零审查,结果就是一场混战。但如果他们重新来过,我不认为他们会做得有很大不同。从流量来看,Reddit比Hacker News成功得多。
但Reddit发生的事情不会不可避免地发生在HN身上。有几个局部最大值。可以有自由放任的地方,也有更深思熟虑的地方,就像现实世界中一样;人们会根据他们在哪里而有不同的行为,就像他们在现实世界中一样。
我在野外观察到这一点。我见过在Reddit和Hacker News上交叉发帖的人,他们确实费心写了两个版本,一个给Reddit的激烈版本,一个给HN的更温和版本。
提交
像Hacker News这样的网站需要避免两种主要类型的问题:不好的故事和不好的评论。到目前为止,不好故事的危险似乎较小。现在首页上的故事仍然是HN开始时大概会在那里的那些故事。
我曾经认为我必须加权投票以防止垃圾出现在首页上,但我还不需要这样做。我本不会预测首页会保持得这么好,我不确定为什么会这样。也许只有更深思熟虑的用户足够关心提交和投票链接,所以一个随机新用户的边际成本接近于零。或者首页通过广告宣传预期类型的提交来保护自己。
对首页最危险的东西是那些太容易投票的东西。如果有人证明了一个新定理,读者需要一些工作来决定是否投票。一个有趣的漫画需要较少的努力。一个以战斗口号为标题的咆哮需要零努力,因为人们甚至不读它就投票。
因此我称之为轻浮原则:在用户投票的新闻网站上,最容易判断的链接将接管,除非你采取具体措施来防止它。
Hacker News有两种防止轻浮的保护。最常见的轻浮链接类型被禁止作为离题。小猫图片、政治咆哮等等被明确禁止。这排除了大多数轻浮内容,但不是全部。一些链接既是轻浮的(在非常短的意义上),也是相关的。
对此没有单一的解决方案。如果一个链接只是一个空洞的咆哮,编辑有时会杀死它,即使它在相关意义上是关于黑客的,因为它不符合真正的标准,即激发一个人的智力好奇心。如果一个网站的帖子特征是这种类型,我有时会禁止它,这意味着该url的新内容被自动杀死。如果一个帖子有诱饵标题,编辑有时会重新措辞使其更实际。这对于标题是战斗口号的链接特别必要,否则它们成为隐含的”如果你相信某某就投票”帖子,这是最极端的轻浮形式。
处理链接的技术必须发展,因为链接在发展。聚合器的存在已经影响了它们聚合的内容。作者现在故意写东西来吸引聚合器的流量——有时甚至是特定的聚合器。(不,这个陈述的讽刺性没有逃过我的注意。)然后有更阴险的变异,比如链接劫持——发布别人文章的释义并提交它而不是原始文章。这些可以获得很多投票,因为文章中很多好的东西经常被保留;确实,释义越接近抄袭,保留的就越多。[3]
我认为一个杀死提交的网站提供一种方式让用户如果想要可以看到被杀掉的内容是很重要的。这使编辑诚实,同样重要的是,让用户有信心如果编辑停止诚实他们会知道。HN用户可以通过翻转他们个人资料中称为showdead的开关来做到这一点。[4]
评论
坏评论似乎比坏提交更难解决的问题。虽然HN首页上链接的质量没有太大变化,但中位数评论的质量可能有所下降。
评论中有两种主要的坏:刻薄和愚蠢。两者之间有很大的重叠——刻薄的评论很可能也是愚蠢的——但处理它们的策略不同。刻薄更容易控制。你可以有规则说不应该刻薄,如果你执行它们,似乎有可能控制住刻薄。
控制愚蠢更难,也许是因为愚蠢不那么容易区分。刻薄的人比愚蠢的人更可能知道自己在刻薄。
最危险的愚蠢评论形式不是长而错误的论点,而是愚蠢的笑话。长而错误的论点实际上相当罕见。评论质量和长度之间有很强的相关性;如果你想比较社区网站上评论的质量,平均长度会是一个很好的预测指标。原因可能是人性,而不是任何特定于评论线程的东西。可能只是愚蠢更多地表现为很少有想法,而不是错误的想法。
无论原因是什么,愚蠢的评论往往是短的。而且因为很难写一个以其传达的信息量而 distinguished 的短评论,人们试图通过搞笑来区分它们。愚蠢评论最诱人的形式是 supposedly 机智的贬低,可能是因为贬低是幽默的最容易形式。[5] 所以禁止刻薄的一个优势是它也减少了这些。
坏评论像野葛:它们迅速接管。评论对新评论的影响比提交对新提交的影响大得多。如果有人提交一个蹩脚的文章,其他提交不会都变得蹩脚。但如果有人在帖子上发布一个愚蠢的评论,这就为它周围的区域定下了基调。人们用愚蠢的笑话回复愚蠢的笑话。
也许解决方案是在人们可以对评论做出回应之前添加一个延迟,并使延迟长度与其质量的一些预测成反比。那么愚蠢的线程会增长得更慢。[6]
人
我注意到我描述的大多数技术是保守的:它们旨在保护网站的特色,而不是增强它。我不认为这是我的偏见。这是由于问题的形状。Hacker News有幸开始就好,所以在这种情况下这确实是一个保护问题。但我认为这个原则也适用于不同起源的网站。
社区网站中的好东西更多地来自人,而不是技术;技术主要在防止坏事情方面发挥作用。技术当然可以增强讨论。例如,嵌套评论可以。但我宁愿使用一个功能原始、用户聪明友好的网站,也不愿使用一个用户是白痴或巨魔的更先进的网站。
所以社区网站能做的最重要的事情是吸引它想要的那种人。一个试图尽可能大的网站想要吸引每个人。但一个针对特定用户群体的网站必须只吸引那些人——同样重要的是,排斥其他所有人。我在HN上有意识地努力做到这一点。图形设计尽可能简单,网站规则不鼓励戏剧性的链接标题。目标是第一次到达HN的人唯一感兴趣的东西应该是那里表达的想法。
调整网站以吸引某些人的缺点是,对那些人来说,它可能太有吸引力了。我太清楚Hacker News可能有多令人上瘾了。对我,就像对许多用户一样,它是一种虚拟的城镇广场。当我想从工作中休息一下时,我走进广场,就像我可能会走进现实世界中的哈佛广场或大学大道。[7] 但在线广场比物理广场更危险。如果我半天在大学大道闲逛,我会注意到。我必须走一英里才能到那里,坐在咖啡馆感觉与工作不同。但访问在线论坛只需要点击一下,表面上感觉非常像工作。你可能是在浪费时间,但你不是闲置的。互联网上有人错了,而你正在解决问题。
Hacker News肯定是有用的。我从在HN上读到的东西中学到了很多。我写了几篇最初作为评论的文章。所以我不希望网站消失。但我想要确保它不是生产力的净拖累。那将是一场灾难,吸引成千上万聪明人到导致他们浪费大量时间的网站。我希望我能100%确定这不是对HN的描述。
我觉得游戏和社交应用程序的令人上瘾性仍然是一个大部分未解决的问题。现在的情况就像1980年代的快克可卡因:我们发明了极其令人上瘾的新东西,但我们还没有进化出保护自己免受它们伤害的方法。我们最终会的,这是我希望接下来专注解决的问题之一。
注释
[1] 我尝试用平均数和中位数评论分数对用户进行排名,平均数(去掉高分)似乎是对高质量更准确的预测。不过中位数可能是对低质量更准确的预测。
[2] 我从这个实验中学到的另一件事是,如果你要区分人,你最好确定你做得对。这是快速原型设计不起作用的问题之一。
确实,这是不在各种类型的人之间歧视的智力上诚实的论点。不这样做的理由不是每个人都一样,而是做错是坏的,做对是难的。
[3] 当我抓住恶劣的链接劫持帖子时,我用他们复制的任何东西的url替换它。习惯性链接劫持的网站会被禁止。
[4] Digg因其缺乏透明度而臭名昭著。问题的根源不是运营Digg的人特别狡猾,而是他们使用错误的算法生成他们的首页。不像Reddit那样随着获得更多投票而从底部冒泡,故事从顶部开始,被新来的推下去。
差异的原因是Digg源自Slashdot,而Reddit源自Delicious/popular。Digg是Slashdot用投票代替编辑,Reddit是Delicious/popular用投票代替书签。(你仍然可以在它们的图形设计中看到它们起源的化石。)
Digg的算法很容易被操纵,因为任何出现在首页的故事都是新的头号故事。这反过来迫使Digg用极端对策回应。很多创业公司在早期不得不诉诸诡计有一些秘密,我怀疑Digg的秘密是头号故事在多大程度上是由人工编辑实际选择的。
[5] Beavis and Butthead中的对话主要由这些组成,当我在真正糟糕的网站上读评论时,我可以用他们的声音听到它们。
[6] 我怀疑大多数阻止愚蠢评论的技术还有待发现。Xkcd在其IRC频道中实现了一个特别聪明的技术:不允许同样的东西两次。一旦有人说”失败”,就再也不能有人说了。这尤其会惩罚短评论,因为它们避免碰撞的空间更少。
另一个有希望的想法是愚蠢过滤器,它就像概率垃圾邮件过滤器,但训练在愚蠢和非愚蠢评论的语料库上。
你可能不必杀死坏评论来解决问题。长线程底部的评论很少被看到,所以可能足够在评论排序算法中 incorporate 质量预测。
[7] 使大多数郊区如此令人沮丧的是没有一个可以步行到的中心。
感谢Justin Kan、Jessica Livingston、Robert Morris、Alexis Ohanian、Emmet Shear和Fred Wilson阅读本文的草稿。
What I’ve Learned from Hacker News
February 2009
Hacker News was two years old last week. Initially it was supposed to be a side project—an application to sharpen Arc on, and a place for current and future Y Combinator founders to exchange news. It’s grown bigger and taken up more time than I expected, but I don’t regret that because I’ve learned so much from working on it.
Growth
When we launched in February 2007, weekday traffic was around 1600 daily uniques. It’s since grown to around 22,000. This growth rate is a bit higher than I’d like. I’d like the site to grow, since a site that isn’t growing at least slowly is probably dead. But I wouldn’t want it to grow as large as Digg or Reddit—mainly because that would dilute the character of the site, but also because I don’t want to spend all my time dealing with scaling.
I already have problems enough with that. Remember, the original motivation for HN was to test a new programming language, and moreover one that’s focused on experimenting with language design, not performance. Every time the site gets slow, I fortify myself by recalling McIlroy and Bentley’s famous quote “The key to performance is elegance, not battalions of special cases.” and look for the bottleneck I can remove with least code. So far I’ve been able to keep up, in the sense that performance has remained consistently mediocre despite 14x growth. I don’t know what I’ll do next, but I’ll probably think of something.
This is my attitude to the site generally. Hacker News is an experiment, and an experiment in a very young field. Sites of this type are only a few years old. Internet conversation generally is only a few decades old. So we’ve probably only discovered a fraction of what we eventually will.
That’s why I’m so optimistic about HN. When a technology is this young, the existing solutions are usually terrible; which means it must be possible to do much better; which means many problems that seem insoluble aren’t. Including, I hope, the problem that has afflicted so many previous communities: being ruined by growth.
Dilution
Users have worried about that since the site was a few months old. So far these alarms have been false, but they may not always be. Dilution is a hard problem. But probably soluble; it doesn’t mean much that open conversations have “always” been destroyed by growth when “always” equals 20 instances.
But it’s important to remember we’re trying to solve a new problem, because that means we’re going to have to try new things, most of which probably won’t work. A couple weeks ago I tried displaying the names of users with the highest average comment scores in orange. [1] That was a mistake. Suddenly a culture that had been more or less united was divided into haves and have-nots. I didn’t realize how united the culture had been till I saw it divided. It was painful to watch. [2]
So orange usernames won’t be back. (Sorry about that.) But there will be other equally broken-seeming ideas in the future, and the ones that turn out to work will probably seem just as broken as those that don’t.
Probably the most important thing I’ve learned about dilution is that it’s measured more in behavior than users. It’s bad behavior you want to keep out more than bad people. User behavior turns out to be surprisingly malleable. If people are expected to behave well, they tend to; and vice versa.
Though of course forbidding bad behavior does tend to keep away bad people, because they feel uncomfortably constrained in a place where they have to behave well. But this way of keeping them out is gentler and probably also more effective than overt barriers.
It’s pretty clear now that the broken windows theory applies to community sites as well. The theory is that minor forms of bad behavior encourage worse ones: that a neighborhood with lots of graffiti and broken windows becomes one where robberies occur. I was living in New York when Giuliani introduced the reforms that made the broken windows theory famous, and the transformation was miraculous. And I was a Reddit user when the opposite happened there, and the transformation was equally dramatic.
I’m not criticizing Steve and Alexis. What happened to Reddit didn’t happen out of neglect. From the start they had a policy of censoring nothing except spam. Plus Reddit had different goals from Hacker News. Reddit was a startup, not a side project; its goal was to grow as fast as possible. Combine rapid growth and zero censorship, and the result is a free for all. But I don’t think they’d do much differently if they were doing it again. Measured by traffic, Reddit is much more successful than Hacker News.
But what happened to Reddit won’t inevitably happen to HN. There are several local maxima. There can be places that are free for alls and places that are more thoughtful, just as there are in the real world; and people will behave differently depending on which they’re in, just as they do in the real world.
I’ve observed this in the wild. I’ve seen people cross-posting on Reddit and Hacker News who actually took the trouble to write two versions, a flame for Reddit and a more subdued version for HN.
Submissions
There are two major types of problems a site like Hacker News needs to avoid: bad stories and bad comments. So far the danger of bad stories seems smaller. The stories on the frontpage now are still roughly the ones that would have been there when HN started.
I once thought I’d have to weight votes to keep crap off the frontpage, but I haven’t had to yet. I wouldn’t have predicted the frontpage would hold up so well, and I’m not sure why it has. Perhaps only the more thoughtful users care enough to submit and upvote links, so the marginal cost of one random new user approaches zero. Or perhaps the frontpage protects itself, by advertising what type of submission is expected.
The most dangerous thing for the frontpage is stuff that’s too easy to upvote. If someone proves a new theorem, it takes some work by the reader to decide whether or not to upvote it. An amusing cartoon takes less. A rant with a rallying cry as the title takes zero, because people vote it up without even reading it.
Hence what I call the Fluff Principle: on a user-voted news site, the links that are easiest to judge will take over unless you take specific measures to prevent it.
Hacker News has two kinds of protections against fluff. The most common types of fluff links are banned as off-topic. Pictures of kittens, political diatribes, and so on are explicitly banned. This keeps out most fluff, but not all of it. Some links are both fluff, in the sense of being very short, and also on topic.
There’s no single solution to that. If a link is just an empty rant, editors will sometimes kill it even if it’s on topic in the sense of being about hacking, because it’s not on topic by the real standard, which is to engage one’s intellectual curiosity. If the posts on a site are characteristically of this type I sometimes ban it, which means new stuff at that url is auto-killed. If a post has a linkbait title, editors sometimes rephrase it to be more matter-of-fact. This is especially necessary with links whose titles are rallying cries, because otherwise they become implicit “vote up if you believe such-and-such” posts, which are the most extreme form of fluff.
The techniques for dealing with links have to evolve, because the links do. The existence of aggregators has already affected what they aggregate. Writers now deliberately write things to draw traffic from aggregators—sometimes even specific ones. (No, the irony of this statement is not lost on me.) Then there are the more sinister mutations, like linkjacking—posting a paraphrase of someone else’s article and submitting that instead of the original. These can get a lot of upvotes, because a lot of what’s good in an article often survives; indeed, the closer the paraphrase is to plagiarism, the more survives. [3]
I think it’s important that a site that kills submissions provide a way for users to see what got killed if they want to. That keeps editors honest, and just as importantly, makes users confident they’d know if the editors stopped being honest. HN users can do this by flipping a switch called showdead in their profile. [4]
Comments
Bad comments seem to be a harder problem than bad submissions. While the quality of links on the frontpage of HN hasn’t changed much, the quality of the median comment may have decreased somewhat.
There are two main kinds of badness in comments: meanness and stupidity. There is a lot of overlap between the two—mean comments are disproportionately likely also to be dumb—but the strategies for dealing with them are different. Meanness is easier to control. You can have rules saying one shouldn’t be mean, and if you enforce them it seems possible to keep a lid on meanness.
Keeping a lid on stupidity is harder, perhaps because stupidity is not so easily distinguishable. Mean people are more likely to know they’re being mean than stupid people are to know they’re being stupid.
The most dangerous form of stupid comment is not the long but mistaken argument, but the dumb joke. Long but mistaken arguments are actually quite rare. There is a strong correlation between comment quality and length; if you wanted to compare the quality of comments on community sites, average length would be a good predictor. Probably the cause is human nature rather than anything specific to comment threads. Probably it’s simply that stupidity more often takes the form of having few ideas than wrong ones.
Whatever the cause, stupid comments tend to be short. And since it’s hard to write a short comment that’s distinguished for the amount of information it conveys, people try to distinguish them instead by being funny. The most tempting format for stupid comments is the supposedly witty put-down, probably because put-downs are the easiest form of humor. [5] So one advantage of forbidding meanness is that it also cuts down on these.
Bad comments are like kudzu: they take over rapidly. Comments have much more effect on new comments than submissions have on new submissions. If someone submits a lame article, the other submissions don’t all become lame. But if someone posts a stupid comment on a thread, that sets the tone for the region around it. People reply to dumb jokes with dumb jokes.
Maybe the solution is to add a delay before people can respond to a comment, and make the length of the delay inversely proportional to some prediction of its quality. Then dumb threads would grow slower. [6]
People
I notice most of the techniques I’ve described are conservative: they’re aimed at preserving the character of the site rather than enhancing it. I don’t think that’s a bias of mine. It’s due to the shape of the problem. Hacker News had the good fortune to start out good, so in this case it’s literally a matter of preservation. But I think this principle would also apply to sites with different origins.
The good things in a community site come from people more than technology; it’s mainly in the prevention of bad things that technology comes into play. Technology certainly can enhance discussion. Nested comments do, for example. But I’d rather use a site with primitive features and smart, nice users than a more advanced one whose users were idiots or trolls.
So the most important thing a community site can do is attract the kind of people it wants. A site trying to be as big as possible wants to attract everyone. But a site aiming at a particular subset of users has to attract just those—and just as importantly, repel everyone else. I’ve made a conscious effort to do this on HN. The graphic design is as plain as possible, and the site rules discourage dramatic link titles. The goal is that the only thing to interest someone arriving at HN for the first time should be the ideas expressed there.
The downside of tuning a site to attract certain people is that, to those people, it can be too attractive. I’m all too aware how addictive Hacker News can be. For me, as for many users, it’s a kind of virtual town square. When I want to take a break from working, I walk into the square, just as I might into Harvard Square or University Ave in the physical world. [7] But an online square is more dangerous than a physical one. If I spent half the day loitering on University Ave, I’d notice. I have to walk a mile to get there, and sitting in a cafe feels different from working. But visiting an online forum takes just a click, and feels superficially very much like working. You may be wasting your time, but you’re not idle. Someone is wrong on the Internet, and you’re fixing the problem.
Hacker News is definitely useful. I’ve learned a lot from things I’ve read on HN. I’ve written several essays that began as comments there. So I wouldn’t want the site to go away. But I would like to be sure it’s not a net drag on productivity. What a disaster that would be, to attract thousands of smart people to a site that caused them to waste lots of time. I wish I could be 100% sure that’s not a description of HN.
I feel like the addictiveness of games and social applications is still a mostly unsolved problem. The situation now is like it was with crack in the 1980s: we’ve invented terribly addictive new things, and we haven’t yet evolved ways to protect ourselves from them. We will eventually, and that’s one of the problems I hope to focus on next.
Notes
[1] I tried ranking users by both average and median comment score, and average (with the high score thrown out) seemed the more accurate predictor of high quality. Median may be the more accurate predictor of low quality though.
[2] Another thing I learned from this experiment is that if you’re going to distinguish between people, you better be sure you do it right. This is one problem where rapid prototyping doesn’t work.
Indeed, that’s the intellectually honest argument for not discriminating between various types of people. The reason not to do it is not that everyone’s the same, but that it’s bad to do wrong and hard to do right.
[3] When I catch egregiously linkjacked posts I replace the url with that of whatever they copied. Sites that habitually linkjack get banned.
[4] Digg is notorious for its lack of transparency. The root of the problem is not that the guys running Digg are especially sneaky, but that they use the wrong algorithm for generating their frontpage. Instead of bubbling up from the bottom as they get more votes, as on Reddit, stories start at the top and get pushed down by new arrivals.
The reason for the difference is that Digg is derived from Slashdot, while Reddit is derived from Delicious/popular. Digg is Slashdot with voting instead of editors, and Reddit is Delicious/popular with voting instead of bookmarking. (You can still see fossils of their origins in their graphic design.)
Digg’s algorithm is very vulnerable to gaming, because any story that makes it onto the frontpage is the new top story. Which in turn forces Digg to respond with extreme countermeasures. A lot of startups have some kind of secret about the subterfuges they had to resort to in the early days, and I suspect Digg’s is the extent to which the top stories were de facto chosen by human editors.
[5] The dialog on Beavis and Butthead was composed largely of these, and when I read comments on really bad sites I can hear them in their voices.
[6] I suspect most of the techniques for discouraging stupid comments have yet to be discovered. Xkcd implemented a particularly clever one in its IRC channel: don’t allow the same thing twice. Once someone has said “fail,” no one can ever say it again. This would penalize short comments especially, because they have less room to avoid collisions in.
Another promising idea is the stupid filter, which is just like a probabilistic spam filter, but trained on corpora of stupid and non-stupid comments instead.
You may not have to kill bad comments to solve the problem. Comments at the bottom of a long thread are rarely seen, so it may be enough to incorporate a prediction of quality in the comment sorting algorithm.
[7] What makes most suburbs so demoralizing is that there’s no center to walk to.
Thanks to Justin Kan, Jessica Livingston, Robert Morris, Alexis Ohanian, Emmet Shear, and Fred Wilson for reading drafts of this.