OpenAI 的 CPO 谈 AI 如何改变必备技能、护城河、编程、创业方法论等 | Kevin Weil
OpenAI 的 CPO 谈 AI 如何改变必备技能、护城河、编程、创业方法论等 | Kevin Weil
访谈实录
AI 模型的进化速度
Kevin Weil: 你今天使用的 AI 模型,将是你余生所用过的最差的 AI 模型。当你真正意识到这一点时,会觉得相当震撼。在我之前工作过的每一个地方,你大致知道自己建立在什么技术之上,但 AI 领域完全不是这样。每隔两个月,计算机就能做到以前从未做过的事情,你需要彻底重新思考自己正在做的事。
Lenny Rachitsky: 你大概是当今世界上最重要公司的首席产品官。我想聊聊身处风暴中心到底是什么感觉。
Kevin Weil: 我们的总体心态是:两个月后会有一个更好的模型,它会彻底打破当前的所有局限。我们也对开发者这么说——如果你正在构建的产品刚好处于模型能力边缘,继续做下去,因为你做对了方向。再过几个月,模型会变得非常强大,而你那个勉强能跑的产品,突然就会大放异彩。
Libra:职业生涯最大的遗憾
Lenny Rachitsky: 你曾在 Facebook 领导过一个著名的项目叫 Libra。
Kevin Weil: Libra 可能是我职业生涯中最大的遗憾。这件事今天在世界上不存在,从根本上让我感到失望,因为如果我们当初能把那个产品做出来,世界会变得更好。我们试图推出一条新的区块链,最初是一篮子货币的设计,集成到 WhatsApp 和 Messenger 中。我可以在 WhatsApp 里免费给你发 50 美分。这东西应该存在的。说实话,现任政府对加密货币非常友好,Facebook 的声誉也已经完全不同了。也许他们现在应该去把它做出来。
Lenny Rachitsky: 今天的嘉宾是 Kevin Weil。Kevin 是 OpenAI 的首席产品官,这大概是当今世界上最重要、影响力最大的公司,处于 AI、AGI(通用人工智能)、也许未来还有超级智能的最前沿。他此前曾担任 Instagram 和 Twitter 的产品负责人,也是 Facebook Libra 加密货币的联合创建者,我们会聊到这段经历。他同时还是 Planet、Strava、Black Product Managers Network 和大自然保护协会(the Nature Conservancy)的董事会成员。他为人也非常好,有太多智慧可以分享。我们会聊到 OpenAI 如何运作,AI 对我们所有人工作和构建产品方式的影响,AI 生态系统中哪些市场是 OpenAI 这样的公司不太可能涉足、从而适合创业公司去占领的,为什么学习编写 evals 的技艺正在迅速成为产品构建者的核心技能,AI 时代哪些技能最重要,他在教自己的孩子关注什么,以及更多内容。
Lenny Rachitsky: Kevin,非常感谢你能来,欢迎来到播客。
Kevin Weil: 非常感谢邀请我。我们说要来做这期节目说了好久,终于成行了。
Lenny Rachitsky: 终于做到了。我无法想象你每天的生活有多疯狂,所以真的很感谢你抽出时间。我们录制这期节目的这一周正好赶上你们发布了新的图像模型,算是巧合。我整个社交媒体信息流都被吉卜力化的生活照、家庭照和各种图片刷屏了,干得漂亮。
Kevin Weil: 是的,我的也是。我妻子 Elizabeth 给我发了一张她的,我跟你们一样。
吉卜力风格图片的爆火
Lenny Rachitsky: 我就想问,你们预期到会有这种反应吗?感觉这是 AI 领域最病毒式传播的事件了,这个门槛可不低,毕竟自 ChatGPT 发布以来——你们预期到会这么成功吗?内部是什么感觉?
Kevin Weil: 在我的职业生涯中,有那么几次,你在内部做一个产品,内部使用量会突然爆发式增长。顺便说一句,当年我们在 Instagram 做 Stories 的时候也是如此。在我的职业生涯中,比任何其他经历都更能让我们感受到产品会成功——因为所有人都在内部使用它。周末我们会离开,上线之前我们都在用,周末回来后我们就知道发生了什么,会说”哦,我看到你去露营了,怎么样?“你会觉得,天哪,这东西真的有效。ImageGen 绝对也是这种情况之一。我们在内部大概玩了两三个月,当它首次在公司内部上线时,有一个小画廊,你可以自己生成图片,也可以看到其他所有人正在生成什么,那简直是持续不断的惊叹和兴奋。所以是的,我们有一种感觉,这会让大家玩得很开心。
Lenny Rachitsky: 这太酷了。这应该可以作为对即将发布的产品是否有信心的一个衡量标准——内部所有人都为之疯狂。
Kevin Weil: 是的,尤其是社交类产品,因为公司内部本身就是一个紧密的社交网络,大家彼此认识,而且你希望自己就是产品的专家。所以在某种意义上,如果你在做社交产品,而它在内部都没有火起来,你可能就要反思自己在做什么了。
Lenny Rachitsky: 对了,顺便问一下,吉卜力化那件事,是你们有意引导的,还是怎么开始的?是故意做的示例吗?
Kevin Weil: 我觉得就是大家喜欢那种风格,而且模型在模仿风格方面真的很有能力,能理解……它的指令遵循能力非常强。这一点我觉得大家正在逐渐发现——你可以让它做非常复杂的事情。你可以给它两张图片,一张是你的客厅,另一张是一堆照片或纪念品之类的东西,然后说:“告诉我你会怎么摆放这些东西。“或者你可以说:“我想看看如果把这个放在这里,这个东西放在它的右边,这个放在那个的左边但在那个的下面,效果会怎样。“模型真的能理解所有这些指令并执行。它强大得不可思议。所以我非常期待大家会想出各种不同的用法。
AI 的”常态化的奇迹”
Lenny Rachitsky: 好的。做得好。OpenAI 团队干得漂亮。好了,让我们进入正题,把视角拉远一些。我的看法是,你现在是全球可能最重要的公司的首席产品官。先不说把标准定得太高——但你们正在引领 AI 的发展,最终是通用人工智能(AGI),再之后是超级智能。没什么大不了的。我对你的问题比我对任何其他嘉宾都多。我实际上在 Twitter、LinkedIn 和我的社区里公开征集了大家想问 Kevin 什么,收到了三百多个成型的问题,我们会逐一过一遍。开个玩笑。
Kevin Weil: 好的。
Lenny Rachitsky: 我挑出了最好的那些,有很多我非常好奇的东西。
Kevin Weil: 嗯,我现在这边是下午一点,天黑还早,来吧。
Lenny Rachitsky: 好的,开始吧。首先,我来记一下笔记。通用人工智能(AGI)什么时候发布?十二月几号?
Kevin Weil: 这个嘛,我们刚发布了一个不错的 ImageGen 模型,那个算不算?
Lenny Rachitsky: 快了快了。
Kevin Weil: 有一句我很喜欢的名言,叫”AI 就是那些还没被做成的事情”。因为一旦做成了,当它开始能用的时候,你就叫它机器学习了;等它变得无处不在了,那它就只是一个算法。所以我一直很喜欢这个现象:当一样东西还不太能用的时侯,我们称之为 AI;等到它变成了推荐你关注的 AI 算法,哦那只是个算法,而这个新东西——比如自动驾驶汽车——那才是 AI。我觉得在某种程度上我们永远都会处于这种状态,下一个新东西永远是 AI,而那些我们现在每天都在用、已经成为生活一部分的东西,那就是算法。
Lenny Rachitsky: 这真的很有意思,因为在湾区你会看到自动驾驶汽车在街上跑,现在已经觉得太正常了,而三四年前你看到的话,一定会说”我的天,什么……我们活在未来了”。而现在我们就这么习以为常了。
Kevin Weil: 其实每件事都是这样。如果我给你展示……当 GPT-3 发布的时候,我还没加入 OpenAI,只是一个普通用户,但那种震撼是颠覆性的。而如果我今天把 GPT-3 接入 ChatGPT 给你用,你会说:“这是什么玩意儿?“简直一团糟。
Lenny Rachitsky: 惨不忍睹,惨不忍睹。
Kevin Weil: 我第一次坐 Waymo 的时候也是同样的体验。至少我的第一次,上车的前十秒钟,车开始自己开动,你就”天哪,小心那辆自行车”,紧紧抓住一切能抓的东西。然后五分钟过去,你平静下来了,意识到自己正在没有司机的情况下被载着在城市里穿行,而且一切正常。你就觉得:“天哪,我现在真的活在科幻未来里。“再过十分钟,你就觉得无聊了,开始在手机上处理邮件、回 Slack 消息,而这人类发明的奇迹,从此就成了你生活中一个理所当然的部分。我们在适应 AI 的过程中,确实也是如此。那些奇迹般的事情发生了——计算机做到了以前从未能做到的事情——我们集体为之震撼一周,然后就说,哦,是啊,现在这只是机器学习,正在变成一个算法。
Lenny Rachitsky: 你刚才说的这些里面最疯狂的一点是——ChatGPT 现在感觉已经很粗糙了,而 3.5 不过是几年前的事,想象一下几年后生活会变成什么样。我们会聊到这个的,聊技术走向哪里,你认为下一个大飞跃是什么。但我想先从你加入 OpenAI 的经历开始聊起。你之前在 Twitter 工作过,在 Facebook 工作过,在 Planet 工作过,在 Instagram 工作过。后来某个时候你被招募加入 OpenAI。我很好奇你加入 OpenAI 担任首席产品官的招聘过程是什么样的,有什么有趣的故事吗?
加入 OpenAI 的故事
Kevin Weil: 如果我没记错时间线的话,我在离开 Planet 时已经沟通过,我打算先休息一段时间。我不是不工作了,但也很乐意过完那个夏天。大概是四月份左右,我想,太好了,我可以陪孩子们过夏天,我们会去太浩湖之类的地方,我真的能好好陪陪他们,而不是像平时那样来回奔波。然后 Sam 和我之前几年有过一些浅层交往,他总是参与很多有意思的项目,比如搞核聚变公司之类的。所以每当我开始考虑下一步做什么的时候,他一直是那种我会打电话聊聊的人,因为我喜欢做那些面向大科技前沿的、下一波浪潮类型的事情。
于是我就给他打了电话,我觉得 Vinod 也帮我们重新牵了线。而这一次,他不是说”你应该去跟那些搞核聚变的人聊聊”。他说:“其实我们正在考虑做一件事,你应该来跟我们谈谈。“我说:“好,听起来太棒了,来吧。“然后一切进展得非常快,非常非常快。我在很短时间内——几天之内——就见了大部分管理团队。他们告诉我:“基本上我们会以自己想要的速度推进。如果你跟所有人都聊过了,大家都认可你,你就可以开始了。“Sam 来我家吃了一顿晚饭,我们聊了一个愉快的晚上,聊 OpenAI 的未来,也更好地了解了彼此。结束时我原本第二天要去参加更大一轮的面试,Sam 说:“进展非常好,我们非常兴奋。”
Kevin Weil: 我说:“太好了,那我明天该怎么准备?“他说:“哦,没问题,别担心。如果明天也顺利的话,基本上就定了。”
于是第二天我去了,见了一堆人,聊得很好。我真的很喜欢遇到的每一个人。任何面试过后你总会自我怀疑——哦,那个不该说,那道题答得不好希望能重来——但我出来后的感觉是:我觉得发挥得还不错。我本来预期那个周末就能收到消息,因为他们基本设定了这个预期——如果一切顺利,随时可以开始。结果什么都没等到。然后周一、周二、周三,还是没有任何消息。我主动联系了 OpenAI 那边的人好几次,依然石沉大海。
我当时就想:“天哪,我搞砸了。我不知道哪里搞砸了,但肯定是彻底搞砸了。简直不敢相信。“我不停地回去找 Elizabeth——我妻子——说:“我到底做错了什么?你觉得我是不是……”整个人都快疯了,但依然没有任何消息。终于,九天之后他们才联系我,原来是内部发生了一堆事情,这个那个的,千头万绪。他们终于说:“哦对,那次聊得很好,来吧。“我说:“好,太好了,那就开始吧。“但那九天简直是煎熬,而他们只是忙于一些内部事务,我却在每一天都坐立不安,把我们面试中的每一句话反复咀嚼。
风暴中心的工作日常
Lenny Rachitsky: 这让我想起谈恋爱的时候,你给对方发了消息却没收到回复,就觉得一定出了什么问题。
Kevin Weil: 对,完全一样。
Lenny Rachitsky: 人家可能只是忙而已。
Kevin Weil: 我到现在还是很难不往坏处想。
Lenny Rachitsky: 太疯狂了。很高兴最终结果是好的。我想这其中的教训就是——别急着下结论。
Kevin Weil: 对,稍微淡定一点。
Lenny Rachitsky: 说到淡定,我想聊聊身处风暴中心到底是什么感觉。你之前在很多公司工作过——虽然那些公司也不算传统——Twitter、Instagram、Facebook、Planet,现在你在 OpenAI。我很好奇,你在 OpenAI 的日常工作方式有什么最大的不同?
Kevin Weil: 我觉得可能是速度。可能有两点。第一是速度。第二是——在 OpenAI 之前我工作过的每一个地方,你基本知道自己是在什么技术之上构建产品的。所以你的时间花在思考:你在解决什么问题?你在为谁构建?你如何改善他们的生活?这是一个足够大的问题以至于能改变人们的习惯吗?人们是否真的在意这个问题的解决?所有那些好的产品问题。但你构建所依赖的技术基础基本是固定的。你谈论的是数据库之类的东西,我敢说你今年用的数据库大概比两年前的好 5%,但在 AI 领域完全不是这样。每两个月计算机就能做到以前从未做到过的事情,你需要彻底重新思考自己在做什么。
这其中有一些根本性的有趣之处,让在这里的工作充满乐趣。我们也许稍后会谈到评估(evals),但在这个世界里……我们对计算机的一切认知都是基于给计算机非常明确的输入。比如拿 Instagram 来说,有按钮做特定的事情,你知道它们会做什么。当你给计算机明确的输入时,你会得到明确的输出。你确信如果做三次同样的操作,会得到三次同样的结果。大语言模型(LLM)完全不同。它们擅长处理模糊的、微妙的输入。人类语言和交流中的所有细微差别,它们都处理得相当不错。而且它们不会给你完全相同的答案。对于同一个问题,你大概会得到”精神上”相同的答案,但每次的措辞肯定不一样。所以输入更模糊,输出也更模糊。当你构建产品时,如果你围绕某个特定用例来构建,这一点真的非常重要。
如果模型在某件事上的正确率是 60%,你构建的产品和正确率 95% 时截然不同,跟正确率 99.5% 时又完全不一样。因此你必须深入到用例和评估(evals)的细节中去,才能理解应该构建什么样的产品。这是根本性的不同。如果你的数据库能跑通一次,它每次都能跑通。但在我们这个领域,不是这样的。
评估(evals)为什么重要
Lenny Rachitsky: 那我们就顺着评估(evals)这条线聊下去。我一直想谈这个。在 Lenny & Friends Summit 上我们有一个传奇般的panel,是你和 Mike Krieger,Sarah Guo 主持。
Kevin Weil: 那次很有趣。
Lenny Rachitsky: 太有趣了。那场 panel 上有一句话让很多人印象深刻——你说编写评估(evals)将成为产品经理的核心技能,我觉得这个说法可能不仅适用于产品经理。很多人知道评估(evals)是什么,也有很多人完全不知道我在说什么。所以能不能简单解释一下什么是评估(evals),以及为什么你认为这对未来构建产品的人如此重要?
Kevin Weil: 当然。我觉得最简单的理解方式是把它当作模型的测验——一种测试,用来衡量它对某类知识掌握得有多好,或者它回答某类问题的能力有多强。就像你上了一门微积分课,然后有微积分考试来检验你是否学到了该学的东西。你有评估(evals)来测试模型在创意写作方面有多好?在研究生水平的科学方面有多好?在竞技编程方面有多强?所以你有这样一套评估(evals),基本上作为模型有多聪明、多能干的基准。
Lenny Rachitsky: 一个简单的理解方式是不是可以把它想成模型的单元测试?
Kevin Weil: 对,单元测试,就是针对模型的各类测试。完全正确。
Lenny Rachitsky: 好,好。那对于那些还不太明白评估(evals)到底是怎么回事的人来说,为什么它对构建 AI 产品如此关键?
Kevin Weil: 这就回到我刚才说的。你需要知道你的模型在某个任务上的表现——有些事情模型能做到 99.95% 的正确率,你可以完全放心。有些它们能做到 95%,有些只有 60%。如果模型在某件事上只有 60% 的正确率,你就得用完全不同的方式来构建产品。而且话说回来,这些能力也不是静态的。所以评估(evals)的一个重要部分是——如果你知道你在为某个用例构建产品,比如拿我们的深度研究(deep research)产品来说,这可能是我最喜欢的一个我们发布的产品。深度研究的理念是,对于没用过的人来说,你现在可以给 ChatGPT 一个任意复杂的查询。它不是返回一个搜索查询的结果——那个我们也能做。
Kevin Weil: 它的工作方式是这样的——如果你要自己回答这个问题,你可能会花两个小时在网上阅读资料,然后可能还需要读一些论文,接着你开始整理思路,写着写着发现还有知识盲区,于是又回去查更多资料。为这样一个问题写出一篇二十页的答案,可能要花你整整一周。而你可以让 ChatGPT 替你跑上二三十分钟。它不会给你你习惯的那种即时回答,但它可能会花二三十分钟,完成原本需要你一周才能做完的工作。所以我们构建这个产品的时候,在设计评估(evals)的同时,也在思考产品将如何运作,并且尝试梳理出那些核心用例。
比如,“这是一个你希望能提出的问题,而这是一个针对该问题的精彩回答。“然后把它们转化成评估(evals),再在这些评估上不断爬坡优化。所以,并不是说模型是静态的、我们只能寄望于它在某些事情上表现还行——你可以教会模型,让它成为一个持续学习的过程。因此,当我们为深度研究(deep research)微调模型以使其能够回答这些问题时,我们能够测试它在那些我们认为对衡量产品效果至关重要的评估(evals)上是否在持续进步。当你开始看到这些变化,当你看到评估(evals)上的表现不断攀升,你就会说:“好吧,我觉得我们有了一个真正的产品。”
Lenny Rachitsky: 你之前说过一句类似的话,说 AI 能有多惊艳,几乎取决于我们的评估(evals)做得有多好。这个说法你还认同吗?在这方面还有什么进一步的想法?
Kevin Weil: 我的意思是,这些模型是智能体,而智能本质上是多维度的。你可以说一个模型在竞赛编程上非常出色,但这和它在——
Kevin Weil: ——前端编码或后端编码上表现出色可能不是一回事,也和把一大堆 COBOL 代码转成 Python 不是一回事。而这还仅仅是在软件工程领域。所以我认为,你可以把这些模型看作是极其聪明、知识面非常广的智能体,但世界上大多数数据、知识和流程并不是公开的——它们隐藏在公司、政府或其他机构的围墙之内。就像你加入一家公司,前两周都在做入职培训一样,你需要学习公司特有的流程,获取公司特有的数据。这些模型已经足够聪明,你什么都可以教它们,但它们需要有原始数据来学习。
企业定制化与创业机会
所以我认为,未来的发展方向将是极其聪明的通用基础模型,再通过公司特有或用例特有的数据进行微调和定制,使其在公司特有或用例特有的任务上表现出色。而你会用自定义的评估(evals)来衡量这一点。所以我之前所说的就是,这些模型确实非常聪明,但如果数据不在它们的训练集中,你仍然需要教它们一些东西,而有大量的用例不会出现在训练集中,因为它们只与某个特定行业或某家公司相关。
Lenny Rachitsky: 我想继续沿着你引导我们的这条线往下走,不过我之后还会回来,因为围绕这些话题我还有更多问题。你谈到了一个很多 AI 创始人都在思考的问题——OpenAI 未来不会来碾压我的领域在哪里?或者说其他基础模型不会涉足的领域在哪里?对很多人来说,这很不清楚——“我到底应不应该在这个领域创业?“你有什么建议或指引吗?关于你认为 OpenAI 或者基础模型整体上可能不会涉足、创业者有机会建立公司的方向?
Kevin Weil: Ev Williams 以前在 Twitter 的时候说过一句话,一直让我印象深刻,他说:“无论你的公司变得多大,无论你的员工有多优秀,围墙之外的聪明人永远比围墙之内的多得多。“这就是为什么我们如此专注于打造一个出色的 API。我们有 300 万开发者在用我们的 API。无论我们有多大的野心、团队发展到多大——顺便说一下,我们也不想变得特别大——AI 能够从根本上改善我们生活的用例和场景实在太多了。我们不可能有足够的人力,也不可能在大多数领域具备相应的专业知识。
而且正如我刚才所说,数据是行业特有的、用例特有的,藏在特定公司的围墙之内。在世界上每一个行业、每一个垂直领域,都有巨大的机会去构建基于 AI 的产品来超越当前的技术水平。而我们根本不可能独自覆盖所有这些领域。我们也不想。如果我们想做的话——但我们真的很兴奋能为 300 万以上的开发者提供这层基础设施,并且未来还会服务更多。
快速交付的秘诀
Lenny Rachitsky: 回到你之前说的,技术不断变化、越来越快,在发布产品的时候你并不完全确定模型的能力会达到什么水平。我很好奇,是什么让你能够如此快速、持续地交付出这么好的东西?听起来其中一个答案是自下而上的赋能团队,而不是那种提前规划好一个季度的高度自上而下的路线图。还有哪些因素让你能够这么频繁、这么快速地交付出色的产品?
Kevin Weil: 对。我们的做法是,尽量对前进的方向有一个感知,让自己朝一个方向看,从而保持大致的 alignment。在主题层面上——我一点也不……我们确实会做季度路线图规划,也会制定年度战略,但我绝不相信我们写进文档里的东西就是三个月后、更不用说六个月或九个月后真正要交付的内容。但没关系。我觉得这就像艾森豪威尔说的那句话:“计划毫无用处,但规划过程非常有价值。“我完全认同这一点,尤其是在这个世界里。想想季度路线图规划,它真正的价值在于给了你一个停下来思考的时刻:“好的,我们做了什么?哪些做得好?哪些做得不好?我们学到了什么?接下来我们打算做什么?”
而且每个团队都有一些依赖关系——你需要基础设施团队做这些事,需要和研究团队在那边配合——所以你需要一个节点来确认依赖关系,确保一切就绪,然后开始执行。我们尽量把这个过程保持轻量,因为它不会完全准确。我们中途就会把它推翻,因为我们总会学到新的东西。所以规划的那个时刻是有价值的,哪怕它只有部分准确。
所以我觉得,就是要有一种预期——你会非常敏捷,写三个月的路线图都没有意义,更别说一年的了,因为底层技术在不断快速变化。我们确实是尽量自下而上地推进,同时保持整体方向上的一致。我们有很优秀的人——工程师、产品经理、设计师和研究员,他们对自己构建的产品充满热情,对产品有强烈的观点,而且他们就是实际构建产品的人,所以他们最清楚模型的能力边界在哪里,这一点至关重要。
自下而上与快速迭代
Kevin Weil: 所以我觉得在这种方式上你应该更加自下而上。我们就是这样运作的。我们不怕犯错。我们一直在犯错。这是我非常欣赏 Sam 的一点——他极力推动我们快速前进,但他也理解,快速前进就意味着,我们可能没把这个做对,或者我们发布了某个东西,但它没成功。我们会回滚。看看我们的命名就知道了。我们的命名糟透了。
Lenny Rachitsky: 这也是很多人向你提的问题。模型的名字,对。
Kevin Weil: 确实糟糕透顶,我们自己知道。我们迟早会去改的,但这不是最重要的事情,所以我们没在这上面花太多时间。
Lenny Rachitsky: 但这也恰恰说明它没那么重要。话说回来,ChatGPT 是有史以来最受欢迎、增长最快的产品,是排名第一的 AI、API 和模型。所以显然命名这事没那么要紧。
Kevin Weil: 我们的命名类似 o3 mini high 这种。
Lenny Rachitsky: 哈哈,我喜欢。好吧。你提到了路线图和自下而上的方式,我很好奇,与你或 Sam 对齐是否有一个固定的节奏或仪式?你们会审查所有即将发布的内容吗?有没有每周或每月的会议让大家了解进展?
Kevin Weil: 关键项目上会有。我们会做产品评审之类的,和你预想的差不多。但没有什么固定的仪式,因为……我绝对不希望团队因为等和我或 Sam 的评审而阻塞发布。如果我出差了或者 Sam 忙不过来,那不是我们不发布的正当理由。显然,对于最重要、最高优先级的事情,我们会密切跟进,但坦白说,我们尽量不这样做。我们希望赋能团队快速行动,我认为更重要的是先发布再迭代。
迭代部署与模型最大化
Kevin Weil: 所以我们有这样一个理念,我们称之为迭代部署。核心思想是,我们所有人都在共同学习这些模型的能力。因此,即使你还不知道模型的全部能力集,先发布出来、然后公开一起迭代,这种方式要好得多。我们在学习这些模型的过程中——了解它们哪里不同、哪里好、哪里差、哪里奇怪——与社会共同进化。我非常喜欢这个理念。
Kevin Weil: 我认为我们产品理念的另一个部分是一种模型最大化的思路。模型并不完美,它们会犯错。你可以花大量时间围绕模型构建各种脚手架。顺便说一句,有时我们确实会这样做,因为有些错误你绝对不想让它发生。但对于那些不匹配的部分,我们不会花太多时间搭建脚手架,因为我们的基本心态是——两个月后就会有更好的模型出现,它会彻底超越当前模型的那些局限。
Kevin Weil: 所以如果你在构建产品——我们也对开发者这样说——如果你构建的产品刚好处于模型能力边缘,继续做下去,因为你做对了,再过几个月模型会变得非常强大,而你那个勉强能跑的产品突然就会大放异彩。这就是确保你真正在推进前沿、构建新事物的方式。
Lenny Rachitsky: 我之前请过 Bolt 的创始人上播客,公司名叫 StackBlitz,他分享了一个故事:他们在幕后做这个产品做了七年,一直失败,没有任何起色。然后突然之间——不好意思提到竞品了——Claude 出来了,或者说 Sonnet 3.5 出来了,一切突然就跑通了。他们一直在构建,终于等到了能用的那一刻。我在 YC 那边也经常听到类似的故事——以前不可能的事情,随着模型的更新,每隔几个月就变成了可能。
Kevin Weil: 完全同意。
关于竞争与编码能力
Lenny Rachitsky: 我顺便问一下,我本来没打算问这个,但我很好奇你有没有什么简短的想法——为什么 Sonnet 在编码方面这么强?你们自己的产品在编码方面达到同样好甚至更好的水平,你有什么想法?
Kevin Weil: 必须向 Anthropic 致敬。他们构建了非常优秀的编码模型,毫无疑问。我们认为我们也能做到。也许等这期播客发布的时候,我们会有更多可说的,但无论如何,功劳归于他们。我认为智能本质上是一个多维度的东西,所以各个模型提供商……过去 OpenAI 在模型上有巨大的领先优势,领先其他所有人大概十二个月。现在不是这样了。我认为我们仍然有领先优势——我可以说我们确实有——但肯定不再是那种大幅领先了。这意味着 Google 的模型会在某些方面特别好,Anthropic 的模型会在某些方面特别好,或者我们在某些方面特别好,而竞争对手会说”我们必须在这方面变得更强”。而且,一旦有人证明了某件事是可能的,再去追赶那个方向,确实比自己在丛林中开辟一条全新的道路要容易得多。
Kevin Weil: 举个例子,就像以前没有人能跑进四分钟一英里,终于有一个人做到了,第二年就有十二个人做到了。我觉得这种情况到处都是,这意味着竞争非常激烈,而消费者会赢,开发者会赢,企业会以巨大的方式从中受益。这也是这个行业运转如此之快的原因之一。但我们尊重其他主要的模型提供商。模型正在变得非常好。我们会尽可能快地推进,而且我认为我们有一些好的东西即将推出。
Lenny Rachitsky: 令人期待。
ChatGPT 的消费者心智占领
Lenny Rachitsky: 这也让我想到,在很多方面,其他模型在某些事情上确实更好,但不知为何 ChatGPT 就是……如果你看所有的认知度数据和用户使用数据,不管你们在排行榜上排第几,人们似乎就是把 AI 和 ChatGPT 几乎等同起来。你觉得你们做对了什么,才能在消费者心智中——至少在目前的认知度和大众认知方面——赢得这样的地位?
Kevin Weil: 我觉得先发优势很重要,这也是我们如此专注于快速行动的原因之一。我们喜欢率先发布新能力。比如深度研究(deep research)。我们的模型可以做很多事情——它们可以接收实时视频输入,支持语音到语音的对话,可以做语音转文本和文本转语音,可以做深度研究(deep research),可以在画布上操作,可以编写代码。所以 ChatGPT 可以成为一个一站式的平台,你想做的所有事情都可以在这里完成。而且随着我们继续推进,我们有了更多智能体工具,比如 Operator,它可以替你浏览网页、在网络上帮你执行操作——你会越来越多地来到 ChatGPT 这个唯一的入口,给它指令,让它为你在现实世界中完成真正的事情。这其中有根本性的价值。所以我们在这方面思考很多。我们努力快速推进,始终让自己成为对人们最有用的去处。
Lenny Rachitsky: 你会说在构建 AI 产品或在 OpenAI 工作之后,你学到的最反直觉的事情是什么?就是那种”我完全没想到会是这样”的事情?
Kevin Weil: 我不知道,也许我本该预料到这一点,但让我觉得有趣的是,当你在思考 AI 产品应该怎么设计,或者在理解某个 AI 现象为什么成立时,你往往可以像推理另一个人类一样去推理它,而且这套推理居然真的管用。举几个例子吧。当我们最初推出推理模型的时候,我们是第一个构建出具备推理能力的模型的——它不会像过去那样对每个问题都立刻给你一个快速的系统一答案,“谁是神圣罗马帝国第三任皇帝,答案在这里”。
你可以问它很难的问题,它会进行推理。就像如果你让我做一道填字游戏,我不可能一上来就把所有格子都填满。我会想,“好吧,这个横向的,我觉得可能是这两个词之一,但那就意味着这里有个 A,所以那个肯定是这个词,不行,回退,一步一步地从当前位置往下推。“就像你解决任何复杂的物流问题、任何科学问题的方式一样。所以这个推理突破是很重大的,但同时也是模型第一次需要坐下来”想一想”。这对消费产品来说是一个很奇怪的范式。你通常不会遇到一种产品,问了一个问题之后还要等上 25 秒。
所以我们一直在想,这个交互界面该怎么设计?深度研究(deep research)那种情况下,模型有时候会去思考 25 分钟,其实反而没那么难处理——因为你不会坐在那里看它 25 分钟。你会去做别的事,切到别的标签页,或者去吃午饭什么的,回来的时候它已经完成了。但当它是 20 到 25 秒,或者 10 秒的时候,这段时间很长,但又不够长到你可以去做别的事情。
所以你可以这样想:如果你问了我一个需要想 20 秒才能回答的问题,我会怎么做?我不会直接沉默,一言不发地愣 20 秒然后再开口。所以我们不应该那样做。我们不应该只放一个进度条在那里,那很烦人。但我也不会把我脑子里的每一个念头都喋喋不休地说出来。所以我们可能也不应该把整个思维链直接暴露给用户。但我可能会说,“这是个好问题。让我想想。“然后开始思考。你可以给一些小小的更新——而这其实就是我们最终上线的方案。
类似的情况还有很多,比如你可以找到这样的场景:让一组模型各自尝试解决同一个问题,然后由另一个模型审视所有输出、进行整合,最后给你一个统一的答案。听起来有点像头脑风暴,对吧?我确实觉得当我和其他人一起坐下来头脑风暴的时候,会想出更好的主意,因为他们思考的方式和我不同。总之,你会不断地遇到这些场景——你可以像推理一群人类或一个单独的人类那样去推理模型的行为,而且这种推理方式是有效的。我不知道,也许我不应该感到惊讶,但我确实惊讶了。
对话式交互的魅力
Lenny Rachitsky: 这太有意思了。因为当我看到这些模型运作的时候,我从来没想到过你们还要专门设计那种体验。对我来说,它就感觉像是大语言模型(LLM)天然的行为——它就坐在那里告诉我它在想什么。我很喜欢你说的这个观点:让它感觉像是一个人类在运作。那人类怎么运作呢?他们会把想法说出来,思考”这个方向值得探索一下”。我很喜欢深度研究(deep research)把这一点推到极致——它会把正在做的和想的一切都展示出来。而且人们似乎也很喜欢那种体验,对吧?你有没有感到意外——“原来这样也行?人们好像什么形式都能接受?”
Kevin Weil: 是的,这件事我们确实有学到。因为我们最初上线的时候,只给了模型正在思考的子标题,没有更多内容。后来 DeepSeek 上线了,展示了很多内容,我们就想,不知道是不是所有人都想要那种体验。看到模型真正在想什么确实有新鲜感。我们内部看的时候也有同感——看到模型的思维链确实很有意思。但我觉得在四亿用户这个规模下,你不会想看到模型在那里喋喋不休。
所以我们最终的做法是用有趣的方式进行总结。你不再是只看到子标题,而是能看到一两个句子,描述模型是怎么思考这个问题的,你还可以从中学习。所以我们试图找到一个中间地带,让这种体验对大多数人来说是有意义的,但给每个人展示三大段文字大概不是正确答案。
Lenny Rachitsky: 这让我想起你在峰会上说的另一件事,一直让我印象深刻——就是关于”对话式界面”的这个观点。人们总是嘲笑对话不是我们与 AI 交互的未来界面,但你提出了一个非常有意思的反面论点:作为人类,我们本就是通过说话来交互的,而人类的智商可以从很低到很高跨度极大,但我们都可以通过对话来沟通。对话式交互也是一样的,它可以在各种智能水平上运作。也许我只是在复述你的观点,但关于为什么对话最终会成为大语言模型(LLM)如此有趣的交互界面,你还有什么想补充的吗?
Kevin Weil: 是的。我不知道,也许这是我相信但大多数人不太认同的一件事——我其实认为对话是一个极好的交互界面,因为它极其通用。人们往往倾向于说,“对话嘛……我们以后会找到更好的方式的。“但我认为它极其普适,因为它就是我们说话的方式。我可以像现在这样和你面对面口头交谈,我们能看到彼此、互动;我们也可以在 WhatsApp 上互发短信——但所有这些都是一种非结构化的沟通方式,而这就是我们运作的方式。
如果在我们交谈时只允许我使用某种更僵化的交互界面,我能和你谈论的事情就会少得多,它实际上反而会阻碍我们实现最大的沟通带宽。所以这里有某种神奇的东西。顺便说一句,过去这种方式从来不奏效,因为没有一个模型能够很好地理解人类语言中所有的复杂性和微妙之处——而这正是大语言模型(LLM)的魔力所在。所以对我来说,这是一种恰好与这些模型的能力完美匹配的交互界面。这并不意味着它永远都只能是我打字你回复——你可能不想一直打字,但你确实想要那种非常开放、灵活的沟通媒介。也许未来我们是用语音说话,模型也用语音回应我,但你仍然想要那种最低门槛、没有任何限制的交互方式。
Lenny Rachitsky: 这太有意思了。这一点真的改变了我对这些事情的思考方式——对话居然如此适合”和超级智能交流”这个非常特定的问题。
Kevin Weil: 顺便说一下,我也不是说只有对话这一种形式。如果你有高频使用的场景,而且那些场景比较规范、你实际上不需要完整的通用性,那在很多用例下,使用一种灵活性更低但更规范、针对特定任务更快的方案会更好。这些方案也很棒,你可以构建各种这样的产品。但你仍然需要对话作为基线,来兜住那些超出你所构建的垂直领域之外的所有需求。它就像一个万能的收容器,能接收你想要对模型表达的任何可能的想法。
Lenny Rachitsky: 我想回到你之前谈到的,关于研究员和产品团队的关系。我想象中很多创新来自于研究员——他们有了一个直觉,然后构建出令人惊叹的东西并发布出来;也有一些想法来自产品经理和工程师。这些团队之间是如何协作的?每个团队都有产品经理吗?是研究主导的情况更多吗?给我们讲讲想法和产品主要从哪里来。
研究与产品的协作模式
Kevin Weil: 这是一个我们正在快速演进的领域,坦白说我对这一点非常兴奋。我想如果你回溯几年,ChatGPT 刚起步的时候,我显然当时还不在 OpenAI……我们当时更像是一家纯粹的研究公司。ChatGPT,如果你还记得的话,是一个低调的研究预览。
Lenny Rachitsky: 很多年都是这样。
Kevin Weil: 对。团队发布它的时候并不认为它会成为这样一个巨大的产品。
Lenny Rachitsky: 哦,ChatGPT。是的。
Kevin Weil: 它只是我们让大家试用和迭代模型的一种方式。所以我们主要是一家研究公司,一家世界级的研究公司。随着 ChatGPT 的成长,以及我们构建了面向企业的产品和 API 等其他东西,现在我们的产品属性比以前强了很多。但我仍然认为我们不能……OpenAI 永远不应该成为一家纯粹的产品公司。我们需要同时是一家世界级的研究公司和一家世界级的产品公司,两者需要真正地协同工作,而这正是我觉得过去六个月我们做得越来越好的地方。如果你把这两件事分开——研究员去做了不起的事情、构建模型,达到某个状态后,产品和工程团队再拿去做点什么——那我们实际上只是自己模型的一个 API 消费者。
而最好的产品,就像我之前谈到的深度研究(deep research)那样,需要大量的迭代反馈。要理解你想要销售的产品或想要解决的问题,为它们构建评估(evals),用这些评估去收集数据、微调模型,让它们在你想要解决的用例上表现更好。要做好这件事需要大量的来回反复。我认为最好的产品会是工程、产品设计和研究作为一个团队一起构建全新事物的成果。所以这基本上就是我们构建任何东西时尝试的运作方式。这对我们来说是一种新的能力,因为我们作为产品公司还比较新,但大家都对此非常兴奋,因为每次这样做,我们都能构建出很棒的东西,所以现在每个产品都这样启动。
产品经理的数量与角色
Lenny Rachitsky: OpenAI 有多少产品经理?我不知道你会不会公布这个数字,如果可以的话。
Kevin Weil: 其实不多。我不知道,大概 25 个吧,也许多一点。我个人的理念是,作为一个组织,你总体上应该保持比较精简的产品经理配置。我说这话是带着爱的,因为我本人就是产品经理,但太多的产品经理会带来问题。我们会用演示文稿和想法填满整个世界,而不是去执行。所以我觉得,当一个产品经理带的工程师稍微多一些的时候,反而是一件好事,因为这意味着他们不会深入去微观管理。你会把大量的影响力和决策权留给工程师。这意味着你需要有产品意识非常强的工程师,而我们很幸运拥有这样的工程师。我们有一支非常有产品意识、高能动性的工程团队。当你拥有这样的组合时,你就有了一支感到非常有自主权的团队,而产品经理则致力于真正理解问题,适度引导团队,但因为手头事情太多而无法过度介入细节,最终你能以非常快的速度推进。所以这大概就是我们的理念。
我们希望从产品经理到工程主管再到产品工程师,整个链条都有产品意识。我们不想要太多的产品经理,但要真正优秀、高质量的。目前来看效果还不错。
OpenAI 产品经理的画像
Lenny Rachitsky: 我想成为 OpenAI 的产品经理对很多人来说是梦想成真。同时,我也能想象它并不适合所有人。那里有研究员参与,还有非常有产品意识的工程师。你在招聘产品经理时看重什么?给那些觉得”也许我不该去那里工作,想都不要想”的人一些建议。
Kevin Weil: 我说过几次了,高能动性是我们真正看重的——那些不会等着别人允许才去做事的人,他们看到问题就直接去做。这是我们工作方式的核心。还有能接受模糊性的人,因为这里有大量的模糊性。它不是那种地方——有时候我们在招聘较初级的产品经理时会遇到困难,就是因为它不是一个会有人走过来说”好的,这是整体格局,这是你的领域,我要你去做这件事”的地方。而那正是职业生涯早期的产品经理所想要的。这里没有人有时间,问题也太不成熟,我们都在边走边摸索。所以高能动性、对模糊性非常适应、准备好参与执行并以极快速度推进——这就是我们的配方。
而且我认为,也乐于通过影响力来领导,因为……作为产品经理本来就是这样,你的团队成员不向你汇报,等等,但你还有研究部门这个复杂的维度,它更加自我驱动,与研究团队建立良好的关系真的非常重要。我觉得情商方面对我们来说也至关重要。
产品经理如何赢得信任
Lenny Rachitsky: 我知道在大多数公司,产品经理进来时大家的反应就是,“我们为什么需要你?“作为产品经理你必须赢得信任,让大家看到你的价值。我觉得在 OpenAI,这种情况可能是极其极端的版本——他们会说,“我们为什么需要这个人?我们有研究员、工程师,你来这里要做什么?”
Kevin Weil: 是的,我觉得如果做得好,大家是会欣赏的,但你需要把人们带上路。我认为产品经理能做好的最重要的事情之一就是果断决策。所以这里有一条很微妙的界线。你不想做每一个……我的意思是,这有点像——我不太喜欢”产品经理是产品的 CEO”这种幻觉,但就像 Sam 在他的角色中,如果他在参加的每一个会议上都亲自做每一个决定,那会是一个错误。而如果他在所有会议上都不做任何决定,那同样也是一个错误,对吧?关键在于理解什么时候该听从团队、让人们去创新,什么时候有一个需要做的决定——要么是大家不太敢做,要么是大家觉得没有权力做,或者是一个有太多不同的利弊分散在一个大群体中、需要有人果断拍板的决定——这是 CEO 一个非常重要的特质。
这也是 Sam 做得很好的地方,同时在更微观的层面上,这也是产品经理一个非常重要的特质。因为模糊性太多了,很多情况下答案并不显而易见,所以你需要一个产品经理能够站出来……顺便说一下,这不一定要是产品经理,如果是其他人我也完全高兴,但我确实期望产品经理能做到的是:如果有模糊性、没有人做决定,你最好确保我们做出一个决定并继续前进。
AI 对产品团队的影响
Lenny Rachitsky: 这涉及到我之前写过的一些文章,关于 AI 到底是要接管我们做的工作,还是帮助我们做各种工作。让我换个角度来看这个问题——AI 如何影响产品团队和招聘等等。首先,现在到处都在谈论大语言模型要替我们写代码了,一年内 90% 的代码将由 AI 编写。Anthropic 的 Dario 就这么说过。与此同时,你们却在疯狂招聘工程师、疯狂招聘产品经理。每个职能都被宣判死刑,但你们每个都还在招。我想先问一下,你和团队——比如工程师、产品经理——是如何在日常工作中使用 AI 的?有没有什么特别有趣的做法,或者你觉得大家还没有注意到的东西?
Kevin Weil: 我们用得很多。我们每个人都在 ChatGPT 里,随时在总结文档、用它来帮助撰写文档,用 GPT 来写产品规格之类的,所有你能想象到的事情。比如说写评估,你其实可以用模型来帮你写评估,而且它们做得相当好。话虽如此,我对我们还是有些失望——真的说的是我自己——如果把我五年前在别的公司做产品的那个自己直接传送到我现在的工作中,我仍然能认出它。我认为我们应该处于这样一个状态——当然一年后肯定是这样,甚至现在可能就应该是了——我几乎认不出我的工作方式,因为工作流应该如此不同,AI 应该被如此重度地使用,但我今天还是能认出来。所以在某种意义上,我觉得自己在这方面做得还不够好。
举个例子,我们为什么不应该到处都在做感觉编程的 demo?与其在 Figma 里展示东西,我们应该展示人们在 30 分钟内感觉编程做出的原型,用来演示概念验证、探索想法。这在今天完全可行,但我们做得还不够。实际上,我们的首席人才官 Julia 前几天跟我说,她感觉编程做了一个内部工具——她在之前的公司有一个,特别想在这里也有,于是她打开——我忘了是 Windsurf 还是别的什么——就感觉编程做出来了。这多酷啊?如果我们的首席人才官都在这么做,我们没有任何借口不多做一点。
感觉编程(Vibe Coding)
Lenny Rachitsky: 这个故事太棒了。有些人可能没听过”感觉编程”这个词。你能描述一下这是什么意思吗?
Kevin Weil: 这个词是 Andrej 提出来的。
Lenny Rachitsky: Karpathy。对。
Kevin Weil: Andrej Karpathy。对。所以你有 Cursor、Windsurf、GitHub Copilot 这些工具,它们非常擅长建议你可能想写的代码。你可以给它们一个提示,它们就会写代码,然后当你去编辑的时候,它会建议你接下来可能要做什么。大家开始用这些东西的方式是:给一个提示,让它做事,你去编辑,再给一个提示,你一直在和模型来回交互。随着模型越来越好、人们越来越习惯,你可以稍微放开方向盘了。当模型在建议东西的时候,就是——点、点、点、点、点。继续。是、是、是、是、是。
当然模型会犯错,或者编译不过去,但当它编译不过去的时候,你把错误贴进去,然后说:继续、继续、继续、继续、继续。然后你测试一下,它做了一件你不想让它做的事,你就输入一条指令,说:继续、继续、继续、继续、继续,你就让模型自己去做。并不是说你会用这种方式来写需要非常严谨的生产代码——至少目前还不会——但对于很多场景,你想做一个概念验证、想做一个 demo,你真的可以放开双手,模型会做得非常出色。这就是感觉编程。
Lenny Rachitsky: 解释得太好了。我觉得进阶版本——也是 Andre 描述的方式——是你用说的,有一步是用 whisper 或 super whisper 之类的东西,你对着模型说话,甚至都不用打字。
Kevin Weil: 对,完全同意。
产品团队的未来
Lenny Rachitsky: 天哪。那我想问一下,当你展望未来的产品团队时,你说你们应该更多地这样做——用原型代替设计稿——你觉得产品团队在组织架构或构建方式上最大的变化会是什么?你觉得接下来几年会朝什么方向发展?
Kevin Weil: 我认为你肯定会进入一个每个产品团队都内置研究员的世界。我甚至不只是在说基础模型公司,因为我认为未来……实际上,坦率地说,我对我们整个行业有一点惊讶的是,微调模型的使用并没有更广泛。很多人……这些模型已经非常好了,所以我们的 API 在很多方面都做得很好,但当你有特定的用例时,你总是可以通过微调让模型在特定用例上表现得更好。这可能只是时间问题。大家在每种情况下对这样做还不太放心。但对我来说,毫无疑问这就是未来。模型会无处不在,就像晶体管无处不在一样,AI 会成为我们所做一切的基本构成部分。但我认为会有很多微调模型,因为如果你可以针对特定用例更具体地定制一个模型,为什么不呢?
所以我认为你会希望在每个团队中都有准研究员、机器学习工程师这类角色,因为微调模型将成为构建大多数产品的核心工作流的一部分。所以这是你可能已经在基础模型公司看到的一个变化,随着时间推移会扩散到更多团队。
微调模型的具体案例
Lenny Rachitsky: 我很好奇有没有一个具体的例子能让这变得真实。我先分享一个你说的时候我想到的例子:当你看 Cursor 和 Windsurf 的时候,我从那些创始人那里学到的是,他们使用 Sonnet,但同时他们也有一堆自定义的边缘模型,让那些不仅仅是生成代码的特定体验变得更好,比如自动补全、预判代码走向等。所以这是一个例子吗?或者你还有什么其他例子?什么是微调模型?你认为团队会带着研究员一起构建这些吗?
Kevin Weil: 是的。当你做微调的时候,基本上就是给模型大量你希望它更擅长的那类任务的示例。就是”这里有个问题,这里有个好的回答。这里有个问题,这里有个好的回答”,或者”这里有个问题,这里有个好的回答”,乘以一千或一万。然后模型就在这个特定任务上比开箱即用时强得多。我们在内部到处都在用。我们在内部使用模型集成(ensembles)比人们想象的要多得多。所以不是说”我有十个不同的问题,我就拿基线 GPT-4o 去问一堆”。如果我们有十个不同的问题,我们可能会用二十个不同的模型调用来解决它们,其中一些使用专门的微调模型,使用不同大小的模型,因为不同的问题可能有不同的延迟要求或成本要求。
每个调用可能都在使用自定义的 prompt。基本上你要教模型在……你想把问题拆解成更具体的任务,而不是一些更广泛的高层次任务。然后你可以让模型非常具体地在每个单独的子任务上变得非常好。然后你用一组集成来解决整个问题。我认为现在很多优秀的公司已经在这样做了。但我仍然看到很多公司在给模型单一的、泛化的、宽泛的问题,而不是把问题拆解开来。我认为未来会有更多的拆解,使用特定的模型处理特定的事情,包括微调。
Lenny Rachitsky: 那么在你们的情况下,因为这一点真的很意思,你们是在使用不同层级的 ChatGPT 吗?比如 1-0-3 之类更早的模型,因为更便宜?
客服系统的模型集成
Kevin Weil: 我们内部技术栈中会有这样的场景。我给你举个例子。客服方面,我们有超过四亿的周活跃用户,会收到大量的进线工单。我不确定我们有多少客服人员,但不会很多,三四十个吧,我不确定,但比任何同等规模的公司都要少得多。这是因为我们自动化了大量流程。我们用内部资源、知识库、回答问题的指导方针、什么样的人设风格等等来处理大部分问题。你可以把这些东西教给模型,然后让它自动完成大量回答,或者在它对某个特定问题没有十足把握的时候,它仍然可以建议一个答案,请求人工审核,而那个人工的回答本身又成为了模型的微调数据——你在告诉它在特定情况下什么是正确答案。
我们在不同环节使用的……某些环节你需要更多推理能力,对延迟不那么敏感,所以我们用 O 系列模型。另一些环节你只需要快速检查一下,那用 four oh mini 就够了,超级快,超级便宜。总的来说就是特定目的用特定模型,然后把它们组装集成来解决问题。顺便说一下,这其实和我们人类解决问题的方式并无不同——一家公司可以说就是一个由各种模型组成的集成,每个模型都根据我们在大学学的东西和职业生涯中积累的经验被微调过。我们每个人都已被微调出不同的技能组合,然后以不同的配置组合在一起,集成的输出比任何单一个体的输出都要好得多。
Lenny Rachitsky: Kevin,你让我大开眼界。这听起来完全正确。而且,不同的人,你付给他们的薪水更低,跟他们沟通的成本更低,有些人回答问题要花很长时间,有些人会胡说八道。这就是……
Kevin Weil: 我告诉你,这是一个心智模型,但在思考……方面确实非常管用。
Lenny Rachitsky: 哦,对,是的。这太好了。有些人是视觉型的,他们想把自己的想法画出来,有些人则喜欢用文字表达。哇,这真是一个非常好的比喻。所以,回到你之前的建议,我很高兴我们绕回来了,你找到了一个非常好的方式来思考如何设计出色的 AI 体验,特别是大语言模型(LLM)方面的——就是想想一个人会怎么做这件事。
Kevin Weil: 嗯,也许答案并不总是”想想一个人会怎么做”,但有时为了获得对如何解决某个问题的直觉,你可以想想在那些情况下一个同等的人类会怎么做,至少用它来获得看待问题的一个不同视角。
Lenny Rachitsky: 哇,这太好了。
Kevin Weil: 因为这其中很大一部分确实是在和模型对话。这方面有很多先例,因为我们一直在和其他人交流,在各种不同情境下与他们接触,所以有很多可以借鉴的地方。
为孩子的未来做准备
Lenny Rachitsky: 好,说到人,我想聊聊未来。你有三个孩子,社区里有人问了我一个很搞笑的问题,我觉得这也是很多人在想的事情。这位是 Patrick,我之前在 Airbnb 跟他共事过。他说,问问他鼓励自己的孩子学什么来为未来做准备。我担心我六岁的孩子到 2036 年的时候,要面临激烈的竞争才能进入顶尖的屋顶维修或管道工项目,所以需要一个备选方案。
Kevin Weil: 这很有意思。我们的孩子,一个十岁,还有一对八岁的双胞胎,所以还很小。他们对 AI 的原生适应程度令人惊叹。自动驾驶汽车对他们来说完全正常。他们可以整天跟 AI 聊天。他们会跟 ChatGPT、Alexa 和其他所有东西进行完整的对话。我不知道未来会怎样。我认为像编程技能这样的东西在很长一段时间内还是会相关的,谁知道呢?但我觉得如果你教会孩子好奇心、独立性、自信,教他们如何思考,我不知道未来会怎样,但我认为这些在任何形式的未来中都将是重要的技能。所以我们并没有所有的答案,但这就是 Elizabeth 和我对孩子们的看法。
Lenny Rachitsky: 你觉得 AI……现在有很多关于 AI 辅导的讨论。你们有在做这个吗?我知道他们在用 ChatGPT,我很喜欢你发的那些他们玩 prompt 的照片,但我想问你们有没有在实验什么,或者你觉得有什么会变得特别重要?
Kevin Weil: 这可能是我能想到的,AI 能做的最重要的事情之一。也许这话有点大,AI 能做的重要事情很多,包括加速基础科学研究和发现的步伐,那可能才是 AI 能做的最重要的事情。但其中一个最重要的方向,就是个性化辅导。我至今仍然感到震惊的是……我知道市面上已经有一些不错的产品。Khan Academy 做得很好,他们是我们非常好的合作伙伴。Vinod Khosla 有一个非营利组织在这个领域做了一些非常有趣的事情,并且正在产生影响。但我有点惊讶的是,目前还没有一个覆盖 20 亿儿童的 AI 个性化辅导平台,因为现在的模型已经足够好了,而且所有曾经做过的研究似乎都表明,当你拥有……教育当然仍然重要,但当教育与个性化辅导相结合时,学习速度会有数个标准差的提升。
这是毫无争议的,对孩子有好处,而且是免费的。ChatGPT 是免费的,你不需要付费,模型也足够好了。我仍然感到惊讶的是,居然还没有一个真正令人惊叹的产品,让我们的孩子、你未来的孩子,以及世界各地那些没有我们孩子这么幸运的人都能使用,让他们也能拥有这种内在的、扎实的教育资源。再说一次,ChatGPT 是免费的,Android 设备到处都是。我真的认为这可以改变世界,我很惊讶它还不存在,我希望它能存在。
AI 的未来与社会影响
Lenny Rachitsky: 这触及到我想花点时间聊的一个话题,很多人也对 AI 非常担忧——它的发展方向,担心它会取代工作,担心未来的超级智能会毁灭人类。你对此有什么看法?以及我想人们需要听到的那个乐观的理由是什么?
Kevin Weil: 我是坚定的技术乐观主义者。如果你回顾过去 200 年,甚至更久,技术推动了许多进步,造就了我们今天的世界和社会。它推动了经济增长、地缘政治进步、生活质量提升、寿命延长。技术几乎是一切进步的根源,所以我认为在几乎所有情况下,从长期来看,技术都是一件好事。但这并不意味着……
这不意味着不会出现短期的结构性调整,也不意味着不会有个人受到影响,这些同样重要。所以不能仅仅是平均值好看就行了,你还必须想办法尽可能照顾到每一个人。
这是我们一直在思考的问题。在与政府合作、与政策制定者合作的过程中,我们尽力提供帮助。我们在教育方面做了很多工作。其中一个好处是,ChatGPT 或许也是你能想到的最好的技能再培训应用。它知道很多东西,如果你有兴趣学习新东西,它可以教你很多东西。
这些问题都是非常现实的。我对长期前景非常乐观,但作为一个社会,我们需要尽一切努力确保这个转型过程尽可能平稳、得到充分的支持。
AI 辅助创造力的下一个飞跃
Lenny Rachitsky: 让大家对未来的方向有个概念。这是很多人心中的一个大问题。有人问了一个我很喜欢的问题,“AI 已经在改变创意工作了——写作、设计、编程,你认为下一个大的飞跃是什么?在 AI 辅助创造力方面,我们应该期待什么?更广泛地说,你认为未来几年会朝什么方向发展?”
Kevin Weil: 对。这也是我非常乐观的一个领域。你看看 Sora,比如我们之前聊过 ImageGen,以及人们 在 Twitter、Instagram 和其他平台上展现出的那种源源不断的创造力。我是世界上最差的艺术家——最差的那种。也许唯一比我艺术更差的就是唱歌了。给我一支笔和一叠纸,我画得还不如我们八岁的孩子。但给我 ImageGen,我就能想出一些有创意的点子,输入到模型里,然后突然得到我自己根本不可能做出来的作品。这真的很酷。
即使是那些真正有才华的人。我最近跟一位导演聊 Sora,他导演过的电影我们都看过。他说,比如他在做的一部电影,举个例子,某种科幻类的,想想《星球大战》,你有一个飞船飞向某个类似死星的场景。你先是从飞船视角看到整个星球,然后要切到一个飞船在地面高度的画面,突然间你看到了城市和一切。我们要怎么处理这个转场?那个过渡?
他说,“在两年前的世界里,我会付给一家 3D 特效公司十万块,他们会花一个月时间,给我做出两个版本的转场。我会评估它们,然后选一个,因为还能怎么办?再花五万块再等一个月?我们就只能将就了。效果还行。电影很棒,我很喜欢。显然……”
我们用已有的技术当然可以做出伟大的作品,但你现在看看用 Sora 能做什么。他的观点是,“现在,我可以用 Sora,我们的视频模型,我自己在提示词里头脑风暴,模型也跟我一起头脑风暴,我就能得到 50 个不同版本的转场。当然,我可以在此基础上迭代、优化、借鉴不同的想法。最终我还是会去找那家 3D 特效工作室制作最终版本,但我去的时候已经经过了充分的头脑风暴,拥有了一个更有创意的方案,最终效果也好得多。而这一切是在 AI 的辅助下完成的。”
我个人对创造力的看法是,没有人会……你不会对着 Sora 输入”给我做一部好电影”。它需要创造力、独创性和所有这些东西,但它可以帮你探索更多,帮你达到一个更好的最终结果。所以,我再次倾向于乐观,而且我认为这里确实有一个非常好的故事。
Lenny Rachitsky: 我知道 Sam Altman,我想是他最近发了条推文,关于你们正在做的创意写作部分……他非常不擅长写创意类的东西,他分享了一个例子,效果非常好。我想这也是你们投资的一个方向。
Kevin Weil: 对,内部在新的研究技术方面有一些令人兴奋的进展。我们之后会有更多消息分享。Sam 有时候喜欢提前展示一些正在做的东西,这很聪明。顺便说一句,这非常体现了我们的迭代部署理念。我们不会有了某种突破就永远自己藏着,然后某一天才恩赐给世界。我们只是谈论我们正在做的事情,能分享的时候就分享,尽早发布、频繁发布,然后在公开中迭代。我很喜欢这种理念。
未来展望
Lenny Rachitsky: 我很喜欢这些关于即将到来的东西的暗示。我知道你不能说太多。你提到不久的将来可能会有一个编码方面的飞跃,也许等这期节目上线的时候就已经发布了。还有什么其他人们应该关注的、近期可能会推出的东西吗?有什么有趣的、令人兴奋的可以透露的吗?
Kevin Weil: 天哪,这些还不够吗?
Lenny Rachitsky: 只希望一切每天都在变好。
Kevin Weil: 是啊,我想的是,天哪,我希望这期节目上线之前我们能把其中一些东西发布出来,这样就——
Lenny Rachitsky: 这就是你的新时间压力。
Kevin Weil: ——不会让大家不高兴。让我觉得神奇的是,我们之前谈到模型在过去短短几年里取得了多大进步。如果你回到 GPT-3 的时代,你会觉得它差得令人发指,尽管两年前的 Lenny 已经被它的出色程度震惊了。很长一段时间里,我们每六到九个月迭代一个新的 GPT 模型——GPT-3、GPT-3.5、4。现在随着 o 系列推理模型的出现,我们的节奏更快了。大概每隔三个月,也许是四个月,就有一个新的 o 系列模型,每一个在能力上都是一个台阶式的提升。
这些模型的能力正在以巨大的速度增长,同时随着规模扩大,成本也在下降。你看看哪怕几年前我们还在什么水平。我想最初的——我不确定,是 GPT-3.5 吧还是什么——大概是如今 GPT-4o mini 在 API 中成本的一百倍。短短几年,成本下降了两个数量级,而智能水平却大幅提升。我不知道世界上还有哪个领域有类似的趋势线。模型变得更聪明、更快速、更便宜,同时也更安全——每次迭代中幻觉都在减少。
这就像摩尔定律之于晶体管的普及,那个定律说的是每 18 个月芯片上的晶体管数量翻倍。而我们在这里说的是每年十倍的提升,那是一个远比之陡峭得多的指数曲线。它告诉我们,未来将和今天截然不同。我经常提醒自己的一点是:你今天使用的 AI 模型,是你余生将要使用的最差的 AI 模型。当你真正把这个观念刻进脑子里,会觉得相当震撼。
Lenny Rachitsky: 我其实正想说同样的话,这也是我观察这一切时始终萦绕在心头的想法。你在谈论 Sora,我想很多听到的人会想,“不不不,它还没有真正准备好,还不够好,不可能比得上我在电影院看的电影。“但关键正是你刚才说的——这是它最差的时候了。它只会越来越好。
Kevin Weil: 对,模型最大化主义。继续为那些即将到来的能力去构建产品,模型会迎头赶上并且表现惊艳的。
Lenny Rachitsky: 滑向冰球要去的地方。
Kevin Weil: 没错。
当魔法变成日常
Lenny Rachitsky: 这让我想起,我前几天正在把各种东西吉卜力化,然后就觉得,“怎么这么慢啊。”
Kevin Weil: 谁不是呢。
Lenny Rachitsky: 就像……你说什么?
Kevin Weil: 我说,谁不是呢。
Lenny Rachitsky: 现在谁不是呢。我就想,“生成一张我家庭的精美图片居然要等一分钟。“拜托,怎么这么慢。你就是会对眼前发生的魔法习以为常。
Kevin Weil: 完全是这样。
Libra 的故事
Lenny Rachitsky: 好了,最后一个问题。这个问题会完全转向另一个方向。很多人都问过这个。众所周知,你在 Facebook 领导过一个叫 Libra 的项目,现在叫 Novi。很多人一直很好奇,“那个项目后来怎么了?那真的是一个很酷的想法。“我知道有些人了解其中存在监管方面的挑战之类的事情。我不太确定你是否经常谈论这个。所以,你能不能简单给大家讲讲 Libra 到底是什么?你在做的这个项目,后来发生了什么,以及你现在怎么看?
Kevin Weil: 好的。其实 David Marcus 是这个项目的负责人,我很高兴能在他的领导下工作、与他合作。我认为他是一位有远见的人,同时也是我的导师和朋友。说实话,Libra 大概是我职业生涯中最大的遗憾。当我想到我们当时要解决的问题——那是非常真实的问题。比如看看汇款领域,人们向其他国家的家人汇款,这大概是……我的意思是,它本质上极其累退,对吧?那些本来就没有多少钱的人,却要支付高达 20% 的费用才能把钱寄回家给家人。费用高得离谱,要花好几天时间,还得亲自去某个地方取现金——一切都是糟糕的。
而我们有 30 亿人在全球各地使用 WhatsApp,每天彼此交流,尤其是朋友和家人之间——正是那种会互相汇款的人群。为什么你不能像发一条短信那样即时、免费、简单地汇款呢?当你坐下来想想这件事,它就应该自然而然地存在。这就是我们当初出发要去做的事情。
当然,我不认为我们把每一张牌都打得完美。如果我能回去重新来过,有很多事情我会换一种做法。
我们试图一次性搞定所有事情。我们想同时推出一条新的区块链,最初是一篮子货币的组合,还要整合进 WhatsApp 和 Messenger。我觉得全世界看到后大概是,“天哪,这一次性变化也太多了。“而且那恰好也是 Facebook 声誉跌到最低谷的时候。所以这并没有帮上忙。另外,那时也不是人们期望推动这种变革的那个 Facebook。这一切我们事先都知道,但我们还是冲了上去。
我觉得有很多种方式可以让我们更温和地引入这些变化,也许同样能达到那个最终目标,但不要同时推出那么多新东西,而是一件一件来。谁知道呢?这些决定是我们共同做出的,我们所有人共同承担。当然,我也承担我那份责任。但从根本上说,这个世界至今缺少这个产品,让我感到深深的失望。如果我们当初能够把那个产品发布出来,世界会变得更美好。我本可以在 WhatsApp 里免费给你转五毛钱,即时到账,每个人的 WhatsApp 账户里都会有一个余额,我们会不断交易……它本来应该存在的。
说实话,现在的政府对加密货币非常友好,Facebook——也就是 Meta——的声誉也处在一个非常不同的位置。也许他们现在应该去把它做出来。
Lenny Rachitsky: 我查了一下它的历史,据说他们把技术以两亿美元卖给了一家私募公司。
Kevin Weil: 对,对,而且——
Lenny Rachitsky: 后来又不得不买回来。
Kevin Weil: 有几个现存的区块链是建立在那套技术之上的,因为那些技术从一开始就是开源的。Aptos 和 Mistin 就是基于这套技术建立的两家公司。所以我们做的所有工作至少没有白费,在这两家公司中延续了下来,而且它们都发展得很好。但即便如此,我们还是应该能在 WhatsApp 里互相转账,而今天还是做不到。
Lenny Rachitsky: 说得太对了。谢谢你分享这个故事,Kevin。在我们进入非常令人期待的快问快答环节之前,你还有什么想分享的吗,或者最后一个负面建议或洞见?
Kevin Weil: 哦,快问快答。直接来吧。
快问快答
Lenny Rachitsky: 来吧。那么,Kevin,我们到了非常令人期待的快问快答环节。准备好了吗?
Kevin Weil: 好了。
Lenny Rachitsky: 开始吧。你最常向别人推荐的两三本书是什么?
Kevin Weil: Ethan Mollick 的《Co-Intelligence》,一本关于 AI 以及如何在日常生活中使用它的非常好的书,无论你是学生还是教师。他非常有思想,顺便说一句,在 Twitter 上也非常值得关注。Peter Zeihan 的《The Accidental Superpower》,如果你对地缘政治以及塑造当前局势的各种力量感兴趣的话,这本书非常好。另外我非常喜欢《Cable Cowboy》,我不记得作者了,但那是 John Malone 的传记。非常精彩。如果你喜欢商业,尤其是如果你想了解……我的意思是,这个人是一个不可思议的交易撮合者,塑造了现代有线电视产业的许多面貌。所以那是一本很好的传记。
Lenny Rachitsky: 这些都是第一次被提到,这总是很棒。
Kevin Weil: 哦,太好了。
影视与产品推荐
Lenny Rachitsky: 下一个问题。你最近有没有特别喜欢的一部电影或电视剧?
Kevin Weil: 我真希望我有时间看电视剧,所以我——
Lenny Rachitsky: 只看 Sora 生成的视频吧。
Kevin Weil: 对,没错。我不知道。小时候我读过《时光之轮》(Wheel of Time)系列,现在 Amazon 把它拍成了剧,正在播第三季,所以我想看看。我还没看。《壮志凌云2》是一部很棒的电影。不过这应该已经不算新片了。
Lenny Rachitsky: 这暴露了你上一次看电影是什么时候了。
Kevin Weil: 但我喜欢那个理念。我想要更多美国精神,更多为强大而自豪的感觉。我觉得《壮志凌云2》在这方面做得非常好。自豪感和爱国情怀,我觉得美国需要更多这样的东西。
Lenny Rachitsky: 有没有你最近发现的、非常喜欢的非自家产品?除了你们内部都能用的超级智能工具之外啊,我开玩笑的。
Kevin Weil: 没错。内部的通用人工智能工具。
Lenny Rachitsky: 对,没错。
Kevin Weil: 我觉得用 Windsurf 这样的产品做感觉编程(vibe coding)真的超级有趣。我玩得很开心。我到现在还是觉得我们的首席人才官用感觉编程做了一些工具这件事太棒了。另一个可能就是 Waymo。只要有机会我就坐 Waymo。那是一种更好的出行方式,而且依然让人觉得像是在未来一样。所以他们做得非常出色。
Lenny Rachitsky: 太棒了。顺便说一下,我请了 Windsurf 的创始人上过播客,可能会在这期之前或之后播出。还有 Cursor 的 CEO 也会来,也是在这期之前或之后。
Kevin Weil: 哦,很酷。我对他们正在做的事情充满敬意。都是非常棒的产品。
Lenny Rachitsky: 改变了所有人构建产品的方式而已。没什么大不了的。
Kevin Weil: 是啊。
人生座右铭
Lenny Rachitsky: 还有几个问题。你有没有一个经常对自己重复的、在工作或生活中觉得特别有用的人生座右铭?
Kevin Weil: 有。说来也巧,它与其说是座右铭,更像是一种哲学,但后来我觉得 Zuck 有一次在 Facebook 的财报电话会议上把它完美地概括了。我真的把这句话做成了海报,挂在我房间里。当时有人问 Mark——这可是在财报电话会议上,所以是一位分析师在财报电话会议上问他。那是 Facebook 增长非常快的某个季度,应该是 2010 年代某个时候。但他问的是,“所以你们做了什么?你们发布了什么?到底是什么推动了所有这些增长?” Mark 说的大意是,“有时候并不是某一件事情,只是长期持续地做好工作。” 这句话一直印在我心里。
我觉得确实如此。我跑超级马拉松,其实就是关于坚持。我觉得人们太经常寻找银弹,而生活中的很多卓越,实际上是日复一日地出现,做好工作,每天都进步一点点,而你可能在一周甚至一个月内都注意不到变化。很多人因此感到沮丧,然后放弃了。但实际上,你继续做下去,进步会持续复利。在一年、两年、五年的时间里,它会积累得惊人。所以就是,长期持续地做好工作。
Lenny Rachitsky: 我太喜欢这句话了。我也得做一张海报。那真是——
Kevin Weil: 我们给你做一张。
Lenny Rachitsky: 我太有共鸣了。好,我收下了。这真的太好了。
提示技巧
Lenny Rachitsky: 好,最后一个问题。我想问你有没有什么提示技巧,不过让我先铺垫一下。你想想有没有一个可以推荐给别人的、能更好地向大语言模型(LLM)提问的技巧。我之前请了一位嘉宾 Alex Komorowski 上播客,他来自 Stripe,每周会写一篇关于世界上正在发生的事情的反思,其中很多与 AI 相关。
他曾经把大语言模型(LLM)形容为全人类知识的一个压缩包。所有的答案都在里面,你只需要找到正确的问题来问,基本上就能获得任何问题的答案。这让我想起提示工程有多么重要,以及掌握如何好好提问是多么关键。你一直在给 ChatGPT 提示,你发现的一个有用的小技巧、小窍门是什么,能帮助你得到你想要的结果?
Kevin Weil: 首先,我想杀掉”你必须是一个好的提示工程师”这个观念。如果我们把工作做好了,这就不应该再成立。这只是模型的一个锋利边缘,专家可以学会驾驭它。但随着时间推移,你不需要了解所有这些。就像以前你必须深入了解”MySQL 的存储引擎是什么?你在用 InnoDB 4.1 吗?“如果你处于 MySQL 性能的极端深度场景,这些仍然有用。但大多数人不需要关心。如果 AI 真的要被广泛采用,你也不应该需要关心提示的细枝末节。
但今天,我们还没有完全做到。顺便说一下,我认为我们正在这方面取得进展,现在需要的提示工程比以前少了。不过,与我之前谈到的微调以及提供示例的重要性相一致,你可以在提示中包含你想要的那种东西的示例以及一个好的回答,这实际上相当于穷人的微调。比如,“这是一个示例,这是一个好的回答。这是一个示例,这是一个好的回答。现在,帮我解决这个问题。” 模型真的会参考并从中学习。
效果不如完整的微调,但比不提供任何示例要好得多。我觉得人们不够经常这样做。
Lenny Rachitsky: 太棒了。我听到过一个小技巧,我很好奇这个是否有效,就是你告诉它”这对我的职业生涯非常重要”。让它真正理解,“如果你不正确回答我,会有人死的。“这有用吗?
Kevin Weil: 这真的很奇怪。这可能有一个很好的解释。但确实,是的,我认为这有一定的道理。你也可以说类似的话,比如,“我想让你成为爱因斯坦,现在,帮我解答这道物理题”,或者,“你是世界上最伟大的营销人,世界上最伟大的品牌营销人,现在这里有一个命名问题。“确实有某种机制,它会把模型切换到某种思维模式中,这实际上可以产生非常积极的效果。
Lenny Rachitsky: 我其实一直都在用这个技巧。我每次……当我为访谈准备问题的时候,偶尔也会用它来想一些我自己没想到的角度,我确实会打上”你是世界上最优秀的播客主持人”。
Kevin Weil: 对。
Lenny Rachitsky: Kevin Weil 要来上节目了……对,这真的有用。
Kevin Weil: 顺便说一下,回到我们之前提过几次的那个观点。你有时候对人也这样做——你把人……你设定一个框架,让他们进入某种心态,得到的回答就完全不同。所以我觉得这种现象在人类身上也有对应,又一次印证了这一点。
Lenny Rachitsky: Kevin,这次对话太精彩了。我刚才在想怎么收尾。我的感觉是……我觉得你不只是站在未来的最前沿,你和你的团队实际上就是那个正在创造未来的前沿。所以真的很荣幸能邀请你来,和你交谈,听你分享你认为事情正在走向何方、我们需要思考什么。谢谢你来做客,Kevin。
Kevin Weil: 哦,非常感谢你的邀请。我有幸和世界上最优秀的团队一起工作,一切功劳都属于他们。也真的很感谢你邀请我来,这次聊天超级有趣。
Lenny Rachitsky: 我差点忘了问你最后两个问题。大家如果想联系你,可以在哪里找到你?听众可以怎么帮到你?
Kevin Weil: 我在几乎所有平台上都是 @kevinweil,K-E-V-I-N-W-E-I-L。这么多年了我仍然是 Twitter 的日活用户,应该说是 X 的日活用户。LinkedIn 也行,哪里都可以。我希望大家能给我的东西就是——反馈。大家都在用 ChatGPT,告诉我哪里做得特别好、你希望我们继续加码的,也告诉我哪里出了问题。我在 Twitter 上非常活跃,喜欢听到大家的反馈,什么好用、什么不好用,所以别害羞。
Lenny Rachitsky: 而且我发现关注你还能帮你了解你们发布的所有东西。你会分享每天、每周、每月推出的新功能,所以这也是一个好处。顺便说一句,四亿周活用户全都给你发反馈,那可来吧。
Kevin Weil: 好啊,来吧。
Lenny Rachitsky: 肯定没问题。好的,谢谢你,Kevin。感谢你来做客。
Kevin Weil: 好的,伙计,非常感谢。回头见。
Lenny Rachitsky: 大家再见。非常感谢大家的收听。如果你觉得这期内容有价值,可以在 Apple Podcasts、Spotify 或你最喜欢的播客应用上订阅本节目。另外,也请考虑给我们评分或留下评论,这真的能帮助更多听众发现这个播客。你可以在 lennyspodcast.com 找到所有往期节目或了解更多关于本节目的信息。下期再见。
术语表
| 原文 | 中文 |
|---|---|
| AGI (Artificial General Intelligence) | 通用人工智能(AGI) |
| Airbnb | Airbnb |
| Alex Komorowski | Alex Komorowski |
| Alexa | Alexa |
| alignment | alignment(保持一致) |
| Andrej Karpathy | Andrej Karpathy |
| API | API |
| Aptos | Aptos |
| Black Product Managers Network | Black Product Managers Network |
| blockchain | 区块链 |
| bottoms-up | 自下而上 |
| brainstorming | 头脑风暴 |
| chain of thought | 思维链 |
| Chief People Officer | 首席人才官 |
| COBOL | COBOL |
| CPO (Chief Product Officer) | 首席产品官 |
| Dario | Dario |
| DAU (Daily Active User) | 日活用户(DAU) |
| David Marcus | David Marcus |
| deep research | 深度研究(deep research) |
| DeepSeek | DeepSeek |
| Elizabeth | Elizabeth |
| ENG (Engineering) | 工程(ENG) |
| ensemble | 集成(ensemble) |
| EQ (Emotional Quotient) | 情商(EQ) |
| Ethan Mollick | Ethan Mollick |
| Ev Williams | Ev Williams |
| evals | 评估(evals) |
| fine-tuned | 微调 |
| foundation model | 基础模型 |
| Ghiblifications | 吉卜力化 |
| hallucinate | 幻觉 |
| hero use cases | 核心用例 |
| high agency | 高能动性 |
| hill climbing | 爬坡优化 |
| ImageGen | ImageGen |
| InnoDB | InnoDB |
| iterative deployment | 迭代部署 |
| John Malone | John Malone |
| Julia | Julia |
| Khan Academy | Khan Academy |
| Lenny & Friends Summit | Lenny & Friends Summit |
| Libra | Libra |
| LLM (Large Language Model) | 大语言模型(LLM) |
| Mike Krieger | Mike Krieger |
| Mistin | Mistin |
| Moats | 护城河 |
| model maximalism | 模型最大化主义 |
| Moore’s Law | 摩尔定律 |
| MySQL | MySQL |
| Novi | Novi |
| OneSchema | OneSchema |
| Patrick | Patrick |
| Peter Zeihan | Peter Zeihan |
| PII (Personally Identifiable Information) | 个人身份信息(PII) |
| Planet | Planet |
| PM (Product Manager) | 产品经理(PM) |
| prompt engineering | 提示工程 |
| proof of concept | 概念验证 |
| remittance | 汇款 |
| Sam | Sam |
| Sam Altman | Sam Altman |
| Sarah Guo | Sarah Guo |
| silver bullet | 银弹 |
| Sora | Sora |
| standard deviation | 标准差 |
| Stories | Stories |
| Strava | Strava |
| super intelligence | 超级智能 |
| system one | 系统一 |
| Tahoe | 太浩湖(Tahoe) |
| the Nature Conservancy | 大自然保护协会(the Nature Conservancy) |
| ultra marathons | 超级马拉松 |
| vibe coding | 感觉编程(vibe coding) |
| Vinod | Vinod |
| Vinod Khosla | Vinod Khosla |
| WAU (Weekly Active User) | 周活用户(WAU) |
| Waymo | Waymo |
| Zuck | Zuck |
此文档由 AI 分片翻译(translate_long_document)
OpenAI’s CPO on how AI changes must-have skills, moats, coding, startup playbooks, more | Kevin Weil
The Pace of AI Evolution
Kevin Weil: The AI models that you’re using today is the worst AI model you will ever use for the rest of your life, and when you actually get that in your head, it’s kind of wild. Everywhere I’ve ever worked before this, you kind of know what technology you’re building on, but that’s not true at all with AI. Every two months, computers can do something they’ve never been able to do before and you need to completely think differently about what you’re doing.
Libra: My Biggest Career Regret
Lenny Rachitsky: You’re chief product officer of maybe the most important company in the world right now. I want to chat about what it’s just like to be inside the center of the storm.
Kevin Weil: Our general mindset is in two months, there’s going to be a better model and it’s going to blow away whatever the current set of limitations are. And we say this to developers too. If you’re building and the product that you’re building is kind of right on the edge of the capabilities of the models, keep going because you’re doing something right. Give it another couple months and the models are going to be great, and suddenly the product that you have that just barely worked is really going to sing.
The Ghibli-Style Image Craze
Lenny Rachitsky: Famously, you led this project at Facebook called Libra.
Kevin Weil: Libra is probably the biggest disappointment of my career. It fundamentally disappoints me that this doesn’t exist in the world today because the world would be a better place if we’d been able to ship that product. We tried to launch a new blockchain. It was a basket of currencies originally. It was integration into WhatsApp and Messenger. I would be able to send you 50 cents in WhatsApp for free. It should exist. To be honest, the current administration is super friendly to crypto. Facebook’s reputation is in a very different place. Maybe they should go build it now.
The “Normalized Miracle” of AI
Lenny Rachitsky: Today my guest is Kevin Weil. Kevin is chief product officer at OpenAI, which is maybe the most important and most impactful company in the world right now, being at the forefront of AI and AGI and maybe someday super intelligence. He was previously head of product at Instagram and Twitter. He was co-creator of the Libra Cryptocurrency at Facebook, which we chat about. He’s also on the boards of Planet and Strava and the Black Product Managers Network and the Nature Conservancy. He’s also just a really good guy and he has so much wisdom to share. We chat about how OpenAI operates, implications of AI and how we will all work and build product, which markets within the AI ecosystem, companies like OpenAI won’t likely go after and thus are good places for startups to own. Also, why learning the craft of writing evals is quickly becoming a core skill for product builders, what skills will matter most in an AI era and what he’s teaching his kids to focus on and so much more.
Why I Joined OpenAI
Kevin Weil: Thank you so much for having me. We’ve been talking about doing this forever and we made it happen.
Working in the Eye of the Storm
Lenny Rachitsky: We did it. I can’t imagine how insane your life is, so I really appreciate that you made time for this and we’re actually recording this the week that you guys launched your new image model, which is a happy coincidence. My entire social feed is filled with ghiblifications of everyone’s life and family photos and everything, so good job.
Kevin Weil: Yep, mine too. My wife, Elizabeth, sent me one of hers, so I’m right there with you.
Why Evals Matter
Lenny Rachitsky: Let me just ask, did you guys expect this kind of reaction? It feels like this is the most viral thing that’s happened in AI, which a high bar since, I don’t know, ChatGPT launched. Just like, did you guys expect it to go this well? What does it feel like internally?
Kevin Weil: There have been a handful of times in my career when you’re working on a product internally and the internal usage just explodes. This was true by the way when we were building stories at Instagram. More than anything else in my career, we could feel it was going to work because we were all using it internally and we’d go away for a weekend. Before it launched we were all using it and we’d come back after a weekend and we would know what was going on and be like, “Oh, hey, I saw you were at that camping trip, how was that?” You were like, “Man, this thing really works.” ImageGen was definitely one of those, so we’d been playing with it for, I don’t know, a couple months and when it first went live internally to the company, there was kind of a little gallery where you could generate your own, you could also see what everyone else was generating and it was just nonstop buzz. So yeah, we had a sense that this was going to be a lot of fun for people to play with.
Customization and Startup Opportunities
Lenny Rachitsky: That’s really cool. That should be a measure of just confidence into something going well that you’re launching is internally everyone’s going crazy for it.
Kevin Weil: Yeah. Especially social things because you have a very tight network as a company socially, so you know each other and you’re experts in your product hopefully. And so there’s some sense in which if you’re doing something social and it’s not taking off internally, you might question what you’re doing.
The Secret to Fast Delivery
Lenny Rachitsky: Yeah, and by the way, the Ghibli thing, is that something you guys seeded or how did that even start? Was that an intentional example?
Bottom-Up and Fast Iteration
Kevin Weil: I think it’s just the style people love and the model is really capable at emulating style or understanding what… It’s very good at instruction following. That’s actually something that I think people… I’m starting to see people discover with it, but you can do very complex things. You can give it two images, one is your living room and the other is a whole bunch of photos or memorabilia or things you want and you say, “Tell me how you would arrange these things.” Or you can say, “I’d like you to show me what this will look like if you put this over here and this thing to the right of that and this one to the left of this, but under that one.” And the model actually will understand all of that and do it. It’s incredibly powerful. So I’m just excited about all the different things people are going to figure out.
Lenny Rachitsky: Yeah. All right. Well, good job. Good job team OpenAI. Let’s get serious here and let’s zoom out a little bit. The way I see it is you’re chief product officer of maybe the most important company in the world right now. Just not to set the bar too high, but you guys are ushering in AI, AGI at some point, super intelligence at some point. No big deal. I have more questions for you than I’ve had for any other guest. Actually put out a call-out on Twitter and LinkedIn and my community just like what would you want to ask Kevin? And I had over 300 well-formed questions and we’re going to go through every single one. So let’s just get started. I’m just joking.
Iterative Deployment and Maximizing Models
Kevin Weil: Cool.
On Competition and Coding Skills
Lenny Rachitsky: I picked out the best and there’s a lot of stuff I’m really curious about.
Kevin Weil: Well, it’s 1 PM here. It doesn’t get dark for a while, so let’s do it.
ChatGPT’s Consumer Mindshare
Lenny Rachitsky: Okay, here we go. Okay, so first of all, I’m just going to take notes here. When is AGI launching? When in December?
Kevin Weil: I mean, we just launched a good ImageGen model. Does that count?
The Charm of Conversational Interfaces
Lenny Rachitsky: It’s getting there. It’s getting there.
The Research-Product Collaboration Model
Kevin Weil: There’s this quote I love, which is “AI is whatever hasn’t been done yet” because once it’s been done, when it kind of works, then you call it machine learning, and once it’s kind of ubiquitous and it’s everywhere, then it’s just an algorithm. So I’ve always loved that we call things AI when they still don’t quite work and then by the time it’s like an AI algorithm that’s recommending you follow, oh, that’s just an algorithm, but this new thing, like self-driving cars, that’s it. I think to some degree we’re always going to be there and the next thing is always going to be AI and the current thing that we use every day and is just a part of our lives, that’s an algorithm.
PM Headcount and Roles
Lenny Rachitsky: It’s so interesting because in the Bay Area you see self-driving cars driving around and it’s so normal now when four years ago and three years ago, you would’ve seen this and you’d be like, “Holy shit, what is… We’re in the future.” And now we’re just so take it for granted.
Kevin Weil: I mean there’s something like that with everything. If I showed you… When GPT-3 launched, I wasn’t at OpenAI then. I was just a user, but it was mind-blowing. And if I gave you GPT-3 now I just plugged that into ChatGPT for you and you started using it, you’d be like, “What is this thing?” It’s like mess.
The Profile of an OpenAI PM
Lenny Rachitsky: Flop, flop.
Kevin Weil: I had the same experience when I first got into a Waymo, your very first ride, at least my very first ride, my first 10 seconds in a Waymo, it starts driving and you’re like, “Oh my God, watch out for that bike.” You’re holding onto whatever you can. And then five minutes in, you’ve calmed down and you realize that you’re getting driven around the city without a driver and it’s working. You’re just like, “Oh my God, I am living in the future right now.” And then another 10 minutes, you’re bored, you’re doing email on your phone, answering Slack messages, and suddenly this miracle of human invention is just an expected part of your life from then on. And there is really something in the way that we all are adapting to AI that’s kind of like that. These miraculous things happen and computers can do something they’ve never been able to do before and it blows our mind collectively for a week and then we’re like, oh, yeah. Oh, now it’s just machine learning on its way to being an algorithm.
How PMs Build Trust
Lenny Rachitsky: The craziest thing about what you just shared actually is, I don’t know, ChatGPT, which is now feels terrible. 3.5 was a couple years ago, and imagine what life will be like in a couple years from now. We’re going to get to that, where things are going, what you think is going to be the next big leap. But I want to start with the beginning of your journey at OpenAI. So you worked at Twitter, you worked at Facebook, you worked at Planet, Instagram. At some point you got recruited to go and come work at OpenAI. I’m curious just what that story was like of the recruiting process of joining OpenAI as CPO. Is there any fun stories there?
Kevin Weil: If I’m remembering the timeline right, we communicated at Planet I was leaving and I was planning to just go take some time. I wasn’t going to stop working, but I was also happy to take the summer. This was maybe April or something. I was like, cool, I’m going to have the summer with my kids. We’re going to go to Tahoe or something and I’ll actually get to hang out rather than what I usually do going up and down and all that. And then Sam and I had known each other lightly for a bunch of years and he’s always involved in so many interesting things like companies building fusion and all these things. So he’d always been somebody that I would call occasionally if I was starting to think about my next thing because I like working on big tech forward, sort of next wave kind of things.
And so I called him and I think Vinod also helped to put us in touch again. And this time it wasn’t like, “Oh, you should go talk to these guys working on fusion.” He said, “Actually, we’re thinking about something, you should come talk to us.” I was like, “Okay, that sounds amazing. Let’s do it.” And it goes really fast, really, really fast. I met most of the management team in a brief period of time, a few days, and they were telling me, ‘Look, we’re basically going to move as fast as we want to move. And if you talk to everyone, everyone likes you, you’re ready to go.” Sam came over for dinner and we had a great evening together just talking about OpenAI in the future and getting to know each other better. And at the end I was like, I was going to go in the next day for a bigger round of interviews and Sam was saying, “Hey, it’s going really well. We’re really excited.”
And I said, “Cool. So how do I think about tomorrow?” And he said, “Oh, you’ll be fine. Don’t worry about it. And if it goes well, we’re basically there.” And so I go in the next day, meet a bunch of people, have a great time. I really enjoyed everybody I met with. In any interview, you can always second guess yourself like, oh, I shouldn’t have said that thing or that thing I gave a bad answer on I wish I could redo, but I came away feeling like I think that went pretty well. And I was expecting to hear that weekend basically because they sort of set expectations as soon as if this goes well, we’re ready to go. And I didn’t hear anything. And then it was like Monday, Tuesday, Wednesday, I still didn’t hear anything and I reached out to folks on the OpenAI side a couple of times, still nothing.
And I was like, “Oh my God, I screwed it up. I don’t know where I screwed it up, but I totally screwed it up. I can’t believe it.” And I was going back to Elizabeth, my wife and being like, “What did I do? Where do you think I…” Getting all crazy about it and then it’s still nothing. And finally it was like nine days later, they finally got back to me and it turned out there was a bunch of stuff happening internally and this, that and the other thing, and there’s just a million things happening. And they finally were like, “Oh yeah, that went well. Let’s do this.” And I was like, “Oh, okay, cool, let’s do it.” But it was nine days of agony and they were just super busy on some internal stuff and there I was fretting every single day and re-going over every line of our interview process.
AI’s Impact on Product Teams
Lenny Rachitsky: It makes me think about when you’re dating someone and you’ve texted them and you’re not hearing anything back, you assume something is wrong.
Kevin Weil: Yeah, totally.
The Rise of Vibe Coding
Lenny Rachitsky: They might just be busy.
Kevin Weil: I have a hard time about it still.
The Future of Product Teams
Lenny Rachitsky: That’s wild. I love that it worked out. And I guess the lesson there is don’t jump to conclusions.
Kevin Weil: Yeah. Have a little bit of chill.
Fine-Tuning Models: A Case Study
Lenny Rachitsky: Speaking of that, I want to chat about what it’s just like to be inside the center of the storm. Again, you work at a lot of, let’s say traditional companies even though they’re not that traditional, Twitter and Instagram and Facebook and Planet, and now you work at OpenAI. I’m curious, what is most different about how things work in your day-to-day life at OpenAI?
Integrating Models into Customer Service
Kevin Weil: I think it’s probably the pace. Maybe it’s two things. One is it’s the pace. The second is everywhere I’ve ever worked before this, you kind of know what technology you’re building on. So you spend your time thinking about what problems are you solving? Who are you building for? How are you going to make their lives better? How are you going to… Is this a big enough problem that you’re going to be able to change habits? Do people care about this problem being solved? All those good product things. But the stuff that you’re building on is kind of fixed. You’re talking about databases and things and I bet the database you used this year is probably 5% better than the database you used two years ago, but that’s not true at all with AI. It’s like every two months computers can do something they’ve never been able to do before and you need to completely think differently about what you’re doing.
There’s something fundamentally interesting about that makes life fun here. There’s also something we will maybe talk about evals later, but it also really, in this world of… Everything we’re used to with computers is about giving a computer very defined inputs. If you look at Instagram for example, there are buttons that do specific things and you know what they do. And then when you give a computer defined inputs, you get very defined outputs. You’re confident that if you do the same thing three times, you’re going to get the same output three times. LLMs are completely different than that. They’re good at fuzzy subtle inputs. Then all the nuances of human language and communication, they’re pretty good at. And also they don’t really give you the same answer. You probably get spiritually the same answer for the same question, but it’s certainly not the same set of words every time. And so you’re much more, it’s fuzzier inputs and fuzzier outputs. And when you’re building products, it really matters whether there’s some use case that you’re trying to build around.
If the model gets it right 60% of the time, you build a very different product than if the model gets it right 95% of the time versus if the model gets it right 99.5% of the time. And so there’s also something that you have to get really into the weeds on your use case and the evals and things like that in order to understand the right kind of product to build. So that is just fundamentally different. If your database works once, it works every time. And that’s not true in this world.
Preparing Kids for the Future
Lenny Rachitsky: Let’s actually follow this thread on evals. I definitely wanted to talk about this. We had this legendary panel at the Lenny & Friends Summit. It was you and Mike Krieger and Sarah Guo moderating.
Kevin Weil: That was fun.
AI’s Future and Societal Impact
Lenny Rachitsky: So fun. The thing that I heard that kind of stuck with people from that panel was a comment you made where you said that writing evals is going to become a core skill for product managers, and I feel like that probably applies further than just product managers. A lot of people know what evals are. A lot of people have no idea what I’m talking about. So could you just briefly explain what is an eval and then just why do you think this is going to be so important for people building products in the future?
Kevin Weil: Yeah, sure. I think the easiest way to think about it is almost like a quiz for a model, a test to gauge how well it knows a certain set of subject material or how good it is at responding to a certain set of questions. So in the same way you take a calculus class and then you have calculus tests that see if you’ve learned what you’re supposed to learn. You have evals that test how good is the model at creative writing? How good is the model at graduate level science? How good is the model at competitive coding? And so you have these set of evals that basically perform as benchmarks for how smart or capable the model is.
The Next Leap in AI Creativity
Lenny Rachitsky: Is it a simple way to think about it, like unit tests for model?
Kevin Weil: Yeah, unit tests, tests in general for models. Totally.
Looking Ahead to the Future
Lenny Rachitsky: Great, great. Okay. And then why is this so important for people that don’t totally understand what the hell’s going on here with evals? Why is this so key to building AI products?
Kevin Weil: Well, it gets back to what I was saying. You need to know whether your model is going to… There are certain things that models will get right. 99.95% of the time and you can just be confident. There are things that they’re going to be 95% right on and things they’re going to be 60% right on. If the model’s 60% right on something, you’re going to need to build your product totally differently. And by the way, these things aren’t static either. So a big part of evals is if you know you’re building for some use case. So let’s take our deep research product, which is one of my favorite things that we’ve released maybe ever. The idea is with deep research for people who haven’t used it, you can give ChatGPT now an arbitrarily complex query. It’s not about returning you an answer from a search query, which we can also do.
It’s here’s a thing that if you were going to answer it yourself, you’d go off and do two hours of reading on the web and then you might need to read some papers and then you would come back and start writing up your thoughts and realize you had some gaps in your thinking. So you go out and do more research. It might take you a week to write some 20 page answer to this question. You can let ChatGPT just like chug for you for 25, 30 minutes. It’s not the immediate answers you’re used to, but it might go work for 25, 30 minutes and do work that would’ve taken you a week. So as we were building that product, we were designing evals at the same time as we were thinking about how this product was going to work and we were trying to go through hero use cases.
Here’s a question you want to be able to ask. Here’s an amazing answer for that question. And then turning those into evals and then hill climbing on those evals. So it’s not just that the model is static and we hope it does okay on a certain set of things, you can teach the model. You can make this a continuous learning process. And so as we were fine-tuning our model for deep research to be able to answer these things, we were able to test is it getting better on these evals that we said were important measures of how the product was working? And it’s when you start seeing that and you start seeing performance on evals going up, you start saying, “Okay, I think we have a product here.”
When Magic Becomes Routine
Lenny Rachitsky: You made a kind of a comment along these same lines around evals that AI is almost capped in how amazing it can be by how good we are at evals. Does that resonate? Any more thoughts along those lines?
Kevin Weil: I mean, these models are their intelligences and intelligence is so fundamentally multidimensional so you can talk about a model being amazing at competitive coding, which may not be the same as that model being great at front-end coding-
… may not be the same as that model being great at front-end coding or back-end coding or taking a whole bunch of code that’s written in COBOL and turning it into Python. And that’s just within the software engineering world. So I think there’s a sense in which you can think of these models as incredibly smart, very factually aware intelligences, but still most of the world’s data, knowledge, process is not public. It’s behind the walls of companies or governments or other things. And same way, if you were going to join a company, you would spend your first two weeks onboarding. You’d be learning the company-specific processes. You’d get access to company-specific data. The models are smart enough, you can teach them anything, but they need to have the raw data to learn from.
So there’s a sense in which I think the future is really going to be incredibly smart, broad-based models that are fine-tuned and tailored with company-specific or use case-specific data so that they perform really well on company-specific, or use case-specific things. And you’re going to measure that with custom evals. So what I was referring to is just like these models are really smart, you need to still teach them things if the data’s not in their training set, and there’s a huge amount of use cases that are not going to be in their training set because they’re relevant to one industry or one company.
The Story of Libra
Lenny Rachitsky: I’m just going to keep following the thread that you’re leading us down, but I’m going to come back because I have more questions around some of these things. So you came to a space that I think a lot of AI founders are thinking about is just, where’s OpenAI not going to come squash me in the future? Or one of the other foundational models. So it’s unclear to a lot of people just like, “Should I build a startup in this space or not?” Is there any advice you have or any guidance for where you think OpenAI, or just foundational models in general likely won’t go and where you have an opportunity to build a company?
Kevin Weil: So this is something that Ev Williams used to say back at Twitter that’s always stuck with me, which is, “No matter how big your company gets, no matter how incredible the people are, there are way more smart people outside your walls than there are inside your walls.” And that’s why we are so focused on building a great API. We have 3 million developers using our API. No matter how ambitious we are, how big we grow, by the way, we don’t want to grow super big, there are so many use cases, places in the world where AI can fundamentally make our lives better. We’re not going to have the people. We’re not going to have the know-how to build most of these things.
And I think, like I was saying, the data is industry-specific, use case-specific, behind certain company walls, things like that. And there are immense opportunities in every industry and every vertical in the world to go build AI-based products that improve upon the state of the art. And there’s just no way we could ever do that ourselves. We don’t want to. We if we did want to, and we’re really excited to power that for 3 million-plus developers and way more in the future.
Rapid Fire Q&A
Lenny Rachitsky: Coming back to your earlier point about the tech changing constantly and getting faster, not exactly knowing what you’ll have by the time you launch something in terms of the power, the model. I’m curious what allows you to ship quickly and consistently in such great stuff? And it sounds like one answer is bottoms-up empowered teams versus a very top-down roadmap that’s planned out for a quarter. What are some of those things that allow you to ship such great stuff so often, so quickly?
Kevin Weil: Yeah. I mean, we try and have a sense of where we’re trying to go, point ourselves in a direction so that we have some rough sense of alignment. Thematically, I don’t for second, and we do quarterly roadmapping. We laid out a year-long strategy. I don’t for a second believe that what we write down in these documents is what we’re going to actually ship three months from now, let alone six or nine. But that’s okay. I think it’s like an Eisenhower quote, “Plans are useless. Planning is helpful,” which I totally subscribe to, especially in this world. It’s really valuable. If you think about quarterly road roadmapping for example, it’s really valuable to have a moment where you stop and go, “Okay. What did we do? What worked? What went well? What didn’t go well? What did we learn and now what do we think we’re going to do next?”
And by the way, everybody has some dependencies. You need the infrastructure team to do the following things, partnership with research here. So you want to have a second to check your dependencies, make sure you’re good to go and then start executing. We try and keep that really lightweight because it’s not going to be right. We’re going to throw it out halfway because we will have learned new things. So the moment of planning is helpful even if it’s only partially.
So I think just expecting that you’re going to be super agile and that there’s no sense writing a three month roadmap, let alone a year long roadmap because the technology’s changing underneath you so quickly. We really do try and go very strongly bottoms up, subject to our overall directional alignment. We have great people. We have engineers and PMs and designers and researchers who are passionate about the products they’re building and have strong opinions about them and are also the ones building them. So they have a real sense of what the capabilities are too, which is super important.
So I think you want to be more bottoms up in this way. So we operate that way. We are happy making mistakes. We make mistakes all the time. It’s one of the things I really appreciate about Sam. He pushes us really hard to move fast, but he also understands that with moving fast comes, we didn’t quite get this right or that we launched this thing, it didn’t work. We’ll roll it back. Look at our naming. Our naming is horrible.
Movie and Product Recommendations
Lenny Rachitsky: That was a lot of questions people had for you. Model names, yeah.
Kevin Weil: It’s absolutely atrocious and we know it, and we will get around to fixing it at some point, but it’s not the most important thing and so we don’t spend a lot of time on it.
My Life Motto
Lenny Rachitsky: But it also shows you how it doesn’t matter. Again, ChatGPT the most popular, fastest growing product in history, it’s the number one AI, API and model. So clearly it doesn’t matter that much.
Kevin Weil: And we name things like o3 mini high.
Top Prompting Tips
Lenny Rachitsky: Man, I love it. Okay. So you talked about roadmapping and bottoms up and I’m really curious, is there a cadence or a ritual of aligning with you or Sam or you review everything that’s going out? Is there a meeting every week or every month where you guys see what’s happening?
Kevin Weil: On key projects. So we do product reviews and things like that, like you would expect. There isn’t a ritual because there isn’t… I would never want us to be blocked on launching something, waiting for a review with me or Sam, if we can’t get there. If I’m traveling or Sam’s busy or whatever, that’s a bad reason for us not to ship. So obviously for the biggest, most high priority stuff, we have a pretty close beat on it, but we really try not to, frankly. We want to empower teams to move quickly, and I think it’s more important to ship and iterate.
So we have this philosophy, we call iterative deployment, and the idea is we’re all learning about these models together. So there’s a real sense in which it’s way better to ship something even when you don’t know the full set of capabilities and iterate together in public. And we co-evolve together with the rest of society as we learn about these things and where they’re different and where they’re good and bad and weird. I really like that philosophy.
I think the other thing that ends up being a part of our product philosophy is the sense of model maximalism. The models are not perfect. They’re going to make mistakes. You could spend a lot of time building all kinds of different scaffolding around them. And by the way, sometimes we do because sometimes there are kinds of errors that you just don’t want to make, but we don’t spend that much time building scaffolding around the parts that don’t match that because our general mindset is in two months there’s going to be a better model and it’s going to blow away whatever the current set of limitations are.
So if you’re building, and we say this to developers too, if you’re building and the product that you’re building is right on the edge of the capabilities of the models, keep going, because you’re doing something right because you give it another couple months and the models are going to be great, and suddenly the product that you have that just barely worked is really going to sing. And that’s how you make sure that you’re really pushing the envelope and building new things.
Lenny Rachitsky: I had the founder of Bolt on the podcast, StackBlitz is the company name, and he shared this story that they’ve been working on this product for seven years behind the scenes and it was failing. Nothing was happening. And then all of a sudden it was, sorry to mention a competitor, but Claude came out or a Sonnet 3.5 came out and all of a sudden everything worked and they’ve been building all this time and finally it worked. And I hear that a lot with YC, just like things that never were possible now are just becoming possible every few months with the updates to the models.
Kevin Weil: Yeah, absolutely.
Lenny Rachitsky: Let me actually ask this, I wasn’t planning to ask this, but I’m curious if you have any quick thoughts just why is Sonnet so good at coding, and thoughts on your stuff getting as good and better at actual coding?
Kevin Weil: Yeah. I mean, kudos to Anthropic. They’ve built very good coding models. No doubt. We think that we can do the same. Maybe by the time this podcast has shipped, we’ll have more to say, but either way, all credit to them. I think intelligence is really multi-dimensional and so I think the model providers… It used to be that OpenAI had this massive model lead, 12 months or something ahead of everybody else. That’s not true anymore. I like to think we still have a lead. I’d argue that we do, but it’s certainly not a massive one. And that means that there are going to be different places where the Google models are really good or where Anthropic models are really good, or where we’re really good and our competitors are like, “We got to get better at that.” And it actually is easier to get better at a certain thing once someone’s proved it possible than it is to forge a path through the jungle and doing something brand new.
So I just think as an example, it was like nobody could break 4 minutes in the mile, and then finally somebody did and the next year 12 more people did it. I think there’s that all over the place and it just means that competition is really intense, and consumers are going to win and developers are going to win and businesses are going to win in a big way from that. It’s part of why the industry moves so fast, but all respect to the other big model providers. Models are getting really good. We’re going to move as fast as we can and I think we’ve got some good stuff coming.
Lenny Rachitsky: Exciting. This makes me also think about, in many ways other models are better at certain things, but somehow ChatGPT is the… If you look at all the awareness numbers and usage numbers, it’s like no matter where you guys are in the rankings, people seem to just think of AI ChatGPT almost as the same. What do you think you did right to win in the consumer mindset, at least at this point and awareness in the world?
Kevin Weil: I think being first helps, which is one of the reasons why we’re so focused on moving quickly. We like being the first to launch new capabilities. Things like deep research. Our models, they can do a lot of things. So they can take real-time video input, you have speech to speech, you can do speech to text and text to speech. They can do deep research. They can operate on a canvas, they can write code. So ChatGPT can be this one- stop-shop where all the things that you want to do are possible. And as we go forward in it, we have more agentic tools like Operator where it’s browsing for you and doing things for you on the web, more and more you’re going to be able to come to this one place to ChatGPT, give it instructions and have it accomplish real things for you in the world. There’s something fundamentally valuable in that. So we think a lot about that. We try to move really fast so that we are always the most useful place for people to come to.
Lenny Rachitsky: What would you say is the most counterintuitive thing that you’ve learned after building AI products or working at OpenAI, something that’s just like, “I did not expect that?”
Kevin Weil: I don’t know, maybe I should have expected this, but one of the things that’s been funny for me is the extent to which you’re trying to figure out how some product should work with AI, or even why some AI thing happens to be true, you can often reason about it the way you would reason about another human and it works. So maybe a couple examples. When we were first launching our reasoning model, we were the first to build a model that could reason, that could, instead of giving you just a quick system one answer right away to every question you asked, it was the third Emperor of the Holy Roman Empire, here’s an answer.
You could ask it hard questions and it would reason. The same way that if I asked you to do a crossword puzzle, you couldn’t just snap fill in everything. You would be, “Well, okay. On this one across, I think it could be one of these two, but that means there’s an A here. So that one has to be this, away, back track, step-by-step build up from where you are.” Same way you answer any difficult logistical problem, any scientific problem. So this reasoning breakthrough was big, but it was also the first time that a model needed to sit and think. And that’s a weird paradigm for a consumer product. You don’t normally have something where you might need to hang out for 25 seconds after you ask a question.
So we were trying to figure out what’s the UI for this? With deep research where the model’s going to go and think for 25 minutes sometimes, it’s actually not that hard because you’re not going to sit and watch it for 25 minutes. You’re going to go do something else. You’re going to go to another tab or go get lunch or whatever, and then you’ll come back and it’s done when it’s like 20, 25 seconds or 10 seconds, it’s a long time to wait, but it’s not long enough to go to do something else.
So you can think, if you asked me something that I needed to think for 20 seconds to answer, what would I do? I wouldn’t just go mute and not say anything and shut down for 20 seconds and then come back. So we shouldn’t do that. We shouldn’t just have a slider sitting there. That’s annoying. But I also wouldn’t just start babbling every single thought that I had. So we probably shouldn’t just expose the whole chain of thought as the model’s thinking, but I might go like, “That’s a good question. All right.” I might approach it like that and then think. You’re maybe giving little updates and that’s actually what we ended up shipping.
You have similar things where you can find situations where you get better thinking sometimes out of a group of models that all try and attack the same problem, and then you have a model that’s looking at all their outputs and integrating it and then giving you a single answer at the end. I mean, sounds a little bit like brainstorming. I certainly have better ideas when I get in a room and brainstorm with other people because they think differently than me. So anyways, there’s just all these situations where you can actually reason about it like a group of humans or an individual human and it works, which I don’t know, maybe I shouldn’t have been surprised but I was.
Lenny Rachitsky: That is so interesting because when I see these models operate, I never even thought about you guys designing that experience. To me, it just feels like this is what the LLM does. It just sits there and tells me what it’s thinking. And I love this point you’re making of let’s make it feel like a human operating and well, how does a human operate? Well, they just talk aloud. They think, here’s the thing I should explore. And I love that deep sequence to the extreme of that where they’re just like, “Here’s everything I’m doing and thinking.” And people actually like that too, I guess. Was that surprising to you, “Maybe that could work too. People seem to like everything?”
Kevin Weil: Yeah. We learned from that actually because when we first launched it, we gave you the subheadings of what the model was thinking about, but not much more. And then deep seek launched and it was a lot and we went, I don’t know if everyone wants that. There’s some novelty effect to seeing what the model’s really thinking about. We felt that too when we were looking at it internally. It’s interesting to see the model’s chain of thought, but it’s not… I think at the scale of 400 million people, you don’t want to see the model babble a bunch of things.
So what we ended up doing was summarizing it in interesting ways. So instead of just getting the subheadings, you’re getting one or two sentences about how it’s thinking about it and you can learn from that. So we tried to find a middle ground that we thought was an experience would be meaningful for most people, but showing everybody three paragraphs is probably not the right answer.
Lenny Rachitsky: This reminds me of something else you said at the summit that has really stuck with me, this idea that chat, people always make fun of chat is not the future interface for how we interact with AI, but you made this really interesting point that may argue the other side, which is, as humans we interface by talking and the IQ of a human can span from really low to really high and it all works talking to them and chat is the same thing and it can work on all kinds of intelligence levels. Maybe I just shared it, but I guess anything there about just why chat actually ends up being such an interesting interface for LLMs?
Kevin Weil: Yeah. I don’t know, maybe this is one of those things I believe that most people don’t believe, but I actually think chat is an amazing interface because it’s so versatile. People tend to go, “Chat. Yeah. We’ll figure out something better.” And I think it’s incredibly universal because it is the way we talk. I can talk to you verbally like we’re talking now. We can see each other and interact. We can talk on WhatsApp and be texting each other, but all of these things is this unstructured method of communication and that’s how we operate.
If I had some more rigid interface that I was allowed to use when we spoke, I would be able to speak to you about far fewer things and it would actually get in the way of us having maximum communication bandwidth. So there’s something magical. And by the way, in the past it never worked because there wasn’t a model that was good at understanding all of the complexity and nuances of human speech, and that’s the magic of LLMs. So to me, it’s like an interface that’s exactly fit to the power of these things. And that doesn’t mean that it always has to be just like I don’t necessarily always want to type, but you do want that very open-ended, flexible communication medium, it may be that we’re speaking and the model’s speaking back to me, but you still want the very lowest common denominator, no restrictions way of interacting.
Lenny Rachitsky: That is so interesting. That’s really changed the way I think about this stuff is that point that chat is just so good for this very specific problem of talking to superintelligence basically.
Kevin Weil: By the way, I think it’s not that it’s only chat either. If you have high volume use cases where they’re more prescribed and you don’t actually need the full generality, there are many use cases where it’s better to have something that’s less flexible, more prescribed, faster to specific task, and those are great too, and you can build all sorts of those. But you still want chat as this baseline for anything that falls out of whatever vertical you happen to be building for. It’s like a catch-all for every possible thing you’d ever want to express to a model.
Lenny Rachitsky: I’m excited to chat with Christina Gilbert, the founder of OneSchema, one of our long-time podcast sponsors. Hi, Christina.
Christina Gilbert: Yes. Thank you for having me on, Lenny.
Lenny Rachitsky: What is the latest with OneSchema? I know you now with some of my favorite companies like Ramp, Vanta, Scale and Watershed. I heard that you just launched a new product to help product teams import CSVs from especially tricky systems like ERPs?
Christina Gilbert: Yes. So we just launched OneSchema FileFeeds, which allows you to build an integration with any system in 15 minutes as long as you can export a CSV to an SFTP folder. We see our customers all the time getting stuck with hacks and workarounds, and the product teams that we work with don’t have to turn down prospects because their systems are too hard to integrate with. We allow our customers to offer thousands of integrations without involving their engineering team at all.
Lenny Rachitsky: I can tell you that if my team had to build integrations like this, how nice would it be to be able to take this off my roadmap and instead, use something like OneSchema and not just to build it, but also to maintain it forever.
Christina Gilbert: Absolutely, Lenny. We’ve heard so many horror stories of multi-day outages from even just a handful of ad records. We are laser-focused on integration reliability to help teams end all of those distractions that come up with integrations. We have a built-in validation layer that stops any bad data from entering your system, and OneSchema will notify your team immediately of any data that looks incorrect.
Lenny Rachitsky: I know that importing incorrect data can cause all kinds of pain for your customers, and quickly lose their trust. Christina, thank you for joining us. And if you want to learn more, head on over to oneschema.co. That’s oneschema.co.
I want to come back to that you talked about researchers and their relationship with product teams. I imagine a lot of innovation comes from researchers just like having an inkling and then building something amazing and then releasing it, and some ideas come from PMs and engineers. How do those teams collaborate? Does every team have a PM? Is it a lot of research-led stuff? Give us a sense of just where ideas and products come from mostly.
Kevin Weil: It’s an area where we’re evolving a lot. I’m really excited about it, frankly. I think if you go back a couple of years when ChatGPT was just getting started, obviously, I wasn’t in OpenAI, but…
Obviously I wasn’t an Open AI, but… We were more of a pure research company at the time. Chat GPT, if you remember, was a low-key research preview.
Lenny Rachitsky: For many years.
Kevin Weil: Yeah. It wasn’t a thing that the team launched thinking it was going to be this massive product.
Lenny Rachitsky: Oh, Chat GPT. Yeah.
Kevin Weil: And it was just a way that we were going to let people play with and iterate on the models. So we were primarily a research company, a world-class research company, and as ChatGPT has grown and as we’ve built our B-to-B products and our APIs and other things, now we’re more of a product company than we were. I still think we can’t… Open AI should never be a pure product company. We need to be both a world-class research company and a world-class product company, and the two need to really work together, and that’s the thing that I think we’ve been getting much better at over the last six months. If you treat those things separately and the researchers go do amazing things and build models and then they get to some state and then the product and engineering teams go take them and do something with them, we’re effectively just an API consumer of our own models.
The best products though are going to be, it’s like I was talking about with deep research, it’s a lot of iterative feedback. It’s understanding the products you’re trying to sell or the problems you’re trying to solve, building evals for them, using those evals to go gather data and fine-tune models to get them to be better at these use cases that you’re looking to solve. It’s a huge amount of back and forth to do it well. And I think the best products are going to be ENG product design and research working together as a single team to build novel things. So that’s actually how we’re trying to operate with basically anything that we build. It’s a new muscle for us because we’re kind of new as a product company, but it’s one that people are really excited about because we’ve seen every time we do it, we build something awesome, and so now every product starts like that.
Lenny Rachitsky: How many product managers do you have at Open AI? I don’t know if you share that number, but if you do.
Kevin Weil: Not that many, actually. I don’t know, 25. Maybe it’s a little more than that. My personal belief is that you want to be pretty PM light as an organization just in general. I say this with love because I am a PM, but too many PMs causes problems. We’ll fill the world with decks and ideas versus execution. So I think it’s a good thing when you have a PM that is working with maybe slightly too many engineers because it means they’re not going to get in and micromanage. You’re going to leave a lot of influence and responsibility with the engineers to make decisions. It means you want to have really product-focused engineers, which we’re fortunate to have. We have an amazingly product focused, high agency engineering team. But when you have something like that, you have a team that feels super empowered, you have a PM that’s trying to really understand the problems and gently guide the team a little bit but has too much going on to get too far into the details, and you end up being able to move really fast. So that’s kind of the philosophy we take.
We want Product ENG leads and product engineers all the way through. We want not too many PMs, but really awesome, high quality ones, and so far that seems to be working pretty well.
Lenny Rachitsky: I imagine being a PM at Open AI is a dream come true for a lot of people. At the same time, I imagine it’s not a fit for a lot of people. There’s researchers involved, very product minded engineers. What do you look for in the PMs that you hire there for folks that are like, “Maybe I shouldn’t go work there. I shouldn’t even think about that.”
Kevin Weil: I think, I’ve said this a few times, but high agency is something that we really look for, people that are not going to come in and wait for everyone else to allow them to do something, they’re just going to see a problem and go do it. It’s just a core part of how we work. I think people that are happy with ambiguity, because there is a massive amount of ambiguity here, it is not the kind of place, and we have trouble sometimes with more junior PMs because of this, because it’s just not the place where someone is going to come in and say, “Okay, here’s the landscape, here’s your area, I want you to go do this thing.” And that’s what you want as an early career PM. I mean, no one here has time and the problems are too ill-formed and we’re figuring them all out as we go. And so high agency, very comfortable with ambiguity, ready to come in and help execute and move really quickly. That’s kind of our recipe.
And I think also happy leading through influence because… I mean it’s usual as a PM, people don’t report to you, your team doesn’t report to you, et cetera, but you also have the complexity of a research function, which is even more sort of self-directed and it’s really important to build a good rapport with the research team. I think the EQ side of things is also super important for us.
Lenny Rachitsky: I know at most companies, a PM comes in and they’re just like, “Why do we need you?” And as a PM you have to earn trust and help people see the value, and I feel like at Open AI it’s probably a very extreme version of that where they’re like, “Why do we need this person? We have researchers, engineers, what are you going to do here?”
Kevin Weil: Yeah, I think people appreciate it done right, but you bring people along. I think one of the most important things a PM can do well is be decisive. So there’s a real fine line. You don’t want to be making… I mean it’s kind of like, I don’t love the PM as the CEO of the product illusion all the time, but just like Sam in his role would be making mistakes if he made every single decision in every meeting that he was in. And he would also be making mistakes if he made no decisions in any meetings that he was in, right? It’s understanding when to defer to your team and to let people innovate. And when there is a decision to be made that people either don’t feel comfortable with or don’t feel empowered to make, or a decision that has too many different disparate pros and cons that are spread out across a big group and someone needs to be decisive and make a call, it’s a really important trait of a CEO.
It’s something Sam does well, and it’s also a really important trait of a PM kind of at a more microscopic level. So because there’s so much ambiguity, it’s not obvious what the answer is in a lot of cases, and so having a PM that can come in and… And by the way, this doesn’t need to be a PM, I’m perfectly happy if it’s anybody else, but I kind of look to the PM to say, if there’s ambiguity and no one’s making a call, you better make sure that we get a call made and we move forward.
Lenny Rachitsky: This touches on a few posts I’ve done of just, where is AI going to take over work that we do versus help us with various work? So let me come at this question from a different direction of just how AI impacts product teams and hiring, things like that. So first of all, there’s all this talk of LM’s doing our coding for us, and 90% of code is going to be written by AI in a year. Dario at Anthropic said that. At the same time, you guys are all hiring engineers like crazy, PM’s like crazy. Every function is dead, but you’re still hiring every single one. I guess just, first of all, let me just ask this, how do you and the team, say engineers, PMs, use AI in your work? Is there anything that’s really interesting or things that you think people are sleeping on in how you use AI in your day-to-day work?
Kevin Weil: We use it a lot. I mean, every one of us is in Chat GPT all the time summarizing docs, using it to help write docs with GPTs that write product specs and things like that, all the stuff that you would imagine. I mean talk about writing evals, you can actually use models to help you write evals and they’re pretty good at it. That all said, I’m still sort of disappointed by us, and I really mean me, in, if I were to just teleport my five-year-old self leading product at some other company into my day job, I would recognize it still. And I think we should be in a world, certainly a year from now, probably even more now, where I almost wouldn’t recognize it because the workflows are so different and I’m using AI so heavily, and I’d still recognize it today. So I think in some sense, I’m not doing a good enough job of that.
Just to give an example, why shouldn’t we be vibe coding demos right, left and center? Instead of showing stuff in Figma, we should be showing prototypes that people are vibe coding over the course of 30 minutes to illustrate proofs of concept and to explore ideas. That’s totally possible today, and we’re not doing it enough. Actually, our chief people officer, Julia, was telling me the other day, she vibe coded an internal tool that she had at a previous job that she really wanted to have here at Open AI and she opened, I don’t know, Windsurf or something, and vibe coded it. How cool is that? And if our chief people officer is doing it, we have no excuse to not be doing it more.
Lenny Rachitsky: That’s an awesome story. And some people may not have heard this term vibe coding. Can you describe what that means?
Kevin Weil: Yeah, I think this was Andrej’s term.
Lenny Rachitsky: Karpathy. Yeah.
Kevin Weil: Andrej Karpathy. Yeah. So you have these tools like Cursor and Windsurf and GitHub Copilot that are very good at suggesting what code you might want to write. So you can give them a prompt and they’ll write code and then as you go to edit it, it’s suggesting what you might want to do. And the way that everyone started using that stuff was, give it a prompt, have it do stuff, you go edit it, give it a prompt, and you’re kind of really going back and forth with the model the whole time. As the models are getting better and as people are getting more used to it, you can kind of just let go of the wheel a little bit. And when the model’s suggesting stuff, it’s just like, tap, tap, tap, tap, tap. Keep going. Yes, yes, yes, yes, yes.
And of course the model makes mistakes or it does something that doesn’t compile, but when it doesn’t compile, you paste the error in and you say, go, go, go, go, go. And then you test it out and it does one thing that you don’t want it to do, so you enter in an instruction and say, go, go, go, go, go, and you just let the model do its thing. And it’s not that you would do that for production code that needed to be super tight today yet, but for so many things, you’re trying to get to a proof of concept, you’re getting to a demo and you can really take your hands off the wheel and the model will do an amazing job, and that’s vibe coding.
Lenny Rachitsky: That’s an awesome explanation. I think the pro version of that, which is, I think, the way Andre even described it as you talk, there’s a step like whisper or super whisper or something like that where you’re talking to the model, not even typing.
Kevin Weil: Yeah, totally.
Lenny Rachitsky: Oh man. So let me just ask, I guess, when you look at product teams in the future, you talked about how you guys should be doing this more, instead of designs, having prototypes, what do you think might be the biggest changes in how product teams are structured or built? Where do you think things are going in the next few years?
Kevin Weil: I think you’re definitely going to live in a world where you have researchers built into every product team. And I don’t even mean just at foundation model companies because I think the future… Actually, frankly one thing that I’m sort of surprised about about our industry in general is that there’s not a greater use of fine-tuned models. A lot of people… These models are very good, so our API does a lot of things really well, but when you have particular use cases, you can always make the model perform better on a particular use case by fine-tuning it. It’s probably just a matter of time. Folks aren’t quite comfortable yet with doing that in every case. But to me, there’s no question that that’s the future. Models are going to be everywhere just like transistors are everywhere, AI is going to be just a part of the fabric of everything we do, but I think there are going to be a lot of fine-tuned models because why would you not want to more specifically customize a model against a particular use case?
And so I think you’re going to want sort of quasi researcher machine learning engineer types as part of pretty much every team because fine-tuning a model is just going to be part of the core workflow for building most products. So that’s one change that maybe you’re starting to see at foundation model companies that will propagate out to more teams over time.
Lenny Rachitsky: I’m curious if there’s a concrete example that makes that real, and I’ll share one that comes to mind as you talk, which is, when you look at Cursor and Windsurf, something I learned from those founders is that they use a Sonnet, but then they also have a bunch of custom models that help along the edges that make the specific experience that’s not just generating code even better like auto-complete and looking ahead to where things are going. So is that one or any other examples of which you… What is a fine-tuned model? Do you think teams will be building with these researchers on their teams?
Kevin Weil: Yeah. I mean, so when you’re a model, you’re basically giving the model a bunch of examples of the kinds of things you want it to be better at. So it’s, “Here’s a problem, here’s a good answer. Here’s a problem, here’s a good answer,” Or, “Here’s a question, here’s a good answer times a thousand or 10,000.” And suddenly you’re teaching the model to be much better than it was out of the gate at that particular thing. We use it everywhere internally. We use ensembles of models much more internally than people might think. So it’s not, “I have 10 different problems. I’ll just ask baseline GPT four oh about a bunch of these things.” If we have 10 different problems, we might solve them using 20 different model calls, some of which are using specialized fine-tuned models, they’re using models of different sizes because maybe you have different latency requirements or cost requirements for different questions.
They are probably using custom prompts for each one. Basically you want to teach the model to be really good at… You want to break the problem down into more specific tasks versus some broader set of high level tasks. And then you can use models very specifically to get very good at each individual thing. And then you have an ensemble that tackles the whole thing. I think a lot of good companies are doing that today. I still see a lot of companies giving the model single, generic, broad problems versus breaking the problem down, and I think there will be more breaking the problem down using specific models for specific things, including fine tuning.
Lenny Rachitsky: And so in your case, because this is really interesting, is that you’re using different levels of Chat GPT, like a 1 0 3 and stuff that’s earlier because it’s cheaper.
Kevin Weil: There’ll be parts of our internal stack. I’ll give you an example. Customer support, with 400 plus million weekly active users, we get a lot of inbound tickets. I don’t know how many customer support folks we have, but it’s not very many, 30, 40, I’m not sure, way smaller than you would have at any comparable company, and it’s because we’ve automated a lot of our flows. We’ve got most questions using our internal resources, knowledge base, guidelines for how we answer questions, what kind of personality, et cetera. You can teach the model those things and then have it do a lot of its answers automatically, or where it doesn’t have the full confidence to answer a particular question, it can still suggest an answer, request a human to look at it and then that human’s answer actually is its own sort of fine tuning data for the model. You’re telling it the right answer in a particular case.
We’re using… At various places. Some of these places, you want a little bit more reasoning, is not super latency sensitive, so you want a little more reasoning, and we’ll use one of our O series models. In other places, you want a quick check on something and so you’re fine to use four oh mini, which is super fast and super cheap. In general, it’s like specific models for specific purposes and then you ensemble them together to solve problems. By the way, again, not unlike how we as humans solve problems, a company is arguably an ensemble of models that have all been fine tuned based on what we studied in college and what we have learned over the course of our careers. We’ve all been fine tuned to have different sets of skills and you group them together in different configurations and the output of the ensemble is much better than the output of any one individual.
Lenny Rachitsky: Kevin, you’re blowing my mind. That sounds exactly correct. And also, different people, you pay them less, they cost less to talk to, some people take a long time to answer, some people hallucinating. This is…
Kevin Weil: I’m telling you. This is a mental model but really does work in thinking…
Lenny Rachitsky: Oh, right. Yeah. This is great. Some people are visual, they want to dry out their thinking, some people want to talk word cell. Wow, this is a really good metaphor. So again, coming back to your advice here because I love that we circled back to it, you’re finding a really good way to think about how to design great AI experiences and LMs, I guess, specifically is think about how a person would do this.
Kevin Weil: Well, it’s maybe not always the answer is to think about how a person would do it, but sometimes to gain intuition for how you might solve a problem, you think about what an equivalent human would do in those situations and use that to at least gain a different perspective on the problem.
Lenny Rachitsky: Wow, this is great.
Kevin Weil: Because some of this really is talking to a model. There’s a lot of prior art because we talk to other humans all the time and encounter them in all sorts of different situations, and so there’s a lot to learn from that.
Lenny Rachitsky: Okay, so speaking of humans, I want to chat about the future a little bit. So you have three kids, and a community member asked me this hilarious question that I think it’s something a lot of people are thinking about. So this is Patrick [inaudible 01:04:47]. I worked with him at Airbnb. He says ask what he’s encouraging his kids to learn to prepare for the future. I’m worried my 6-year-old by the year 2036 will face a lot of competition trying to get into the top roofing or plumbing programs and need a backup plan.
Kevin Weil: That’s funny. So our kids, we have a 10 year old and eight year old twins, so they’re still pretty young. It’s amazing how AI native they are. It’s completely normal to them that there are self-driving cars. That they can talk to AI all day long. They have full conversations with Chat GPT and Alexa and everything else. I don’t know, who knows what the future holds? I think things like coding skills are going to be relevant for a long time, who knows? But I think if you teach your kids to be curious, to be independent, to be self-confident, you teach them how to think, I don’t know what the future holds, but I think that those are going to be skills that are going to be important in any configuration of the future. And so it’s not like we have all the answers, but that’s how Elizabeth and I think about our kids.
Lenny Rachitsky: And do you find that AI… There’s a lot of talk about AI tutoring. Is that something you guys are doing? I know they’re using Chat GPT, I love all the photos you post where they’re playing with prompts and stuff, but I guess is there anything there you’re experimenting with or you think is going to become really important?
Kevin Weil: This is something that… It’s maybe the most important thing that AI could do. Maybe that’s a grand statement. There are lots of important things that AI can do, including speeding up the pace of fundamental science research and discovery, which maybe is actually the most important thing AI can do. But one of the most important things would be personalized tutoring. And it kind of blows my mind that there is still… I know there are a bunch of good products out there. Khan Academy does great things. They’re a wonderful partner of ours. Vinod Khosla has a non-profit that’s doing some really interesting stuff in this space and is making an impact. But I’m kind of surprised that there isn’t a 2 billion kid AI personalized tutoring thing because the models are good enough to do it now, and every study out there that’s ever been done seems to show that when you have… Like, education is still important, but when you combine that with personalized tutoring, you get multiple standard deviation improvements in learning speed.
And so it’s uncontroversial, it’s good for kids, it’s free. Chat GPT is free, you don’t need to pay, and the models are good enough. It still just kind of blows my mind that there isn’t something amazing out there that our kids are using and your future kids are using, and people in all sorts of places around the world that aren’t as lucky as our kids to be able to have this sort of built-in, solid education. Again, Chat GPT is free. People have Android devices everywhere. I really just think this could change the world and I’m surprised it doesn’t exist and I want it to exist.
Lenny Rachitsky: This kind of touches on something I want to spend a little time on, which is a lot of people also worry a lot about AI, where it’s going, they worry about jobs it’s going to take, they worry about the super intelligence squashing humanity in the future. What’s your perspective on that and just the optimistic case that I think people need to hear?
Kevin Weil: I mean, I’m a big technology optimist. I think if you look over the last 200 years, maybe more, technology has driven a lot of the advancements that have made us the world and the society that we are today. It drives economic advancements, it drives geopolitical advancements, quality of life, longevity advancement. I mean, technology’s at the root of just about everything, so I think there are very few examples where this is anything but a great thing over the longer term. That doesn’t mean that there aren’t…
… a great thing over the longer term. That doesn’t mean that there aren’t temporary dislocations or where there aren’t individuals that are impacted, and that matters too. So it can’t just be that the average is good. You’ve got to also think about how you take care of each individual person as best you can.
It is something that we think a lot about and as we work with the administration, as we work with policy, we try and help wherever we can. We do a lot with education. One of the benefits here is that ChatGPT is also perhaps the best reskilling app you could possibly want. It knows a lot of things. It can teach you a lot of things if you’re interested in learning new things.
These are very real issues. I’m super optimistic about the long run, and we’re going to need to do everything we can as a society to ensure that we make this transition as graceful and as well-supported as we can.
Lenny Rachitsky: To give people a sense of where things might be going. That’s a big question in a lot of people’s minds. So someone asked this question that I love, which is, “AI is already changing, creative work in a lot of different ways, writing and design and coding, what do you think is the next big leap? What should we be thinking is the next big leap in AI-assisted creativity specifically, and then just broadly, where do you think things are going to be going in the next few years?”
Kevin Weil: Yeah. This is also an area where I’m a big optimist. If you look at Sora, for example. I mean we talked about ImageGen earlier and the absolute fount of creativity that people are putting across Twitter and Instagram and other places. I am the world’s worst artist like the worst. Maybe the only thing I’m worse at than art is singing. Give me a pencil and a pad of paper and I can’t draw better than our eight-year-old. But give me ImageGen and I can think some creative thoughts and put something into the model and suddenly have output that I couldn’t have possibly done myself. That’s pretty cool.
Even you look at folks that are really talented. I was talking to a director recently about Sora, someone who’s directed films that we would all know, and he was saying, for a film that he’s doing, take the example of some sort of sci-fi-ish, think of Star Wars, and you’ve got some scene where there’s a plane zooming into some Death Star-like thing. And so you’ve got the plane looking at the whole planet, and then you want to cut to a scene where the plane’s kind of at the ground level, and all of a sudden you see the city and everything else. How are we going to manage that cut scene? And that transition?
And he was saying, “In the world of two years ago, I would have paid a 3D effects company a hundred grand and they would’ve taken a month, and they would’ve produced two versions of this cut scene for me. And I would’ve evaluated them. We would’ve chosen one, because what are you going to do? Pay another 50 grand and wait another month. And we would’ve just gone with it. And it would be fine. Movies are great. I love them. And there’ve been…”
Obviously, we can do great things with the technology that we’ve had, but you now look at what you can do with Sora. And his point was, “Now, I can use Sora, our video model, and I can get 50 different variations of this cut scene just me brainstorming into a prompt and the model brainstorming a little bit with me. I’ve got 50 different versions. And then of course, I can iterate off of those and refine them and take different ideas. And now I’m still going to go to that 3D effects studio to produce the final one, but I’m going to go having brainstormed and had a much more creative approach with an outcome that’s much better. And I did that assisted by AI.”
My personal view on creativity in general is that it’s no one’s going to… You don’t type into Sora like, “Make me a great movie.” It requires creativity and ingenuity, and all these things, but it can help you explore more. It can help you get to a better final result. So, again, I tend to be an optimist in most things, but actually, I think there’s a very good story here.
Lenny Rachitsky: I know Sam Altman, I think it was him who tweeted recently, the creative writing piece that you guys are working on where it’s… He is very bad at writing creative stuff, and he shared an example where it’s actually really good. I imagine that’s another area of investment.
Kevin Weil: Yeah, there’s some exciting stuff happening internally with some new research techniques. We’ll have more to say about that at some point. But yeah, Sam sometimes likes to show off some of the stuff that’s coming, which is smart. By the way, it’s very indicative of this iterative deployment philosophy. We don’t have some breakthrough and keep it to ourselves forever, and then bestow it upon the world someday. We kind of just talk about the things we’re working on and share when we can and launch early and often, and then iterate in public. I really like that philosophy.
Lenny Rachitsky: I love all these hints that a few things coming. I know you can’t say too much. You talked about how there might be a coding leap coming in the near future maybe by the time this comes out. Is there anything else people should be thinking about, might be coming in the near future? Any things you can tease that are interesting? Exciting?
Kevin Weil: Man, this hasn’t been enough for you?
Lenny Rachitsky: Only everything is getting better every day.
Kevin Weil: Yeah. I’m like, man, I hope we get some of this stuff out before the episode launches so-
Lenny Rachitsky: This is your new timebox.
Kevin Weil: … I don’t piss people off. The amazing thing to me is we were talking earlier about how far models have come in just a couple of years. If you went back to GPT-3, you’d be disgusted by how bad it was, even though Lenny of two years ago was mind-blown by how good these were. And for a long time, we were iterating every six to nine months on a new GPT model. It was like GPT-3, GPT-3.5, 4, and now with this o-series of reasoning models, we’re moving even faster. Every roughly three months, maybe four months, there’s a new o-series model, and each of them is a step up in capability.
And so the capabilities of these models are increasing at a massive pace. They’re also getting cheaper as they scale. You look at where we were even a couple of years ago. I think the original, I don’t know, what was it, GPT-3.5 or something was like 100 x the cost of GPT-4o mini today in the API. A couple of years, you’ve gone down two orders of magnitude in cost for much more intelligence. And so I don’t know where there’s another series of trends like that in the world. Models are getting smarter, they’re getting faster, they’re getting cheaper, and they’re getting safer too. They hallucinate less every iteration.
And so the Morse Law and transistors becoming ubiquitous. That was a law around doubling the number of transistors on a chip every 18 months. If you’re talking about something where you’re getting 10 x every year, that’s a massively steeper exponential. And it tells us that the future is going to be very different than today. The thing I try and remind myself is, the AI models that you’re using today is the worst AI model you will ever use for the rest of your life. And when you actually get that in your head, it’s kind of wild.
Lenny Rachitsky: I was going to actually say the same thing, and that’s the thing that always sticks with me when I watch this thing. You’re talking about Sora, and I imagine many people hearing that are like, “No, no. It’s not actually ready. It’s not good enough. It’s not going to be as good as a movie I see in the theater.” But the point is what you just made that this is the worst it’s going to be. It will only get better.
Kevin Weil: Yeah, model maximalism. Just keep building for the capabilities that are almost there, and the model’s going to catch up and be amazing.
Lenny Rachitsky: Escape to where the puck is going to be.
Kevin Weil: Yeah.
Lenny Rachitsky: This reminds me, I was just using… I was duplifying everything the other day and I was just like, “What is taking so long.”
Kevin Weil: As one does.
Lenny Rachitsky: Just like cut… What was that?
Kevin Weil: I said, as one does.
Lenny Rachitsky: As one does these days. I was just like, “It’s taking a minute to generate this image of my family in this amazing way.” Come on, what’s taking so long. You just get so used to magic happening in front of you.
Kevin Weil: Yeah, totally.
Lenny Rachitsky: Okay, final question. This is going to go in a completely different direction. A lot of people asked about this. So famously, you led this project at Facebook called Libra, which is now called Novi. A lot of people always wondered, “What happened there? That was a really cool idea.” I know some people have a sense there’s regulation challenges, things like that. I don’t know if you’ve talked about this much. So I guess, could you just give people a brief summary of just what is Libra? This project you working on, and just what happened, and how you feel about it?
Kevin Weil: Yeah. I mean, David Marcus led it, and I happily work for him and with him. I think he’s a visionary and also a mentor and a friend. Honestly, Libra is probably the biggest disappointment of my career. When I think about the problems we were solving, which are very real problems. If you look at, for example, the remittance space, people sending money to family members in other countries, it is maybe… I mean it’s incredibly regressive, right? People that don’t have the money to spend are having to pay 20% to send money home to their family. So outrageous fees, it takes multiple days, you have to go then pick up cash from… It’s all bad.
And here we are with 3 billion people using WhatsApp all over the world, talking to each other every day, especially friends and family, and exactly the kind of people who’d send money to each other. Why can’t you send money as immediately, as cheaply, as simply as you send a text message? It is one of those things when you sit back and think about it, that should just exist. And that was what we set out to try and do.
Now, I don’t think we played all of our cards perfectly. If I could go back and do things, there are a bunch of things I would do differently.
We tried to get it all at once. We tried to launch a new blockchain. It was a basket of currencies originally. It was integration into WhatsApp and Messenger, and I think the whole world kind of went like, “Oh my God, that’s a lot of change at once.” And it happened also to be at the time that Facebook was at the absolute nadir of its reputation. And so that didn’t help. It was also not the Messenger that people wanted for this kind of change. We knew all that going in, but we went for it.
I think there are a bunch of ways that we could do that that would’ve introduced the change a little bit more gently, maybe still gotten to that same outcome, but fewer new things at once and introduced the new things one at a time. Who knows? Those were decisions we made together. So we all own them. Certainly, I own them. But it fundamentally disappoints me that this doesn’t exist in the world today because the world would be a better place if we’d been able to ship that product. I would be able to send you 50 cents in WhatsApp for free. It would settle instantly. Everybody would have a balance in their WhatsApp account. We’d be transact… I mean, it should exist.
I don’t know. To be honest, the current administration is super friendly to crypto. Facebook’s reputation, Meta’s reputation is in a very different place. Maybe they should go build it now.
Lenny Rachitsky: I was looking at the history of it, and apparently, they sold the tech to some private equity company for 200 million bucks.
Kevin Weil: Yeah, yeah, and-
Lenny Rachitsky: They had to buy it back.
Kevin Weil: There are a couple of current blockchains that are built on the tech because the tech was open-sourced from the beginning. Aptos and Mistin are two companies that are built off of this tech. So at least all of the work that we did, did not die and lives on in these two companies, and they’re both doing really well. But still, we should be able to send each other money in WhatsApp, and we can’t today.
Lenny Rachitsky: Hear, hear. Well, thanks for sharing that story, Kevin. Is there anything else you want to share or maybe a last negative advice or insight before we get to our very exciting lightning round?
Kevin Weil: Ooh, the lightning round. Let’s just go do that.
Lenny Rachitsky: Let’s do it. With that, Kevin, we reached our very exciting lightning round. Are you ready?
Kevin Weil: Yeah.
Lenny Rachitsky: Let’s do it. Okay. What are two or three books that you find yourself recommending most to other people?
Kevin Weil: Co-Intelligence by Ethan Mollick, a really good book about AI and how to use it in your daily life as a student, as a teacher. He’s super thoughtful. Also, by the way, a very good follow on Twitter. The Accidental Superpower by Peter Zion. Very good if you’re interested in geopolitics and the forces that sort of shape the dynamics happening. And then I really enjoyed Cable Cowboy, I don’t know who the author is, but the biography of John Malone. Just fascinating. If you like business, especially if you want to get into… I mean the man was an incredible dealmaker and shaped a lot of the modern cable industry. So that was a good biography.
Lenny Rachitsky: These are all first-time mentions, which is always a great,
Kevin Weil: Oh, good.
Lenny Rachitsky: Next question. Do you have a favorite recent movie or TV show that you really enjoyed?
Kevin Weil: I wish I had time to watch a TV show, so I’m-
Lenny Rachitsky: Just Sora videos.
Kevin Weil: Yeah, right. I don’t know. When I was a kid, I read the Wheel of Time series and now Amazon has it as they’re in the third season of it, so I want to watch that. I haven’t yet. Top Gun 2 was an awesome movie. I think that’s no longer new.
Lenny Rachitsky: That shows when the last time you watched a movie was.
Kevin Weil: But I like the idea. I want more Americana. I want more being proud of being strong. And I thought Top Gun 2 did a really good job of that. Pride and patriotism, I think the US could use more of that.
Lenny Rachitsky: Is there a favorite product that you’ve recently discovered that you really love, other than your super intelligence internal tool that you all have access to? I’m just joking.
Kevin Weil: That’s right. Internal AGR.
Lenny Rachitsky: Yeah, that’s right.
Kevin Weil: Well, I think vibe coding with products like Windsurf is just super fun. I’m having a great time doing that. I still just love that our chief people officer vibe coded some tools. Maybe the other one is Waymo. Every chance I get, I’ll take a Waymo. It’s just a better way of riding, and it still feels like the future. So they’ve done an amazing job.
Lenny Rachitsky: That’s awesome. By the way, I had the founder of Windsurf on the podcast. It might come out before this or after this. And also Cursor’s CEO is coming on the podcast either before or after this.
Kevin Weil: Oh, cool. I have a ton of respect for what those guys are doing. Those are awesome products.
Lenny Rachitsky: Just changing the way everyone builds product. No big deal.
Kevin Weil: Yeah.
Lenny Rachitsky: A couple more questions. Do you have a favorite life motto that you often repeat yourself, find really useful in work or in life?
Kevin Weil: Yeah. So actually, this is interestingly enough, it is more of a philosophy, but then I thought Zuck encapsulated it one time on a Facebook earnings call. So I actually had this made into a poster. It sits in my room. But somebody was asking Mark. This is literally on an earnings call, so it’s like an analyst on an earnings call asking him. It was some quarter when Facebook had grown a lot. This was back in the 20 teens sometime, I think. But he’s like, “So what did you do? What was it that you launched? What was the one thing that drove all this growth for you?” And he said something to the effect of, “Sometimes it’s not any one thing, it’s just good work consistently over a long period of time.” And that’s always stuck with me.
And I think it is. I mean I run ultra marathons. It’s like it’s just about grinding. I think people too often look for the silver bullet when a lot of life and a lot of excellence is actually showing up day in and day out, doing good work, getting a little bit better every single day, and you may not notice it over a week or even a month. And a lot of people then kind of get dismayed and stop. But actually, you keep doing it. The gains keep compounding. And over the course of a year, two years, five years, it adds up like crazy. So good work consistently over a long period of time.
Lenny Rachitsky: I love that. I got to make a poster of this now. That is-
Kevin Weil: We’ll get you one.
Lenny Rachitsky: I so resonate with that. Okay, I’ll take it. That is so good. Okay, final question. I’m going to ask if you have any prompting tricks, and I’m going to set it up first. But think about if you have a trick that you could recommend to people for prompting LLMs better. I had a guest, Alex Komorowski, come on the podcast. He’s from Stripe and writes his weekly reflections on what’s happening in the world. A lot of them are AI-related.
And he once described an LLM as a zip file of all human knowledge. All the answers are in there, and you just need to figure out the right question to ask to get the answer to every problem basically. And so it just reminded me how important prompt engineering is and knowing how to prompt well. You’re constantly prompting ChatGPT. What’s one tip, one trick that you found to be helpful in helping you get what you want?
Kevin Weil: Well, I’ll say, first of all, I want to kill the idea that you have to be a good prompt engineer. I think if we do our jobs, that stops being true. It’s just one of those sharp edges of models that experts can learn. But then, just over time, you shouldn’t need to know all that. The same way you used to have to get deep into, “What’s your storage engine in MySQL? Are you using InnoDB 4.1?” There’s still use cases for that if you’re at the deep edge of MySQL performance. But most people don’t need to care. And you shouldn’t need to care about minute details of prompting if AI is really going to become broadly adopted.
But today, we’re not totally there. I think by the way, we are making progress there. I think there is less prompt engineering than there had to be before. But in line with some of the fine-tuning stuff I was talking about and the importance of giving examples, you can do effectively poor man’s fine-tuning by including examples in your prompt of the kinds of things that you might want and a good answer. So like, “Here’s an example and here’s a good answer. Here’s an example, and here’s a good answer. Now, go solve this problem for me.” And the model really will listen and learn from that.
Not as well as if you do a full fine-tune, but much more than if you don’t provide any examples. And I think people don’t do that often enough.
Lenny Rachitsky: That’s awesome. One tip that I heard, I’m curious if this works is you tell it, “This is very, very important to my career.” Make it really understand like, “Someone will die if you don’t answer me correctly.” Does that work?
Kevin Weil: It’s really weird. There’s probably a good explanation for this. But you can also say things. So, yes, I think there is some validity to that. You can also say things like, “I want you to be Einstein. Now, answer this physics problem for me,” or, “You are the world’s greatest marketer, the world’s greatest brand marketer. Now here’s a naming question.” And there is something where it sort of shifts the model into a certain mindset that can actually be really positive.
Lenny Rachitsky: I use that tip all the time actually. I always… When I’m coming up with questions for interviews and I use it occasionally to come up with things I haven’t thought of, I actually type, “You’re the world’s best podcast interviewer.”
Kevin Weil: Right.
Lenny Rachitsky: I have Kevin Weil coming on the pod… Yeah, it actually works.
Kevin Weil: By the way, back to our other point that we made a few times. You do do that sometimes with people. You sort of put them… You frame things, you get them into a certain mindset, and the answer is completely different. So I think there are human analogs of this one more time.
Lenny Rachitsky: Kevin, this was incredible. I was just thinking about a way to end this. The way I feel like… I feel like not only are you at the cutting edge of the future. You and the team are kind of actually the edge that is creating the future. And so it’s a real honor to have you on here and to talk to you and to hear where you think things are going and what we need to be thinking about, so thank you for being here, Kevin.
Kevin Weil: Oh, thank you so much for having me. I get to work with the world’s best team, and all credit to them, but really appreciate you having me on. It’s been super fun.
Lenny Rachitsky: I forgot to ask you the two final questions. Where can folks find you if they want to reach out, and how can listeners be useful to you?
Kevin Weil: I am @kevinweil, K-E-V-I-N-W-E-I-L on pretty much every platform. I’m still a Twitter DAU after all these years. I guess an X DAU, LinkedIn, wherever. And I think the thing I would love from people, give me feedback. People are using ChatGPT. Tell me where it’s working really well for you and where you want us to double down. Tell me where it’s failing. I’m very active and engaged on Twitter. I love hearing from people, what’s working and what’s not, so don’t be shy.
Lenny Rachitsky: And I learned following you helps you figure out all the stuff that you’re launching. You share all the things that are going out every day, or week, month, so that’s also a benefit. And by the way, 400 million weekly active users all emailing you feedback. Here we go.
Kevin Weil: Yes, let’s do it.
Lenny Rachitsky: It’s going to work out great. Okay. Well, thank you, Kevin. Thanks for being here.
Kevin Weil: All right, man, thanks so much. See you soon.
Lenny Rachitsky: Bye, everyone. Thank you so much for listening. If you found this valuable, you can subscribe to the show on Apple Podcasts, Spotify, or your favorite podcast app. Also, please consider giving us a rating or leaving a review as that really helps other listeners find the podcast. You can find all past episodes or learn more about the show at lennyspodcast.com. See you in the next episode.
Glossary
| English | 中文 |
|---|---|
| AGI (Artificial General Intelligence) | 通用人工智能(AGI) |
| Airbnb | Airbnb |
| Alex Komorowski | Alex Komorowski |
| Alexa | Alexa |
| alignment | alignment(保持一致) |
| Andrej Karpathy | Andrej Karpathy |
| API | API |
| Aptos | Aptos |
| Black Product Managers Network | Black Product Managers Network |
| blockchain | 区块链 |
| bottoms-up | 自下而上 |
| brainstorming | 头脑风暴 |
| chain of thought | 思维链 |
| Chief People Officer | 首席人才官 |
| COBOL | COBOL |
| CPO (Chief Product Officer) | 首席产品官 |
| Dario | Dario |
| DAU (Daily Active User) | 日活用户(DAU) |
| David Marcus | David Marcus |
| deep research | 深度研究(deep research) |
| DeepSeek | DeepSeek |
| Elizabeth | Elizabeth |
| ENG (Engineering) | 工程(ENG) |
| ensemble | 集成(ensemble) |
| EQ (Emotional Quotient) | 情商(EQ) |
| Ethan Mollick | Ethan Mollick |
| Ev Williams | Ev Williams |
| evals | 评估(evals) |
| fine-tuned | 微调 |
| foundation model | 基础模型 |
| Ghiblifications | 吉卜力化 |
| hallucinate | 幻觉 |
| hero use cases | 核心用例 |
| high agency | 高能动性 |
| hill climbing | 爬坡优化 |
| ImageGen | ImageGen |
| InnoDB | InnoDB |
| iterative deployment | 迭代部署 |
| John Malone | John Malone |
| Julia | Julia |
| Khan Academy | Khan Academy |
| Lenny & Friends Summit | Lenny & Friends Summit |
| Libra | Libra |
| LLM (Large Language Model) | 大语言模型(LLM) |
| Mike Krieger | Mike Krieger |
| Mistin | Mistin |
| Moats | 护城河 |
| model maximalism | 模型最大化主义 |
| Moore’s Law | 摩尔定律 |
| MySQL | MySQL |
| Novi | Novi |
| OneSchema | OneSchema |
| Patrick | Patrick |
| Peter Zeihan | Peter Zeihan |
| PII (Personally Identifiable Information) | 个人身份信息(PII) |
| Planet | Planet |
| PM (Product Manager) | 产品经理(PM) |
| prompt engineering | 提示工程 |
| proof of concept | 概念验证 |
| remittance | 汇款 |
| Sam | Sam |
| Sam Altman | Sam Altman |
| Sarah Guo | Sarah Guo |
| silver bullet | 银弹 |
| Sora | Sora |
| standard deviation | 标准差 |
| Stories | Stories |
| Strava | Strava |
| super intelligence | 超级智能 |
| system one | 系统一 |
| Tahoe | 太浩湖(Tahoe) |
| the Nature Conservancy | 大自然保护协会(the Nature Conservancy) |
| ultra marathons | 超级马拉松 |
| vibe coding | 感觉编程(vibe coding) |
| Vinod | Vinod |
| Vinod Khosla | Vinod Khosla |
| WAU (Weekly Active User) | 周活用户(WAU) |
| Waymo | Waymo |
| Zuck | Zuck |
Reformatted by reformat_english.py