AI 在新产品开发中的角色 | Ryan J. Salva(GitHub 产品副总裁)
AI 在新产品开发中的角色 | Ryan J. Salva(GitHub 产品副总裁)
文字记录
Ryan J. Salva (00:00:00): 我们实际上为 GitHub 的公开代码创建了一个快照,用于我们所说的北极代码库(Arctic Code Vault),对吧?简单来说,这东西在芬兰极北之地,那里有一个种子库。我们就想,种子库的真正目的是保存世界植物的种子多样性,以防某种疯狂的自然或人为灾难。但对世界来说,另一个非常重要的资产是我们的代码,我们的开源软件。这其实代表了现代世界大量的集体成果——当然包括软件,某种程度上也包括智能,对吧?
Ryan J. Salva (00:00:44): 我们把公共仓库的这个快照存放在银色胶片上,将在北极代码库中保存数千年。然后,我们拿着同样的数据快照去找了 OpenAI 的朋友们,想看看,好吧,基于公开代码构建的大语言模型能做些什么?事实证明,我们能做一些相当酷的事情。
Lenny (00:01:13): Ryan Salva 是 GitHub 的产品副总裁,在众多项目中,他孵化并推出了 GitHub Copilot——在我看来,这是你会遇到的最具魔力的产品之一。如果你还没听说过,它使用 OpenAI 的机器学习引擎,在工程师编码时实时自动补全代码。我认为这是我们一段时间以来在产品开发和生产力方面最大的突破之一。我一直很好奇这样一个大型产品是如何起步、获得内部支持、积聚势头、然后发布的,尤其是在微软这样的大公司里,尤其是 Copilot 这样的产品,它还面临着令人意外的伦理挑战、规模化挑战和商业模式问题。
Lenny (00:01:55): 而且,这个产品出自 GitHub 的一个小型 R&D 团队,听听 Ryan 对于在大公司内部孵化大胆项目、然后从原型扩展到微软规模,学到了什么,真的非常有趣。Ryan 作为一个人也非常有意思。他的背景非常非传统。我很期待你们听到这段对话。那么,我为大家请出 Ryan Salva。如果你在搭建分析技术栈,却还没有使用 Amplitude,那你在干什么?Amplitude 是全球最受欢迎的分析解决方案,既有 Shopify、Instacart 和 Atlassian 这样的大公司在用,也有大多数科技初创公司在用。
Lenny (00:02:38): Amplitude 拥有你所需的一切,包括强大且完全自助式的分析产品、实验平台,甚至还有一个集成的客户数据平台,帮助你以前所未有的方式了解你的用户。为你的团队提供自助式的产品数据,理解你的用户,提升转化率,增加参与度、增长和收入。抛弃那些虚荣指标,相信你的数据,更聪明地工作,增长你的业务。免费试用 Amplitude,只需访问 Amplitude.com 即可开始。本期节目由 Athletic Greens 赞助。我基本上在每个我听的播客上都能听到 AG1,比如 Tim Ferriss 和 Lex Fridman 的节目。
Lenny (00:03:20): 今年早些时候我终于试了一下,它很快就成了我早晨日常的核心部分,尤其是在我需要深度写作或录制像这样的播客的日子里。关于 AG1,我喜欢三点。第一,一小勺溶在水里,你就能吸收 75 种维生素、矿物质、益生菌和适应原。我有点把它当作我营养上的安全网,以防饮食中遗漏了什么。第二,他们把 AG1 当作软件产品来对待。据说他们已经迭代到第 52 版了,而且不断根据最新的科学、研究和内部测试来改进它。
Lenny (00:03:59): 第三,这是我每天可以做的一件简单的事来照顾自己。现在,是时候重拾你的健康,用便捷的日常营养来武装你的免疫系统了。每天只需一勺加一杯水,就这样。不需要一把又一把的药片和补剂来关注你的健康。让它变得简单。Athletic Greens 将为你的首次购买赠送一年用量的免疫支持维生素 D 和五份免费旅行装。你只需要访问 AthleticGreens.com/lenny。再说一次,AthleticGreens.com/lenny,掌控你的健康,获取最极致的日常营养保障。Ryan,欢迎来到播客。
Ryan J. Salva (00:04:42): 谢谢你,我的朋友。我真的非常高兴能来到这里。很高兴能和你一起畅聊一会儿。
Lenny (00:04:48): 我也很兴奋。我们在录制前简单聊了几句,你提到了一些你的背景,对于领导 GitHub 产品的人来说真的非常独特。能不能分享一下你在学校学的是什么,然后简单说说那如何引导你进入了产品管理这个行业?
Ryan J. Salva (00:05:07): 哇!你要让我一直回想到上学的时候。好吧。上学的时候,我不是那种经典的软件工程或计算机科学专业。比较偏门的回答是——美学哲学和 20 世纪批判理论。通俗一点的回答是哲学和英语。但核心其实一直是关于我们作为人如何彼此沟通,如何通过创造力表达自己。自从人类有史以来就在洞穴墙壁上绘画、围着篝火跳舞、写故事和小说、互相歌唱。我就是对我们如何向他人传达自己对世界的体验非常感兴趣。
Ryan J. Salva (00:05:58): 我进入软件开发和产品管理领域,是因为我想投身于创造力的行业。我们正处于人类历史上一个非常非常独特的时刻,我们实际上在见证一种全新媒介的诞生。软件开发及其所创造的世界,在大概 50、60 年前还是不可能的。如果我出生在 1700 年代,我可能就是那个制造新颜色颜料和画笔的人,但我不是。我出生在 21 世纪之交,所以我从事工程工作。
Ryan J. Salva (00:06:39): 这就是我过去大约二十多年一直在做的事情,有时在创业公司工作,有些是别人的,有些是我自己的,在微软大约十年,现在在 GitHub 三年了。
Lenny (00:06:51): 厉害。我不知道发明新颜料颜色还能是一份工作。你会调出什么颜色吗?
Ryan J. Salva (00:06:59): 天哪!碰巧黄色……我想如果做那一行的话,我会调一种非常鲜艳的金色阳光黄。
Lenny (00:07:13): 非常积极、快乐。我喜欢。这可以成为 GitHub 的新品牌色。现在你是 GitHub 的产品副总裁。在此之前,你是微软的资深产品负责人,我一直很好奇这种转型是如何发生的——从一个大公司的长期资深产品负责人,转而接手一个被收购后的公司。我很好奇是什么让你决定迈出这一步,以及在推动这个转型、理清各种安排的过程中,有没有什么有趣的故事?
Ryan J. Salva (00:07:45): 好问题。正如我所说,我在微软时负责开发工具和开发者服务。具体来说,我负责他们所谓的 One Engineering System 的产品。它本质上是所有微软产品(比如 Windows、Office、Azure 等)共享的开发者基础设施,以及微软的 DevOps 解决方案 Azure DevOps。收购发生时,很显然,围绕开发工具和服务的大量能量、关注和创新都将围绕 GitHub 展开。因为那里正是社区在创造的地方。
Ryan J. Salva (00:08:34): 那里是人们学习的地方,是开发者社区心智资源的集中地。就像我说的,我很有动力。我关心的是帮助人们去创造。我很清楚,没有哪个地方能比在 GitHub 产生更大的影响力。我真的抓住了那个机会,从微软一个更偏企业导向的内部角色,转到可以参与各种项目的地方——从 Copilot 这样的 AI 技术,到 Codespaces 这样的云端托管开发环境,再到仓库——地球上几乎每个开发者每年都会以某种方式参与 GitHub 仓库。
Ryan J. Salva (00:09:28): 这就是我想要实现的:如何更紧密地连接社区,尤其是微软自身无法触及的社区。做出这个决定,也不仅仅是因为 GitHub 当时是什么样,而是 GitHub 未来可以成为什么样。GitHub 有超过十年、将近十五年的历史,将开发者聚集在一起通过仓库协作编写代码。但在过去几年里,我们确实大幅扩展了产品组合,覆盖了开发者生命周期中的许多不同环节。
Ryan J. Salva (00:10:13): 刚才我提到了 Codespaces 和 Copilot,但还有用于 CI/CD 的 Actions,以及高级安全功能。作为开发者,我们远不止是存放代码的地方。整个工具链还有很多部分。而能有机会参与这么多 V1 产品,这本身就是创造——从零开始构建一个全新产品,推向市场,测试它,迭代它,并从社区反馈回来的能量中汲取养分。
Lenny (00:10:46): 太棒了。GitHub 确实充满活力。我想把大部分聊天时间花在一个你的团队帮助孵化和发布的产品上,那就是 GitHub Copilot。从旁观者的角度来看,它感觉是十几年来软件开发领域最大的进步之一,也许更久。它绝对是最具魔力的产品之一,而你的团队和你领导了 Copilot 的孵化和发布。
Lenny (00:11:15): 我很想把大部分时间用来聊这个。第一个问题……好的。我的第一个问题,针对那些不太了解 Copilot 的人来说就是:它是什么?你能简单描述一下 Copilot 是什么吗?
Ryan J. Salva (00:11:26): 当然可以。过去二十多年,开发者基本上只有简单的智能自动补全。你敲一个句号,就弹出下一个可能的变量名。它能帮你稍微快一点地写代码,有时也能帮你记住某个方法或函数的特定语法。Copilot 本质上就是把它放大了很多行代码。它是多行自动补全,底层由一个叫 CodeX 的 AI 模型驱动,这个模型是你可能熟悉的 GPT-3 的一个衍生版本。
Ryan J. Salva (00:12:15): 当你在编辑器里——可能是 VS Code,可能是 IntelliJ,也可能是其他编辑器——基本上当你打字时,Copilot 会提供建议,通常以一种斜体灰色文本呈现,而它推断出来的东西确实——如你所说——有点魔力。根据周围的变量、类名、方法名、你的注释,Copilot 推断你打算创建什么,然后希望做得相当不错,提供一个脚手架代码模板,你可以在此基础上发挥。我们发现开发者很喜欢它。他们真的很享受。他们甚至有点上瘾,因为它帮助他们保持心流。
Ryan J. Salva (00:13:08): 作为开发者,我们喜欢那种状态。我喜欢那种创造东西的状态,专注于某个产品、某段将交付给客户和用户的软件。而记住某个 API 的参数顺序,或者某个东西的特定语法是什么来着,又或者我得创建一堆模拟数据——星期几或者月份——这些都是苦力活。这不是在创造,只是打字。
Ryan J. Salva (00:13:47): Copilot 通过把所有这些信息带到编辑器里,帮助开发者保持心流,避免他们去查阅文档、看教程、上 Stack Overflow 找答案,更糟的情况是还得提问然后等回答。它把这一切都带进了编辑器,给开发者通常是多个建议供其选择,让他们挑选适合自己要解决的问题的正确方案。
最疯狂的使用故事
Lenny (00:14:21): 太棒了。我最好奇的——我们接下来也会花时间聊——就是这样一个产品在大公司里是怎么诞生的。不过在聊这个之前,有人用 Copilot 写代码最疯狂的故事是什么?我先分享一个。为了准备这次对话我看了些 YouTube 视频,有一个人——也许这是 AI 写代码的图灵测试——他用 Copilot 来居中 div。他说,“哇!它居中了。“还有另一个人,他是一个编程讲师。
Lenny (00:14:51): 他制作 YouTube 视频教人写代码,他说,“Copilot 直接就给你答案了,所以我没法那么容易做这些视频了。我得把它关掉,不然它就全剧透了。“你见过什么?
Ryan J. Salva (00:15:03): 这种故事太多了。我就分享几个最近听到的吧。我和一位开发者聊过……他其实是一位教育工作者,教孩子们写代码,基本上是高中生那个年纪,16 岁、15 岁左右。他的经验和我的经验一致:我们很多人学编程最好的方式不是做那些随意的练习,而是真正去构建有用的东西、解决实际问题。
Ryan J. Salva (00:15:41): 他的做法是把需要构建内部工具的中小企业和班级的学生匹配起来——大概六到八个学生一组——然后给这些学生 Copilot,说:“中小企业,这里有一组学生。去帮这家企业构建这个内部工具吧。”
Ryan J. Salva (00:16:08): Copilot 本质上就像在学生耳边低语,打个比方:“嘿,这个问题这样解决,这个这样做。“学生们不仅构建了企业所需的软件工具——然后可以把它放在简历和大学申请上——还通过使用这些工具来学习,而这些工具很可能在两三年后、随着 AI 渗透到我们的整个技术栈,会成为开发者工具链的核心组成部分。这是一个我觉得非常酷的最近听到的故事。
Lenny (00:16:48): 这太酷了。我之前没想到教育这个角度——它让学习编程变得如此容易,而不仅仅是写代码。
Ryan J. Salva (00:16:56): 重点就在这里,Copilot 特别擅长的不仅是替你省去一些力气,很多时候……学习一门新语言是一回事,踏入一个你不一定熟悉的代码库又是另一回事,对吧?说真的,有时候连我自己半年前或一年前写的代码我都认不出来,感觉像在踏入一片新领地。但如果你需要修复一个你不常碰的应用里的 bug,涉足那个代码库本身就是一种学习,是在为那个代码库构建心智地图。
Ryan J. Salva (00:17:30): Copilot 在这里真正神奇的地方在于,AI 会收集你正在进入的应用的上下文。它能帮你构建心智地图、学习代码库,哪怕那个语言你已经很熟悉了。
Copilot 的起源故事
Lenny (00:17:47): 太棒了。回到 Copilot 的起点,聊聊它是怎么开始的——我一直很好奇,一个最终对大公司来说意义重大的项目是如何起步的,尤其是它如何积攒势能、如何获得内部认同,然后最终推出。你能谈谈这个想法最初的种子吗?比如它来自谁,谁最初有这个愿景,这个想法是怎么涌现并获得动力的,以至于你们投入了资源?
Ryan J. Salva (00:18:13): 哇,这个故事可长了,而且怎么说呢,取决于你的视角,可能觉得它曲折,也可能觉得令人兴奋。微软和 OpenAI 在大语言模型上合作了相当长一段时间了,涉及各种不同的实验和微软软件产品线的不同部分,同时也在帮助 OpenAI 提供所需的计算资源。训练这些模型需要海量的计算。它们主要都是大语言模型。大概两三年前吧,我们渐渐意识到,语言模型处理的不只是英语、西班牙语、德语、韩语、日语,还有 Python、JavaScript、Java、C# 和 Clojure。
Ryan J. Salva (00:19:07): 这些都是语言。事实上,从 AI 的角度看它们还挺不错的,因为它们的语义相对受限,对吧?Python 里能表达的”词汇”数量——我把”词汇”打上引号——比英语小得多,英语有各种不同的语法规则、名词、动词、形容词、副词。我们开始探索把代码引入这些大语言模型会是什么样子。我接触这件事的方式其实挺有趣的。微软和 OpenAI 有了这个想法。
OpenAI 几乎搞垮 GitHub
Ryan J. Salva (00:19:53): 当时我负责的团队之一是 GitHub 的基础设施团队,就是负责数据中心、可靠性、响应时间的那个团队。有一天我们注意到服务器被猛砸——真的是被猛砸——海量的 clone 请求涌进来。我们心想,“天哪!这是拒绝服务攻击吗?我们怎么应对?会发生什么?“我们很快就搞清楚了,其实那是 OpenAI。他们在克隆我们所有的仓库,从 GitHub 上抓取数据。我是说,这完全合规,但确实产生了实际影响。
Ryan J. Salva (00:20:33): 我们很快就介入并缓解了这个问题。没有出现可靠性方面的事故,但我们跟他们说:“嘿,各位,很酷,我们很喜欢。但让我们看看能不能用更负责任的方式把数据给你们,用一种更符合你们需求的打包方式。“恰好在那之前一年,我们为所谓的北极代码库创建了 GitHub 公开代码的快照。本质上,这个代码库在芬兰很靠北的地方,那里有一个种子库。我们想,你知道吗?种子库的目的是保存世界植物的多样性,以防某种疯狂的自然或人为灾难。
Ryan J. Salva (00:21:25): 但世界上另一个非常重要的资产是我们的代码、我们的开源。这实际上代表了现代世界大量的集体智慧——至少是软件,甚至可以说是智能。我们把公开仓库的快照存放在银胶片上,在北极代码库中保存数千年。然后我们把同样的数据快照带给了 OpenAI 的朋友们,想看看,好吧,基于公开代码构建的大语言模型能做些什么?
Ryan J. Salva (00:22:03): 事实证明我们可以做些相当酷的事情。就像翻译工具可以从英语翻到西班牙语、从西班牙语翻到德语一样,你也可以从英语翻到 Python,或者从 Python 翻到 C#。我们想,好的,这很酷。我们不仅能做翻译,还能做一些预测文本。我们大家对代码编辑器中的预测文本应该已经相当熟悉了,比如 IntelliSense。而且,你去你最喜欢的文字处理软件里,很可能也有某种预测文本功能。
用户体验的打磨
Ryan J. Salva (00:22:43): 我们开始尝试不同的用户体验,对吧?比如,我们要不要设计成你右键点击然后弹出一个小侧边栏,显示一堆你可能需要的不同选项?这还不错,因为它能给你完整的函数,但它偏离了光标,对吧?你必须……即使你没有切换到不同的窗口,你仍然需要切换到不同的面板,这本身就有点分散注意力。我们最终想到了内联自动补全这个方案。
Ryan J. Salva (00:23:20): 借助微软那边一些朋友的合作,我们和 Visual Studio Code 团队的伙伴们合作了,他们说,嘿,编辑器目前还没有支持多行自动补全的扩展点,但我们有个想法可以实现。我们反复尝试了实际的呈现方式。按键应该怎么设计?呈现层应该是什么样?灰色斜体文字似乎是表示这是临时内容的好方式。很早的时候,我们就确定了现在大多数开发者所体验到的 Copilot 用户体验。我想说那至少是 16 个月前的事了,14、16 个月前。从那以后,我们把它带给了开发者。
从构想到落地
Lenny (00:24:15): 再确认一下,你是说不到一年半前这个项目才真正开始,现在已经面向世界发布了,对吗?
Ryan J. Salva (00:24:26): 完全正确。完全正确。大概就是一年半前。
Lenny (00:24:30): 这太疯狂了。从 OpenAI 差点搞垮 GitHub 到那个节点之间,这段时间发生了什么?
Ryan J. Salva (00:24:38): 从 OpenAI 差点搞垮 GitHub 到我们真正确定用户体验之间的那段时期,坦白说,其中很大一部分是 OpenAI 一群非常聪明的研究人员在做实验,做着只有世界顶级 AI 研究人员才能做的事。他们做了大量实验,偶尔会要求更新数据集,把模型丢回来让我们试用和把玩。这些模型字面上有上千个可以传入的参数。当你真正思考从 GPT-3 到 CodeX,再到像 Copilot 这样的产品的过渡时,这不仅仅是模型的问题……
Ryan J. Salva (00:25:27): 创建模型是一回事,但接下来要弄清楚如何使用模型——你想调整哪些参数,你想在哪些方面做优化……性能就是一个很好的例子,对吧?当你在代码编辑器里,你并不希望敲敲敲,然后等一秒、两秒、三秒才得到一个建议,而你的全部目标是保持心流。我们会做实验,看多少毫秒是合适的量,让开发者不觉得自己被 Copilot 的建议打断了。
Lenny (00:26:06): 答案是什么?
Ryan J. Salva (00:26:09): 目前看来大概是 200 毫秒左右。取决于你在世界哪个地方,延迟会上下浮动一点。但最佳点似乎在 200 毫秒左右。
Lenny (00:26:20): 有用的信息。
Ryan J. Salva (00:26:22): 我们还做了大量实验。不仅仅关乎模型本身,还关乎你给模型喂什么。你如何提示模型让它返回一个有用的响应?这就开启了一段我们称之为提示词精炼(prompt crafting)的实验之旅。
Lenny (00:26:40): 回过头来看这件事是怎么开始的,听起来基本上就是一个幸运的意外——OpenAI 做了一件你们没想到的事,然后你说的那个 PhD 小组里有人说,“哇,也许我们可以用这个做出非常厉害的东西。“大概是这样开始的吗?
Ryan J. Salva (00:26:57): 相当准确。是的。我的意思是,我们有了一个模型,它确实惊人地好,在实际智能上是一个台阶式的飞跃,对吧?然后把它和一个真正好的用例结合起来——一个真正改变开发者对创作过程、创造过程的基础体验的用例。
Lenny (00:27:25): 是否有一个时间点,让你或者领导层明确意识到,我们应该在这件事上加倍投入、大干一场?还是说这个小团队一直在做这个想法,然后突然觉得”哇,这个真的行”?又或者从一开始就是”我们要押注这个东西,这是一个巨大而伟大的想法,我们一定会从一开始就投入资源”?
Ryan J. Salva (00:27:48): 最初在 GitHub 做 Copilot 的团队是我们称之为 GitHub Next 的团队。他们的工作本质上就是做第二和第三地平线的项目。也就是有些人可能说的登月项目,对吧?那些我们并不真正指望在一两年内能成功的事情,但也许三五年后可能会变成有意义的东西。
Lenny (00:28:17): 地平线二和地平线三有没有一个具体的定义?像亚马逊那样按年数来划分的吗?
Ryan J. Salva (00:28:23): 不一定是具体的定义。对我来说,我通常粗略地算:第一地平线是接下来一年,第二地平线是接下来三年,第三地平线是接下来五年。但我们通常更多把它当作对模糊度和置信水平的衡量,而不是日历上的日期。
Lenny (00:28:47): 本节目由 Modern Treasury 赞助播出。Modern Treasury 是一个用于资金流动和追踪的下一代操作系统。他们正在为管理复杂支付流的公司现代化开发者工具和金融流程。想想数字钱包通过加密货币通道、分成共享市场、即时借贷等等。他们与 Gusto、Pipe、ClassPass 和 Marqeta 等高增长公司合作。Modern Treasury 强大的 API 让工程团队能将支付流程直接构建到你的产品中,而财务团队可以通过一个精致现代的 Web 仪表盘监控和审批一切。
Lenny (00:29:22): 支持实时支付、自动对账、持续记账和合规方案,Modern Treasury 的平台每月被用于对账超过 30 亿美元。他们是当今市场上最热门的年轻金融科技创业公司之一,已从 Benchmark、Altimeter、SVB Capital、Salesforce Ventures 和 Y Combinator 等顶级机构获得融资。请在 ModernTreasury.com 了解更多。
第二、第三地平线的组织方式
我很想再多花点时间聊聊这个。这太有意思了。这种按三个地平线分配一定比例资源或押注不同地平线的做法,是微软的做法吗?
Ryan J. Salva (00:29:58): 我觉得这不一定是微软的做法,但在 GitHub,我们确实把它真正落地了。并不是说微软内部没有团队也在用这种方法论,但 GitHub 是我们真正明确地、有意识地这样做的地方——我们专门划出了一个团队来思考第二和第三地平线的工作,并把它们与 EPD 分开。EPD 这里指的就是工程、产品和设计(Engineering, Product, and Design),那些负责构建产品化的、可运营的产品并推向市场、免费提供或以某种方式变现的人。
Lenny (00:30:39): 这太有意思了。很多公司都有这类研发(R&D)小组——Facebook 有新产品体验团队,Google 也有一个。我不确定这些团队产出了多少成功案例。据我所见,而且我也很好奇,你们有什么……显然你们取得了一个巨大的成功,至少目前在我看来是这样。关于如何在更大的公司内部投资这些大型登月项目,你有什么心得吗?
Ryan J. Salva (00:31:05): 我觉得第一步是投资它。第一步真的是雇佣非常聪明的人,吸引聪明的人,给他们创造的空间。不要期望他们产出的东西一开始就能赚钱,或者一开始就要满足安全、隐私、可用性、无障碍性这些基本要求,所有那些花哨的东西。他们需要空间去创造和实验。
Ryan J. Salva (00:31:37): 而且,当你真正到了那个阶段——那个团队有了一个想法,这个想法明确地联系到一批有真实问题的代表性客户,并且至少有中等置信度的信号表明这个解决方案,不管它是什么,以一种新颖的方式解决了问题——那就是你该开始考虑的时候了:好,让我们实际上投入一点……我称之为市场测试。没那么正式,其实就像,让我们开始把这个原型带到越来越多的客户面前去测试,看看,嘿,这真的在帮你解决问题吗?这是你会用的东西吗?这就是 GitHub 内部 Next 和 EPD 之间过渡真正开始的地方。
Ryan J. Salva (00:32:35): 这也是我在产品周期中的角色真正开始扩大的地方。在此之前,我其实一直与 Next 团队保持紧密联系,在旁观察他们的工作,偶尔提供一些咨询。但正是在那个时刻——当我们确认,好,这是真实的东西。客户在说,开发者在说,“这太神奇了。它做了一件我自己做不到的非凡之事”——我们开始思考,好,我们如何把这个过渡过来?从那以后,我们基本上就是,好,我们觉得我们有了一个爆款。我们觉得我们有了一个可以真正交付给开发者的东西。
Ryan J. Salva (00:33:21): 我们有意识地做了一个决定:把 Next 团队中的一些研究员调过来,在一段有限的时间内,组建一个新的 EPD 小队。我们希望他们继续做研究,但我们需要做知识转移,而且我们需要真正为一支最终能够将产品运营化和产品化的团队提供种子。这基本上开启了技术预览阶段,我们开始邀请数万人,然后数十万人加入技术预览。在那个技术预览中,我们开始看到大量”脑洞大开”表情符号的推文,以及 Hacker News 上人们为此极度兴奋的讨论帖。
Ryan J. Salva (00:34:09): 这就是我们如何知道该开始扩张了,该真正开始思考如何进行招聘,以便在这些研究员周围建立一层缓冲,让他们最终能够回到 GitHub Next 去做他们最擅长的事——创新、创造、思考下一个登月项目。那个过程,花了……嗯,实际上我们现在还处于它的尾声。就像我说的,距离产品最初创建大约一年半之后,我们经历了技术预览,已经达到了正式发布。我们现在已经在他们周围招聘了一支团队。
Ryan J. Salva (00:34:53): 研究员们实际上从上个月开始就已经逐步搬回 GitHub Next 了。一支 EPD 小队——实际上是多支 EPD 小队——现在正在推进这个产品,开始响应客户反馈,思考,好,作为产品团队,我们现在如何把这条路线图从一个源自 GitHub Next 的想法延续下去?
Lenny (00:35:22): 我很喜欢这个洞察——把人一起带过来,而不是那种”好了,我们从这里接手”的做法。如果你要在某个地方重新建立这样一支团队——这种研发(R&D)的地平线三或地平线二团队——你有什么会做不同的事情吗?有没有从这个经验中提炼出的、可能对创始人或大公司的产品经理有用的教训,那些说”嘿,我们也应该有类似的东西”的人?有没有什么你觉得对让这样的项目成功至关重要的东西?
研究员回归研发团队的条件
Ryan J. Salva (00:35:49): 把研究员送回研发(R&D)团队的标准——不管在你所在的组织中那个团队是什么——不能基于日历。它必须基于岗位上已经有了接替者,一个真正在做这份工作、已经掌握了所有必要技能的人,只有到那时研究员才能回去。要确保你在交接之前保持了专业能力和领域知识的连续性。我觉得我们今天在这方面做得还不错。同样关键的是,从研发(R&D)团队手中接手的团队必须觉得自己掌控着自己的未来。你不能真的把路线图委托给一个研发(R&D)团队。
Ryan J. Salva (00:36:44): 负责维护产品、构建产品的团队——与终端客户反馈闭环最紧密的那个团队——才是真正需要拥有并感觉掌控路线图的人。要确保你不是把创新完全外包给研发(R&D)团队,而是随着产品团队对想法和客户场景的逐步接管,创新也在产品团队内部发生。最后我想说的是,工程基本功在很多方面正是区分研发(R&D)团队和运营产品团队的契约。
从研究到运营产品的文化转变
Ryan J. Salva (00:37:30): 把这种基础流程带进来,坦白说,对研究员来说会感觉有点不自然。因此需要一些文化变革管理,让每个人调整自己的工作方式,理解我们正在从一个实验和研究项目毕业,走向一个运营产品。而且通常来说,因为这些研究员……他们是第一批过来的人。他们是这个项目的种子。这对他们来说会有点不自然,他们可能不会具备完成那个转型所需的全部技能。
Ryan J. Salva (00:38:08): 要确保你有一个好的工程师组合——既有擅长维护服务的工程师,也有那些真正在思考”我们创造的想法是什么、我们带给市场的新东西是什么”的工程师和研究员,并且能够将愿景注入其中。
Lenny (00:38:27): 对,我完全能理解这种挑战……”这是我的东西。我一直在做这个。你们在把这个项目搞成什么样?它要往哪里去?我不太确定我感觉……”然后各种新的需求朝你涌来,“天哪,这本来太有趣了,现在我居然要扩展这个该死的东西。“
Copilot 的伦理与法律挑战
Ryan J. Salva (00:38:46): 我是说,这是世界上最好的问题——能遇到这种问题简直是幸运。说到客户需求,特别是 Copilot,涌进来的讨论量、客户反馈量——尤其是对我们做 AI 的来说,坦白讲,世界还在摸索 AI。我是说,我们正在变得更好,尤其是过去几年里有了 DALL-E 和 Copilot 这样的东西。但它不仅带来了工程挑战,坦率地说,还带来了伦理挑战和法律挑战——比如厘清我们对 AI 的期望是什么。如果 AI 生成了冒犯性的内容,谁该负责?
Ryan J. Salva (00:39:37): 我们的立场,我们最终得出的结论,实际上是将 Copilot 定位为 AI 结对编程者——我觉得这个框架很有用。结对编程,我想你的大多数听众应该知道,通常是两个开发者并排坐着一起解决一个问题。一个人在键盘前,另一个人帮忙梳理思路、讨论想法、纠正错误,诸如此类。那么,如果 Copilot 是你的 AI 结对编程者,而它在对你耳语一些疯狂的东西,把政治话题扯进来,或者性别认同话题,或者,我不知道,其他什么……
Ryan J. Salva (00:40:19): 它在冒出俚语、诽谤,诸如此类的东西。你大概没法专注工作了吧?会非常让人分心。回到一些基本原则上——我们试图解决的使用场景是什么,什么是”适当的”——我打个引号——坐在你旁边的 AI 机器人的行为——这帮助我们为想要打造的开发者体验建立了一些原则和指导方针。
Lenny (00:40:52): 哦,我喜欢这个。就是给这个东西创建一个人格形象,来帮助你来界定这个东西的行为应该如何运作。你们是怎么应对这些挑战的?是你和法律团队的讨论吗?我不知道,这些伦理问题真的很棘手,我猜。作为产品团队,你们怎么处理这类问题?
Ryan J. Salva (00:41:09): 这是一场涉及非常非常广泛的角色的对话。尤其是这个产品,我花在与法务团队沟通上的时间可能比我负责过的任何其他产品都多。都是些出色而有创造力的人。但不仅仅是法务。还有隐私和安全方面的负责人。坦率地说,还有开发者——也就是使用它的人——听取他们的意见。嘿,这里什么行得通?这里什么对你行不通?为什么这有冒犯性?为什么这不冒犯?我们继续用那个疯狂的结对编程者对着我们耳朵说疯话的例子。在我们最初开始的时候,在非常非常早期的日子里,Copilot 其实没有任何过滤机制。
内容过滤与编辑困境
Ryan J. Salva (00:41:58): 后来我们觉得,好吧,它需要是一个稍微更受控的体验。我们需要把一些最过分的东西过滤掉。我们引入了一个简单的词语屏蔽列表,而这种屏蔽列表总是充满了风险——哪些词可以,哪些词不行。突然之间,我们变成了语言的编辑者,这是一个相当令人不安的处境。至少我个人对此并不自在。但在某种程度上,这又是不得不做的,否则你就会创造一个糟糕的开发者体验。
Ryan J. Salva (00:42:35): 我们经常收到开发者的反馈,比如:“嘿,这个特定的词被屏蔽了。它被屏蔽这件事,要么让我觉得受到了冒犯,要么阻碍了我从产品中获得好的价值。”
Lenny (00:42:51): 天哪。
Ryan J. Salva (00:42:52): 始终在做着编辑内容的那种微妙平衡。我们现在已经能够与 Azure 的负责任 AI 部门合作,他们打造了一些非常出色的模型,可以帮助检测——姑且称之为情感吧,没有更好的词了——基本上就是检测那些明显带有冒犯性的内容。因为有些词在某些语境下可能是冒犯性的,但在另一些语境下则完全合理,特别是当你涉及医疗类软件场景时,对吧?
AI 模型替代简单屏蔽列表
Ryan J. Salva (00:43:35): 能够稍微转变方向,依赖那些能比我们用粗糙简单的屏蔽列表做得更好的 AI 模型——这也许可以作为又一个例证,证明 AI 作为解决常见开发问题的方案,正越来越擅长解决我们技术栈中更多的部分,或者说填补我们技术栈中更多的空缺。至少在我们的案例中,我们相当幸运,能够借助或依赖母公司的贡献来解决一个 GitHub 单靠自己可能无法解决的非常尖锐的问题。
Lenny (00:44:16): 我从没想过 Copilot 会……你们还得担心它说出疯狂的话。你们居然得处理这种事,太不可思议了。是不是微软之前有个机器人变得非常负面,最后被关掉了?
Ryan J. Salva (00:44:31): 是的。
Lenny (00:44:31): 他们有这方面的经验。
Ryan J. Salva (00:44:32): 它叫什么来着?Talia 还是类似的什么?
Lenny (00:44:35): 差不多那样的。
Ryan J. Salva (00:44:36): 对,差不多那样的。我们不想再重演那样的事件。
AI 开发者的未来:40% 代码由 AI 编写
Lenny (00:44:40): 哇。这让我想到的是,你的团队处于 AI 实际应用的最前沿。我很好奇你对这一切的发展方向怎么看,尤其是对开发者而言。我看到一个数据,说大概 40% 的代码现在是由 Copilot 写的。我不知道这个数字是否准确。但未来的愿景是会变成 90% 吗?你觉得这一切会走向哪里?
Ryan J. Salva (00:45:02): 先把这个数据说清楚,40% 这个数字是专门针对 Python 开发者的。坦率地说,这个比例因语言而异。因为你可以想象,某些语言在公共领域中的代表比其他语言更好。而且通常来说,训练数据的数量和多样性与建议的质量相关,而建议质量又体现在代码行数、接受率或其他任何指标上。
Lenny (00:45:35): 太好了。感谢澄清。
Ryan J. Salva (00:45:36): 当然。我们看到在所有不同语言中,这个比例从二十多到四十多不等。
Lenny (00:45:43): 说句题外话,作为一个不太出色的工程师——我以前做了大约十年工程师——我欢迎我们的 AI 霸主替我写所有的代码。我很期待它能做越来越多的事情。所以,是的,我很好奇你觉得这会走向哪里。
AI 将渗透整个开发栈
Ryan J. Salva (00:45:58): 确实如此。它让像我这样平庸的开发者也能做出一些相当了不起的事情。但未来会怎样?首先,我认为——我希望大多数开发者已经清楚地看到——AI 在不远的将来会渗透到我们几乎整个开发栈中。Copilot 实际上只是大量创新的最尖端——比如更好地管理我们的构建队列,或者帮助……这里有一个很好的例子。我不知道你怎么样,但我经常看到 commit message 和 PR 里的注释质量不太行。这给代码审查者带来了很大的负担,得自己去弄清楚开发者到底想做什么。
Ryan J. Salva (00:46:55): 如果 AI 能用你完整的请求来总结你所有的变更,而你作为贡献代码的开发者只需要审查一下确保它准确无误,然后发送出去就行了,不需要为此投入额外的精力呢?AI 有大量大量的机会可以从我们的工作中剥离那些枯燥乏味的部分,这样我们就能专注于创造性的行为。我从开发者那里听到的、我自己也体会到的是,Copilot 在某种程度上迫使我更多地思考:我试图创建的设计模式是什么?
Ryan J. Salva (00:47:33): 我试图通过代码实现的最终用户体验或成果是什么,而我可以让 Copilot 帮我搭建大量的脚手架代码,这样我就能专注于更有创造性的工作?这才是我对我们行业五到十年后的期望——不仅通过提供一层抽象,或者至少在开发过程中给予一些帮助,来邀请更多开发者或更多人成为开发者,而且让那些真正经验丰富的开发者能够专注于更大的问题,专注于成果和创造力,而不是那些真正底层的、困难的、需要死记硬背的东西,比如语法或参数顺序之类的。
Lenny (00:48:32): 说得好。退一步说,至少这能让人们不用再开着一个 Stack Overflow 的标签页,每个函数都要去复制粘贴了。
Ryan J. Salva (00:48:42): 我希望 Stack Overflow 能继续经营下去,但我个人确实不那么介意少一点上下文切换。
扩展 Copilot 的最大挑战
Lenny (00:48:48): 在扩展这个产品的过程中,你觉得最大的挑战是什么?是技术上的,还是运营上的——把它扩展成一个人们真正付费的产品?
Ryan J. Salva (00:49:01): 这有几个维度。一个是属于我们这个时代的问题,即过去几年供应链遭到了严重破坏。事实证明,Copilot 无论是训练还是运行模型,都需要一些非常稀有和独特的 GPU,而全球供应量有限。部分挑战就在于:我们能弄到足够的硬件来运行这些东西吗?我们实际上已经预留了相当多的算力容量,而且我们对全球更多的容量是贪婪、贪婪、再贪婪。只要能生产出那些芯片并放进数据中心,我们就立刻去做。
Ryan J. Salva (00:49:50): 这是一种独特的挑战。我还想说,在运营方面,另一个挑战是:我们如何创建一个让社区真正感到自己拥有所有权的模型?将一个 AI 工具推向市场——尤其是训练数据来自公共代码的 AI 工具——所需要进行的大量对话,要求我们和社区之间进行大量的沟通。每一个优秀的产品经理都应该尽可能多地把时间花在客户和潜在客户身上。
Ryan J. Salva (00:50:34): Copilot 尤其是一个更为复杂的推出过程,因为我们作为一个行业、作为一个社会,仍在摸索如何理解它。开发者和我们产品团队之间的这种来回博弈,实际上迫使我们扩展产品团队的规模,甚至超过了工程团队的扩展。
Lenny (00:51:02): 有意思。那是为什么?
应对质疑与焦虑
Ryan J. Salva (00:51:04): 有几个不同的原因。我的意思是,其一,就像我说的,我们的模型是基于公共代码训练的。并不是所有社区成员都真正确信,什么时候可以基于公共代码训练模型?什么时候不可以?Copilot 是否在生成安全的建议?Copilot 是否在生成有 bug 的建议?有很多疑问。有很多非常健康的质疑。我说这话是发自内心的。我希望人们对 Copilot 保持质疑。我们作为社区有责任对任何 AI 保持质疑。
Ryan J. Salva (00:51:40): 因为就像它有巨大的益处潜力一样,它也有巨大的危害潜力。人们让我们承担责任——你们如何防止模型投毒(model poisoning)这类问题?是否会出现一种围绕 AI 的、我们还没有真正想到的新攻击向量(attack vector),可能会产生负面后果?我们认为在这方面我们做得非常出色且负责任,确保首先,我们非常明确——Copilot 不是开发者的替代品。它永远不会是。
Ryan J. Salva (00:52:17): 我们不希望 Copilot 在没有一个有思想、有推理能力、有呼吸的人类坐在键盘另一端做出审慎决策的情况下自动生成代码。我们不希望 Copilot 替代技术栈中的任何其他部分,无论是静态分析工具、单元测试,还是你今天采用的任何其他措施来确保人类产出高质量代码。我们希望你保留所有这些同样的体系,确保使用 Copilot 等工具的人类继续产出高质量代码。
Ryan J. Salva (00:52:56): 但与此同时,也有很多焦虑——AI 处于什么位置?AI 最终会不会……这又回到你关于五到十年后我们会怎样的问题。它会不会编写 90% 的代码?我们不希望 Copilot 变成那样……我们不希望它替代任何东西。我们希望它增强。这里的核心理念是,AI 是一个赋能者,让开发者专注于创造性工作,保持心流,能够更快地推进。化解这些焦虑,化解这些健康的质疑,需要对话。需要交流。这需要我们产品方面与社区进行有引导的对话。
哲学与文学的回响
Lenny (00:53:50): 感觉这和你当年的教育背景——哲学和文学——联系了起来。这不是很巧吗?
Ryan J. Salva (00:53:57): 它常常让人感觉很紧密……我的意思是,教育方面确实教会了我,对话的重要性、质疑的重要性,其价值远不止于象牙塔里那些深奥的冥想。它实际上适用于现实世界。
Lenny (00:54:17): 在我们进入非常令人兴奋的闪电问答之前,也许还有最后一个问题。
Ryan J. Salva (00:54:21): 喔!
大胆下注与渐进改进的分配
Lenny (00:54:23): 回顾这段整体经历——在一家大公司内部构建、孵化、推出这个大胆的赌注——你可以从任一方向来谈:要么是关于大胆下注与渐进式胜利的任何经验教训,以及你如何看待在这两类之间进行投资;要么仅仅是在大公司内部,关于如何将一个像这样的庞然大物——从一个想法的种子一直做到一个潜在的大型新业务线——的经验教训。
Ryan J. Salva (00:54:51): 作为产品经理和多个产品的产品组合经理(portfolio manager),我在 GitHub 负责多条产品线,时间、专注力、精力和资源的分配就成为一个极具挑战性的问题。答案并不总是一样的,取决于时间、世界形势、组织状况、技术环境。作为一条通用规则、一个通用原则,我当然努力确保我们始终保留一定的余力给那些大胆的、有魄力的实验性研究项目。你可以把那些不确定性极高的赌注看作占团队产能的百分之五到十。大约百分之二十五,也许百分之三十的团队产能,一般应该用在运营上。
Ryan J. Salva (00:55:54): 我们如何让已在市场上的产品持续满足客户期望?然后剩余的部分——大概百分之六十左右——真正用于已上市产品的渐进式进步。我们如何做出迭代改进,并继续真正兑现一年、两年、三年、四年前所做的大赌注的回报?从粗略分配来看,这大概就是我运营大团队的方式。但这在你拥有较大团队时才有效。在初创公司里,当我们几乎完全是一个大赌注时,显然你的比例会大不相同,变成全力以赴押注那一张所谓的彩票。
Lenny (00:56:50): 太棒了,谢谢分享。我本来还想问你推荐的比例是多少,谢谢你主动说到了这一点。到此为止,我们已经到了非常令人兴奋的闪电问答环节。我只是简短地问你五个问题,想到什么就说什么,你有什么答案就说出来。开始吧?好的。你最常向别人推荐的两三本书是什么?
闪电问答
Ryan J. Salva (00:57:13): 哦,好问题。其中一本是关于用户体验的书,叫 Make It So。这是对《星际迷航》的致敬,核心理念就是,科幻作品中呈现给我们的用户体验,往往会在二三十年后进入我们的日常产品和工具中。这是一本非常开阔眼界、启发性强、而且真的很有趣的书。这是一本。然后完全不同的方向,我走出科技领域,纯粹推荐娱乐价值。有一本 David Foster Wallace 的书叫 Brief Interviews with Hideous Men,我很喜欢。它是一部短篇小说集。
Ryan J. Salva (00:58:04): 本质上它就是——如果你在看一部电影,反派得到了发表长篇独白的机会,解释自己为什么成为现在的自己,这让他们在那一刻也许显得有些脆弱——这就是那篇独白,为十个不同的可恶之人,可怕、可怕的人,各讲一遍。很有意思的阅读体验。我推荐它。
Lenny (00:58:31): 我喜欢这个。它让我想起一本书,讲的是独裁者的室内设计,给你看萨达姆·侯赛因、希特勒那些人的家。
Ryan J. Salva (00:58:43): 天哪!我的天,太棒了。我得找到那本。你得发给我。
Lenny (00:58:47): 我在一家旧书店找到的,二手书店那种。不知道现在还有没有,但我会找找看。第二个问题。你最喜欢听或推荐的其他播客是什么?
Ryan J. Salva (00:59:02): 天哪,太多了。我每个月要听几百个小时的播客。这很疯狂。我可以选很多。我就给你一个吧。Nate DiMeo 做的 The Memory Palace 是一个出色的故事类播客。他每集大约做二十分钟的小品,通常选材自美国历史。他还曾是华盛顿特区一家博物馆的驻馆艺术家(artist in residence)。如果你去的话——我想是美国历史博物馆之类的——如果你在那里,你可以走到博物馆的不同展厅,他会给你讲述你所看到的物品或展厅的故事。那是一种神奇的体验,推荐给所有人。
Lenny (00:59:56): 哇!我喜欢这种。最近有没有哪部电影或电视节目让你特别喜欢?
Ryan J. Salva (01:00:00): 我不知道这算不算最近的作品,但这是我最近看的——《降临》(Arrival)。嗯,算数。《降临》。表面上讲的是外星人的电影,但实际上讲的是语言和记忆。我觉得它非常、非常引人入胜。
Lenny (01:00:20): 你读过 Ted Chiang 的书和短篇小说吗?
Ryan J. Salva (01:00:23): 没有。我没有读过。
Lenny (01:00:24): 哇!哦,你一定会喜欢的。《降临》就是根据他的一篇短篇小说改编的,我相信,是他的故事之一,同一个人还有一整本包含更多短篇小说的集子。都非常精彩。
Ryan J. Salva (01:00:34): 太好了。那我这个周末有事情做了。
Lenny (01:00:39): 这就对了。放下工作,去读书吧。你在面试时最喜欢问的面试问题是什么?
Ryan J. Salva (01:00:46): 让我想想。我给你一个有趣的,与其说是刁钻不如说是好玩的。这算是我的破冰面试题,尤其适合初级到中级的 portfolio manager。我会请他们在one minute(一分钟)内教会我一个新东西。通常我会掏出手机开始计时,给他们一秒钟想一下,然后开始计时。评分标准有三条。第一是完整性——他们有没有在one minute(一分钟)内真正讲完这堂课?第二是复杂度——如果你教我,比如说,同时拍头和揉肚子,那是一回事。
Ryan J. Salva (01:01:28): 但如果你教我18世纪艺术与当时宗教潮流之间的关联,那就是另一回事了。最后一个标准其实是清晰度。对,清晰度是最后一个。清晰度就是——我真的听懂了吗?到课程结束时我真的学到东西了吗?他们是否完整、充分地传达了这个想法?
Lenny (01:01:52): 我得问一下,有人在这个问题里教过你最有趣的东西是什么?
Ryan J. Salva (01:01:57): 我经常随口举的那个例子——关于18世纪艺术与当时宗教潮流之间的关联——真的有人教了我这个。太惊人了。她其实是一位还在读大学的候选人,来自范德堡大学(Vanderbilt University)。
Lenny (01:02:18): 那是强烈的录用意向吗?
Ryan J. Salva (01:02:20): 极其强烈的录用意向。她简直太厉害了。非常聪明的人。
业界受尊敬的思想领袖
Lenny (01:02:28): 太棒了。最后一个问题,在行业中你会说你最尊敬谁——作为思想领袖或有影响力的人?
Ryan J. Salva (01:02:36): 有很多,但我想今天如果不提 Uga Damore,我大概会后悔。Uga 是主要的研究者,可以说是 Copilot 真正的创新者。最初的成果归功于他,他是一位杰出的技术专家和未来学家。我真的非常非常尊敬他。
结束语与联系方式
Lenny (01:03:05): 太棒了。很棒的推荐。Ryan,这次对话真的非常精彩。你们在这么多有趣的工作上走在最前沿。说实话,我等不及想要 Copilot 来帮我写 newsletter,这样我就可以少干点活了。也许有一天会实现的。无论如何,我很期待看到这一切的发展。谢谢你来这里。最后两个问题——如果大家想了解更多或联系你,在哪里可以找到你?然后,听众有没有什么方式可以帮到你?
Ryan J. Salva (01:03:33): 简单。怎么找到我?我在所有平台都叫 Ryan J. Salva——Twitter、GitHub,随你选。LinkedIn 也是 Ryan J. Salva。然后听众怎么帮到我?Copilot 有一个 60 天免费试用,所有人都可以获取和使用。去试试吧。用了之后,请在 Twitter 或 Hacker News 或 GitHub Discussions 上分享你的体验。
Ryan J. Salva (01:04:07): 给我们好的反馈,也给我们不好的反馈。我非常渴望看到人们以新颖的方式使用它,以及在哪里碰到了粗糙的边缘。就像我说的,我们还有很大的成长和改进空间,但我很有信心,开发者们会对它目前的能力感到相当惊叹。
Lenny (01:04:30): 太棒了。谢谢你来这里,Ryan。
Ryan J. Salva (01:04:31): 嘿,哥们,太感谢你了。真的非常非常有趣。
Lenny (01:04:35): 非常感谢你的收听。如果你觉得这期节目有价值,可以在 Apple Podcasts、Spotify 或你最喜欢的播客应用上订阅本节目。另外,也请考虑给我们评分或留下评论,因为这真的能帮助其他听众找到这档播客。你可以在 lennyspodcast.com 找到所有往期节目或了解更多关于本节目的信息。下期再见。
术语表
| 原文 | 中文 |
|---|---|
| Actions | Actions |
| adaptogens | 适应原 |
| ambiguity and confidence level | 模糊度和置信水平 |
| Arctic Code Vault | 北极代码库 |
| artist in residence | 驻馆艺术家(artist in residence) |
| attack vector | 攻击向量(attack vector) |
| Azure Department of a Responsible AI | Azure 负责任 AI 部门 |
| block list | 屏蔽列表 |
| build queues | 构建队列 |
| CI/CD | CI/CD |
| code reviewer | 代码审查者 |
| Codespaces | Codespaces |
| CodeX | CodeX |
| commit message | commit message |
| context switching | 上下文切换 |
| critical theory | 批判理论 |
| Dolly | DALL-E |
| early to mid career | 初级到中级 |
| EPD | EPD(工程、产品和设计) |
| flow | 心流 |
| general availability | 正式发布 |
| GitHub Next | GitHub Next |
| GPUs | GPU |
| Hacker News | Hacker News |
| horizon | 地平线 |
| icebreaker | 破冰 |
| inline autocomplete | 内联自动补全 |
| knowledge transfer | 知识转移 |
| mental map | 心智地图 |
| mind share | 心智资源 |
| model poisoning | 模型投毒(model poisoning) |
| moonshots | 登月项目 |
| One Engineering System | One Engineering System(微软统一工程系统) |
| pair programmer | 结对编程者 |
| philosophy of aesthetics | 美学哲学 |
| portfolio manager | 产品组合经理(portfolio manager) |
| PR | PR |
| probiotics | 益生菌 |
| prompt crafting | 提示词精炼(prompt crafting) |
| R&D | 研发(R&D) |
| scaffolding code | 脚手架代码 |
| sentiment | 情感 |
| silver film | 银胶片 |
| strong yes hire | 强烈的录用意向 |
| supply chains | 供应链 |
| technical preview | 技术预览 |
| V1 product | V1 产品 |
| VP of Product | 产品副总裁 |
此文档由 AI 分片翻译(translate_long_document)
The role of AI in new product development | Ryan J. Salva (VP of Product at GitHub)
Ryan J. Salva: We had actually created a snapshot of GitHub’s public code for what we call the Arctic Code Vault, right? Essentially, this is up in like way in the Northlands of Finland, there’s a seed vault. We were like, you know what? Seed vaults are really there to preserve the diversity of the world’s flora in seeds in case of some crazy either natural or manmade disaster. But another really important asset to the world is our code, our open source. This represents actually a lot of the collective, well, certainly software, if not intelligence of kind of the modern world, right?
We had put this snapshot of public repositories on this silver film that would be preserved for thousands of years in this Arctic Code Vault. Well, we took that same data snapshot and we brought it to our friends over at OpenAI to see like, okay, what can we do with these large language models built on public code? Well, it turns out we can do some pretty cool things.
Lenny: Ryan Salva is VP of product at GitHub, where, amongst other projects, he incubated and launched GitHub Copilot, which in my opinion is one of the most magical products that you’ll come across. If you haven’t heard of it, it uses OpenAI’s machine learning engine to autocomplete code for engineers in real time as they’re coding. I think it’s one of the biggest advances in product development and productivity that we’ve seen in a while. I’m always really curious how a big product like this starts, gets buy in, build momentum, and then launches, especially at a big company like Microsoft and especially a product like Copilot that has surprising ethics challenges, scaling challenges, business model questions.
Also, this came out of a small R&D team that GitHub has, and it’s so interesting to hear what Ryan has learned about incubating big bets within a large company, and then taking them from prototype to Microsoft scale. Ryan is also just super interesting as a human. He’s got a very non-traditional background. I am excited for you to hear this conversation. With that, I bring you Ryan Salva. If you’re setting up your analytics stack, but you’re not using Amplitude, what are you doing? Amplitude is the number one most popular analytics solution in the world used by both big companies like Shopify, Instacart, and Atlassian, and also most tech startups.
I finally gave it a shot earlier this year, and it has quickly become a core part of my morning routine, especially on days that I need to go deep on writing or record a podcast like this. Here’s three things that I love about AG1. One, with a small scoop that dissolves in water, you are absorbing 75 vitamins, minerals, probiotics, and adaptogens. I kind of like to think of it as little safety net for my nutrition in case I’ve missed something in my diet. Two, they treat AG1 like a software product. Apparently they’re on their 52nd iteration and they’re constantly evolving it based on the latest science, research studies, and internal testing that they do.
And three, it’s just one easy thing that I can do every single day to take care of myself. Right now, it’s time to reclaim your health and arm your immune system with convenient daily nutrition. It’s just one scoop and a cup of water every day. And that’s it. There’s no need for a million different pills and supplements to look out for your health. Make it easy. Athletic Greens is going to give you a free one year supply of immune supporting vitamin D and five free travel packs for your first purchase. All you have to do is visit AthleticGreens.com/lenny. Again, that’s AthleticGreens.com/lenny to take ownership over your health and pick up the ultimate daily nutritional insurance. Ryan, welcome to the podcast.
Ryan J. Salva: Thank you, my friend. I am genuinely very excited to be here. Lovely to geek out with you for a little while.
Lenny: I’m excited as well. We were chatting briefly before we started recording and you mentioned a little bit about your background, which is really unique for someone that is leading product at GitHub. Could you just share what you studied in school, and then briefly just how that led to your career in product management?
Ryan J. Salva: Oh wow! You’re going to make me remember all the way back to school. Okay. Back in school, I was not a classic software engineering, CS major. The kind of esoteric answer is philosophy of aesthetics and 20th century critical theory. The easier access answer is philosophy and English. But primarily it was really about how do we, as people, communicate with each other, how do we express ourselves through creativity. As humans since the dawn of time have been painting on cave walls and dancing around the fire and writing stories and novels and singing to each other. I was just really interested in how we convey our experience of the world to others.
I got started in software development and product management because I wanted to be in the business of creativity. We’re at a really, really unique time in human history where we actually get to witness the advent of a brand new medium. Software development and the worlds that it creates wasn’t possible, I don’t know, maybe 50, 60 years ago now. If I’d been born in the 1700s, I probably would’ve been the guy making, I don’t know, new colors of paint and paint brushes, but I wasn’t. I was born kind of at the turn of the 21st century, and so I work in engineering.
That’s what I’ve been doing for the last about a little bit more than 20 years now, working sometimes in startups, some of them other people, some of them my own, about 10 years at Microsoft and now three years at GitHub.
Lenny: Amazing. I didn’t know that was a job to make new paint colors for paint brushes. Is there a color you would come up with?
Ryan J. Salva: Oh man! It so happens that yellow… I think I would do a really vibrant gold sunshine yellow if I was in that business.
Lenny: Very positive, happy. I love it. That could be a new GitHub brand color. Today, you’re VP of product at GitHub. Before that, you were a super senior product leader at Microsoft, and I’m always curious how that transition happens when you move from just a longtime senior product leader at a larger company to taking on something like this that was an acquisition. I’m curious what made you decide to take this leap, and then just was there anything interesting about the machination that went into just making that transition and figuring that out?
Ryan J. Salva: Yeah, it’s a good question. Like I said, I was working on development tools and developer services when I was there at Microsoft. Specifically, I was leading product for what they call One Engineering System. It’s essentially the shared developer infrastructure for all Microsoft products like Windows and Office and Azure and things like that, as well as Microsoft’s DevOps solution called Azure DevOps. When the acquisition happened, it was clear that so much of the energy, so much of the focus and the innovation that was going to be happening around developer tools and services was going to be happening around GitHub. I mean, that’s where the community is creating.
That’s where people are learning, that’s where so much of the mind share of just the development community is focused. Like I said, I’m motivated. What I care about is helping people create. It was very clear to me that there was no place that I could have a larger impact than working at GitHub. I really took that opportunity to make the transition out of a little bit more enterprise focused internal role at Microsoft to going where I could work on everything from, I don’t know, AI technology like Copilot to a cloud hosted development environments like Codespaces, repos, which literally every single developer on the planet is participating in some way GitHub repos in a typical year.
That was what I wanted to accomplish, is just like, how do I get more connected to the community, especially the community outside of what Microsoft could reach on its own. The decision to move as well, I think, was really focused not just on what GitHub was and maybe is at the time, but what GitHub also can be. I mean, GitHub has more than a decade, nearly a decade and a half of history of bringing developers together to collaborate on code through repositories. But in the last few years, we’ve really expanded that portfolio to include so many different parts of the developer life cycle.
Again, I talked there about Codespaces and Copilot, but it’s also actions for CI/CD and advanced security. As developers, we are so much more than just where we put our code. There’s a whole part of the tool chain there. And to get to an opportunity to work on so many V1 products, like that is creation itself, to be able to build an entirely new product, get it out to market, test it, iterate on it, and really feed on the energy that’s coming back from the community.
Lenny: Awesome. There’s definitely a lot of energy coming out of GitHub. What I want to spend most of our time chatting about is a product that your team helped launch and incubate, which is GitHub Copilot, which just from my outsider perspective feels like one of the biggest advances in software development in, I don’t know, a decade, maybe more. It’s definitely one of the most magical products out there and your team and you kind of led the incubation and launch of the Copilot.
I’d love to spend most of our time chatting through that. The first question… Okay, cool. My first question just for folks that don’t know a lot about Copilot is just like, what is it? Can you just kind of briefly describe what Copilot is?
Ryan J. Salva: Yeah, sure. Developers for the last 20 years or more have had essentially simple, intelligent autocomplete. You hit the period and you get the next variable that might come up. It’s helpful for moving a little bit faster through your code, helpful sometimes for remembering what the particular syntax might look like for a method or a function. Copilot is essentially that magnified by many lines of code. It is multi-line autocomplete that is fundamentally powered by an AI model called CodeX, which is a derivative of another one that you might be familiar with, GPT-3.
When you are in the editor, it could be VS Code, it could be IntelliJ, it could be them, essentially, as you are typing, Copilot will provide suggestions usually in kind of this italicized gray text that is really, to your point, kind of magical what it’s able to infer. Based upon the variables around it, the class names, the method names around it, your comments, Copilot infers what you intend to create, and then hopefully does a pretty good job at nailing it by providing scaffolding code template that you can then riff on. Now, what we tend to find is that developers love it. They really enjoy it. They kind of find themselves getting a little addicted to it because it helps them stay in the flow.
As developers, we love to be in that place. I love to be in that place where I’m creating things, where I’m focusing on some product, some piece of software that I’m going to give to my customers, my users. The labor of remembering what’s the order of parameters that need to come into a particular API, or hey, what’s the particular syntax of this thing I’m supposed to do, or oh, I’ve got to create a bunch of dummy data that is days of the week or months in the year. That’s just labor. It’s not creating. It’s just typing.
Copilot helps developers stay in the flow by bringing all of that information into the editor, preventing them from having to go check out documentation or watch tutorial or go to Stack Overflow and either find an answer or worse, have to ask a question and wait for an answer. It just brings all of that into the editor and gives the developer often multiple suggestions that they can choose from and just pick and choose what is the right solution to solve the problem for the thing they’re trying to create.
Lenny: Awesome. What I’m most curious about, and we’re going to spend time on this, is just how a product like this comes to be at a larger company. But before we get into that, what’s the craziest story of someone using Copilot to write code? And I’ll share one real quick. I was watching some YouTube videos to prepare for this chat and one guy, maybe this is the Turing Test of AI writing code, is he used Copilot to center divs. He’s like, “Wow! This did it right.” And then another guy, he’s an instructor of code.
He makes YouTube videos teaching people how to code and he’s like, “Copilot just gives you the answer immediately, and so I can’t make these videos as easily. I have to turn it off so that doesn’t just give it away.” I’m curious, what have you seen?
Ryan J. Salva: There are so many of those. I’ll just kind of give a couple of recent ones that I’ve heard. I was talking to one developer who was… He’s actually an educator and he’s teaching kids how to code, usually like kind of high school age, so 16, 15, that kind of thing. His experience matches my own, which is that many of us, we learn to code best not by arbitrary exercises, but by actually building something that’s going to be useful solving problems.
What he does is he matches small businesses and medium size businesses who need to build internal tools with essentially classes of students, like a group of maybe six or eight students, and then gives those students Copilot and says, “Here, small business, medium size business. Group of students, go build this internal tool for this business.”
Copilot is essentially kind of whispering in the student’s ear, metaphorically speaking, “Hey, here’s how you solve this problem. Here’s how you do this,” and students build not only the tool, the software that the business needs and then get to put that on their resume and their application for college and university, but they also get to learn by using the tools that likely are going to be part of the core DNA of the developer tool chain two, three, four years from now, as AI starts to permeate our entire stack. That was a pretty cool recent one that I talked to.
Lenny: That is very cool. I didn’t think about just the education lever here of just making it so much easier to learn to code, not even just building code.
Ryan J. Salva: And that’s the thing, Copilot is particularly good not just at taking away some of the effort, but often… There’s learning a new language, and then there’s also just waiting into a code base that you’re not necessarily familiar with, right? I mean, heck, sometimes I don’t recognize some of the code that I wrote six months ago or a year ago. It feels like I’m wading into new territory. But maybe you need to fix a bug in an app that you don’t often touch, wading into that code base is kind of learning and creating a mental map for that code base.
One of the really magical pieces of Copilot here is that, that AI is collecting context of the application that you’re going into. It can help you build that mental map and learn the code base, even if it’s a language that you’re already familiar with.
Lenny: Awesome. Going back to the beginning of Copilot and how it started, I’m always curious how a project that ends up being a huge deal to a larger company begins and especially how it builds momentum, how it gets buy in, and then just gets out the door. Can you talk about just the original seed of this idea like, who did it come from, who had the original vision, how did this idea emerge and build momentum where you put resources into it?
Ryan J. Salva: Oh wow, what a long, and I don’t know, depending upon your point of view, sorted or exciting story that is. Microsoft and OpenAI have been collaborating for quite a while now on large language models, making its way into all different experiments and different parts of both Microsoft’s software portfolio, as well as just helping OpenAI by providing the compute necessary. It takes massive amounts of compute to train these models. They were mostly large language models. Couple years ago now, it kind of dawned on us that, well, language models aren’t just English and Spanish and German and Korean and Japanese, but Python and JavaScript and Java and C# and Closure.
All of these are languages too. In fact, they’re kind of nice from an AI perspective because they’re relatively constrained in terms of their semantics, right? The number of words, I put that the in scare quotes as it were, that can be expressed in Python, for example, is much smaller than the English language, which has all sorts of different grammar rules and nouns, verbs, adjectives, adverbs. We started to see what it would be like to actually bring code to these large language models. The way that I actually got introduced to it is kind of funny. Microsoft and OpenAI had this idea.
At the time, one of the teams that I was responsible for was GitHub’s infrastructure team, the team responsible for our data centers, our reliability, our rep time. We noticed one day that we were getting hammered, I mean absolutely hammered with a tremendous amount of clone requests. We’re like, “Oh my gosh! Is this like a denial of service attack? How are we going to respond to this? What’s going to happen?” We figured out pretty quickly that it was actually OpenAI. They were cloning all of our repositories to harvest the data out of GItHub.I mean, it’s totally legit practice, but it does have a real consequence.
We were able to step in and mitigate it very quickly. There was not a reliability kind of an uptime incident there, but we’re like, “Hey, you all, cool. Love this thing. Let’s see if we can get that data to you in a more responsible way, in a way that’s packaged a little bit more to meet your needs.” What we did is just the year before that, We had actually created a snapshot of GitHub’s public code for what we call the Arctic Code Vault, right? Essentially, this is up in like way in the Northlands of Finland, there’s a seed vault. We were like, you know what? Seed vaults are really there to preserve the diversity of the world’s flora in seeds in case of some crazy either natural or manmade disaster.
But another really important asset to the world is our code, our open source. This represents actually a lot of the collective, well, certainly software, if not intelligence of kind of the modern world, right? This represents actually a lot of the collective, well, certainly software, if not intelligence of kind of the modern world. We had put this snapshot of public repositories on this silver film that would be preserved for thousands of years in this Arctic Code Vault. Well, we took that same data snapshot and we brought it to our friends over at OpenAI to see like, okay, what can we do with these large language models built on public code?
Well, it turns out we can do some pretty cool things. Just like a translation tool that goes from English to Spanish, Spanish to German, you can also go from English to Python or Python to C#. We’re like, okay, this is cool. We can start to get not only translation, but a little bit of predicted text here as well. We’re all I think fairly already familiar with predictive text already in our code editors as IntelliSense. But in, I don’t know, you go to your favorite word processor and chances are that you’ve got some kind of predictive text happening there as well.
We started experimenting with different user experiences, right? Do we want it so that you, I don’t know, right click and get a little side panel that comes up with a bunch of different options for things that you might want here. That was nice because it would give you hold functions, but it’s out of the cursor, right? You had to really… Even if you weren’t switching over to a different window, you still had to switch over to a different panel, which itself was a little bit distracting. We eventually came to this idea of inline autocomplete.
We were able to with the kind of partnership of some of our friends over on the Microsoft side of things, partner with our friends in Visual Studio Code, they’re like, hey, there’s not really an extensibility yet in your editor for this multi-line autocomplete, but we’ve got an idea for how this might work. Played around with the actual presentation of it. What should the key strokes be? What should the presentation layer be? The gray italicized tech seemed to be a good way of indicating that it was ephemeral, as it were. Pretty early on, we landed on this user experience that is Copilot as most developers experience it today. I want to say that was at least 16 months ago, 14, 16 months ago. Since then, we brought it to developers.
Lenny: Just to double click on that, you’re saying just less than a year and a half ago, this kind of really started as a project and now it’s out to the world. Is that right?
Ryan J. Salva: That is exactly right. That’s exactly right. It’s about a year and a half ago.
Lenny: That’s insane. What was that period between OpenAI almost taking down GitHub to I guess that point?
Ryan J. Salva: The period in between kind of OpenAI almost taking down GitHub and then us really arriving at the user experience, part of that was, frankly, a lot of really smart researchers at OpenAI experimenting and doing what only world class AI researchers can do. It was a lot of them experimenting, occasionally asking for updates to the data set, tossing back to us a model that we might play with and tinker around with. These models have literally thousands of parameters that you can pass to them. When you’re really thinking about GPT-3 and CodeX and then the transition from that to something like Copilot, it was not just like the model…
Ryan J. Salva: Creating the model is one thing, but then figuring out how to use the model in terms of what parameters do you want to adjust for, what do you want to optimize for in terms of… A great example of this is performance, right? When you’re in a code editor, you don’t necessarily want to type, type, type and then have to wait one second, two seconds, three seconds to get a suggestion back when your entire goal is to stay in the flow. We would run experiments to see how many milliseconds are the right amount such that a developer doesn’t feel like they’re being interrupted by Copilot and a suggestion.
Lenny: What’s the answer to that?
Ryan J. Salva: It seems like right now it’s around 200 milliseconds. Depending upon where you’re in the world, your latency can go up or down a little bit from there. But it seems like the sweet spot is somewhere around 200 milliseconds.
Lenny: Good to know.
Ryan J. Salva: We also experimented quite a bit. It’s not just about the model, but it’s also about what you feed the model. How do you prompt the model to return back a useful response? This kind of began a journey of experimentation for what we call prompt crafting.
Lenny: Going back to the way this started, it sounds like basically it was kind of this fortunate accident where OpenAI just did something that you didn’t expect. And then somebody within this PhD group that you described is like, “Oh wow. Maybe we could do something really good with this.” Is that kind of how it began?
Ryan J. Salva: That’s fairly accurate. Yeah. I mean, we had a model that really was amazingly good, like a step level change in actual intelligence, right? And then marrying that up against a really good use case that actually changes developers’ fundamental experience of the creation process, the creative process.
Lenny: Was there kind of a point at which it was clear to you or leadership in general like, we should double down on this thing and go big? Or this smaller team was working on this idea and then you’re like, “Oh wow, this is going to work?” Or is it always like, “We will bet on this thing, this is such a big and great idea. We’re going to invest resources for sure from the beginning?”
Ryan J. Salva: The original team that was working on Copilot at GitHub was the team that we call GitHub Next. Essentially their job is to work on second and third horizon projects. What some folks might call moonshots, right? Things that we never really expect work in the next one or two years, but might three, five years down the line actually turn into something meaningful.
Lenny: Is there a concrete definition of horizon two and three? Is it like number of years out like Amazon style?
Ryan J. Salva: Not necessarily a concrete definition. For me, I usually ballpark it as first horizon is the next year, second horizon, the next three years, third horizon, the next five years. But we generally think of it more as a measure of ambiguity and confidence level more than calendar dates.
Lenny:
Enabling realtime payments, automatic reconciliation, continuous accounting and compliance solutions, Modern Treasury’s platform is used to reconcile over $3 billion per month. They’re one of the hottest young FinTech startups on the market today, having raised funding from top firms like Benchmark, Altimeter, SVB Capital, Salesforce Ventures, and Y Combinator. Check them out at ModernTreasury.com. I’d love to spend a little bit more time on this. It’s so interesting. Is this a Microsoft thing, just having these three horizons in a certain percentage of resources or bet on different horizons?
Ryan J. Salva: I would say it is not necessarily Microsoft thing, but is definitely at GitHub, how we have really contextualized it. Not to say that there aren’t teams at Microsoft who might also use that methodology, but where we’ve been really maybe explicit or intentional about it is at GitHub where we’ve actually ring-fenced a team to think about that horizon two and horizon three work and kept them separate from EPD. EPD here being engineering, product, and design, the folks who are working on building productized operational products that we bring to market and we either give away or monetize in some way.
Lenny: This is so interesting. There’s a lot of companies that have these sorts of R&D groups, new product experience team at Facebook and Google has one. I’m not sure how many successes have come out of these teams. From what I’ve seen, and I’m curious, what have you… And clearly you had a huge success as far as I can tell so far. Is there anything you’ve learned about how to do this, where you invest in these big moonshots within a larger company?
Ryan J. Salva: I mean, I think the first step is to invest in it. The first step is really hire really smart people, attract smart people, and give them the opportunity to be creative. Don’t expect anything out of them that is going to turn into a money maker or something that is going to be beholden to fundamentals around security, privacy, uptime, accessibility, all that groovy kind of stuff upfront. They need space to create and experiment.
And also, when you do get to a place where that team has an idea that is clearly connected to a representative set of customers who have a genuine problem and there is signal with at least medium confidence that this solution, whatever it is, solves it in a novel way, that’s the time to start thinking about, okay, let’s actually put a little bit of… I’m going to call this market testing. It’s nothing so formal as market testing. It’s really just like, let’s start to actually bring prototypes of this in front of more and more customers to kind of test it out and see, hey, is this actually solving a problem for you? Is this something that you would use? This is where the transition between Next and EPD at GitHub really started.
This is actually where my role in the product cycle kind of really started to increase. I had kind of been in tight connection and been monitoring the work and kind of consulting a little bit with the Next team prior to that. But it was that moment when we identified that, okay, this is actually something real. Customers are saying, developers are saying, “This is magical. This does something extraordinary that I could not do on my own,” that we started to think about, okay, how do we transition this over? From there, we’re really just like, okay, we think we’ve got a hit here. We think we’ve got something that we can actually bring to developers.
Ryan J. Salva: We made an intentional decision to take some of the researchers who were in the Next team and for a finite period of time, move them over to create a new EPD squad. We want them to be researchers, but we need to do knowledge transfer and we needed to actually provide the seed for a team that could eventually operationalize and productize. And that kind of began the technical preview where we started to invite tens of thousands, then hundreds of thousands to the technical preview. In that technical preview, we started to see crazy mind-blown emoji tweets and threads on Hacker News about people getting really, really excited about it.
That’s how we knew it was time to start scaling and it was time to really start thinking about how do we do hiring so that we can build in some insulation around these researchers so that they can eventually go back to GitHub Next to do what they do best, which is be innovative and creative and think about the next moonshot. That process, that took… Well, we’re actually still kind of at the tail end of it now. Here we are, like I said, roughly a year and a half after the initial creation of the product, having gone through technical preview, have achieved general availability. We’ve now hired in a team around them.
The researchers actually as early as last month have started to gradually move back over to GitHub Next. An EPD squad, multiple EPD squads actually are now taking the product forward and starting to respond to customer feedback to think about, okay, how do we now as a product team, carry this roadmap forward from an idea that originated in GitHub Next?
Lenny: I love that insight of bringing the people along and not just kind of like, cool, we’ll take it from here. If you were to build a team like this again somewhere to this kind of R&D horizon three or two teams, is there anything else you would do differently, any lessons you take away from this experience for maybe founders or PMs working at larger companies that are like, “Hey, we should have something like this?” Is there anything else that you find is important for making something like this successful?
Ryan J. Salva: The criteria for moving researchers back into their R&D team, whatever that happens to be for your organization, that can’t be based on a calendar. It needs to be based on a replacement in seat, who’s actually doing the job and has picked up all of the skills necessary, and only then can the researcher move back. Make sure that you’ve got continuity of expertise and sets and domain familiarity before you move over. I feel like we’ve managed that pretty well today. As well, it’s critical that the team who is taking over from the R&D shop feels like they have control over their own future. You can’t really delegate roadmap to an R&D team.
The team who’s responsible for maintaining the product, for building the product, who has the closest feedback loop with the end customer, they’re the ones who really need to own and feel like they control the roadmap. Making sure that you’re not outsourcing innovation exclusively to an R&D team, but that is happening within the product team as they take ownership over the idea and over the use case in the customer. Last I would say here is really that engineering fundamentals in a lot of ways are the contracts that differentiate an R&D team from an operational product team.
Bringing that fundamentals process into it is going to feel candidly a little bit unnatural to the researchers. That takes therefore a little bit of cultural change management for everyone to just adapt their way of working and understand that we’re graduating from an experiment and a research project to an operational product, and often because those researchers are… They’re the first wave that come over. They’re the seed of the project. It’s going to feel a little bit unnatural to them and they probably won’t have all the right skillsets in order to make that transition.
Making sure that you’ve got a good mix of engineers who are comfortable maintaining a service, as well as engineers and researchers who are really thinking about, what is the idea that we’ve created, what is the new thing that we’ve brought to market, and can bring that vision to it.
Lenny: Yeah, I can totally see the challenge that comes from… This was my thing. I’ve been working on this. What are you guys doing to this project? Where is this going? I’m not sure I’m feeling… And then there’s all these new asks that are coming at you like, oh my God, this was so much fun and now I have to scale this freaking thing.
Ryan J. Salva: I mean, this is the best problem in the world to have. Talk about kind of customer ask, for Copilot in particular, the amount of chatter, the amount of customer feedback that was coming in especially for us with AI, I mean, the world is still figuring out AI, candidly. I mean, we’re getting a lot better at it, especially in the last couple of years with things like Dolly and Copilot. But it brings with it not only engineering challenges, but also, frankly, ethical challenges and legal challenges, like making sense of what our expectations are of AI. If AI produces something that is offensive, who’s at fault?
Our stance on it, what we ended up coming to is actually the framing of Copilot as an AI pair programmer I think is a useful one. Pair programmer, I suspect most of your listeners will know, but pair programmer is usually two developers sitting side by side working on a problem together. One’s at the keyboard and the other one’s kind of helping them talk through it, talk through the ideas and make corrections, that kind of thing. Well, if Copilot is your AI pair programmer and they’re whispering crazy stuff into your ear and they’re bringing politics into it or gender identity into it or, I don’t know, whatever other…
They’re spouting off slang and slander and all that kind of stuff. You’re probably not going to be able to focus on your work, right? It’s going to be really distracting. Really coming down to some principles about what is the use case we’re trying to solve, what is appropriate, I put this in scare quotes, behavior of the AI bot sitting side by side with you, helped us create some principles or some guidelines for the developer experience that we wanted to create.
Lenny: Oh, I love that. Just kind of creating a persona of the thing to help you inform how the behavior of the thing should work. How do you work through these challenges? Is it discussions with you and the legal team? I don’t know, these ethical things are really tricky, I imagine. How do you approach them like that as a product team?
Ryan J. Salva: It is conversations with a very, very wide cast of characters. This product in particular, I probably spent more time with legal than any other products that I’ve ever kind of been responsible for. All wonderful creative people. But it’s not just legal. It is also privacy and security champions. It is, frankly, developers, like the people who are using it, listening to them. Hey, what works here? What doesn’t work for you here? Why is this offensive? Why is it not offensive? We’ll continue on the example of the crazy pair programmer whispering crazy things into our year. When we first started out, we didn’t really have any filter on Copilot whatsoever the very, very, very early days.
Ryan J. Salva: And then eventually we’re like, okay, it needs to be slightly more controlled experience. We need to edit out some of the most egregious stuff. We introduced a simple block list of words, and these block lists are always fraught with peril, like which words are okay, which words are not okay. All of a sudden, we become editors of language and that’s kind of a scary place to be. I’m not comfortable with it at least. But at a certain level, it has to be done, because otherwise you’re going to create a bad developer experience.
Often we would get feedback from developers of like, “Hey, this particular word was blocked. That it was blocked either was offensive to me or prevented me from being able to get good value out of the product.”
Lenny: Oh man.
Ryan J. Salva: Always kind of dancing the dance of editorial content. We’re actually at a place now where we’re able to partner with the Azure Department of a Responsible AI, and they’ve created some really extraordinary models that help detect I’ll call it sentiment for lack of a better word, but basically when there is something that is patently offensive. Because there are some words that in some contexts may be offensive and in some context may be totally reasonable, especially when you get into software for medical kind of scenarios, right?
Being able to start to shift a little bit to focus or to rely on AI models that can also do a better job than we could with crude or simple block lists is maybe another proof point both of how AI as a solution for common development problems is getting way better at solving more parts of our stack or filling in for more parts of our stack. At least in our case, we were pretty fortunate to be able to deliver on or depend on a parent company’s contributions to solve a real acute problem that GitHub probably could not have solved on our own.
Lenny: I never thought that Copilot would be… That you would have to worry about it saying things that are crazy. That is wild that you guys have to deal with that. Wasn’t it Microsoft that had that bot that turned really negative and eventually shut down?
Ryan J. Salva: It was.
Lenny: There’s experience there.
Ryan J. Salva: What was its name? Talia or something like that?
Lenny: Something like that.
Ryan J. Salva: Yeah, something like that. We don’t want another one of those incidences.
Lenny: Wow. What this makes me think about is your team is at the forefront of AI in this applied way. I’m curious what your thinking is on just where this goes for developers especially. I saw a stat that maybe 40% of people’s code is now written by Copilot. I don’t know if that’s right. But is the vision in the future becomes something like 90? Where do you see this all going?
Ryan J. Salva: Just to put a fine point on that stat, it is 40% is specifically for Python developers. Candidly, it varies depending upon the language. Because as you might imagine, some languages have better representation in the public domain than others. And usually both the volume and the diversity of training data correlates with the quality of suggestions, which is then represented by either the number of lines written or the acceptance rate or any one of a number of other metrics.
Lenny: Awesome. Thanks for clarifying.
Ryan J. Salva: Yeah, totally. We see it range anywhere from the upper twenties to the forties across all the different languages.
Lenny: Just to throw this out there, as a not great engineer, I used to be an engineer for about 10 years, I welcome our AI Overlords writing all my code. I’m excited for this to do more and more. And yes, I’m curious where you think this goes.
Ryan J. Salva: It does. It enables even mediocre developers like myself to be able to do some pretty amazing things. But where’s it going? First, I think, I hope it’s obvious to most developers that AI is going to infuse pretty much our entire development stack in the not so distant future. Copilot is really just the very tip of the sphere for a lot of innovations and better managing maybe our build queues or helping to… Here’s a great one. I don’t know about you, but often the comments that I get with commit messages and PRs aren’t super great. It puts a lot of effort onto the code reviewer to go figure out what the developer was actually trying to do.
What if AI could summarize all of your changes with your full request and you just have to, as the contributing developer, just review it to make sure it’s accurate, send it on its way, and you don’t have to put in extra effort for that. There are lots and lots of different opportunities for AI to essentially be able to take some of the drudgery out of our work so that we can focus on creative acts. What I hear from developers and what I experience myself is that Copilot kind of forces me to think a little bit more about what are the design patterns I’m trying to create?
What is the end user experience or the outcomes that I’m trying to drive with my code, and that I can rely on Copilot to scaffold out a lot of that so that I can focus on more creative work? That is really what I hope for our industry five, 10 years from now, is that not only will we be inviting more developers or more people to become developers by essentially providing a layer of abstraction a little bit, or at least a little bit of a hand in development, but that also the really experienced developers are focusing on much larger problems and focusing on outcomes and creativity rather than really low level difficult rote memorization of things like syntax or ordering of parameters and the like.
Lenny: Great. If nothing else, that’ll keep people from just having a tab of Stack Overflow, copy and pasting every function that they’re trying to figure out.
Ryan J. Salva: I want Stack Overflow to stay in business, but I would mind a little bit less contact switching myself.
Lenny: In the experience of scaling this thing, what would you say has been the biggest challenge either technologically or even operationally just kind of scaling it to a real product that people are paying for?
Ryan J. Salva: There’s a few dimensions of that. One is a problem that’s very much of our time in the world, namely that supply chains have been disrupted dramatically over the course of the last few years. It turns out that Copilot for both training and operating the models requires some very rare and unique GPUs that there’s not a lot of global supply of. Part of it is just like, can we get enough hardware in order to run these things? We’ve actually earmarked quite a bit of capacity, and we are greedy, greedy, greedy for more capacity globally. As soon as we can produce those chips and get them in data centers, we do it.
That’s been one kind of unique challenge. I would also say here that operationally, another challenge has been, how do we create a model that the community really feels like ownership over, right? The dialogue that’s had to happen as we brought an AI tool to market, especially one that is trained on public code, has required a lot of dialogue between us and our community. Every good product manager should be spending as much of their time as possible with their customers, with their potential customers.
Copilot, in particular, has been a more complicated kind of rollout because we as an industry, as a society are still figuring out how to make sense of it. The amount of give and take between developers and us as a product team has really required us to scale up more of the product team than it has the engineering team.
Lenny: Interesting. And why is that?
Ryan J. Salva: It’s a couple of different reasons. I mean, one, like I said, we are trained on public code. Not all of the community is really sure like, when is it okay to train a model on public code? When is it not okay to train a model on public code? Is Copilot producing secure suggestions? Is Copilot producing bug buggy suggestions? There’s a lot of doubt. There’s a lot of very healthy skepticism. Actually I mean that genuinely. I want people to be skeptical of Copilot. We owe it to ourselves as a community to be skeptical of any AI.
Because just like there’s great potential for benefit, there’s also great potential for harm. People keeping us accountable like, how are you preventing things like model poisoning? Is there going to be a new attack vector that we just haven’t really thought of yet around AI that might produce negative consequences? We think that we’ve done a really good and responsible job of that by making sure that first, we are very clear that Copilot is not a replacement for a developer. It will never be.
We do not want Copilot auto generating code where a thinking, reasoning, breathing human being is not on the other side of that keyboard making recent decisions. We do not want Copilot to replace any other part of the stack, whether it is static analysis tools or your unit tests or whatever kind of measures you’re putting in today to make sure that your humans produce good quality code. We want you to keep all of those same systems in place to make sure that humans who are leveraging tools like Copilot continue to produce that good quality code.
But there’s a lot of at the same time anxiety of like, where is AI stack? Is AI eventually going to be… This is back to your question about where will we be five, 10 years from now. Will it be writing 90% of the code? We don’t want Copilot to be that… We don’t want it to replace anything. We want it to augment. The idea here is really that AI is an enabler for developers to focus on the creative work, to stay in the flow, to be able to move faster. Working through those anxieties, working through that healthy skepticism takes conversation. It takes dialogue. And that takes us on the product side having that guided conversation with the community.
Lenny: It feels like it connects back to your education back in the day, philosophy and literature. How convenient is that?
Ryan J. Salva: It often feels very connect… I mean, certainly the education side of things taught me that the importance of dialogue, the importance of skepticism is valuable in so much more than esoteric armchair ponderings. It’s actually applicable to the real world.
Lenny: Maybe a final question before we get to our very exciting lightning round.
Ryan J. Salva: Woo!
Lenny: Just looking back at this whole experience of, one, just building, incubating, launching this big bold bet within a big company, you can go in either direction, either just any lessons on just taking a bold bet versus incremental wins and how you think about investing in these two kind of categories, or just within a large company, a lesson of just how to build something like this, like a massive new product from just a seed of an idea to a large new business line potentially.
Ryan J. Salva: As both a product manager and a portfolio manager of multiple products, I’m responsible for multiple product lines at GitHub, the allocation of time, of focus, energy, and resources becomes a really challenging question. The answer to which isn’t always the same, depending upon the time, world circumstances, organizational circumstances, technology circumstances. As a general rule, as a general principle, I certainly try to make sure that we’re always reserving some capacity for bold, audacious experimental research projects. You can think of those really uncertain bets as being five to 10% of the team’s capacity. About 25, maybe 30% of the team’s capacity should generally be on just operations.
How do we keep our in-market products meeting customer expectations? And then the remainder of it, what is that, about 60% or so, is really on incremental progress for our end market products. How do we make iterative improvements and continue to actually realize payoff for the larger bets that we made one, two, three, four years back? And from a rough distribution, that’s generally how I run my larger teams. That works when you have larger teams though. At startups, where we were pretty much only a big bet, obviously your percentages get very different and it becomes a matter of you’re all in for that one proverbial lottery ticket.
Lenny: Awesome. Thanks for sharing that. I was going to ask you the percentages that you recommend. Thank you for getting to that. With that, we’ve gotten to our very exciting lightning round. I’m just going to ask you five questions briefly and just whatever comes to mind, whatever answer you have. Let’s do it. Sound good? Okay. What are two or three books that you recommend most to other people?
Ryan J. Salva: Oh, good question. One of them is a book on user experience called Make It So. It’s a reference back to Star Trek, and the idea here is essentially that user experiences that are presented to us in sci-fi often make their way into our everyday products and tools 20, 30 years down the line. It is a great eye-opening, illuminating and just really fun book. That’s one. And then completely different take, I’ll go outside of tech and I’ll just do entertainment value. There’s a David Foster Wallace book called Brief Interviews with Hideous Men that I love. It’s a collection of short stories.
And essentially what it is, is it is if you’re watching a movie and the villain gets their opportunity to have their big speech, which kind of explains why they are who they are, it makes them maybe a little bit vulnerable in that moment, it’s that speech 10 times over for different hideous people, terrible, terrible people. Interesting read. I recommend it.
Lenny: I love that. It reminds me of this book that is the interior design of dictators and they show you their homes of Saddam Hussein, Hitler, and all these guys.
Ryan J. Salva: Dude! Oh my gosh, that’s awesome. I got to find that one. You’ll have to send it to me.
Lenny: I found one at an old bookstore, like used bookstore. I don’t know if they’re around anymore, but I’ll find it. Second question. What’s a favorite other podcast that you like to listen to or recommend if there’s any?
Ryan J. Salva: Oh god, there’s so many. I consume hundreds of hours of podcasts every month. It is crazy. I can choose many. I’ll give you just one. The Memory Palace with Nate DiMeo is an excellent storytelling podcast. He does about 20 minute vignettes, usually selected from kind of American history. He also was the artist in residence at one of the museums in Washington, DC. And if you’re ever at I think it’s the American History Museum or something like that, if you’re ever there, you can go to different rooms in the museum and he’ll tell you stories about the objects or the rooms that you see there. It’s a magical experience recommended to anyone.
Lenny: Wow! I love those. What’s a recent movie or TV show that you’ve really enjoyed?
Ryan J. Salva: I don’t know if this counts as recent, but it’s one that I watched recently, which was Arrival. Yeah, that counts. Arrival. Movie ostensibly about aliens, but is really about language and memory. I found that really, really compelling.
Lenny: Have you read Ted Chiang books and short stories?
Ryan J. Salva: I have not. I have not.
Lenny: Oh wow! Oh, you would love it. Arrival is from one of his story, I believe, is one of his stories and there’s a whole book of many more short stories by the same guy. They’re amazing.
Ryan J. Salva: Brilliant. I’ve got my weekend cut out for me then.
Lenny: There you go. Just leave work and get to reading. What’s a favorite interview question that you like to ask in interviews?
Ryan J. Salva: Let’s see here. I’ll give you a fun one more than it is a challenging one. This is kind of my icebreaker interview question, particularly for more early to mid career product managers. I ask them to teach me something new in one minute. Usually I’ll pull up my phone and I’ll start the timer. I’ll give them a second to think about it and start the timer. They’re graded on three different criteria. One is completeness. Did they actually finish the lesson inside of one minute? Two is complexity. It’s one thing if you teach me how to, I don’t know, pat my head and rub my stomach at the same time.
It’s another thing if you teach me something about 18th century ardent connection to religious trends at the time. And then last is really clarity. Oh yeah, clarity is the last one. Clarity is like, do I actually understand? Did I learn something by the end of the lesson? Did they convey the idea fully and wholly?
Lenny: I have to ask, what’s the most interesting thing somebody has taught in this question?
Ryan J. Salva: My go-to kind of throwaway answer there about did they teach me something about 18th century art and its connection to religious trends at the time, someone taught me that. It was astounding. It was actually a university candidate, so someone who was still in university, and she was from Vanderbilt University.
Lenny: And was that a strong yes hire?
Ryan J. Salva: It was an extremely strong yes hire. She was freaking amazing. Such a smart person.
Lenny: Amazing. Final question, who else in the industry would you say you most respect as a thought leader or just influence person?
Ryan J. Salva: There are many, but I think for today I’d probably beat myself up if I didn’t say Uga Damore. Uga is the primary researcher who really kind of is the true innovator for Copilot. He deserves credit for the initial work and is a brilliant technologist and futurists. I really, really respect him a lot.
Lenny: Amazing. Cool call out. Ryan, this has been so fascinating. You guys are at the forefront of so much interesting work. I honestly can’t wait for Copilot for my newsletter so that I can do less work. Maybe that’ll come someday. But in any case, I’m excited to see where this whole thing goes. Thank you for being here. Two last questions. Where can folks find you online if they’re curious to learn more, reach out? And then is there a way that listeners can be useful to you?
Ryan J. Salva: Easy one. How can folks find me? I am Ryan J. Salva everywhere, Twitter, GitHub. Pick your choice. LinkedIn, Ryan J. Salva. And then how can folks be useful to me? Please, there is a 60 day free trial of Copilot that is there for everyone to pick up and use. Go try it out. When you do, post either on Twitter or Hacker News or on discussions, GitHub Discussions, your experience.
Give us the good feedback. Give us the bad feedback. I am so hungry to see how people are using it in novel ways and where they’re running up against the rough edges too. Like I said, there’s lots of room for us to grow and improve from here, but I’m pretty confident that developers will be pretty freaking amazed at what it’s already capable of.
Lenny: Awesome. Thanks for being here, Ryan.
Ryan J. Salva: Yeah, dude, thank you so much. It’s been a lot, a lot of fun.
Lenny: Thank you so much for listening. If you found this valuable, you can subscribe to the show on Apple Podcasts, Spotify, or your favorite podcast app. Also, please consider giving us a rating or leaving a review, as that really helps other listeners find the podcast. You can find all past episodes or learn more about the show at lennyspodcast.com. See you in the next episode.
Glossary
| English | 中文 |
|---|---|
| Actions | Actions |
| adaptogens | 适应原 |
| ambiguity and confidence level | 模糊度和置信水平 |
| Arctic Code Vault | 北极代码库 |
| artist in residence | 驻馆艺术家(artist in residence) |
| attack vector | 攻击向量(attack vector) |
| Azure Department of a Responsible AI | Azure 负责任 AI 部门 |
| block list | 屏蔽列表 |
| build queues | 构建队列 |
| CI/CD | CI/CD |
| code reviewer | 代码审查者 |
| Codespaces | Codespaces |
| CodeX | CodeX |
| commit message | commit message |
| context switching | 上下文切换 |
| critical theory | 批判理论 |
| Dolly | DALL-E |
| early to mid career | 初级到中级 |
| EPD | EPD(工程、产品和设计) |
| flow | 心流 |
| general availability | 正式发布 |
| GitHub Next | GitHub Next |
| GPUs | GPU |
| Hacker News | Hacker News |
| horizon | 地平线 |
| icebreaker | 破冰 |
| inline autocomplete | 内联自动补全 |
| knowledge transfer | 知识转移 |
| mental map | 心智地图 |
| mind share | 心智资源 |
| model poisoning | 模型投毒(model poisoning) |
| moonshots | 登月项目 |
| One Engineering System | One Engineering System(微软统一工程系统) |
| pair programmer | 结对编程者 |
| philosophy of aesthetics | 美学哲学 |
| portfolio manager | 产品组合经理(portfolio manager) |
| PR | PR |
| probiotics | 益生菌 |
| prompt crafting | 提示词精炼(prompt crafting) |
| R&D | 研发(R&D) |
| scaffolding code | 脚手架代码 |
| sentiment | 情感 |
| silver film | 银胶片 |
| strong yes hire | 强烈的录用意向 |
| supply chains | 供应链 |
| technical preview | 技术预览 |
| V1 product | V1 产品 |
| VP of Product | 产品副总裁 |
Reformatted by reformat_english.py