走进 OpenAI | Logan Kilpatrick(开发者关系负责人)
Inside OpenAI | Logan Kilpatrick (head of developer relations)
Finding High-Agency People
Logan Kilpatrick: Finding people who are high agency and work with urgency, if I was hiring five people today, those are some of the top two characteristics that I would look for in people because you can take on the world if you have people who have high agency and not needing to get 50 people’s different consensus. They hear something from our customers about a challenge that they’re having, and they’re already pushing on what the solution for them is and not waiting for all the other things to happen that… People just go and do it and solve the problem, and I love that. It’s so fun to be able to be a part of those situations.
Inside the OpenAI Board Saga
Lenny: Today my guest is Logan Kilpatrick. Logan is head of developer relations at OpenAI, where he supports developers building on open AI’s, APIs and ChatGPT. Before OpenAI, Logan was a machine learning engineer at Apple and advised NASA on their open source policy. If you can believe it, ChatGPT launched just over a year ago and transformed the way that we think about AI and what it means for our products and our lives. Logan has been at the front lines of this change, and every day is helping developers and companies figure out how to leverage these new AI superpowers.
In our conversation, we dig into examples of how people are using ChatGPT and the new GPTs and other open AI APIs in their work and their life. Logan shares some really interesting advice on how to get better at prompt engineering. We also get into how OpenAI operates internally, how they ship so quickly, and the two key attributes they look for in the people that they hire, plus, where Logan sees the biggest opportunities for new products and new startups building on their APIs.
We also get a little bit into the very dramatic weekend that OpenAI had with the board and Sam Altman and all of that, and so much more. A huge thank you to Dan Shipper and Dennis Ing for some great questions, suggestions. With that, I bring you Logan Kilpatrick after a short word from our sponsors.
Then, when you’re ready to share, you can use Hex’s drag and drop app builder to configure beautiful reports or dashboards that anyone can use. Join the hundreds of data teams like Notion, All Trails, Loom, Mixpanel, and Algolia using Hex every day to make their work more impactful. Sign up today at hex.tech/lenny to get a 60-day free trial of the Hex team plan. That’s hex.tech/lenny.
Logan, thank you so much for being here and welcome to the podcast.
Timing and Team Cohesion
Logan Kilpatrick: Thanks for having me, Lenny. I’m super excited.
Amazing New AI Interfaces
Lenny: I want to start with the elephant in the room, which I think the elephant is actually leaving the room because I think this is months ago at this point, but I’m still just really curious. What was it like on the inside of OpenAI during the very dramatic weekend with the board and Sam and all those things? What was it like? And is there a story maybe you could share that maybe people haven’t heard about what it was like on the inside, what was going on?
Startup Moats: Vertical vs. General
Logan Kilpatrick: Yeah, it was definitely a very stressful Thanksgiving week. I think in broad context, OpenAI had been pushing for a really long time, since ChatGPT came out, and that was supposed to be one of the first weeks that the whole company had taken time away to actually reset and have a break. So very selfishly, I was super excited to spend time with my family, all that stuff. And then, yeah, Friday afternoon we got the message that all of the changes were happening, and I think it was super shocking because I think, and this is a perspective a lot of folks share, everybody had and continues to have such deep trust in Sam and Greg and our leadership team that it was just very surprising. And we’re also a very, as far as company cultures go, very transparent and very open. So when there’s problems or there’s things going on, we tend to hear about them. And again, it was the first time that a lot of us had heard some of the things that were happening between the board and the leadership team, so very, very surprising.
I think my being someone who’s not based in San Francisco, I was, again, very selfishly kind of happy that it happened over the Thanksgiving break because a lot of folks actually had gone home to different places. So it felt like I had a little bit of comfort knowing I wasn’t the only one not in San Francisco, because everybody was meeting up in person to do a bunch of stuff and be together during that time. So it was nice to know that there was a few other folks who were sort of out of the loop with me.
I think the thing that surprised me the most was just how quickly everybody got back to business. I flew to San Francisco the next week after Thanksgiving, which I wasn’t planning to do, to deal with the team in person and seeing, literally Monday morning, I was walking into the office being, like expecting, I don’t know, something weird to be going on or happening. And really it was like people laser focused and back to work, and I think that speaks to the caliber of our team and everybody who’s just so excited about building towards the mission that we’re building towards. So I think that was, yeah, that was the most surprising thing of the whole incident. I think a lot of companies would’ve had the potential to truly be derailed for some non-trivial amount of time by this, and everybody was just right back to it, which I love.
Boosting Internal Efficiency with AI
Lenny: I feel like it also maybe brought the team closer together. It feels like it was this kind of traumatic experience that may bring folks together because it was something they all shared. Is there anything along those lines that’s like, “Wow, things are a little different now?”
Logan Kilpatrick: One of my takeaways was I’m actually very grateful that this happened when it happened. I think today the stakes are… They’re still relatively high. People have built their businesses on top of OpenAI. We have tons of customers who love ChatGPT, so if something bad happens to us, we definitely impact our customers. But on the world scale, somebody else will build a model if OpenAI disappeared and continue towards this progress of general intelligence. I think fast-forward five or 10 years, if something like this would’ve happened and we hadn’t gone through the hopeful upcoming work transformation and all those changes that are going to happen, I think it would’ve been a little bit, or potentially much worse, of an outcome. So I’m glad that things happened when the stakes are a little bit lower.
And I totally agree with you. It’s like the team has been growing so rapidly over the last year since I joined. It’s been crazy to think about how many new folks there are, and I really think that this really brought people together because most folks, historically, many of the folks when I joined, what kind of banded us all together was the launch of ChatGPT, the launch of GPT-4, and for folks who weren’t around for some of those launches, it was perhaps step day. For folks who are on for dev day, it was probably this event. So I think we’ve had these events that have really brought the company together cross-functionally, so hopefully all the future ones will be really exciting things like GPT five, whenever that comes, and stuff like that.
The Rise of Prompt Engineering
Lenny: Awesome. We’re going to talk about GPT-5. Going in a totally different direction, what is the most mind-blowing or surprising thing that you’ve seen AI do recently?
Logan Kilpatrick: The things that are getting me most excited are these new interfaces around AI, like the Rabbit R-1. I don’t know if you’ve seen that, but consumer hardware device, this company called TL Draw. I don’t know if you’ve seen TL Draw.
Practical Tips for Better Prompts
Lenny: I think. You sketch something and then it makes it as a website?
Logan Kilpatrick: And that’s only a small piece of what TL Draw is actually working on, but there’s all of these new interfaces to interact with AI, and I think I was having a conversation with the TL Draw folks a couple of days ago, really blows my mind to think about how chat is the predominant way that folks are using AI today. And I actually think, and this is my bulk case for the folks at TL Draw, I’m super excited for them to build what they’re building, but they’re sort of building this infinite canvas experience and you can imagine how, as you’re interacting with an AI on a daily basis, you might want to jump over to your infinite canvas, which the AI has filled in all the details and you might see a reference to a file and to a video and all of these different things.
And it’s such a cool way. It actually makes a lot more sense for us as humans to see stuff in that type of format than, I think, just listing out a bunch of stuff in chat. So I’m really, really excited to see more people. I think 2024 is the year of multimodal AI, but it’s also the year that people really push the boundaries of some of these new UX paradigms around AI.
The Launch of GPTs
Lenny: It’s funny. I feel like Chatbots, as a PM for many years, it feels like every brainstorming session we had about new features, it’s like, “Hey, we should have built a Chatbot to solve this problem.” It’s like the perennial like, “Oh, Chatbot,” or, “Someone’s going to suggest we do a Chatbot,” and now they’re actually useful and working and everyone’s building Chatbots, a lot of them based on OpenAI APIs.
There’s not really a question there, but maybe the question, I was going to get to this later, is just when people are thinking about building a product like, say, TL Draw, what should they think about? Where OpenAI is not going to go versus here’s what OpenAI is going to do for us. We shouldn’t worry about them building a version of TL Draw in the future. What’s the way to think about where you won’t be disrupted essentially by OpenAI, knowing also they may change their mind?
GPTs: Real Value and Future Potential
Logan Kilpatrick: That’s a great question. I think we’re deeply focused on these very, very general use cases like the general reasoning capabilities, the general coding, the general writing abilities. I think where you start to get into some of these very vertical applications… And I think a great example of this is actually Harvey. I don’t know if you’ve seen Harvey, but it’s this legal AI use case where they’re building custom models and tools to help lawyers and people at legal firms and stuff like that. And that’s a great example of our models are probably never going to be as capable as some of the things that Harvey’s doing because our goal and our mission is really to solve this very general use case and then people can do things like fine-tuning and build all their own custom UI and product features on top of that.
I have a lot of empathy and a lot of excitement for people who are building these very general products today. I talk to a lot of developers who are building just general purpose assistants and general purpose agents and stuff like that. I think it’s cool and it’s a good idea. I think the challenge for them is they are going to end up directly competing against us in those spaces and I think there’s enough room for a lot of people to be successful, but to me you shouldn’t be surprised when we end up launching some general purpose agent product because, again, we’re sort of building that with GPTs today and versus we’re not going to launch some of these varied verticalized products. We’re not going to launch an AI sales agent. That’s just not what we’re building towards. And companies who are and have some domain specific knowledge and they’re really excited about that problem space, they can go into that and leverage our models and end up continuing to be on the cutting edge without having to do all that R&D effort themselves.
OpenAI’s Workflow and Rapid Iteration
Lenny: Got it. So the advice I’m hearing is get specific about use cases, and that could be either models that are tuned to be especially useful for a use case like sales or make an interface or experience solving a more specific problem.
High Agency and High Urgency
Logan Kilpatrick: And I think if you’re going to try and solve this very general, if you’re going to try to build the next general assistant to compete with something like ChatGPT, it has to be so radically different. People have to really be like, “Wow, this is solving these 10 problems that I have with ChatGPT, and therefore I’m going to go and try your new thing.” Otherwise we’re just putting a ton of engineering efforts and research effort into making that an incredible product, and it’s just going to be the normal challenges of building companies. It’s just hard to compete against something like that.
OpenAI’s Planning and Priorities
Lenny: Awesome. Okay, that’s great. I was going to get to that later, but I’m glad we touched on that. I imagine that’s on the minds of many developers and founders. Kind of along the same lines, there’s a lot of talk about how ChatGPT and GPTs and many of the tools you guys offer are going to make a company much more efficient. They don’t need as many engineers, data scientists, PMs, things like that, but I think it’s also hard for companies to think about what can we actually do to make our company more efficient. I’m curious if there’s any examples that you can share of how companies have taken built a, say, a GPT internally to do something so that they don’t have to spend engineering hours on it or generally just used OpenAI tooling to make their business internally more efficient?
Logan Kilpatrick: Yeah, that’s a great question. I wonder if you can put this in the show notes or something like that, but there’s a really great Harvard Business School study about… And I forgot which consulting firm they did it with. Maybe it was like Boston Consulting or something like that, but it might’ve been one of the other ones. And they talk about the order of magnitude of efficiency gain for those folks who are using AI tools, I think it was chat GPT specifically in those use cases that they were using, comparatively against folks who aren’t using AI. I’m really excited, also, just as just more time passes between the release of this technology, for us to get more empirical studies. I feel just for myself, as somebody who’s an engineer today, I use ChatGPT and I can ship things way faster than I would be able to.
I don’t have any good metrics for myself to put a specific number on it, but I’m guessing people are working on those studies right now. I think engineering is actually one of the highest leverage things that you could be using AI to do today and really unlocking, probably on the order of at least a 50% improvement, especially for some of the lower hanging fruit software engineering tasks. The models are just so capable at doing that work. And it’s crazy to think… And I’m guessing, actually, GitHub probably has a bunch of really great studies they publish around copilots and you could use those as an analogy for what people are getting from ChatGPT as well. But those are probably the highest leverage things.
I think now with GPTs, people are able to go in and solve some of these more tactical problems. I think one of the general challenges with ChatGPT is it gives a decent answer for a lot of different use cases, but oftentimes it’s not particular enough to the voice of your company or the nuance of the work that you’re doing. And I think now with GPTs and people who are using the teams in ChatGPT and Enterprise in ChatGPT, they can actually build those things, incorporate the nuance of their own company, and make solving those tasks much, much more domain specific. So we literally just launched GPTs a couple of months ago, so I don’t think there’s been any good public success stories, but I’m guessing that success is happening right now at companies, and hopefully we’ll hear more about that in the months to come as folks get super excited about sharing those case studies.
Slack and Communication Culture
Lenny: I’ll share an example. So I have this good friend, his name’s Dennis Yang, he works at Chime, and he told me about two things that they’re doing at Chime that seem to be providing value. One is he built a GPT that helps write ads for Facebook and Google just gives you ideas for ads to run, and so that takes a little load off the marketing team or the growth team. And then he built another GPT that delivers experiment results, kind of like a data scientist, with here’s the result of this experiment. And then you could talk to it and ask for like, “Hey, how much longer do you think we should run this for,” or, “What might this imply about our product,” and things like that. And I think it’s really-
Logan Kilpatrick: I love that.
Team Size at OpenAI
Lenny: Like you said. Is there anything else that comes to mind? Just things you’ve heard people do just like, “Wow, that was a really smart way of… ” So I get there’s engineering, co-piloting type tooling. Is there anything else that comes to mind? Just to give people a little inspiration of like, “Wow, that’s an interesting way I should be thinking about using some of these tools.”
Advantages of Small Research Teams
Logan Kilpatrick: I’ve seen some interesting GPTs around the planning use cases, like you want to do OKR planning for your team or something like that. I just actually saw somebody tweet it literally yesterday. I’ve seen some cool venture capital ones of doing diligence on a deal flow, which is kind of interesting, and getting some different perspectives. I think all of those horizontal use cases where you can bring in a different personality and get perspective on different things I think is really cool. I’ve personally used a GPT, the private GPT that I use myself that helps with some of the planning stuff for different quarters, and just making sure that I’m being consistent in how I’m framing things like driving back to individual metrics, stuff that, when people do planning, they often miss in our data, and then it’s been super helpful for me to have a GPT to force me to think about some of those things.
Future Directions and New Modalities
Lenny: Wait, can you talk more about this? What does this GPT do for you and what do you feed it?
Logan Kilpatrick: Yeah, I forgot what article I saw online, but it was some article that was talking about what are the best ways to set yourself up for success in planning. And I took a bunch of the… I’ll see if I can make it public after this and send you a link, but took a bunch of the examples from that and went in and put some of those suggestions into the GPT, and then now when I do any of my planning of I want to build this thing, I put it through and have it generate a timeline, generate all the specifics of what are the metrics and success that I’m looking for, who might be some important cross-functional stakeholders to include in the planning process, all that stuff, and it’s been helpful.
GPT-5: Expectations vs. Reality
Lenny: Wow, that is very cool. That would be awesome if you made it public. And if you do, we’ll link to it and we’ll make it the number one most popular GPT in the store.
Logan Kilpatrick: I love it.
OpenAI’s B2B Products
Lenny: Going in a slightly different direction, there’s this whole genre of prompt engineering. It feels like it’s one of these really emerging skills. I actually saw a startup hiring a prompt engineer, one of the startups I’ve invested in, and I think that’s going to blow a lot of people’s minds that there’s this new job that’s emerging. And I know the idea is this won’t last forever, that in theory AI will be so smart you don’t need to really think about how to be smart about asking it for things you need it to do. But can you just describe this idea of what is prompt engineering, this term that people might be hearing? And then even more interestingly, just what advice do you have for people to get better at writing prompts for, say, ChatGPT or through the API in general?
Logan Kilpatrick: Yeah, this is such an interesting space, and I think it’s another space where I’m excited for people to do more scientific empirical studies about, because there’s so much gut feeling, best practices that maybe aren’t actually true in a certain way. I think the reason that prompt engineering exists and comes up at all is because the models are so inclined, because of the way that they’re trained, to give you just an answer to the question that you ask. Crap in crap out. If you ask a pretty basic question, you’re going to get a pretty basic response. And actually the same thing is true for humans, and you can think of a great example of this. When I go to another human and I ask, “How’s your day going,” they say, “It’s going pretty good.”
Literally, absolutely zero detail, no nuance, not very interesting at all versus, again, if you have some context with a person, if you have a personal relationship with them and I ask you, “Hey Lenny, how’s your day going? How did the last podcast go,” et cetera, et cetera, you just have a little bit more context and agency to go and answer my question. I think this is prompt engineering.
My whole position on this is prompt engineering is a very human thing. When we want to get some value out of a human, we do this prompt engineering. We try to effectively communicate with that human in order to get the best output. And the same thing is true of models. And I think it’s like, again, because we’re using a system that appears to be really smart, we assume that it has all this context, but it’s really like imagine a human level intelligence but literally no context. It has no idea what you’re going to ask it. It’s never met you before. It has no idea who you are, what you do, what your goals are. And it’s the reason that you get super generic responses sometimes is because people forget they need to put that context in the model.
So I think the thing that is going to help solve this problem, and we already kind of do this in the context of Dali, so when you go to the image generation model that we have, Dali, and you say, “I want a picture of a turtle,” what it does is it actually takes that description. It says, “I want a picture of a turtle,” and it changes it into this high fidelity, like generate a picture of a turtle with a shell, with a green background and lily pads in the water and all this other. It adds all this fidelity because that’s the way that the model is trained. It’s trained on examples with super high fidelity. This will happen with text models.
You can imagine a world where you go into ChatGPT and you say, “Write me a blog post about AI.” It automatically will go and be like, “Let me generate a much higher fidelity description of what this person really wants, which is generate me a blog post about AI that talks about the trade-offs between these different techniques and some example use cases and references some of the latest papers,” and it does all that for you, and then you at the user will hopefully be able to be like, “Yep, this is kind of what I wanted. Let me edit this. Let me edit this here.”
And again, the inherent problem is we’re lazy as humans. We don’t want to type all… We don’t really want to type what we mean, and I think AI systems are actually going to help solve some of that problem.
Upcoming New Features
Lenny: So until that day, what can people do better when they’re prompting, say ChatGPT? And I’ll give you an example. Tim Ferris suggested this really good idea that I’ve been stealing, which is when you’re preparing for an interview, you go to chat GPT. And so I did this for you. I was like, “Hey, I’m interviewing Logan Kilpatrick, he is head of developer relations at OpenAI, on my podcast. Give me 10 questions to ask him in the style of Tyler Cowen,” who I think is the best interviewer. He is so good at just very pointed original questions. So what advice would you have for me to improve on that prompt to have better results? The questions were fine. They’re great. They’re interesting enough, but they weren’t like, “Holy, these are incredible.” So I guess what advice would you give me in that example?
Logan Kilpatrick: Yeah, that’s a great example where thinking in context of who it is that you’re asking questions about. I’m probably not somebody who has enough information about me on the internet where the model actually has been trained and knows the nuances of my background. I think there’s probably much more famous guests where it might be that there’s enough context on the internet to answer the questions. You actually have to do some of that work. You need to, say if you’re using browse with Bing, for example, you could say, “Here’s a link to Logan’s blog and some of the things that he’s talked about. Here’s a link to his Twitter. Go through some of his tweets, go through some of his blogs and see what his interesting perspectives are that we might want to surface on the blog,” or something like that.
It’s, again, giving the model enough context to answer the question. I think, again, that prompt actually might work really well for somebody who has it, if you were interviewing Tom Cruise or something like that, somebody who has a lot of information about them on the internet. It probably works a little bit better.
How PMs and Founders Use AI
Lenny: So the advice there is just give more context. It doesn’t tell you, “Hey, I don’t actually know that much about Logan, so give me some more information.” It’s just like, “Here you go. Here’s a bunch of good questions.”
Advice for AI Developers
Logan Kilpatrick: Exactly. It wants to. It so deeply wants to answer your question. It doesn’t care that it doesn’t have enough context. It’s the most eager person in the world you could imagine to answer the question, and without that context it’s just hard to do, to give anything a value. If we got t-shirts printed, they should say, “Context is all you need. Context is the only thing that matters.” It’s such an important piece of getting a language model to do anything for you.
Rapid Fire Q&A
Lenny: Any other tips? Just as people are sitting there, maybe they have ChatGPT open right now as they’re crafting a prompt, is there anything else that you’d say would help them have better results?
Logan Kilpatrick: We actually have a prompt engineering guide, which folks should go and check out. It has some of the examples. It depends on the order of magnitude of how much performance increase you can get. There’s a lot of really small silly things, like adding a smiley face, increases the performance of the model. I’m sure folks have seen a lot of these silly examples, but telling the model to take a break and then answer the question, all these kinds of things. And again, if you think about it, it’s because the corpus of information that’s trained these models is the same things that humans have sent back and forth to each other. So you telling a human, “When I go take a break and then I come back to work, I’m fresher and I’m able to answer questions better and do work better,” so very similar things are true for these models. And again, when I see a smiley face at the end of someone’s message, I feel empowered that this is going to be a positive interaction and I should be more inclined to give them a great answer and spend more effort on the thing that they asked me for.
Recently Discovered Cool Products
Lenny: Wow, wait. So that’s a real thing. If you had a smiley face, it might give you better results.
Personal Life Motto
Logan Kilpatrick: Again, it’s like the challenge with all this stuff is it’s very nuanced and it’s also it’s a small jump in performance. You could imagine on the order of one or 2%, which for a few sentence answer might not even be a discernible difference. Again, if you’re generating an entire saga of texts, the smiley face could actually make a material difference for you, but for something small and textual it might not.
Lenny: Okay, good tip. Amazing. Okay, we’ve talked about GPTs I think maybe might be helpful to describe what is this new thing that you guys launched, GPTs, and I’m curious just how it’s going. This is a really big change and element of OpenAI now with this idea that you could build your own mini, and I’m almost explaining it, your mini open ChatGPT and then people can… I think you can pay for it. You can charge for your own GPT or is it all free right now?
AI Stand-Up Comedy
Logan Kilpatrick: It’s ll free right now. It’s all free.
Lenny: Okay. In the future I imagine people will be able to charge. So there’s this whole store now. Basically it’s the whole app store that you guys have launched. How’s it going? What’s happening? What surprised you there? What should people know?
Contact Info and Feedback
Logan Kilpatrick: Yeah, it’s going great, and again, historically the thing that you would have to do, let’s say for example, you have a really cool ChatGPT use case, what you would have to do to share it with somebody else is actually go in and start the conversation with the model, prompt it to do the things that you wanted to, and then you would share that link with somebody else before the action has actually happened and be like, here now you can essentially finish this conversation with ChatGPT that I started.
So GPT kind of changes this where you take all that important context, you put it into the model to begin with, and then people can go and chat with essentially a custom version of ChatGPT. And the thing that’s really interesting is you can upload files, you can give it custom instructions, you can add all these different tools. Like a code interpreter is built in, which allows you to do math. Essentially you have browsing built in, image generation built in. You can also, for more advanced use cases if you’re a developer, you can connect it to external APIs so you can connect it to the Notion API or Gmail or all these different things, and have it actually take actions on your behalf.
So there’s so many cool things that people are unlocking. And what’s been most exciting to me, actually, is the non-developer persona is now empowered to go and solve these really, really, really more challenging problems by giving the model enough context on what that problem is to be able to solve it. Going back to context is all you need, this is very true in the context of GPTs, and if you give it enough context, you can solve much more interesting problems.
There’s so many things that I’m excited about with this. I think monetization, when it comes to the store later this quarter, I think is going to be extremely exciting when people can get paid based on who’s using their GPT. That’s going to be a huge unlock and open a lot of people’s eyes to the opportunity here. I also think continuing to push on making more capabilities accessible to GPTs for people who can’t code is really exciting. Even for me as someone who is a software engineer, it’s not super easy to connect the Notion API or the Gmail API to my GPT, and really I’d love to just be able to one click sign in with Gmail and then all of a sudden it’s like my Gmail is accessible, or someone else can sign in with their Gmail and make it accessible. So I think over time all those types of things will come, but today it’s really custom prompts is essentially one of the biggest value adds with GPTs.
Lenny: Awesome. I have it pulled up here on different monitor and Canva has the top GPT currently, and I was trying to play with it as you’re chatting just to see. I was going to make a big banner that said, “It’s the context stupid,” and it doesn’t. I’m not doing something right, but I’m not paying that much attention to it because we’re talking, but this is very cool. Just maybe a final question there. Is there a GPT that you saw someone built that was like, “Wow, that’s amazing. That’s so cool,” something that surprised you? And I’ll share one that was very cool, but is there anything that comes to mind when I ask that?
Logan Kilpatrick: I think my instinct is the Zapier. All of the stuff that Zapier has done with GPTs is the most useful stuff that you could imagine. You can go so far with what… And I don’t know how it’s packaged for Zapier’s GPT right now, but you can actually, as a third party developer, integrate Zapier without knowing how to code into your GPT. So they’re pushing a lot of this stuff, and then basically all 5,000 connections that are possible with Zapier today, you can bring into your GPT and essentially enable it to do anything. So I’m incredibly excited for Zapier and for people who are building with them so many things that you can unlock using that platform. So I think that’s probably the most exciting thing to me for people who aren’t developers.
Lenny: Awesome. Zapier’s always in there, getting in there connecting things.
Logan Kilpatrick: Yeah, they’re great.
Lenny: So the one that I had in mind, so I had a buddy of mine, [inaudible 00:30:28], who’s the CEO of a company called Runway built this thing called Universal Primer which helps you learn. It’s described as, “Learn everything about anything,” and basically, I think, it’s kind of this Socratic method of helping you learn stuff. So it’s like, “Explain how transformers work in LMs,” and then it just kind goes through stuff and then asks you questions, I think, and helps you learn new concepts. And I think it’s the number two education GPT.
Logan Kilpatrick: I love that. [inaudible 00:30:53] is incredible, so…
Lenny: Yes, it’s true. Let me tell you about a product called Arcade. Arcade is an interactive demo platform that enables teams to create polished, on-brand demos in minutes. Telling the story of your product is hard and customers want you to show them your product, not just talk about it or gate it. That’s why Product Four teams such as Atlassian, Carta and Retool use Arcade to tell better stories within their homepages, product change logs, emails and documentation.
But don’t just take my word for it. Quantum Metric, the leading digital analytics platform created an interactive product tour library to drive more prospects. With Arcade, they achieved a 2X higher conversion rate for demos and saw five times more engagement than videos. On top of that, they built the demo 10 times faster than before. Creating a product demo has never been easier. With browser-based recording Arcade is the no-code solution for building personalized demos at scale.
Arcade offers product customization options, designer approved editing tools and rich insights about how your viewers engage every step of the way. Ready to tell more engaging product stories that drive results? Head to arcade.software/lenny and get 50% off your first three months. That’s arcade.software/lenny.
I want to talk about just what it’s like to work at OpenAI and how the product team operates and how the company operates. So you worked at… Your two previous companies were Apple and NASA, which are not known for moving fast. And now you’re at OpenAI, which is known for moving very fast, maybe too fast for some people’s taste, as we saw it with the whole board thing. And so what I’m curious is just what is it that OpenAI does so well that allows them to build and ship so quickly and at such a high bar? Is there a process or a way of working that you’ve seen that you think other companies should try to move more quickly and ship better stuff?
Logan Kilpatrick: Yeah, there’s so many interesting trade-offs and all of this tension around how quickly companies can move. I think for us, again, if you think about Apple as an example, if you think about NASA as an example, just older institutions, lots of… Over time, the tendency is things slow down. There’s additional checks and balances that are put in place, which drags things down a little bit. So we’re young and a new company, so we don’t have a lot of that institutional legacy barriers that have been put in place.
I think the biggest thing, and there’s a good Sam tweet somewhere in the ether about this from, I think, 2022 or something like that, but finding people who are high agency and work with urgency is one of the most… If I was hiring five people today, those are some of the top two characteristics that I would look for in people because you can take on the world if you have people who have high agency and not needing to either get 50 people’s different consensus, because you have people who you trust with high agency and they can just go and do the thing, I think, is one of the most… It is the most important thing, I’m pretty sure, if you were to distill it down.
And I see this in folks that I work with. Folks are so high agency. They see a problem and they go and tackle it. They hear something from our customers about a challenge that they’re having and they’re already pushing on what the solution for them is and not waiting for all the other things to happen that I think traditional companies are stuck behind because they’re like, “Oh, let’s check with all these seven different departments to try to get feedback on this.” People just go and do it and solve the problem. And I love that. It’s so fun to be able to be a part of those situations.
Lenny: That is so cool. I really like these two characteristics because I haven’t heard this before. Those are the two, maybe the two most important things you guys look for, high agency, high urgency. To give people a clear sense of what these actually look like when you’re hiring, you shared maybe this example of customer service. Someone’s hearing a bug and then going to fix it. Is there anything else that can illustrate what that looks like, high agency? And then a similar question on urgency other than just move, move, move, ship, ship, ship.
Logan Kilpatrick: I think the assistants API that we released for dev day, we continue to get this feedback from developers that people wanted these higher levels of abstraction on top of our existing APIs, and a bunch of folks on the team just came together and were like, “Hey, let’s put together what the plan would look like to build something like this,” And then very quickly came together and actually built the actual API that now powers so many people’s assistant applications that are out there. And I think that’s a great example of it wasn’t this top down, oh, someone’s sitting there being like, “Oh, let’s do these five things,” and then like, “Okay, team, go and do that.” It’s like people really seeing these problems that are coming up and knowing that they can come together as a team and solve these problems really quickly. And I think the assistants API, and there’s like 1,001 other examples of teams taking agency and doing this, but I think that’s a great one at the top of my head
Lenny: That makes me want to ask. Just how does planning work at OpenAI? So in this example is just like, “Hey, we think we need to build this. Let’s just go and build it.” I imagine there’s still a roadmap and priorities and goals and things that that team had. How does road mapping and prioritization and all of that generally work to allow for something like that?
Logan Kilpatrick: I think this is one of the more challenging pieces at OpenAI. There’s so many. Everyone wants everything from us, and today, especially, in the world of ChatGPT and how large and well-used our API is, people will just come to us and say, “Hey, we want all of these things.” I think there’s a bunch of core guiding principles that we look at. One, going back to the mission, is this actually going to help us get to AGI? So there’s a huge focus on there’s this potential shiny reward right in front of us, which is optimize user engagement, or whatever it is. And is that really the thing? Maybe the answer is yes. Maybe that is what is going to help us get to AGI sooner, but looking at it through that lens I think is always the first step of deciding any of these problems.
I think, on the developer side, there’s also these core tenets of reliability like, “Hey, it would be awesome if we had additional APIs that did all these cool things like new endpoints, new modalities, new abstractions, but are we giving customers a robust and reliable experience on our API?” And that’s often the first question. And I think there have been times where we’ve fallen short on that, and there was a bunch of other things that we’ve been thinking about doing and really bringing the focus and priority back to that reliability piece because, at the end of the day, nobody cares if you have something great if they can’t use it robust and reliably.
So there’s these core tenets. And I think, again, we have very, other than all the principles about how we’re making the decision, I think the actual planning process is pretty standard. We come together. There’s H1 Q1 goals. We all sprint on those. I think the real interesting thing is how stuff changes over time. You think we’re going to do these very high level things and new models, new modalities, whatever it is. And then as time goes on, there’s all of this turmoil and change, and it’s interesting to have mechanisms to be like, “Hey, how do we update our understanding of the world and our goals as everything the ground changes underneath of us as is happening in the craziness of the AI space today?”
Lenny: It’s interesting that it sounds a lot like most other companies. There’s H1 planning. There’s Q1 planning. Are there metrics and goals like that that you guys have OKRs or anything like that? Or is it just, Here we’re going to launch these products?”
Logan Kilpatrick: I think it’s much higher level. I actually don’t think OpenAI is a big OKR company. I don’t think teams do OKRs today and I don’t have a good understanding of why that’s the case, whether or not. I don’t even know if OKRs are still the industry. You’re probably talking to a lot more folks about who are making those decisions. So I’m curious. Is that something that you’re seeing from folks? Is it still common for people to do OKRs?
Lenny: Yeah, absolutely. Many companies use OKRs, love OKRs. Many companies hate OKRs. I’m not surprised that OpenAI is not an OKR driven company. Along those lines, I don’t know how much you can share about all this stuff, but how do you measure success for things that you launch? I know there’s this ultimate goal, AGI. Is there some way to track we’re getting closer? What else do you guys look at when you launch, say DPT Store or assistants or anything that’s like, “Cool, that was exactly what we’re hoping for.” Is it just adoption?
Logan Kilpatrick: Yeah, adoption is a great one. I think there’s a bunch of metrics around revenue, number of developers that are building on our platform, all those things. And a lot of these, and I don’t want to dive… I’ll let Sam or someone else on our leadership team go more into details, but I think a lot of these are actual abstractions towards something else. Even if revenue is a goal, it’s like revenue is not actually the goal. Revenue is a proxy for getting more compute, which is then actually what helps us get towards getting more GPUs so that we can train better models and actually get to the goal. So there’s all these intermediate layers where even if we say something is the goal, and you hear that in a vacuum and you’re like, “Oh, well OpenAI just wants to make money,” and it’s like, “Well, really money is the mechanism to get better models so that we can achieve our mission.” And I think there’s a bunch of interesting angles like that as well.
Lenny: I don’t know if I’ve heard of a more ambitious vision for a company, to build artificial general intelligence. I love that. I imagine many companies are like, “What’s our version of that?” Before we leave this topic, is there anything else that you’ve seen OpenAI do really well that allows it to move this fast and be this successful? You talked about hiring people with higher agency and high urgency. Is there anything else that’s just like, “Oh wow, that’s a really good way of operating?” I imagine part of it’s just hiring incredibly smart people. I think that’s probably an unsaid thing, but yeah, anything else?
Logan Kilpatrick: I think there’s a non-trivial benefit to using Slack, and I think maybe that’s controversial and maybe some people don’t like Slack, but OpenAI has such a slack heavy culture and it really… The instantaneous real time communication on Slack is so crucial. And I just love being able to tag in different people from different teams and get everybody coalesced. So everybody is always on Slack, so even if you’re remote or you’re on a different team or in a different office, so much of the company culture is ingrained in Slack, and it allows us to really quickly coordinate where it’s actually faster to send someone a Slack message sometimes than it would be to walk over to their desk because they’re on Slack and they’re going to be using it.
And I saw, if you saw, the recent Sam and Bill Gates interview, but Sam was talking about how Slack is his number one most used app on his phone and, “I don’t even look at the time thing on my phone anymore because I don’t want to know how long I’m using Slack,” but I’m sure the Salesforce people are looking at the numbers and they’re like, “This is exactly what we wanted.”
Lenny: I also love Slack. I’m a big promoter Slack. I think there’s a lot of Slack hate, but such a good product. I’ve tried so many alternatives and nothing compares. I think what’s interesting about Slack for you guys is one of the… You don’t know if someone in there is just an AGI, that is not actually a person that’s just there working at the company.
Logan Kilpatrick: I know there are real people. There is no AGIs yet. But I think, yeah, even Slack is building a bunch of really cool AI tools, which I’m excited to… And that’s why there’s so much cool AI progress. And at the end of the day, it’s so exciting, from being a consumer of all these new AI products. Google’s a great example. I’m so happy that Google is doing really cool AI stuff because I’m a Google Docs customer and I love using Google Docs. I like a bunch of their other products, and it’s awesome that people are building such useful things around these models.
Lenny: How big is the OpenAI team at this point, whatever you can share? Just to give people a sense of the scale.
Logan Kilpatrick: Yeah, I think the last public number was something around like 750 near the end of last year, 780 or something like that near the end of last year. And we’re still growing so quickly, so I won’t be the messenger to share the specific updated numbers, but the team is growing like crazy and we’re also hiring across all of our engineering teams and PM teams, so folks who are interested, would love to hear from folks who are curious about joining.
Lenny: Maybe one last question here. So you’re growing, maybe getting to 1,000 people, clearly still very innovative and moving incredibly fast. Is there anything you’ve seen about what OpenAI does well to enable innovation and not slow down new big ideas?
Logan Kilpatrick: Yeah, there’s a couple of things, one of which is the actual research team, who seed most of the innovation that happens at OpenAI, is intentionally small. Most of the growth that OpenAI has seen is around our customer facing roles, our engineering roles to provide the infrastructure for ChatGPT and things like that. The research team is, again, intentionally kept small and there’s all of this talk. And it’s really interesting. I just saw this thread from one of our research folks who was talking about how in a world where you’re constrained by the amount of GPU capacity that you have as a researcher, which is the case for open AI researchers, but also researchers everywhere else, each new researcher that you add is actually a net productivity loss for the research group unless that person is up-leveling everyone else in such a profound way that it increases the efficiency.
If you just add somebody who’s going to go and tackle some completely different research direction, you now have to share your GPUs with that person and everyone else is now slower on their experiments. So it’s a really interesting trade-off that research folks have that I don’t think product folks… If I add another engineer to our API team or to some of the ChatGPT teams, you can actually write more code and do more. And that’s actually a net beneficial improvement for everybody. And that’s always not the case in the case of researchers, which is interesting, in A GPU constrained world, which hopefully we won’t always be in.
Lenny: I want to zoom out a bit and then there’s going to be a couple follow-up questions here. Where are things heading with OpenAI? What’s in the near future of what people should expect from the tools that you guys are going to have in launch?
Logan Kilpatrick: Yeah, new modalities. I think ChatGPT continuing to push all of the different experiences that are going to be possible. Today, ChatGPT is really just text in, text out or, I guess three months ago, it was just text in, text out. We started to change that with now you can do the voice mode and now you can generate images and now you can take pictures. So I think continuing to expand the way in which you interface with AI through ChatGPT is coming.
I think GPTs is our first step towards the agent future. Again, today when you use A GPT, it’s really you send a message, you get an answer back almost right away, and that’s kind of the end of your interaction. I think as GPTs continue to get more robust, you’ll actually be able to say, “Hey, go and do this thing and just let me know when you’re done. I don’t need the answer right now. I want you to really spend time and be thoughtful about this.”
And again, if you think back to all these human analogies, that’s what we do as humans. I don’t expect somebody, when I ask them to do something meaningful for me, to do it right away and give me the answer back right away. So I think pushing more towards those experiences is what is going to unlock so much more value for people.
And I think the last thing is GPTs as this mechanism to get the next few hundred million people into ChatGPT and into AI. So I think if you’ve had conversations with people who aren’t close to the AI space, oftentimes you talk about, even if they’ve heard of ChatGPT… A lot of people haven’t heard of ChatGPT, but if they have, they show up in ChatGPT and they’re like, “I don’t really know what I’m supposed to do with this blank slate. I can kind of do anything. It’s not super clear how this solves my specific problem.”
But I think the cool thing about GPTs is you can package down like, “Here’s this one very specific problem that AI can solve for you and do it really well,” and I can share that experience with you and now you can go and try that GPT, have it actually solve the problem and be like, “Wow, it did this thing for me. I should probably spend the time to investigate these five other problems that I have to see if AI can also be a solution to those.” So I think so many more people are going to come online and start using these tools because very narrow vertical tools are what’s going to be a huge unlock for them.
Lenny: So in that last case, a classic horizontal product problem where it does so many things and people don’t know what exactly it should do for them. So that makes a ton of sense. Just being a lot more template oriented, use case specific, helping people onboard makes tons of sense. A common problem for so many SaaS products out there. The other ones you mentioned, which was really interesting, basically more interfaces to more easily interact with OpenAI voice. You mentioned audio and things like that. That makes tons of sense. And then this agents piece where the idea is, instead of just it’s a chat, it’s like, “Hey, go do this thing for me.”
Kind of along those lines, GPT-5, we touched on this a bit. There’s a lot of speculation about the much better version. People just have these wild expectations, I think, for where GPT is going. GPT-5 is going to solve all the world’s problems. I know you’re not going to tell me when it’s launching and what it’s going to do, but I heard from a friend that there’s kind of this tip that when you’re building products today, you should build towards a GPT-5 future, not based on limitations of GPT-4 today. So to help people do that, what should people think about that might be better in a world of GPT-5? Is it just faster? It’s just smarter? Is there anything else that might be like, “Oh wow, I should really rethink how I’m approaching my product?”
Logan Kilpatrick: If folks have looked through the GPT-4 technical report that we released back in March when GPT-4 came out, GPT-4 was the first model that we trained where we could reliably predict the capabilities of that model beforehand based on the amount of computes that we were going to put into it. We did a scientific study to show like, “Hey, this is what we predicted and here is what the actual outcome was.” So it’ll be, one I think, just as somebody who’s interested in technology, but interesting to see does that continue to hold for GPT-5, and hopefully we’ll some of that information whenever that model comes out.
I also think you can probably draw a few observations. One of them, which is GPT-4 came out. The consensus from the world is, “Everything is different. All of a sudden everything is different. This changes the world. This changes everything.” And then slowly but surely, we come back to reality of like, “This is a really effective tool and it’s going to help solve my problems more effectively.”
And I think that is the, undoubtedly, the lens in which people should look at all of these model advancements, like GPT-5 is surely going to be extremely useful and solve some whole new echelon of problems. Hopefully it’ll be faster. Hopefully It’ll be better in all these ways, but fundamentally, the same problems that exists in the world are still going to be the same problems. You now just have a better tool to solve those problems.
And I think going back to vertical use cases, I think people who are solving very specific use cases are just now going to be able to do that much more effectively. I don’t think that’s going to… People have these unrealistic expectations that GPT-5 is going to be doing back flips in the background in my bedroom while it also writes all my code for me and talks on the phone with my mom or something like that.
I’m like, “That’s not the case.” It’s just going to be this very effective tool, very similar to GPT-4, and it’s also going to be become very normal very quickly. And I think that is actually a really interesting piece. If you can plan for the world where people become very used to these tools very quickly, I actually think that’s an edge, and assuming that this thing is going to absolutely change everything, and in many ways I think it’s actually a downside, is the wrong mental framing to have of these tools as they come out.
Lenny: Kind of along these lines, you guys are investing a lot into B2B offerings. I think half the revenue, last I heard, was B2B and then half is B2C. I don’t know if that’s true, but that’s something I heard. What is it that you get if you work with opening AI as a company, as a business? What does it unlock? Is it just called OpenAI Enterprise? What’s it called and what do you get as a part of that?
Logan Kilpatrick: Yeah, so I think a lot of our B2B customers are using the API to build stuff. So I think that’s one angle of it. I think if you’re a ChatGPT B2B customer, we sell Teams, which is the ability to get multiple subscriptions of ChatGPT packaged together. We also have an enterprise version of ChatGPT. There’s a bunch of enterprise-y things that enterprise companies want, around SSO and stuff like that, related to ChatGPT Enterprise.
I think the coolest thing is actually being able to share some of these prompt templates and GPTs internally. So again, you can make custom things that work really well for your company with all of the information that’s relevant to solving problems at your company and share those internally. And to me, that’s like you want to be able to collaborate with your teammates on the cool things you create using AI. So that’s a huge unlock for companies. I think that those are the two biggest value adds. There’s higher limits and stuff like that on some of those models, but I think being able to share your very domain specific applications is the most useful thing.
Lenny: I think if you’re a company listening and you think a lot of employees are using ChatGPT, basically the simplest thing you could do is just roll it up into a business account with single sign-on and that probably saves you money and makes it easier to coordinate and administer.
Logan Kilpatrick: Yeah, there’s also a bunch of security stuff too, like if you want to control. You don’t want people to use certain GPTs from the GPT store because you’re worried about security or privacy and stuff like that. You don’t want your private data going in places. It makes a lot of sense to sign up for that so that you have a little bit more control over what’s happening.
Lenny: Okay, got it. There’s a launch happening tomorrow, I think, after we’re recording this. Can you talk about what is new, what’s coming out? I think this is going to come out a couple weeks after recording, but just what should people know that’s new that’s coming out from OpenAI tomorrow in our world?
Logan Kilpatrick: Yeah, updated, so there’s a few different things. A couple of quick ones are updated GPT-4 Turbo model, updated the preview model that we released that dev day. There’s an updated version of that. It fixes this, if folks have seen online people talking about this sort of laziness phenomenon in the model. We improve on that and it fixes a lot of the cases where that was the case. So hopefully the model will be a little bit less lazy. The big thing is the third generation embeddings model. So we were talking off-camera before recording about all of the cool use cases for embedding. So if folks have used embeddings before, it’s essentially the technology that powers many of these question and answering with your own documentation or your own corpus of knowledge. And like you were saying, you actually have a website where people can ask questions about recordings of the podcast.
Lenny: Lennybot.com. Check it out.
Logan Kilpatrick: Yeah, lennybot.com. And my assumption was that lennybot.com is actually powered by embedding. So you take all of the corpus of knowledge. You take all the recordings, your blog post. You embed them, and then when people ask questions, you can actually go in and see the similarity between the question and the corpus of knowledge and then provide an answer to somebody’s question and reference an empirical fact, something that’s true from your knowledge base. And this is super useful and people are doing a ton of this. It’s like trying to ground these models in reality, in what they know to be true. We know all the things from your podcast to be at least something that you’ve said before and to be true in that sense, and we can bring them into the answer that the model is actually generating in response to a question. So that’ll be super cool.
And these new V3 embeddings models, again, state-of-the-art performance, the cool thing is actually the non-English performance has increased super significantly. I think historically, people really were only using embeddings for… It only worked really well for English, and I think now you can use it across so many new languages because it’s just so much more performant across those languages and it’s five times cheaper as well, which is wonderful. There’s no better feeling than making things cheaper for people. I love it. I think now it’s like you can embed, I’m pretty sure, it was like 62,000 pages of text for $1, which is very, very cheap. So lots of really cool things that you can do with embeddings and exciting to see people invent more stuff.
Lenny: What a deal. Final question before we get to a very exciting lightning round. Say you’re a product manager at a big company, or even a founder, what do you think are the biggest opportunities for them to leverage the tech that you guys are building, GPT-4, all the other APIs? How should people be thinking about, “Here’s how we should really think about leveraging this power in our existing product,” or new product, whichever direction you want to go.
Logan Kilpatrick: Yeah, I think going back to this theme of new experiences is really exciting to me. I think consumers are just going to be… You are going to have an edge on other people if you’re providing AI that’s not accessible in a Chatbot. People are using a ton of chat and it’s a really valuable service area. It’s clearly valuable because people are using it. But I think products that move beyond this chat interface really are going to have such an advantage. And also, thinking about how to take your use case to the next level. I’ve tried a ton of chat examples that are very, very basic and providing a little bit of value to me, but I’m like, “Really this should go much further and actually build your core experience from the ground up.”
I’ve used this product that allows you to essentially manage or view the conversations that are happening online around certain topics and stuff like that. So I can go and look online. What are people saying about GPT-4? And what I just said out loud, “What are people saying about GPT-4,” is the actual question that I have. And in a normal product experience today, I have to go into a bunch of dashboards and change a bunch of filters and stuff like that. And what I really want is just ask my question. What are people doing? What are people saying about GPT-4? Get an answer to that question in a very data grounded way.
And I’ve seen people solve part of this problem where, oh, they’ll be like, “Oh, well here’s a few examples of what people are saying and, well, that’s not really what I want. I want this summary of what’s happening.” And I think it just takes a little bit more engineering effort to make that happen. But I think it’s like that is the magical unlock of like, “Wow, this is an incredible product that I’m going to continue to use,” instead of like, “Yeah, this is kind of useful, but I really want more.”
Lenny: Awesome. I’ll give a shout-out to a product. I’m not an investor, but I know the founder, called visualelectric.com, which I think is doing exactly this. It’s basically a tool specifically built for creatives, I think specifically graphic design, to help them create imagery. So there’s like Dali, obviously, but this takes it to a whole new level where it’s kind of this canvas, infinite canvas, that you could just generate images, edit, tweak them, and continue to rate until you have the thing that you need. Visualelectric.com.
Logan Kilpatrick: I’m going to try that out. Is it similar to Canva?
Lenny: It’s even more niche, I think, for more sophisticated graphic design, I think, is the use case. But I’m not a designer, so I’m not the target customer. But I will say my wife is a graphic designer. She had never used AI tools. I showed her this and she got hooked on it. She paid for it without even telling me that she was going to become a paid customer, and she created imagery of our dog and all this art. And now it’s like on our TV. The art she created is now sitting… It’s like we have a frame TV, and that’s the image on our TV. So anyway…
Logan Kilpatrick: I love that. What was it called again?
Lenny: Visualelectric.com. Anyway, anything else you wanted to touch on or share before we get to a very exciting lightning round?
Logan Kilpatrick: I’ve made this statement a few times online and other places, but for people who have cool ideas that they should build with AI, this is the moment. There are so many cool things that need to be built for the world using AI. And again, if I or other folks on the team at OpenAI can be helpful in getting you over the hump of starting that journey of building something really cool, please reach out. The world needs more cool solutions using these tools, and would love to hear about the awesome stuff that people are building.
Lenny: I would’ve asked you this at the end, but how would people reach out? What’s the best way to actually do that?
Logan Kilpatrick: Twitter, LinkedIn. My email should be findable somewhere. I don’t want to say it and then get slammed with a bunch of email. You should be able to sign my email, if you need it, online somewhere. But yeah, Twitter and LinkedIn is usually the easiest place.
Lenny: And how do they find you on Twitter?
Logan Kilpatrick: It’s just Logan Kilpatrick, or I think my name shows up as Logan.GPT or-
Lenny: Logan.GPT?
Logan Kilpatrick: Or official Logan K.
Lenny: Yeah. Awesome. Okay. And we’ll link into it in the show notes. Amazing. Well Logan, with that, we’ve reached a very exciting lightning round. Are you ready?
Logan Kilpatrick: I’m ready.
Lenny: First question, what are two or three books that you’ve recommended most to other people?
Logan Kilpatrick: I think the first one is one that I read a long time ago and came back to recently, is the One Room Schoolhouse by Sal Khan. Incredible. Yeah, I don’t want… It’s the lightning round so I won’t say too much, but incredible story and AI is what is going to enable Sal Khan’s vision of a teacher per student to actually happen. So I’m really excited about that. And the other one is, that I always come back to, is Why We Sleep. Sleep science are so cool. If you don’t care about your sleep, it’s one of the biggest up-levels that you can do for yourself.
Lenny: What is a favorite recent movie or TV show that you really enjoyed?
Logan Kilpatrick: I’m a sucker for a good inspirational human story. So I watched, with my family recently over the holidays, this Gran Turismo movie, and it’s a story about someone who, a kid from London, who grew up doing SIM racing, which is a virtual race car, and did this competition, ended up becoming a real professional race car driver through some competition. And it’s just really cool to see someone go from driving a virtual car to driving a real car and competing in the 24-hour Le Mans and all that stuff.
Lenny: I used to play that game and it was a lot of fun, but I don’t think I have any clue how to drive a real car, race car. So that’s inspiring. Do you have a favorite interview question that you’d like to ask candidates that you’re interviewing?
Logan Kilpatrick: Yeah, I’m always curious to hear what people’s… The thing that they so strongly believe that people disagree with them on.
Lenny: What do you look for in an answer that seems like, Wow, that’s a really good signal?”
Logan Kilpatrick: Oftentimes, it’s just an entertaining question to ask in some sense, but it’s also… It’s interesting to see what somebody’s deeply held strong belief is, I think. And not to judge whether or not I believe in that, but just curious to see why people feel that way.
Lenny: What is a favorite product that you’ve recently discovered that you really like?
Logan Kilpatrick: On the narrative of sleep, I have this really nice sleep mask from this company called… Not being paid. I just say this, but it’s called Manta Sleep or something like that. It’s a weighted sleep mask and it feels incredible when I… I don’t know. Maybe I just have a heavy head or something like that, but it feels good to wear a weighted sleep mask at night. I really appreciate it.
Lenny: I have a competing sleep mask that I highly recommend. I’m trying to find it. I’ve emailed people about it a couple of times in my newsletter for gift guides.
Logan Kilpatrick: Yeah.
Lenny: Okay. My favorite is called the Waoaw Sleep Mask, W-A-O-A-
Logan Kilpatrick: What do you like about it?
Lenny: W-A-O-A-W. I’ll link to it in the show notes. It makes a lot of room. It’s very large and there’s space for your eyes, so your eyelashes and whatever eyes aren’t pressed on, and it just fits really nicely around the head. And my wife, we both wear eye masks at night. It just, speaking of sleep, really helps us sleep. [inaudible 01:02:37] It doesn’t have the weight-ness piece, so it might be worth trying, but everyone I’ve recommended this to is like, “That changed my life. Thank you for helping me sleep better.” And so we’ll link to [inaudible 01:02:51].
Logan Kilpatrick: Look at that.
Lenny: Look at us. So adult. Two more questions. Do you have a favorite life motto that you often come back to share with friends or family, either in work or in life?
Logan Kilpatrick: Yeah, I’ve got it. It’s on a Post-It note that I… Right behind my camera and it’s “Measure in hundreds.” I love this idea of measuring things in hundreds, and it’s for folks who are at the beginning of some journey. I talk to people all the time, they’re like, “Yeah, I’ve tried this thing and it hasn’t worked.” And if your mental model is to measure in hundreds, “I measure in hundreds,” the five times that you failed at something, you failed and tried zero times. And I love that. It’s such a great reminder that everything in life is built on compounding and multiple attempts at stuff. And if you don’t try enough times, you’re never going to be successful at it.
Lenny: I love that. I could see why you are successful at OpenAI and why you’re a good fit there. Final question. So I asked ChatGPT for a very silly question. “Give me a bunch of silly questions to ask Logan Kilpatrick, head of developer relations at OpenAI,” and I went through a bunch. I have three here, but I’m going to pick one. If an AI started doing standup comedy, what do you think would be its go-to joke or funny observation about humans?
Logan Kilpatrick: I think today, I think if you were to do this, I think the go-to question would be something along the, “So an AI walks into a bar,” and likely because, again, it’s trained on some distribution of training data, and that’s the most common joke that comes up, and that’s probably… I wonder if you came up with a joke right now, whether or not that would show up in one of the examples.
Lenny: I love it. What would be the joke though? We need the joke. We need the punchline. I’m just joking. I know you can’t come up with amazing-
Logan Kilpatrick: That’s what we have ChatGPT for.
Lenny: We’re already irrelevant. Amazing. Logan, thank you so much for being here. Two final questions, even though you’ve already shared this information, but just for folks to remind them. Where can folks find you if they want to reach out and ask you more questions? And how can listeners be useful to you?
Logan Kilpatrick: Yeah, Twitter and LinkedIn, Logan Kilpatrick or Logan.GPT on Twitter. Please shoot me messages. I get a ton of DMs from people and it’s always really, really interesting stuff. I think the thing that I would love to have help on is if people find bugs and things that don’t work well in ChatGPT, I oftentimes see people be like, “This thing didn’t work really well.” And the key, and I think we as OpenAI need to do a better job of messaging this to people, but having shared chats or actual, tangible, reproducible examples, are the two things that we need in order to actually fix the problems that people have. The model laziness was a good example where it was hard to figure out what was going on because people would be like, “Oh, the model’s lazier,” but it’s hard to figure out what were the prompts they were using. What was the examples, all that stuff? So send those examples as you come up on things that don’t work well and we’ll make stuff better for you.
Lenny: Amazing. And I’ll also just remind people, if you’re listening to this and you’re like, “Oh, okay, cool. A lot of cool ideas for OpenAI and ChatGPT,” what you need to do is actually just go to chat.openai.com and try this stuff out. There’s a lot of just theorizing, but I think once you actually start doing it, you start to see things a little differently. And at this point, every day I’m in there doing something, like asking for ideas for questions, doing research on a newsletter post, and it’s just a tab I’m always coming back to. And I know there’s a lot of people just talking about this sort of thing, and I just want to remind people. Just go. Sign in. Play with it. Ask it questions on something you’re working on and just see how it goes and keep coming back to it. Is there anything else you want to share along those lines to inspire people to give this a shot?
Logan Kilpatrick: I love it. I think the phrase of people being worried about humans being replaced by AI, and I’ve seen this narrative online, that it’s not AI that’s going to replace humans. It’s other humans that are being augmented and using AI tools that are going to be more competitive in a job market and stuff like that. So go and try these AI tools. This is the best time to learn. You’re going to be more productive and empowered in your job and the things that you’re excited about. So yeah, excited to see what people use ChatGPT for.
Lenny: And then you can expense your account. I think it’s 10 or 20 bucks a month. A lot of companies are paying for this for you, so ask your boss if you can just have it expensed and make sure you use the latest version. Anyway, Logan, thank you again so much for being here.
Logan Kilpatrick: This is awesome, Lenny. Thanks for having me in. Thoughtful questions. Hopefully those weren’t all from ChatGPT.
Lenny: Nope, only the last one. I did have a bunch of others I had in the belt or in the pocket. I don’t know what the metaphor is. In the back pocket, that’s the metaphor, but I did not get to them because we had enough great stuff. So no, that was all me. Human AI.
Logan Kilpatrick: Thank you.
Lenny: Thanks, Logan.
Logan Kilpatrick: Lenny.ai.
Lenny: I love it. Lennybot.com, check it out. Okay, thanks Logan. Bye everyone. Thank you so much for listening. If you found this valuable, you can subscribe to the show on Apple Podcasts, Spotify, or your favorite podcast app. Also, please consider giving us a rating or leaving a review as that really helps other listeners find the podcast. You can find all past episodes or learn more about the show at lennyspodcast.com. See you in the next episode.
Glossary
| English | 中文 |
|---|---|
| agent | 代理 |
| AGI | AGI(通用人工智能,保留原文) |
| Assistants API | Assistants API(保留原文) |
| copilot/Copilot | Copilot(保留原文) |
| corpus | 语料库 |
| deal flow | 项目交易流 |
| Dev Day | Dev Day(开发者大会,保留原文) |
| Developer Relations | 开发者关系 |
| diligence | 尽职调查 |
| empirical study | 实证研究 |
| fine-tuning | 微调 |
| GPT Store | GPT Store(保留原文) |
| GPTs | GPTs(保留原文) |
| GPU | GPU(保留原文) |
| Harvey | Harvey(法律 AI 公司,保留原文) |
| high agency | 高能动性 |
| OKR | OKR(目标与关键成果,保留原文) |
| prompt engineering | 提示词工程 |
| Rabbit R1 | Rabbit R1(保留原文) |
| TLDraw | TLDraw(保留原文) |
| Tom Cruise | 汤姆·克鲁斯 |
| UX | UX(用户体验,保留原文) |
Reformatted by reformat_english.py
走进 OpenAI | Logan Kilpatrick(开发者关系负责人)
文字记录
找到高能动性的人
Logan Kilpatrick: 找到高能动性(high agency)且以紧迫感工作的人——如果我今天要招五个人,这两点是我最看重的特质。因为如果你身边有高能动性的人,不需要获得五十个人的共识就能采取行动,那你几乎可以征服世界。他们一听到客户遇到的挑战,就已经在着手寻找解决方案,而不是等待其他所有事情先发生……这些人直接去做、去解决问题,我很喜欢这一点。能够身处这样的情境中,真的很有趣。
Lenny: 今天我的嘉宾是 Logan Kilpatrick。Logan 是 OpenAI 的开发者关系负责人,负责支持在 OpenAI 的 API 和 ChatGPT 上进行开发的开发者们。在加入 OpenAI 之前,Logan 曾是 Apple 的机器学习工程师,并为 NASA 的开源政策提供咨询。不管你信不信,ChatGPT 发布至今才一年出头,却已经彻底改变了我们对 AI 的认知,以及它对产品和生活的意义。Logan 一直身处这一变革的最前线,每天都在帮助开发者和企业弄清楚如何利用这些新的 AI 超能力。
在我们的对话中,我们深入探讨了人们如何在工作和生活中使用 ChatGPT、新的 GPT 以及其他 OpenAI API 的实际案例。Logan 分享了一些关于如何提升提示词工程(prompt engineering)能力的非常有趣的建议。我们还聊到了 OpenAI 的内部运作方式、他们如何如此快速地发布产品,以及在招聘时最看重的两个关键特质。此外,Logan 也谈到了他认为基于 OpenAI API 构建新产品和新创业公司的最大机会在哪里。
我们也稍微聊到了 OpenAI 那个非常戏剧性的周末——关于董事会、Sam 和所有那些事情——以及更多内容。非常感谢 Dan Shipper 和 Dennis Ing 提出了很好的问题和建议。接下来,请收听我对 Logan Kilpatrick 的采访。
OpenAI 董事会事件的内幕
Lenny: Logan,非常感谢你的到来,欢迎来到播客。
Logan Kilpatrick: 谢谢你邀请我,Lenny,我非常兴奋。
Lenny: 我想先聊聊那个显而易见但大家可能已经开始淡忘的话题——虽然这已经是几个月前的事了,但我仍然非常好奇。在那个涉及董事会、Sam 和所有那些事情的戏剧性周末,OpenAI 内部是什么样的情况?当时到底是什么感受?有没有什么外界可能还没听过的故事,可以分享一下内部当时的情况和正在发生的事?
Logan Kilpatrick: 那确实是一个非常紧张的感恩节假期。先说大背景——ChatGPT 发布以来,OpenAI 一直在高强度推进,而那一周本来应该是全公司第一次真正放假休息、重新调整的时间。所以,说点私心的——我当时特别期待能和家人共度时光。然后,周五下午我们收到了变动的消息,这确实非常令人震惊。因为——这也是很多人的共同感受——大家对 Sam、Greg 和我们的领导团队一直有着、并且现在依然有着深厚的信任,所以这一切让人非常意外。而且 OpenAI 的公司文化是非常透明和开放的,当有问题或事情发生时,我们通常都会知情。而这次,很多事情发生在董事会和领导团队之间,我们中的很多人确实是第一次听说,所以非常、非常令人意外。
作为一个不在旧金山的人,说句私心的话,我其实有点庆幸这件事发生在感恩节假期,因为很多人已经回到了各自不同的地方。所以我觉得有一丝安慰,知道自己不是唯一一个不在旧金山的人——因为大家都在现场聚集在一起处理各种事情、共度那段时光。知道还有几个同事和我一样处于信息圈外,心里稍微好受一些。
最让我惊讶的是大家回归工作的速度之快。感恩节后的那一周我飞去了旧金山——这本来不在我的计划内——为了当面和团队处理事务。周一早上我走进办公室时,本以为会看到什么不对劲的情况。但事实上,大家都在全神贯注地投入工作。我觉得这体现了我们团队的水准——每个人都对我们正在建设的使命充满热情。所以,这确实是整件事中最令人惊讶的地方。我想很多公司遇到这种情况,很可能会被打乱相当长一段时间,但大家立刻就回到了正轨,这一点我非常喜欢。
Lenny: 我感觉这件事也许还让团队走得更近了。它像是一种共同经历的创伤体验,可能把人们凝聚在一起,因为这是所有人共同经历过的事情。这方面有什么感受吗——比如”哇,现在有些东西确实不太一样了”?
时机与团队凝聚力
Logan Kilpatrick: 我的一个感触是,这件事发生的时机其实让我很庆幸。我认为如今的代价……仍然不算低。很多人把自己的业务建在了 OpenAI 之上,我们有大量的 ChatGPT 用户,如果我们出了什么问题,确实会影响我们的客户。但放到世界层面来看,如果 OpenAI 不在了,别人也会去建模型,继续推进通用智能的进程。我觉得快进五到十年,如果类似的事情在那时发生——而我们还没有经历希望中即将到来的工作变革以及所有那些将要发生的变化——我认为结果可能会稍微、甚至严重得多。所以我很庆幸这件事发生在代价还比较低的时候。
我完全同意你的看法。自从我加入以来,团队在过去一年里增长得非常快。想到有那么多新同事加入,真是有点不可思议。我真的觉得这次事件把大家凝聚在了一起,因为在我加入的时候,把我们团结在一起的是 ChatGPT 的发布、GPT-4 的发布。对于那些没有赶上这些发布的同事来说,可能是某个重要的日子;对于参加了开发者日的同事来说,可能就是这次事件。所以我们经历了这些真正把公司跨部门凝聚在一起的事件,希望未来所有这样的事件都是像 GPT-5 这种令人兴奋的事情——不管它什么时候发布。
AI 最令人惊叹的新界面
Lenny: 太好了。我们接下来聊聊 GPT-5。换个完全不同的话题——最近你看到 AI 做过的最令人震撼或最出人意料的事情是什么?
Logan Kilpatrick: 最让我兴奋的是围绕 AI 的一些新界面,比如 Rabbit R1。不知道你有没有见过,那是一个消费级硬件设备。还有一家公司叫 TLDraw,不知道你有没有看过 TLDraw?
Lenny: 我想我看过。就是你画一个草图,然后它把它变成一个网站?
Logan Kilpatrick: 而这其实只是 TLDraw 正在做的很小一部分。现在有各种各样与 AI 交互的新界面,前几天我和 TLDraw 的团队聊过,想到如今人们使用 AI 的主要方式还是聊天,真是令人震惊。我实际上认为——这也是我对 TLDraw 团队非常看好的原因,我特别期待他们在做的产品——他们正在构建一种无限画布的体验。你可以想象,当你每天与 AI 交互时,你可能想跳转到你的无限画布上,AI 已经填充了所有细节,你可以看到对一个文件的引用、对一个视频的引用,以及各种不同的内容。
这是一种非常酷的方式。实际上,对于我们人类来说,以那种格式去查看信息,比在聊天里列出一堆东西要合理得多。所以我真的很期待看到更多人推动这件事。我认为 2024 年是多模态 AI 的元年,但也是人们真正突破围绕 AI 的一些新 UX 范式边界的一年。
Lenny: 很有意思。作为一个做了多年产品经理的人,我觉得聊天机器人这东西——我们每次头脑风暴讨论新功能的时候,总会有人说”嘿,我们应该做个聊天机器人来解决这个问题”。它就像一个永恒的老梗——“哦,聊天机器人”,或者”又有人建议我们做聊天机器人了”。而现在它们真的变得有用了、能正常工作了,大家都在做聊天机器人,其中很多都是基于 OpenAI 的 API。
这其实不算个问题,但我真正想问的——我本来打算后面再聊这个——就是当人们考虑构建一个产品的时候,比如 TLDraw,他们应该怎么思考?OpenAI 不会去做什么,而 OpenAI 会帮我们做什么?我们不用担心他们未来会做一个 TLDraw 的竞品。简单来说就是,在知道 OpenAI 可能会改变主意的情况下,怎么判断自己不会被 OpenAI 颠覆?
创业者的护城河:垂直 vs 通用
Logan Kilpatrick: 这是个好问题。我认为我们高度专注于那些非常通用的使用场景——通用推理能力、通用编程、通用写作能力。当你开始进入一些非常垂直的应用领域时……我觉得一个很好的例子其实是 Harvey。不知道你有没有了解过 Harvey,那是一个法律 AI 产品,他们在构建定制化的模型和工具来帮助律师和律所的工作人员。这是一个很好的例子——我们的模型可能永远不会做到 Harvey 正在做的那些事情,因为我们的目标和使命是去解决那些非常通用的使用场景,然后人们可以在此基础上进行微调,构建自己的定制化 UI 和产品功能。
我对那些今天正在构建非常通用的产品的人有很多共情,也有很多期待。我和很多开发者交流过,他们在构建通用助手、通用代理等等。我觉得这很酷,是个好想法。但他们面临的挑战是,他们最终会在这条赛道上直接和我们竞争。我认为这里有足够的空间让很多人取得成功,但对我来说,当我们最终推出某种通用代理产品时,你不应该感到惊讶——因为我们现在其实已经在通过 GPTs 来构建这个方向了。而我们不会去推出那些各种各样的垂直化产品。我们不会去做一个 AI 销售代理。那不是我们的方向。而那些正在做这件事、拥有特定领域知识、并且对那个问题空间充满热情的公司,可以深入其中,利用我们的模型,继续站在前沿,而不必自己去做所有的研发工作。
Lenny: 明白了。所以我听到的建议是:把使用场景做具体,这可以是针对某个场景做专门优化的模型,比如销售场景,也可以是通过界面或体验来解决一个更具体的问题。
Logan Kilpatrick: 而且我觉得,如果你要试图解决这种非常通用的问题——如果你要试图构建下一个通用助手来和 ChatGPT 竞争,那它必须要有根本性的不同。用户必须真心觉得”哇,这解决了我用 ChatGPT 时的这十个问题,所以我要去试试这个新东西”。否则,我们正在把大量的工程和研究精力投入到让 ChatGPT 成为一个出色的产品上,而这就只是创办公司常见的挑战了——和这样的东西竞争确实很难。
AI 如何提升公司内部效率
Lenny: 太好了,这个问题本来打算后面再聊的,很高兴提前聊到了。我猜很多开发者和创始人都在想这个问题。沿着类似的思路,现在有很多讨论说 ChatGPT、GPTs 以及你们提供的很多工具会让公司变得更高效——不需要那么多工程师、数据科学家、产品经理之类的。但我觉得对公司来说,也很难想清楚我们到底能做什么来让自己的公司变得更高效。我很好奇你能不能分享一些例子,比如有公司内部构建了一个 GPT 来做某件事,从而不必花费工程时间,或者总体上利用 OpenAI 的工具让内部业务变得更高效?
Logan Kilpatrick: 这个问题问得好。我不知道你能不能把这个放在节目笔记里之类的,哈佛商学院有一项非常好的研究……我忘了他们跟哪家咨询公司合作做的了,也许是波士顿咨询之类的,也可能是其他某一家。他们谈到的是,使用 AI 工具的人——我记得在那个研究场景中具体用的是 ChatGPT——相比不使用 AI 的人,效率提升了一个数量级。我也很期待,随着这项技术发布后时间推移,我们能获得更多实证研究。就我自己而言,作为一个工程师,我使用 ChatGPT 之后,交付速度比以前快了很多。
我没办法给自己一个具体的数字,但我猜现在应该有人在做了这些研究了。我认为工程实际上是目前用 AI 能做的杠杆率最高的事情之一,至少能带来 50% 以上的提升,尤其是一些比较容易上手的软件工程任务。模型做这些工作的能力真的很强。想想其实挺疯狂的……我猜 GitHub 大概也发布了大量围绕 Copilot 的研究,你可以用那些来类比人们从 ChatGPT 上获得的收益。但那些可能确实是杠杆率最高的应用场景了。
我认为现在有了 GPTs,人们可以去解决一些更具体的战术性问题。ChatGPT 有一个普遍的挑战,就是它在很多不同的场景下都能给出一个还算不错的答案,但往往不够贴合你们公司的风格,也不够贴合你工作的细微之处。而现在有了 GPTs,使用 ChatGPT 团队版和企业版的用户可以真正构建这些东西,融入自己公司的细节,让解决那些任务变得更有领域针对性。我们几个月前才刚发布 GPTs,所以我觉得目前还没有什么好的公开成功案例,但我猜现在各个公司里已经在取得成功了,希望接下来几个月随着大家越来越热衷于分享这些案例研究,我们会听到更多。
Lenny: 我分享一个例子。我有一个好朋友叫 Dennis Yang,他在 Chime 工作,他跟我讲了他们在 Chime 做的两件事,看起来挺有价值的。一个是他建了一个 GPT,帮助撰写 Facebook 和 Google 的广告,给你广告投放的创意,这样就减轻了营销团队或增长团队的一点负担。然后他还做了另一个 GPT,能交付实验结果,有点像数据科学家的角色,告诉你这个实验的结果是什么,然后你可以跟它对话,问”你觉得我们还需要再跑多长时间”,或者”这对我们的产品可能意味着什么”之类的。我觉得这真的很——
Logan Kilpatrick: 我很喜欢这个。
Lenny: 就像你说的那样。还有没有其他例子?就是你听说别人做了什么,让你觉得”哇,这做法真聪明……”我知道工程方面有 Copilot 那样的工具,还有没有其他让你印象深刻的?给大家一点灵感,比如”哇,这种思路很有意思,我也应该这样去用这些工具”。
Logan Kilpatrick: 我见过一些围绕规划场景的很有意思的 GPT,比如你想为你的团队做 OKR 规划之类的。我昨天还真看到有人发推文分享了。我也见过一些做风险投资相关事务的,用来对项目交易流做尽职调查,挺有意思的,能获得一些不同的视角。我觉得所有这类横向应用场景——你能引入一个不同的角色,从不同角度获取观点——都非常酷。我个人也用一个 GPT,是我自己用的私有 GPT,帮助我做不同季度的规划之类的事情,确保我在框架设计上保持一致,比如把所有内容都回溯到具体的指标上,这些都是人们做规划时在数据方面经常遗漏的东西。有一个 GPT 来强迫我思考这些事情,对我来说帮助非常大。
Lenny: 等等,能多说一些吗?这个 GPT 具体帮你做什么,你给它输入什么?
Logan Kilpatrick: 我忘了之前在网上看到的是哪篇文章了,但它讲的是如何最好地为规划做好准备。我从那篇文章里提取了一堆内容——我看能不能之后把它公开,把链接发给你——把其中的一些建议放进 GPT 里,现在每当我做任何规划、想构建什么东西的时候,我把它输进去,让它生成一个时间线,生成所有具体的细节,包括我要追踪的指标和成功标准是什么、规划过程中可能需要纳入哪些重要的跨职能利益相关方,等等这些,确实很有帮助。
Lenny: 哇,这太酷了。如果你把它公开的话那就太好了。如果你真公开了,我们会在节目里放链接,然后把它捧成 GPT 商店里最受欢迎的 GPT。
Logan Kilpatrick: 好极了。
提示词工程的兴起
Lenny: 换一个稍微不同的方向,现在有整个一类关于提示词工程的东西。感觉它是一个正在快速兴起的技能。我真的看到一家创业公司在招聘提示词工程师——是我投资的一家公司——我觉得这会让很多人大吃一惊,居然出现了这样一个新职位。我知道大家的看法是这个不会永远存在,理论上 AI 会变得足够聪明,你不需要再费心去想怎么巧妙地让它做你需要它做的事。但你能不能描述一下提示词工程这个概念——人们可能经常听到这个术语——然后再更有意思的是,你对大家提升写提示词的能力有什么建议,比如在 ChatGPT 上或者通过 API 使用时?
Logan Kilpatrick: 这个领域确实非常有意思,我也很期待大家做更多科学的实证研究,因为现在有太多凭感觉总结的最佳实践,也许在某种意义上并不真正成立。提示词工程之所以存在、之所以被提起,是因为模型由于训练方式的原因,非常倾向于对你提出的问题直接给出一个答案。垃圾进垃圾出。如果你问一个很基础的问题,你就会得到一个很基础的回答。其实对人类来说也是一样的,你可以想一个很好的例子。当我走到另一个人面前问”今天过得怎么样”,对方说”还不错”。
几乎没有零细节,没有层次,一点都不有趣。反之,如果你跟一个人有一些上下文、有私人关系,我跟你打招呼说”Lenny,你今天怎么样?上一期播客做得如何”等等,你就有更多的上下文和能动性来回答我的问题。我觉得这就是提示词工程。
Logan Kilpatrick: 我对这件事的整体看法是,提示词工程其实是一件非常人类化的事情。当我们想从一个人身上获取价值时,我们就在做这种提示词工程——我们努力与那个人进行有效沟通,以获得最好的输出。模型也是一样的。我觉得,同样地,因为我们使用的是一个看起来非常聪明的系统,我们就假设它拥有所有这些上下文,但实际上你可以把它想象成一个具有人类水平智能却完全没有上下文的存在。它完全不知道你会问它什么。它从来没见过你。它不知道你是谁、做什么、目标是什么。有时候你得到非常泛泛的回答,原因就在于人们忘记了需要把这些上下文放进模型里。
所以我认为能帮助解决这个问题的方案——其实我们在 DALL-E 的场景中已经在某种程度上这么做了。当你使用我们的图像生成模型 DALL-E,说”我想要一张乌龟的图片”时,它实际上会接收这个描述,把”我想要一张乌龟的图片”转化成一个高保真度的版本——生成一张乌龟的图片,带壳,绿色背景,水里有睡莲等等。它添加了所有这些丰富的细节,因为模型就是用这些高保真度的示例训练出来的。文本模型也会走上这条路。
你可以想象这样一个场景:你进入 ChatGPT,说”帮我写一篇关于 AI 的博客文章”,它会自动去生成一个更高保真度的描述——“帮我写一篇关于 AI 的博客文章,讨论不同技术之间的权衡,给出一些示例用例,引用一些最新论文”——它把这些全部替你做好,然后你作为用户就可以说”对,这差不多就是我想要的,我来编辑一下,这里改改那里改改”。
说到底,根本问题在于我们人类是懒惰的。我们不想把所有……我们并不真的愿意把自己的意思全部打出来,而我认为 AI 系统实际上会帮助解决一部分这个问题。
提升提示词的具体建议
Lenny: 那么,在那一天到来之前,大家在使用比如 ChatGPT 写提示词时,可以做哪些改进?我举个例子。Tim Ferriss 提过一个很好的建议,我一直在偷师——就是当你准备做访谈时,你去 ChatGPT。所以这次我就为你这么做了。我说”嘿,我要在我的播客上采访 Logan Kilpatrick,他是 OpenAI 的开发者关系负责人。以 Tyler Cowen 的风格给我十个问题”。Tyler Cowen 我觉得是最好的访谈者,他特别擅长问非常尖锐、原创的问题。所以你有什么建议能帮我改进这个提示词、获得更好的结果吗?那些问题还不错,挺好的,也挺有趣,但没有到那种”天哪,这些问题太厉害了”的程度。所以在这种场景下,你会给我什么建议?
Logan Kilpatrick: 这是一个很好的例子,你需要思考你在问的那个人是谁。我可能不是一个在互联网上有足够多信息的人,模型在训练中并没有掌握我背景的细节。我想如果你的嘉宾更有名,互联网上可能有足够的上下文来回答这些问题。实际上你需要自己做一些工作。比如说,如果你使用了 Bing 联网浏览功能,你可以说”这里是 Logan 的博客链接,还有他谈到的一些内容。这是他的 Twitter 链接。去看看他的一些推文和博客文章,找出他有哪些有趣的观点值得在播客中展现”,诸如此类。
这又是给模型足够的上下文来回答问题。我觉得,那个提示词对于某些人来说可能效果很好——比如你去采访 Tom Cruise 这样在互联网上有大量信息的人,可能效果会好一些。
Lenny: 所以建议就是多给上下文。它不会跟你说”嘿,我对 Logan 其实不太了解,给我更多信息”。它就是直接给你一堆好问题。
Logan Kilpatrick: 没错。它太想回答你的问题了。它根本不在意自己没有足够的上下文。它是你能想象的世界上最渴望回答问题的人,没有那些上下文,就很难给出任何有价值的东西。如果我们要印 T 恤的话,上面应该写着”上下文就是你所需要的一切,上下文是唯一重要的东西”。让语言模型为你做任何事情,上下文是极其重要的一环。
Lenny: 还有其他建议吗?就是大家坐在那里,也许现在正打开 ChatGPT 在写提示词,你还有什么能帮助他们获得更好结果的建议吗?
Logan Kilpatrick: 我们其实有一份提示词工程指南,大家可以去看看,里面有一些示例。具体能在多大程度上提升性能,取决于各种因素。有很多很小但看起来有点傻的做法,比如加一个笑脸表情,就能提升模型的表现。我相信大家见过很多这类看起来很傻的例子——让模型休息一下再回答问题,诸如此类。再想想看,之所以如此,是因为训练这些模型的语料库(corpus)就是人类之间互相发送的信息。你告诉一个人”我休息一下再回来工作,状态会更清醒,回答问题会更好,工作也会更出色”——这些模型也遵循非常类似的规律。同样地,当我看到消息末尾有一个笑脸,我会觉得这是一次积极的互动,我应该更愿意给出一个好回答,在对方要求的事情上花更多心思。
Lenny: 哇,等等。所以这是真的?加一个笑脸表情确实可能给你更好的结果?
Logan Kilpatrick: 再说一次,所有这些东西的挑战在于它非常微妙,而且性能提升幅度很小。你可以想象大概百分之一二的水平,对于几句话的回答来说可能根本看不出区别。但如果你在生成一整部长篇文本,笑脸表情可能确实会带来实质性的差异,对于简短的文本内容可能就看不出来了。
GPTs 的推出
Lenny: 好的,好建议。太神奇了。好,我们之前聊到了 GPTs——也许可以介绍一下你们推出的这个新东西 GPTs 是什么?我也很好奇它的进展。这是 OpenAI 一个很大的变化和新要素,就是你可以构建自己的迷你——我差点自己在解释了——迷你版 ChatGPT,然后人们可以……我觉得你可以收费?你可以对自己的 GPT 收费,还是现在全部免费?
Logan Kilpatrick: 现在全部免费。
Lenny: 好的。我猜未来大家应该可以收费。所以现在有了这个完整的商店。基本上就是你们推出的一个完整的应用商店。进展如何?有什么动态?有什么让你意外的地方?大家需要了解什么?
GPTs 的实际价值与未来潜力
Logan Kilpatrick: 进展很好。再说一下,过去如果你有一个很酷的 ChatGPT 用例,想分享给别人,你实际上得进去跟模型开始一段对话,用提示词引导它做你想做的事,然后在对话中的操作还没发生之前就把那个链接分享给别人,说”来,你可以在 ChatGPT 里接着我开的这个对话聊下去”。
GPT 改变了这一点——你把所有重要的上下文预先放进模型里,然后人们就可以去跟一个定制版的 ChatGPT 聊天。真正有趣的是,你可以上传文件,可以给它自定义指令,可以添加各种工具。比如内置了代码解释器,可以做数学运算;还内置了浏览功能、图像生成功能。对于更高级的开发者用例,你还可以连接外部 API——比如 Notion API、Gmail API 等等,让它真正替你执行操作。
大家解锁了非常多很酷的东西。最让我兴奋的,其实是非开发者现在也能去解决那些非常非常有挑战性的问题了——只要给模型提供足够的上下文,告诉它问题是什么,它就能解决。回到”上下文就是你所需要的一切”这个理念,在 GPT 的场景下确实如此:给够上下文,你就能解决更有趣的问题。
这方面让我兴奋的事情太多了。关于变现,本季度晚些时候商店上线后,人们可以根据谁在使用自己的 GPT 来获得收入,这将是一个巨大的解锁,会让很多人看到这里的机会。我也认为继续为不会写代码的人提供更多 GPT 能力是非常令人兴奋的。就算是我作为一个软件工程师,把 Notion API 或 Gmail API 接到我的 GPT 里也不是特别容易。我更希望能一键用 Gmail 登录,然后我的 Gmail 就能被访问了,或者别人也能登录自己的 Gmail 并授权访问。我觉得这些功能随着时间推移都会到来,但今天来看,自定义提示词基本上还是 GPT 最大的价值所在。
Lenny: 太棒了。我在另一个显示器上打开了,Canva 的 GPT 目前排在第一。我刚才一边跟你聊一边试着玩一下,想做一个大横幅写着”关键在上下文啊傻瓜”,但没成功,肯定是我哪里操作不对,不过我也没太专注在上面毕竟咱们在聊天。确实很酷。最后再问一个问题:你有没有看到过谁做了一个 GPT,让你觉得”哇,太厉害了,太酷了”,有让你惊喜的?我也分享一个我觉得很酷的例子,你听到这个问题时有想到什么吗?
Logan Kilpatrick: 我的第一反应是 Zapier。Zapier 围绕 GPT 做的所有东西都是你能想象到的最有用的。你能用它做到的事情非常深入……我不确定 Zapier 的 GPT 目前是怎么打包的,但作为第三方开发者,你实际上不需要会写代码就能把 Zapier 集成到你的 GPT 里。他们在这方面推进了很多,基本上 Zapier 目前支持的 5000 种连接你都可以带入你的 GPT,让它基本上什么都能做。所以我对 Zapier 和基于它们构建的开发者都感到非常兴奋,用那个平台能解锁太多东西了。对于非开发者来说,这可能是我觉得最令人兴奋的。
Lenny: 太棒了。Zapier 总是能在那里把东西串起来。
Logan Kilpatrick: 是的,他们很棒。
Lenny: 我想到的那个例子是,我有个朋友,他是一家叫 Runway 的公司的 CEO,做了一个叫 Universal Primer 的东西,帮助人们学习。它的描述是”关于任何事物,学习一切”,基本上是一种苏格拉底式的方法来帮你学习。比如你说”解释一下语言模型中 transformer 是怎么工作的”,然后它会逐步讲解并提问,帮你学习新概念。我觉得它是教育类 GPT 里排名第二的。
Logan Kilpatrick: 我很喜欢那个。他确实很厉害。
OpenAI 的工作方式与快速迭代
Lenny: 我想聊聊在 OpenAI 工作是什么感觉,产品团队怎么运作,公司怎么运作。你之前待过的两家公司是 Apple 和 NASA,这两家都不以速度见长。而现在你在 OpenAI,以速度极快著称——对某些人来说可能太快了,就像我们看到的那次董事会事件。所以我很好奇,OpenAI 到底做对了什么,才能以这么高的标准这么快地构建和发布?有没有什么流程或工作方式是你觉得其他公司也应该尝试的,以便更快地行动、发布更好的产品?
Logan Kilpatrick: 这里面有非常多有趣的权衡和围绕公司行动速度的张力。对我们来说,如果你以 Apple 和 NASA 为例,它们都是比较老牌的机构,随着时间的推移,趋势就是事情变慢。越来越多的检查和制衡机制被建立起来,拖慢了进度。我们还年轻,是一家新公司,所以没有太多那种随着时间积累下来的制度性障碍。
高能动性与高紧迫感
Logan Kilpatrick: 我觉得最重要的一点——Sam 似乎在 2022 年左右发过一条推文谈到这个——找到具有高能动性并且带着紧迫感工作的人,这是最重要的……如果今天让我招五个人,这两项特质是我最看重的。因为如果你身边都是高能动性的人,不需要去获取五十个人的共识——你信任这些高能动性的人,他们可以直接去做该做的事——我觉得这是最重要的,如果把一切浓缩到本质的话,这就是最关键的东西。
我和身边的同事都能看到这一点。大家的能动性非常高。看到问题就直接上手解决。听到客户反馈什么困难,就已经在推进解决方案了,而不是等待其他环节——我觉得传统公司就是被这些拖住了,“哦,先跟七个部门确认一下,收集反馈。“我们的人直接去做,解决问题。我非常喜欢这一点。能成为这种环境的一部分,太有意思了。
Lenny: 太酷了。我特别喜欢这两个特质,因为之前没听人这么说过。这两点可能是你们最看重的——高能动性、高紧迫感。为了让大家对这两个词在招聘中的具体表现有更清晰的认识,你刚才举了一个客服的例子,听到一个 bug 就去修。还有没有其他例子能说明高能动性是什么样的?紧迫感呢,除了”快快快,发发发”之外,还有什么更具体的体现?
Logan Kilpatrick: 我们在 Dev Day 发布的 Assistants API 就是个很好的例子。我们不断收到开发者的反馈,说希望在现有 API 之上有更高层的抽象。然后团队里一群人就凑到一起说,“嘿,我们来规划一下,看看怎么做一个这样的东西。“然后很快就协作开发了实际的 API,现在这个 API 驱动着市面上大量的助手应用。我觉得这是一个很好的例子——它不是自上而下的,不是某个人坐在那里说”我们做这五件事”,然后”好,团队去执行”。而是大家真正看到了问题,并且知道可以作为一个团队迅速聚在一起解决这些问题。Assistants API 就是一例,类似团队主动出手的情况还有无数个,但这个是我在脑海中首先想到的。
OpenAI 的规划与优先级
Lenny: 这就让我想问了——OpenAI 的规划是怎么做的?在这个例子里,感觉就是”嘿,我们觉得需要做这个,那就去做吧。“但我想应该还是有路线图、优先级、目标之类的吧?路线图制定和优先级排序一般来说是怎么运作的,才能允许这种事情发生?
Logan Kilpatrick: 我觉得这是 OpenAI 更有挑战性的部分之一。挑战很多。所有人都想从我们这里得到一切,尤其是在 ChatGPT 如此庞大、我们的 API 使用如此广泛的今天,人们会直接来说”嘿,我们要所有这些东西。“我们有一系列核心指导原则。首先回到使命——这件事真的能帮助我们迈向 AGI 吗?所以焦点在于:眼前可能有一个闪亮的诱惑——比如优化用户参与度之类的东西——但那真的是我们该做的吗?也许答案是肯定的,也许那确实能帮我们更快走向 AGI,但用这个透镜来审视一切,我认为永远是做任何决策的第一步。
在开发者这一侧,也有一些核心原则,比如可靠性——“嘿,如果我们增加各种很酷的新 API、新端点、新模态、新抽象,那当然很好,但我们有没有给客户提供稳健可靠的 API 体验?“这往往是第一个问题。我觉得我们在这方面有过做得不够的时候,当时还有很多其他想做的事情,但我们必须把焦点和优先级拉回到可靠性上——因为说到底,如果你的东西别人没法稳定可靠地使用,那再好也没人在乎。
所以有这些核心原则。我想再说一次,除了这些决策原则之外,实际的规划流程其实相当标准化。我们聚在一起,有 H1、Q1 的目标,大家全力冲刺。我觉得真正有意思的是事情随时间如何变化。你以为要做这些高层级的事情——新模型、新模态,等等。然后随着时间推移,各种动荡和变化层出不穷。有趣的是如何建立机制来说,“嘿,当脚下的地面不断变化时——就像今天 AI 领域的疯狂一样——我们如何更新对世界的理解、更新我们的目标?”
Lenny: 有意思,听起来跟大多数其他公司差不多。有 H1 规划,有 Q1 规划。那你们有没有具体的指标和目标?比如 OKR 之类的?还是说就是”好,我们要发布这些产品”?
Logan Kilpatrick: 我觉得粒度要高得多。说实话,我不认为 OpenAI 是一个很 OKR 导向的公司。我不觉得各团队现在在做 OKR,我也不太清楚为什么会这样。我甚至不知道 OKR 是不是还是行业主流。你大概在跟更多做这类决策的人聊,所以我反而好奇——你从其他人那里看到的情况是怎样的?OKR 现在还普遍吗?
Lenny: 是的,绝对普遍。很多公司用 OKR,喜欢 OKR。也有很多公司讨厌 OKR。OpenAI 不是一个 OKR 驱动的公司,我并不意外。沿着这个话题,我不知道这些你能分享多少——你们怎么衡量发布的东西是否成功?我知道有 AGI 这个终极目标,那有没有什么方式追踪我们离它越来越近了?发布 GPT Store 或者 Assistants 之类的东西时,你们还会看什么,来判断”好,这正是我们期望的”?就是看采用率吗?
Logan Kilpatrick: 是的,采用率是一个很好的指标。还有一系列围绕收入的指标、在我们平台上构建应用的开发者数量等等。我不想深入太多——让 Sam 或者我们领导团队的其他人来谈更多细节。但我觉得其中很多指标其实是对其他东西的代理。即使收入是一个目标,收入本身并不是真正的目标。收入是获取更多算力的代理,而算力才是真正帮助我们获得更多 GPU、训练更好的模型、最终达成目标的手段。所以有这些中间层——即使我们说某件事是目标,你单独听到可能会想,“哦,OpenAI 就想赚钱。“但实际上,钱是获得更好模型的机制,这样我们才能实现使命。我觉得这些角度都很有意思。
Slack 与沟通文化
Lenny: 我不确定我是否听说过比这更宏大的公司愿景——构建通用人工智能。我很喜欢这一点。我想很多公司大概都在想,“我们的版本是什么?“在离开这个话题之前,你有没有看到 OpenAI 还有什么做得特别好的地方,让它能这么快地行动、这么成功?你谈到了招聘高能动性、高紧迫感的人。还有没有什么让你觉得,“哇,这真是一种很好的运作方式?“我想其中一部分可能就是招到了极其聪明的人,这点大概不用明说。但除此之外还有别的吗?
Logan Kilpatrick: 我觉得使用 Slack 带来了不可忽视的好处。这可能有些争议,也许有些人不喜欢 Slack,但 OpenAI 的文化重度依赖 Slack,这真的……Slack 上的即时实时沟通至关重要。我非常喜欢可以把不同团队的人拉进来,让大家迅速汇聚到一起。所有人都一直在 Slack 上,所以不管你是远程的、在不同团队、还是在不同的办公室,公司文化很大一部分都深植在 Slack 里,它让我们能非常快速地协调,有时候给某人发一条 Slack 消息甚至比走到他工位还快,因为他就挂在 Slack 上,而且肯定在看。
我看到——不知道你有没有看——最近 Sam 和 Bill Gates 的那次访谈,Sam 说到 Slack 是他手机上使用时间最长的应用,“我甚至都不看手机上的使用时间了,因为我不想知道自己用 Slack 用了多久。“但我确信 Salesforce 那边的人看到这些数据肯定觉得,“这正是我们想要的。”
Lenny: 我也很喜欢 Slack,我是 Slack 的大力推广者。我知道有很多人黑 Slack,但它真的是一个非常好的产品。我试过很多替代品,没有一个能比的。我觉得对你们来说 Slack 有趣的一点是——你都不知道里面会不会有一个 AGI,不是真正的人,就在公司里工作。
Logan Kilpatrick: 我知道里面都是真人。目前还没有 AGI。不过我觉得,Slack 自己也在做很多很酷的 AI 工具,我很期待能用到。这也是为什么现在有这么多令人兴奋的 AI 进展。说到底,作为所有这些新 AI 产品的消费者,真的很令人激动。Google 就是个很好的例子——我很高兴 Google 在做很酷的 AI 东西,因为我是 Google Docs 的用户,我很喜欢用 Google Docs,还有他们其他一些产品。看到大家在围绕这些模型构建如此有用的东西,太棒了。
团队规模
Lenny: OpenAI 团队现在大概多大了?你能分享多少就分享多少,给大家一个规模感。
Logan Kilpatrick: 我记得去年年底公开的数字大概是 750 人左右,780 之类的。我们现在还在快速增长,所以具体最新的数字我就不当那个披露的人了,但团队正在疯狂扩张,我们的工程团队和产品经理团队也都在招人。如果有朋友感兴趣,非常欢迎来聊聊。
Lenny: 也许再问最后一个问题。你们在增长,可能要接近 1000 人了,但显然仍然非常创新,行动速度极快。你有没有观察到 OpenAI 在保持创新、不让新的大想法慢下来方面,有什么做得好的地方?
研究团队与小团队优势
Logan Kilpatrick: 有几件事。其中一件是,实际的研究团队——他们孵化了 OpenAI 大部分的创新——被有意保持在一个小规模。OpenAI 的大部分增长都集中在面向客户的角色、为 ChatGPT 等产品提供基础设施的工程角色上。研究团队是刻意保持小规模的。而且有各种各样的讨论,这真的很有意思。我刚看到我们一位研究人员发的帖子,他在讲:在一个受 GPU 算力容量约束的世界里——这对 OpenAI 的研究人员来说是现实,对其他所有地方的研究人员也是一样——每增加一名新的研究人员,对整个研究组来说实际上是一个净生产力损失,除非这个人能以一种如此深刻的方式提升所有其他人的水平,从而提高整体效率。
如果你只是加一个人,让他去搞一个完全不同的研究方向,那现在你就得跟他共享 GPU,其他所有人的实验都变慢了。所以这是研究人员面临的一个非常有趣的权衡,我觉得做产品的人不会遇到这种情况——如果我给 API 团队或某个 ChatGPT 团队加一个工程师,他们确实能写更多代码、做更多事情,这对所有人都是一个净正面的改善。但对研究人员来说并非总是如此,这是很有意思的一点,至少在一个 GPU 受限的世界里是这样——希望我们不会永远处于这种状态。
未来方向与新模态
Lenny: 我想把视角拉远一点,然后还有几个后续问题。OpenAI 接下来的方向是什么?大家应该预期你们即将推出和发布的工具有哪些?
Logan Kilpatrick: 新的模态。我认为 ChatGPT 会持续拓展各种可能的交互体验。今天的 ChatGPT 基本上还是文本输入、文本输出——或者应该说三个月前还只是文本输入、文本输出。我们已经开始改变这一点:现在你可以用语音模式,可以生成图片,可以拍照。所以我认为,持续扩展你与 AI 交互的方式,是 ChatGPT 接下来要做的事情。
我认为 GPTs 是我们迈向代理未来的第一步。同样,今天你使用一个 GPT 的时候,基本就是你发一条消息,几乎立刻得到一个回答,交互就结束了。我觉得随着 GPTs 变得更加成熟,你实际上可以说,“嘿,去做这件事,做完了告诉我。我现在不需要答案,我想让你真正花时间、认真对待这件事。”
回想一下所有那些人类类比——我们作为人类就是这么做的。当我请别人帮我做一件重要的事情时,我不指望他立刻做完、立刻把答案给我。所以我认为,推动更多这样的体验,将为大家释放多得多的价值。
最后一点是,GPTs 作为一个机制,把接下来几亿人带进 ChatGPT、带进 AI。我觉得如果你跟那些不近距离接触 AI 领域的人聊过,很多时候你会发现——即使他们听说过 ChatGPT(很多人其实没听说过),但当他们打开 ChatGPT,面对那个空白界面,他们会觉得,“我真的不知道我该拿这个东西做什么。它什么都能做,但不清楚它到底怎么解决我的具体问题。”
但 GPTs 的厉害之处在于,你可以把它打包成——“这是 AI 可以为你解决的一个非常具体的问题,而且做得非常好”——我可以把这个体验分享给你,你去试试那个 GPT,它真的帮你解决了问题,然后你会想,“哇,它帮我做了这件事。也许我应该花点时间看看我另外五个问题,看看 AI 是不是也能解决它们。“所以我觉得会有非常非常多的人开始上线、开始使用这些工具,因为那些非常窄的垂直工具对他们来说将是一个巨大的解锁。
GPT-5 的预期与现实
Lenny: 那么在刚才说的这个情况里,其实就是一个典型的横向产品问题——它能做的事情太多了,人们反而不知道它到底能为自己做什么。所以,转向更模板化、更针对具体使用场景、帮助用户上手,这些都非常合理。这也是很多 SaaS 产品面临的共同难题。你提到的另外几点也非常有意思,基本上就是更多交互界面,让人们更方便地与 OpenAI 的语音交互。你提到了音频之类的,这些都很有道理。然后就是代理这个方向,核心理念是——不再仅仅是一个聊天,而是”嘿,帮我把这件事做了”。
关于 GPT-5,我们之前稍微聊到过。大家对更强的版本有很多猜测,我觉得人们对 GPT 的未来发展抱有一些不切实际的期望——GPT-5 会解决世界上所有的问题。我知道你不会告诉我它什么时候发布、会做什么,但我从一个朋友那里听到过一个建议:今天构建产品时,应该面向 GPT-5 的未来来构建,而不是基于 GPT-4 目前的局限。为了帮助大家做到这一点,在 GPT-5 的世界里,哪些方面可能会变得更好?仅仅是更快吗?仅仅是更聪明吗?还是会有其他让人惊叹的东西,让人觉得”我真的应该重新思考我设计产品的方式”?
Logan Kilpatrick: 如果大家看过我们在三月份 GPT-4 发布时公布的技术报告,GPT-4 是我们训练的第一个模型,能够根据我们投入的计算量可靠地预测其能力。我们做了一项实证研究来展示——“这是我们的预测,这是实际的结果。“所以,作为一个对技术感兴趣的人,我觉得值得关注的是这个规律在 GPT-5 上是否仍然成立,希望在那个模型发布时我们能看到相关信息。
我同时也觉得可以得出几个观察。其中一个就是 GPT-4 发布后,世界的共识是”一切都变了。一切都突然不同了。这改变了世界,改变了一切。“然后慢慢地,大家回归现实——“这是一个非常有效的工具,它能更有效地帮我解决问题。”
我认为这毫无疑问是人们看待所有模型进步应该采用的视角。GPT-5 肯定会非常有用,会解决一些全新层级的问题。希望它会更快,希望在各方面都会更好,但从根本上说,世界上存在的问题还是那些问题,你只是有了一个更好的工具来解决它们。
回到垂直使用场景,我认为那些在解决非常具体场景的人,只是会能够做得更加高效。我不认为……人们对 GPT-5 有些不切实际的期望,觉得它会在背景里后空翻,同时帮我写代码,还替我跟妈妈打电话之类的。
这不是事实。它只会是一个非常有效的工具,跟 GPT-4 非常相似,而且它也会非常快地变得稀松平常。我觉得这其实是一个非常有趣的点。如果你能提前规划一个人们很快就会习惯这些工具的世界,我认为这本身就是一种优势。而假设这个东西会彻底改变一切——在很多方面我认为这反而是一个劣势——这是看待这些工具面世时错误的思维框架。
OpenAI 的 B2B 产品
Lenny: 顺着这个方向,你们在 B2B 产品上投入了很多。我上次听说,收入的一半是 B2B,另一半是 B2C,不过不确定是不是真的。作为一家公司、一个企业,如果与 OpenAI 合作,你能获得什么?它能解锁什么?是不是叫 OpenAI Enterprise?它叫什么,能获得什么?
Logan Kilpatrick: 好的,我觉得我们很多 B2B 客户是通过 API 来构建产品的,这是一个方面。如果你是 ChatGPT 的 B2B 客户,我们销售 Teams 版,就是把多个 ChatGPT 订阅打包在一起。我们也有 ChatGPT 的企业版。企业客户需要一堆企业级功能,比如 SSO 之类的,这些都跟 ChatGPT Enterprise 相关。
我觉得最酷的其实是能在内部共享一些提示词模板和 GPTs。你同样可以创建非常适合你公司的定制化工具,使用与解决你公司问题相关的所有信息,并在内部共享。对我来说,你肯定希望能跟团队成员协作你用 AI 创建的那些很棒的东西。所以这对企业来说是一个巨大的解锁。我认为这是两个最大的价值点。还有一些更高的模型使用限额之类的,但我觉得能共享你非常垂直的专业应用才是最有用的。
Lenny: 如果有公司在听这个播客,而且觉得很多员工都在用 ChatGPT,最简单的做法就是把它整合成一个企业账户,加上单点登录,这大概能省钱,也更容易协调和管理。
Logan Kilpatrick: 对,还有很多安全方面的功能,比如你想控制——你不希望员工使用 GPT Store 里的某些 GPTs,因为你担心安全或隐私之类的问题。你不想让你的私密数据流到不该去的地方。注册企业版就很有意义,这样你就能对正在发生的事情有更多控制。
即将发布的新功能
Lenny: 好的,明白了。明天有一场发布会,在我们录制这期节目之后。你能聊聊有什么新的东西、即将发布什么吗?我想这期节目会在录制几周后才上线,但大家应该知道明天 OpenAI 在我们的领域有什么新东西发布?
Logan Kilpatrick: 好的,更新了,有几件不同的事情。简单来说,更新了 GPT-4 Turbo 模型,也更新了在 Dev Day 上发布的预览模型,有了一个更新版本。它修复了这个问题——如果大家在网上看到过人们讨论的模型”懒惰”现象——我们在这方面做了改进,修复了很多出现这种情况的场景。所以希望模型会不那么”懒惰”一点。最关键的是第三代嵌入模型。我们之前在录制前聊过嵌入的各种酷炫使用场景。如果大家之前用过嵌入,它本质上是支撑很多”用你自己的文档或你自己的语料库进行问答”的技术。就像你说的,你有一个网站,人们可以在上面针对播客录音提问。
Lenny: Lennybot.com,去看看吧。
Logan Kilpatrick: 对,lennybot.com。我的推测是 lennybot.com 实际上就是基于嵌入运行的。所以你把整个语料库——所有的录音、你的博客文章——全部做嵌入,然后当人们提问时,你就可以去比对问题和知识库之间的相似度,进而给出一个答案,并且引用一个实证事实,也就是你知识库中真实存在的内容。这非常有用,大家在这方面做了大量的工作。它的核心思路就是让这些模型扎根于现实、扎根于它们已知为真的内容。我们知道你播客中的所有内容至少是你曾经说过的,在这个意义上是真实的,我们可以把它们带入模型针对问题生成的回答中。所以这会非常酷。
新的 V3 嵌入模型,同样是业界最先进的性能。很酷的一点是非英语语言的性能有了极大提升。我觉得过去大家基本上只能用嵌入做英语的内容,而且效果也只有在英语上才真正好。而现在我觉得你可以在那么多新的语言上使用了,因为跨语言的性能确实提升了很多,而且价格也降到了原来的五分之一,这太棒了。没有什么比帮大家降低成本更让人开心的了。我很喜欢这件事。我记得现在大概 1 美元可以嵌入大约 62,000 页文本,非常非常便宜。所以嵌入能做的酷事情非常多,很期待看到大家发明出更多新玩法。
产品经理和创始人如何利用 AI
Lenny: 真划算。在进入非常令人兴奋的快问快答环节之前,最后一个问题。假设你是一家大公司的产品经理,或者你是一个创始人,你觉得对他们来说,利用你们正在构建的技术——GPT-4 以及其他所有 API——最大的机会是什么?人们应该怎么思考”我们到底该如何在现有产品中真正利用这种能力”这个问题?或者新产品也可以,你想往哪个方向聊都可以。
Logan Kilpatrick: 我觉得回到”新体验”这个主题,真的很让人兴奋。我觉得消费者会……如果你提供的 AI 体验不仅仅是一个聊天机器人可以获取的那种,你就会比别人有优势。大家大量使用聊天界面,它确实是一个很有价值的服务领域,用户量说明了一切。但我认为那些超越聊天界面的产品,真的会有巨大的优势。另外,也要思考如何把你的使用场景提升到下一个层次。我试过大量非常基础的聊天应用,提供给我的价值很有限,但我会想,“这其实应该走得更远,真正从零开始构建你的核心体验。”
我用过一款产品,它可以让你管理和查看网上围绕某些话题的讨论之类的。所以我可以在网上去看,大家对 GPT-4 都在说什么?而我刚才说出口的那句话——“大家对 GPT-4 都在说什么”——就是我真正想问的问题。在今天典型的产品体验中,我得进入一堆仪表盘,调整一堆筛选器之类的东西。而我真正想要的只是直接问我的问题:大家在做什么?大家对 GPT-4 在说什么?然后以一个有数据支撑的方式得到答案。
我也看到有人解决了这个问题的一部分,比如他们会展示,“哦,这里有几个人在说什么什么”,但这其实不是我想要的。我想要的是对整体情况的总结。我觉得实现这一点只需要再多一点工程上的努力。但我认为那种才是真正的魔法解锁感——“哇,这是一个不可思议的产品,我会一直用下去”,而不是”嗯,这个有点用,但我真的想要更多”。
Lenny: 太棒了。我来推荐一款产品。我不是投资人,但我认识创始人,叫 visualelectric.com,我觉得它做的正是你说的这件事。它基本上是一个专门为创意人士打造的工具,我觉得主要是平面设计方向,帮助他们创作图像。市面上有 Dali 之类的东西,但这个把它带到了一个全新的高度——它是一种画布,一个无限画布,你可以在上面生成图像、编辑、微调,不断迭代直到得到你需要的东西。Visualelectric.com。
Logan Kilpatrick: 我要去试试。它跟 Canva 类似吗?
Lenny: 我觉得它更细分,面向更专业的平面设计,大概是这样的使用场景。但我不是设计师,所以不是目标用户。不过我妻子是平面设计师。她以前从没用过 AI 工具。我把这个给她看,她就上瘾了。她都没告诉我一声就直接付费了,然后用它创作了我们家狗的图像和各种艺术作品。现在那些作品出现在我们电视上了。她创作的那些艺术作品现在就放在……我们有一个画框电视,那个就是电视上显示的画面。所以……
Logan Kilpatrick: 我太喜欢这个了。那个叫什么来着?
Lenny: Visualelectric.com。总之,在进入非常令人兴奋的快问快答之前,你还有什么想补充或分享的吗?
给 AI 开发者的建议
Logan Kilpatrick: 我在网上和其他场合说过几次这句话,但对于那些有酷想法、想用 AI 来实现的人来说,现在就是时候。有太多酷的事情需要用 AI 来为这个世界构建。还是那句话,如果我或 OpenAI 团队的其他同事能帮你迈出开始构建很酷的东西的第一步,请随时联系。这个世界需要更多用这些工具做出的优秀解决方案,也很期待看到大家在构建的厉害作品。
Lenny: 我本来打算最后再问你这个问题的,但大家怎么联系你呢?最好的方式是什么?
Logan Kilpatrick: Twitter、LinkedIn。我的邮箱应该在网上能找到。我不想在这里说出来然后被邮件淹没。如果你需要的话,应该能在网上找到我的邮箱。不过 Twitter 和 LinkedIn 通常是最方便的。
Lenny: 在 Twitter 上怎么找到你?
Logan Kilpatrick: 就是 Logan Kilpatrick,或者我的显示名好像是 Logan.GPT 或者——
Lenny: Logan.GPT?
Logan Kilpatrick: 或者 official Logan K。
Lenny: 好的,太棒了。我们会在节目简介里放上链接。太好了。Logan,说到这里,我们进入非常令人兴奋的快问快答环节。准备好了吗?
Logan Kilpatrick: 准备好了。
快问快答
Lenny: 第一个问题,你向别人推荐最多的一两本书是什么?
Logan Kilpatrick: 第一本是很久以前读过、最近又重读的一本书,是 Sal Khan 写的《一间教室的学校》(One Room Schoolhouse)。非常精彩。对,我不想……这是快问快答所以我不多说,但故事非常精彩,而且 AI 正是能让 Sal Khan 的”每个学生一位老师”愿景真正实现的技术。所以我对这个特别兴奋。另一本是我一直反复推荐的是《我们为什么需要睡眠》(Why We Sleep)。睡眠科学太酷了。如果你不关注自己的睡眠,这可能是你能为自己做的最大的提升之一。
Lenny: 你最近最喜欢的一部电影或电视剧是什么?
Logan Kilpatrick: 我对那种鼓舞人心的真人故事毫无抵抗力。最近假期里我和家人一起看了《Gran Turismo》这部电影,讲的是一个伦敦小孩,从小玩 SIM racing(模拟赛车),然后参加了一场比赛,最终通过某个竞赛成为了一名真正的职业赛车手。看到一个人从开虚拟赛车到开真正的赛车,参加勒芒 24 小时耐力赛什么的,真的很酷。
Lenny: 我以前也玩过那个游戏,挺好玩的,但我完全不觉得自己有能力驾驶一辆真正的赛车。所以那个故事确实很鼓舞人心。你在面试候选人时,有没有一个特别喜欢问的面试问题?
Logan Kilpatrick: 有的。我一直很想知道,别人有什么东西是他们深信不疑但别人却不认同的。
Lenny: 你在回答中会寻找什么信号,让你觉得”哇,这是一个非常好的信号”?
Logan Kilpatrick: 很多时候,这个问题本身就挺有意思的,但它也……我觉得看到一个人内心深处坚定的信念是什么,这件事本身很有意思。不是说我要判断我自己是否也认同,而是单纯好奇人们为什么会那样想。
最近发现的好产品
Lenny: 你最近有没有发现一个特别喜欢的产品?
Logan Kilpatrick: 顺着睡眠这个话题,我有一个特别好的睡眠眼罩,是一家叫……不是广告啊,我只是随口说说,叫 Manta Sleep 什么的。它是一个加重的睡眠眼罩,戴上感觉特别棒。我也不知道,可能就是我的头比较重之类的,但晚上戴着加重的眼罩确实很舒服。我真的很喜欢。
Lenny: 我有一款与之竞争的睡眠眼罩,我也非常推荐。我找一下。我在newsletter的送礼指南里已经给好几个人推荐过了。
Logan Kilpatrick: 是吗。
Lenny: 我最喜欢的那款叫 Waoaw Sleep Mask,W-A-O-A-
Logan Kilpatrick: 你喜欢它什么?
Lenny: W-A-O-A-W。我会在节目备注里放链接。它的空间很大,整体很大,给眼睛留了足够的空间,所以睫毛什么的、眼睛不会被压到,而且戴在头上非常贴合。我妻子和我晚上都戴眼罩。说到睡眠,这确实帮了我们睡得更好。它没有加重的部分,所以也许值得试试你那款,但每个我推荐过的人都跟我说,“这改变了我的生活,谢谢你帮我睡得更好了。“所以我们会在节目备注里放链接。
Logan Kilpatrick: 看看这个。
Lenny: 看看我们俩。真是太成年人的话题了。还有两个问题。你有没有一个经常回来的人生格言,会分享给朋友或家人的,不管是在工作还是生活中?
人生格言
Logan Kilpatrick: 有。我贴了一个便利贴……就贴在我摄像头后面,上面写着”以百来衡量”(Measure in hundreds)。我特别喜欢以百为单位来衡量事物这个理念。这主要是给那些刚开始某段旅程的人的。我经常和人聊天,他们会说,“嗯,我试过这个东西,但没用。“如果你的思维模式是”以百来衡量”,那你失败了五次就等于零次——你根本还没有真正尝试过。我特别喜欢这个理念。它很好地提醒你,生活中一切都是建立在复利和多次尝试之上的。如果你没有尝试足够多次,你永远不可能成功。
Lenny: 我很喜欢这个。我能理解为什么你在 OpenAI 做得很成功,为什么你很适合那里。最后一个问题。我让 ChatGPT 给我想了一些很蠢的问题。“给我一些蠢问题来问 Logan Kilpatrick,OpenAI 开发者关系负责人。“我过了一遍,这里挑了三个,但我只选一个。如果一个 AI 开始做单口喜剧,你觉得它最常讲的笑话或关于人类的有趣观察会是什么?
AI 的单口喜剧
Logan Kilpatrick: 我觉得今天的话,如果你真的去做这件事,我觉得最常见的笑话会是类似”So an AI walks into a bar”这种,可能是因为,说到底它是基于某个训练数据分布训练出来的,而这恰好是最常见的笑话类型。这大概就是……我在想如果你现在编一个笑话出来,它会不会出现在某个训练样本里。
Lenny: 我太喜欢了。但笑话是什么?我们需要笑话。我们需要笑点。我开玩笑的,我知道你不可能当场编出一个精彩的——
Logan Kilpatrick: 这就是我们 ChatGPT 存在的意义。
Lenny: 我们已经被淘汰了。太棒了。Logan,非常感谢你能来。最后两个问题,虽然你已经分享过这些信息了,但还是帮大家回顾一下。人们可以在哪里找到你,如果想联系你、问你更多问题?以及听众怎么能帮到你?
联系方式与反馈
Logan Kilpatrick: Twitter 和 LinkedIn,Logan Kilpatrick,或者 Twitter 上是 Logan.GPT。请给我发消息。我收到大量的私信,内容总是非常有趣。我觉得希望大家帮忙的是,如果你们在 ChatGPT 中发现 bug 或者有不好用的地方,我经常看到有人说”这个功能不太好用”。关键在于——我觉得我们 OpenAI 在向用户传达这一点上需要做得更好——有分享的对话链接或者实际的、可复现的具体示例,是我们真正修复用户问题所需要的两样东西。模型偷懒就是 一个很好的例子,当时很难弄清楚到底怎么回事,因为人们会说”模型变懒了”,但很难弄清楚他们当时用的是什么提示词,什么示例,所有那些东西。所以遇到不好用的地方请把这些示例发给我们,我们会为你改进。
Lenny: 太好了。我也想提醒大家一下,如果你正在听这个,觉得”哦,好的,关于 OpenAI 和 ChatGPT 有很多很酷的想法”,你需要做的就是直接去 chat.openai.com,亲自试试这些东西。很多人只是在理论层面讨论,但我觉得一旦你真正开始动手,你会以不同的视角看待这些事情。到现在,我每天都在上面做点什么——比如问它面试问题的灵感,为一个 newsletter 文章做研究——它就是一个我总是会回到的标签页。我知道有很多人只是在谈论这类事情,我只是想提醒大家:直接去。注册。玩一玩。问你正在做的事情相关的问题,看看效果如何,然后持续回来用。关于这方面,你还有什么想分享的来激励大家尝试一下吗?
Logan Kilpatrick: 说得好。我觉得有一个说法是人们担心人类会被 AI 取代,我在网上也看到了这种叙事——但实际上不是 AI 要取代人类,而是那些被 AI 增强了、正在使用 AI 工具的人会在就业市场等等方面更有竞争力。所以去试试这些 AI 工具吧。现在是最好的学习时机。你会在你的工作和你热爱的事情上变得更高效、更有能力。所以,很期待看到大家用 ChatGPT 做出什么。
Lenny: 而且你可以报销账号费用。大概一个月十块或二十块钱。很多公司都会为你付费,所以问问你老板能不能报销,确保你用的是最新版本。好了,Logan,再次非常感谢你能来。
Logan Kilpatrick: 太棒了,Lenny。感谢邀请我来。问题都很有深度。希望不是全部从 ChatGPT 那儿生成的。
Lenny: 不是,只有最后一个。我确实还准备了不少备选问题,在腰带上——不对,是揣在后兜里,应该是后兜,这个比喻才对——不过我们聊的内容已经够精彩了,所以没来得及用。所以不是的,全是我自己想的。人类 AI。
Logan Kilpatrick: 谢谢。
Lenny: 谢谢,Logan。
Logan Kilpatrick: Lenny.ai。
Lenny: 我喜欢。大家可以去看看 Lennybot.com。好的,谢谢 Logan,大家再见。非常感谢你的收听。如果你觉得这期节目有价值,可以在 Apple Podcasts、Spotify 或你喜欢的播客应用上订阅本节目。也请考虑给我们评分或留下评论,这真的能帮助更多听众发现这个播客。你可以在 lennyspodcast.com 找到所有往期节目或了解更多关于节目的信息。下期再见。
术语表
| 原文 | 中文 |
|---|---|
| agent | 代理 |
| AGI | AGI(通用人工智能,保留原文) |
| Assistants API | Assistants API(保留原文) |
| copilot/Copilot | Copilot(保留原文) |
| corpus | 语料库 |
| deal flow | 项目交易流 |
| Dev Day | Dev Day(开发者大会,保留原文) |
| Developer Relations | 开发者关系 |
| diligence | 尽职调查 |
| empirical study | 实证研究 |
| fine-tuning | 微调 |
| GPT Store | GPT Store(保留原文) |
| GPTs | GPTs(保留原文) |
| GPU | GPU(保留原文) |
| Harvey | Harvey(法律 AI 公司,保留原文) |
| high agency | 高能动性 |
| OKR | OKR(目标与关键成果,保留原文) |
| prompt engineering | 提示词工程 |
| Rabbit R1 | Rabbit R1(保留原文) |
| TLDraw | TLDraw(保留原文) |
| Tom Cruise | 汤姆·克鲁斯 |
| UX | UX(用户体验,保留原文) |
此文档由 AI 分片翻译(translate_long_document)