走向以证据为导向 | Itamar Gilad (Gmail, YouTube, Microsoft)

Itamar Gilad 2023-09-21

Becoming evidence-guided | Itamar Gilad (Gmail, YouTube, Microsoft)

Validating Gmail Tabs via Wizard of Oz

Itamar Gilad: You fake it, you do a fake door test, you do a smoke test, Wizard of Oz tests. We used a lot of those in the tabbed inbox by the way, one of the first early versions was actually we showed the tabbed inbox working to people. But it wasn’t really Gmail, it was just a facade of HTML and behind the scenes and according to the permissions that the users gave us some of us moved just the subject and the sender into the right place. So initially the interviewer kind of distracted them and then showed them their inbox and then the top 50 messages were sorted to the right place more or less if we got it right. And people were like, “Wow, this is actually very cool.” But it gave us some evidence to go and say, “Hey, we should try and build this thing.”

Lenny:

I actually used Ezra, earlier this year unrelated to this podcast completely on my own dime because my wife did one and loved it and I was super curious to see if there’s anything that I should be paying attention to in my body as I get older. The way it works is you book an appointment, you come in, you put on some very cool silky pajamas that they give you that you get to keep afterwards. You go into an MRI machine for 30 to 45 minutes and then about a week later you get this detailed report sharing what they found in your body. Luckily, I had what they called an unremarkable screening which means they didn’t find anything cancerous. But they did find some issues in my back which I’m getting checked out at a physical next month probably because I spend so much time sitting in front of a computer. Half of all men will have cancer at some point in their lives, as will one third of women. Half of all of them will detect it late.

Vanta, helps companies obtain the reports they need to accelerate growth, build efficient compliance processes, mitigate risks to their businesses and build trust with external stakeholders. Over 5,000 fast-growing companies use Vanta to automate up to 90% of the work involved with SOC 2 and these other frameworks. For a limited time Lenny’s Podcast listeners get $1,000 off Vanta. Go to vanta.com/lenny, that’s V-A-N-T-A.com/lenny to learn more and to claim your discounts get started today. Itamar, thank you so much for being here. Welcome to the podcast.

Lessons from Google+: Opinion-Based Development

Itamar Gilad: It’s a pleasure being here, thank you for inviting me.

Lenny: It’s my pleasure. I thought we’d start with the story of your work on Google+ and Gmail and how those experiences formed your perspective on how to build a successful product. Can you share that story?

Google’s DNA: From Evidence to Execution

Itamar Gilad: Google+ was my first experience at Gmail, I joined Gmail in August 2011 and the first thing they asked me is, “Let’s connect Gmail with Google+.” If you’re hazy about the story, back then Facebook was massive. It’s still massive but then it was growing like mushrooms, people were spending hours. That really freaked out Google and the obvious solution was to launch a social network of Google called Google+ and we all believe in this thing, it really caught on very well initially we all used it, we all believed in it. So our mission was to build this thing and Google really cut no costs. It created a whole new division within Google and it created a whole strategy around Google+ and we had to connect Gmail and YouTube and search to Google+ to make them more personalized in a sense and more social. So that was the idea and we went on and we launched a series of features in Gmail for a couple of years, honestly and Google+ itself became this massive project, very feature rich and with a lot of redesigns and iterations and none of it worked.

It turned out people actually didn’t need another social network, people didn’t love it, people didn’t use it. Eventually in Gmail we rolled back all the Google+ integration a few years later and Google+ itself was shut down in 2019. So putting aside all the tremendous waste that went into this, all the millions of person hours and personal weeks. In hindsight, not only did Google bet on the wrong thing it missed much easier opportunities. So just not far from Google’s headquarters there was WhatsApp, not very famous in the US but they actually created massive impact. Hundreds of millions of people were using their stuff and they became a threat to Facebook much more than Google was. So Google missed the opportunity of social mobile apps like WhatsApp, like Snapchat, etc and for me this story kind of was the epitome of what I call today, opinion-based development. We come up with an idea, we believe in it, all the indications show it’s good.

Maybe the early tests show it’s good, then we just go all in and we try to implement it and I made this very mistake many times as the product manager, I was the guy pushing for the ideas. So for me, this was kind of a turning point I felt we need to adopt a different system.

The Birth of Gmail Tabs

Lenny: And just before you move on to the next story, how big was the team? Roughly how many years was spent on this area? Just to give people a sense of the waste as you said.

The Evidence-Guided Company

Itamar Gilad: So there was a tremendous earthquake inside Google to create the Google+ team, teams and the entire divisions were kind of thrown apart and reformatted and I think at its peak it was about 1000 people inside-

When Are Top-Down Decisions Justified?

Lenny: Wow, [inaudible 00:07:56].

Itamar Gilad: It was a division the size of Android and Docs and a really sizable thing, they’re under their own buildings. It’s taken from the playbook of Steve Jobs, create this whole secretive project inside and just run like hell.

Presenting Data to Skeptical Leadership

Lenny: Yeah. I remember though Facebook was really scared, I remember they shut everything down. It as like a code DEFCON one situation too, so it really scared Facebook at the same time.

Itamar Gilad: Yeah, it’s true. But at the end of the day, neither Google’s advertising revenue was affected, neither was Facebook affected. So it turned out this idea was not that necessary after all.

Are You Truly Evidence-Guided?

Lenny: Yeah, okay. So that’s an example of something that didn’t work because it was opinion based software, I think the phrase you used and then there’s a different experience with tabs I think with Gmail.

Itamar Gilad: That’s right. So Google, is a very successful company. It’s not for me to criticize it or to in hindsight kind of say you guys need to be better and some of the people that were behind Google+ was some of the smartest leaders and I still think they are despite this story. If you look back at the history of Google, how things started in the first decade or so. Google, was what I call an evidence guided company. So essentially it put a high premium on focusing on customers, coming up with a lot of ideas on looking at the data, looking at how these ideas actually worked out. They weren’t shy about launching betas and things that were very rough and incomplete and learning from that and then they expected people to take action based on the results. So fail fast is a very famous paradigm and so you had to kill your project or pivot it seriously if it didn’t work out and I think had we kept fail fast it would’ve really have helped Google+, if we had this mentality.

But for some reason with Google+, Google put this playbook aside and used a different playbook which I call plan and execute essentially. But I think inside Google the DNA still existed. So inside Gmail, the next project after Google+ was the tabbed inbox. So it was kind of the reverse of Google+, it started as a very small idea that no one believed in and we started looking what’s behind the city? What’s the goal? What’s the problem actually we’re trying to solve? It turned out that a lot of people were receiving social notifications and promotions, etc, and most of them were very passive. They weren’t clearing their inbox, they were just living in this world of clutter and I came up with an idea how to fix this. I was sure it was great, I wanted to push it, plan and execute, but my colleagues were like, “Hold on, we actually tried this. We have a bunch of ideas to help people organize their inbox, they’re not using it. Why is your idea good?”

So that sent us, kind of me and my team into researching these users into establishing a goal that was much more user-centric and then thinking of other ideas. And then we started testing them much more rigorously and basically we started testing on our own inboxes and then we recruited other dog footers, other Googlers to test the same inbox, then we put it outside for external testers. We did usability studies, we did data, we built a whole data mining team and a whole machine learning team to build the right categorization and we ended up with a solution that turned out to be very successful for a lot of these passive users. This was a surprise to a lot of people because most of my colleagues and most of the people I talk with actually know how to manage their inbox. So for them that solution makes complete nonsense, like splitting promotions and social to the side sounds like the stupidest idea. But there’s about 85% of the population, 85 to 88% that absolutely love it and today Gmail has about 1.8 billion active users according to Gmail.

Most of these users are using this feature, so it was a pretty high impact feature as well.

Overview of the GIST Model

Lenny: And the feature specifically, just in case people aren’t totally getting it is the promotions folder and the social I think and then the regular.

The Four Levels of GIST

Itamar Gilad: Yeah, there are a couple more that you can enable in settings if you like.

Where Strategy and Vision Fit

Lenny: Yeah, I use it, I love it. Except it puts my newsletter in people’s promotions folder, who do I talk to about that?

Itamar Gilad: Yeah. Newsletters are a very complicated scenario for the categorization engine.

What Questions Does GIST Answer?

Lenny: Yeah. We just need an exception for my newsletter and then we’re good. Okay, but go on.

Itamar Gilad: So in hindsight I was asking and saying, “Why was this project so different?” And I think the reason is that we didn’t have that much confidence in our opinions. We had opinions, we had ideas but we didn’t just go all in and just let’s build it. We actually used an evidence guided system and I think that’s not unique just to Google. I think every successful product company out there that you look at Amazon, Airbnb, anyone you will check, at least in their best periods they found a way to balance human judgment with evidence. They didn’t try to obliterate human judgment and opinion just to supercharge them with evidence and they came up with very different models. Apple, is another example but the principle still holds in all of these companies.

Goals and Value Exchange Loops

Lenny: Awesome. So you took that experience and all the experience you’ve had from coaching product leaders working with companies and you wrote this book called Evidence-Guided, which people on YouTube could see sitting there behind you. So I want to talk through some of these stories and then some of these other lessons and frameworks that emerged. But maybe just to start, what’s the elevator pitch for this book?

Itamar Gilad: So this is a book for people like us, product people who want to bring evidence guided thinking or modern product management if you like into their organizations. There’s a lot of challenges, it’s not simple, we all read the books, we all know the theory, we all know some parts of the system. It tries to give you a system how to do that, it’s a meta framework that kind of helps you lift your organization in the direction of evidence guidance if that’s what you want to do.

The Metric Tree

Lenny: So going back to the story briefly before we get into the frameworks and lessons of the book. In the first example of Google+, basically it came top down, “Hey, we need to build a social network, go build it.” Obviously that happens at a lot of companies, I don’t know if there’s an easy answer to this. But are there cases where it does make sense to approach it that way? Obviously Apple is a classic example of Steve Jobs, is like we need to build an iPhone. I don’t know if that’s exactly how it went. But are there instances where it is worth just approaching new product ideas that way based on the experience and creativity and insights of the founder? Or is your thinking it should always come from this evidence-based approach?

Ideas Layer and ICE Scoring

Itamar Gilad: I think the founders are very important, especially in the startup and scale-ups phase. They come up with many of the most important ideas and it’s super important that they have the space to express and to push the organization to look at those. However, it’s not about shutting them down it’s about looking at them critically. You need to create the environment in the organization where the leader comes and says, “You know what? I talked to these three customers, I figured it out. Here’s what we need to do in the next five years.” And you need to ask, “Where’s your evidence?” And by the way, the example you give that’s a classic example. Steve Jobs, he just brainstorm in his kitchen the iPhone and then just told the team to build it. That’s the story Steve Jobs, told but it’s not the real story at all. Now we know what actually happened and the iPhone has actually a story of discovery, of trial and error, multiple projects to do it, multitouch with phones, most of them failed.

Steve Jobs, was the architect. He kind of managed to connect the dots and eventually come up with this perfect device but he wasn’t actually the creator, it wasn’t his brainchild. He was actually against it for a while but over time as he saw the evidence, as he saw what this thing can do, as he saw the demos he was able to piece together something that was very useful.

Lenny: That’s really important insight. People that are hearing this might feel like I like this idea of pushing back and encouraging the founders to make it more evidence guided. In the case of say Google+, was it even possible? Could you have come to Larry and Sergey and be like, “Here’s all this data I’ve gathered that tells us this is not going to work?” Do you have any advice for how to push back and encourage the founders and execs to really take the counterpoint seriously or really kind of vet their idea?

Delivery Speed vs. Discovery Speed

Itamar Gilad: So another nice thing about Google is that it’s a very open culture and people are not shy to tell even Sergey and Larry that they are wrong and they do this all the time. In certain forms, right? You need to know the right channels. But there was a very big discussion about Google+ and whether it’s the right thing to create a clone of Facebook, there was a very public internal discussion. I think what I would change is not have this discussion based on opinions, because when you have the discussion you come with your own opinions usually the most senior person’s opinions will win. That’s just the way it is. If we had come with hard data and we said, “Listen, things are not actually panning out the way you guys are expecting. What can we do? Should we continue? Should we pivot this?” I think the discussion would’ve done better. Now I’m doing a huge disservice, I was not in all the discussions. I know probably in Google+, there were very serious discussions happening along these lines.

But it’s just as a general trend, I find that evidence is very empowering for us smaller people in the organization or mid-level managers to be empowered to challenge the opinions.

Adapting for Different Company Stages

Lenny: Is there anything tactically you found to be useful and effective in giving people, say they don’t work at Google. They work at companies where founders and bosses and execs are not as open to challenge. [inaudible 00:18:26] any tactically found about how to present a counter proposal or like, “Hey, I have this data that we should really pay attention to?”

Which Two Companies Benefit Most?

Itamar Gilad: I think if you come with data, if you run a secret experiment and you come back and you show them you usually get one of two results. Either they get extremely mad at you and they tell you to get back to work and to do what you were told and in that case, probably you need to start polishing your resume and look for another place either inside the organization or outside it because that person is not being reasonable to be honest. But the more common case is they’re pleasantly surprised and that’s what happened with Steve Jobs, as well. He was against phones but then people showed him all sorts of evidence that Apple can make a phone. He was against multitouch initially but then he changed his mind, there was a lot of back and forth. So even, Steve Jobs, given evidence was willing to flip and I say this in many organizations. So evidence is so powerful, that’s why this is the principle I based the book on.

Where to Begin

Lenny: You have this concept of being evidence guided. People listening may feel like, “Hey, we’re evidence guided, we’re in experiments, we make decisions using data.” Oftentimes they aren’t actually and so what are signs that maybe you’re not actually that evidence guided or as evidence guided as you think you are?

Steps Layer: Build While Learning

Itamar Gilad: I think there’s a few telltale signs that I look for, first the goals are very unclear. Either there are many or they’re very kind of obscure and vague or they are about output, there’s misalignment. So the goals part is not there, usually this goes hand in hand with metrics. Missing metrics or just using revenue and business metric but there’s no user facing metrics. So that’s another telltale sign, then there’s a lot of time and effort spent on planning especially on road mapping. Creating the perfect roadmap which really can consume a lot of time of the top management and PMs, etc. Then as you go down you see there’s not a lot of experimentation and if there is experimentation there’s not a lot of learning and finally another telltale sign is that the team is disengaged. So the engineers are kind of getting the signal that what they need to do is deliver, they’re focused on output, that’s what they’re measured on. So they’re kind of disengaged, they’re disengaged from the users, from the business, they don’t care that much.

It’s usually something that you can fix by adopting a more evidence guided system.

Lenny: Okay. So let’s dive into your approach to becoming more evidence guided. In the book, you share this model that you call the GIST model which is kind of this overarching approach to building a product that almost forces you to be more evidence guided. So let’s just start with what’s the simplest way to understand this GIST model?

Low-Cost Validation Methods

Itamar Gilad: With your permission, I can show a few slides.

Lenny: Oh, let’s do it.

From Tests to Experiments to Launch

Itamar Gilad: And maybe that will help.

Lenny: Here we go, and then yeah, a good excuse to go check this out on YouTube.

The Gap Between Two Worlds

Itamar Gilad: All right, you’re seeing this? So this is the GIST model, goals, ideas, steps and tasks, and essentially it’s tries to break the change which is a really big change for a lot of companies into four slightly more manageable parts. They’re still big but each one you can tackle on its own and that’s kind of the reason I kind of split it, and goals are about defining what we’re trying to achieve, ideas are hypothetical ways to achieve the goals, steps are ways to implement the idea and validate it at the same time. So essentially build, measure learn loops and tasks are the things we manage in Kanban and Jira and all these good tools. These are the things that your development team is usually very focused on and just listening to this, a lot of this will sound familiar to you because GIST is not a brand new invention. It’s a meta framework that puts in place a lot of existing methodologies. It’s based on lean startup, on design thinking, product discovery, growth, There’s a lot of all of these things here. It just tries to put them all into one framework or one model.

GIST Kanban: Connecting Planning and Execution

Lenny: So what’s the simplest way to think about what this model is meant for? Is this how you think about your roadmap? Is this how you plan? What is this trying to tell people to do differently in the way they build product broadly?

Itamar Gilad: I would say these are four areas that you need to look at and ask, are we doing the right thing in each? In each you may need to change or even transform and as I go and explain each one of those I’ll give you basically three things. In each chapter in the book I try to touch on three things. The principles behind them, the frameworks or models that implement the principles and then process and the process honestly is the most brittle part and the one that you would need to change and adapt to your company. Because not two companies are exactly the same, and it’s very tempting when you write a book not to give any process but that’s the part that people actually want the most. So it’s included as well, but just be aware that you will have to change this process.

Learning Milestones Over Engineering Milestones

Lenny: Awesome. Okay, so we’re going to talk about each of these four layers. Before we do that, where do vision and strategy fit into this? Do they bucket into one of these four layers and how do you think about strategy and vision?

Itamar Gilad: That’s a great question, so there’s this whole strategic context that is outside of GIST. GIST, is not trying to tackle that, it assumes it’s in place, there’s another huge blob which is research. GIST, is not about research it’s more about discovery and delivery. But strategy is extremely important and you can use some of the tools we will talk about to develop your strategy as well. In many companies the strategy is just a roadmap on steroids, it’s small plan and execute just on a grand scale and Google+ again, was a strategic choice actually if you think about it. So in the book there is a chapter where I touch on strategy and I explain how the same evidence guided methods are being used by companies to develop their strategy as well.

GIST Kanban and the Roadmap

Lenny: Awesome, maybe one last context question. So people might be seeing this and thinking okay cool, I have goals, I have ideas steps, I have tasks, I’m already doing this. What is this kind of a counter or reaction to? What are people probably missing when they’re seeing this and they’re like, “Oh, I see. This is like what we’re not doing and this is the most important, this is something we should probably change.” And we’ll go through these in detail too.

Itamar Gilad: I think talking about each one will help.

How to Put It Into Practice

Lenny: Okay, let’s do it.

Itamar Gilad: But we can talk about in each level what’s actually being done. So when people say I have goals, usually they take the goals layer and use it as a planning session. They talk about what shall we build by when, what are the resources? And that’s actually not goals at all, that’s planning work.

Resources and Lightning Q&A

Lenny: Cool, let’s talk about goals and I know part of this is OKRs related too, so I’m excited to hear your take on OKRs.

Itamar Gilad: Oh, that’s a whole different discussion. You had, Christina, the real expert over there so I doubt I can add more to that. But it’s true OKR is all part of it, but let’s start with goals. What’s our goals supposed to be? Goals are supposed to paint the end state to define where we want to end up and the evidence will not guide you unless you know where you want to go, and in many companies what you have is goals at the top for revenue, market share, whatever it is, and then a bunch of siloed goals for each department. There’s engineering goals, there’s design goals, there’s marketing goals, etc, and that actually pushes people into different vectors and it’s really hard to decide. I would argue that in evidence guided companies, and you’ve worked for a few so probably you’ve seen this. They use models in order to construct overarching goals for the entire organization. One of the models I show in the chapter about goals is the value exchange loop.

Where basically the organization is trying to deliver as much value as it can to the market and to capture as much value back, and by creating a feedback loop between these two you are actually able to grow very fast. Now, I would argue that you want to measure both of these and to put a metric on each and the metric we usually use to measure value delivered is called the North Star metric. I know you wrote an article, a very good article about it.

Movies and Learning

Lenny: Thank you.

Itamar Gilad: And in it you listed dozens and dozens of companies, like leading companies and what they considered the North Star metric is super interesting. I would argue that what they told you is what is the most important metrics we measure? What is the number one metric for us? But it’s not what I call the North Star metric, the North Star metric measures how much value we create for the market. For example, let’s take WhatsApp. WhatsApp for a very long time measured messages sent because every message sent is a little incremental of value for the sender, the receiver, it’s free, it’s rich media, you can send it for anywhere in the world, compared to SMS that’s huge value. So if in year one we have a billion messages being sent in year two, two billion probably we doubled the amount of value. In Airbnb, I think one of your key metrics or the real North Star metric was nights booked. I don’t know if it was still the case while you were there?

AI and Product Discovery

Lenny: Yeah, absolutely.

A Motto for Life

Itamar Gilad: And there are examples like this in Amplitude for example, they measure active learning users or weekly active learning users. Which are users that found in the tool some insight that was so important that they shared it with at least two other users and they consume it. So it’s a very powerful thing to point at this metric and say, “This is the most important metric combined with the value metric that we want to capture, revenue, market share, whatever it is.” Once you have these two, you can further break them down into what I call metrics trees. So there’s a metric three for the North Star metric and there’s the metric three for the top KPI, the top business metric which you see here on the left side in blue and usually they overlap. So you might find in the middle some metrics that are super, super important because moving them actually moves the needle on everything else.

Taste of Home

Lenny: Can you clarify again the difference between what you call this top KPI versus North Star metric?

Itamar Gilad: So the North Star metric is measuring how much value we’re creating for the user, the core value that they’re getting. In this case this is some productivity suite, so this is number of documents created per month for example. Because we think that every document created maybe it’s a small document, I don’t know. AI is in fashion now, is a little incremental value, so that’s the number we’re trying to grow. The top KPI is what we expect to get, it should be revenue or profit.

Lenny: I see, this is the value exchange. I see, one is what users are getting, one is what you’re getting back from them.

Itamar Gilad: Exactly.

Lenny: Basically how the business is benefiting. Awesome. I think this is a really important concept, the metric tree. I think a lot of people think they have something like this in mind where they’re just like, “Cool, here’s our North Star Metric, here’s the levers and things that we can work on to move that.” But I think actually mapping it out the way you have it here where it kind of goes layers and layers deep to all of the different variables that impact this metric. Not only is it a way to think about impact and goals and things like that, but also helps you estimate the impact of the experiment you’re potentially thinking about running. So if you’re going to work on something at the bottom here like activation rate, say you move that 10%. How much is that going to impact this global metric? It’s probably a very small amount.

Itamar Gilad: This is a very important one and we’ll talk about impact assessment shortly, this helps with it. It also helps with alignment because the entire organization is trying to move these two metrics, it’s the two sides of our mission essentially. We have the mission that’s the top objective of the company and these are the two top most key results if you like, the top most things. So when you go and work with another team and you say, “Hey, why don’t you work on my project?” They might say, “This idea actually might move the North Star metric model in your idea.” And that helps you guys align and I’ve seen cases where team B put aside their own ideas to jump on the ideas of team A, because of this model. It also creates an opportunity to give some sub metrics to teams to own on an ongoing basis, so it creates a little sense of ownership as well and mission within the tree.

Lenny: It also helps you figure out what teams you should have, which teams have the biggest potential to impact the metric.

Itamar Gilad: Another thing that happens in a lot of organizations, the team topology reflects the structure of the software or some hierarchical model where we want to organize the organization in a particular way. But if you start with a metrics tree, you can try to arrange the topology around goals and sometimes you need to readjust. It’s not a constant reorg but from time to time you will realize the goals have changed and we need to reorganize, so the tree helps visualize that as well.

Lenny: I think for people that are listening to this and thinking about this, I think the simplest way to even think about this is basically there’s a math formula that equals your North Star metric or your revenue or whatever you’re trying to do and if you don’t have some ideally really clear sense of what that math formula is you should work on that. Because that will inform so much of how you think about where to invest, what teams to have, where to invest more resources, less resources.

Itamar Gilad: Right.

Lenny: Imagine a place where you can find all your potential customers and get your message in front of them in a cost-efficient way. If you’re a B2B business, that place exists and it’s called LinkedIn. LinkedIn Ads allows you to build the right relationships, drive results, and reach your customers in a respectful environment. Two of my portfolio companies Webflow and Census are LinkedIn success stories. Census had a 10x increase in pipeline with the LinkedIn startup team, for Webflow after ramping up on LinkedIn in Q4 they had the highest marketing source revenue quarter to date. With LinkedIn Ads, you’ll have direct access to and can build relationships with decision makers including 950 million members, 180 million senior execs and over 10 million C-level executives. You’ll be able to drive results with targeting and measurement tools built specifically for B2B. In tech LinkedIn, generated two to five X higher return on ad spend than any other social media platforms. Audiences on LinkedIn, have two times the buying power of the average web audience and you’ll work with a partner who respects the B2B world you operate in.

Make B2B marketing everything it can be and get $100 credit on your next campaign, just go to linkedin.com/podlenny to claim your credit. That’s linkedin.com/podlenny, terms and conditions apply. Okay. So metrics trees, what comes next?

Itamar Gilad: All right. So next we need to go to the ideas layer and the ideas layer is there to help us sort through the many ideas we might encounter and they may come from as you said the founders, the managers, the stakeholders, from the team, from research, from competitors. We’re flooded with ideas, and what usually happens inside organization is some sort of battle of opinions or some sort of politics sometimes or highest paid person’s opinion. You had, Ronny Kohavi, who invented this term in your show. What doesn’t happen is very rational, logical decisions these are the best ideas, because it’s really, really hard to predict honestly. There is so much uncertainty in the needs of the users, in the changes in the market, in our technology, in our product, in our own organization. It’s almost impossible to say this idea is going to be the best, but we do say this because we have cognitive biases that kind of convince us that this idea is far superior to anything else and it’s definitely the right choice.

In order to avoid this, what we want to do is to evaluate the ideas in a much more objective and consistent and transparent way. In the book I suggest using ICE, impact, confidence and ease. I think I have a slide coming on this. So impact, confidence and ease which is basically a way to assign three values to each idea. The impact tries to assess how much impact it’ll have on the goals and that’s why it’s so important that we have very clear goals and not many. How we are measuring the ideas on the North Star metric, on the top business KPI, on a local metric of the team. Whatever it is, let’s be clear about it and then let’s evaluate the ideas against this thing. Ease, is basically the opposite of effort. How easy or hard it’s going to be, but both of those are guesstimates, both of those are things we need to estimate. I would argue that just by breaking the question to these two questions we usually have a slightly better discussion than just my idea is better than yours.

But then there’s the third element which is confidence, which tries to assess how sure are we or should we be about our first guesstimates about the impact and the ease.

Lenny: It’s interesting you use the word ease, because I think it’s usually effort. You kind of make it positive, is that an intentional tweak you made?

Itamar Gilad: I’m using the definitions of, Sean Ellis. Sean, invented ICE. You know Sean, I don’t know if you’ve had him yet? But he’s-

Lenny: I haven’t had him on yet.

Itamar Gilad: Yeah. For the people who don’t know him, Sean, is amazing. He’s like one of the fathers of the growth movement, he coined the term growth hacking and he popularized the concept of product market fit.

Lenny: Yeah.

Itamar Gilad: He created ICE, he created a bunch of things that we use in product that we don’t even know.

Lenny: Wow, I didn’t know he came up with ICE. Okay, cool. So the original version of ICE is ease instead of effort.

Itamar Gilad: Exactly, yeah.

Lenny: Fun fact.

Itamar Gilad: A lot of your viewers are wondering where’s the R because there’s another variant of this culture. RICE, where there’s rich as well. I prefer ICE because I prefer to fold the rich into the I for various reasons but both are valid, both are equivalent in a sense.

Lenny: I’m in your boat, that’s exactly how I think about it. I think people over complicate this stuff and try to get so many math formulas involved with estimating impact, and I feel like these are just simple heuristics to kind of bubble the best ideas to the top. It doesn’t have to be a perfect estimate of impact and confidence and all those things, so I think the simpler is better and it always ends up being a spreadsheet. People always have these tools to estimate these things but it’s like a spreadsheet, Google Sheets. Great.

Itamar Gilad: So yeah, you’re actually leading me to my next point. So when you come to estimate impact you will realize it’s the hardest part. So sometimes it’s just a gut feeling and it’s a guess and sometimes it’s based on some spreadsheet or some analysis and the back of envelope calculation you’ve done and I think that’s legitimate. Sometimes these things do show you some things you didn’t think of and sometimes the best case it’s based on tests. You actually tested it, you interviewed 12 customers, you show them the thing and out of those only one actually liked it. You should reduce your impact based on that usually, or you do other types of tests. We’ll talk about testing in a second. What happens is that people tend to just go with gut instinct and then give themselves a high confidence. They say it’s an eight and I’m pretty convinced, so it’s eight for confidence and I found this a bit disturbing because it kind of subverts the whole system.

So I wanted to help people realize when they have strong evidence in support of their guesses and when it’s weak evidence, how to calculate confidence in a sense. For that I created a tool called the confidence meter, which you can see here this colorful thing and should I go and explain it?

Lenny: Yeah, let’s do it. And then again, if you’re just listening to this you can check this out on YouTube and you can see the actual slide.

Itamar Gilad: All right, awesome. So basically I constructed it a bit like a thermo meter. It goes from very low confidence which is the blue area or the upper right, all the way to high confidence which is the red area and you can see the numbers going from zero to 10. Where zero is very low confidence, we don’t know basically anything we’re just guessing in the dark and 10 is full confidence. You know for sure this thing is a success, no doubt about it and across the circle I put various classes of evidence you might find along the way. So for example, starting at the top right, all of these blue areas about opinions. It could be your own self-confidence in the idea, your self conviction, you feel it’s a great idea. Guess what? Behind every terrible idea that was ever someone thought it was great, that gives you 0.01 out of 10. Maybe you created a shiny pitch deck or a six-page document that explains in detail why this is a great idea. Slightly harder to do but still very low confidence, maybe you connected it to some theme, it’s about the blockchain…

Well sorry, the blockchain is out of fashion. What’s hot right now?

Lenny: AI.

Itamar Gilad: Exactly, AI. It’s about AI, that makes it a good idea? Absolutely not. Or the strategy of the company, that’s another thematic support. Thousands and thousands of terrible ideas are being implemented right now as we speak based on these themes. So all these things combined can give you a maximum 0.1 out of 10 according to the tool, if you follow it then we move into slightly harder tests. One is reviewing it with your colleagues, your managers, your stakeholders the idea. They don’t know it either, they don’t have a crystal ball, they’re usually not the users, they cannot predict. But they can evaluate it in a slightly more objective way and maybe find flaws in your idea. On the other hand groups tend to have biases too, politics group thing. So groups can actually arrive sometimes with worse decisions than individuals, there’s some research to that. Next, our estimates and plans. So you may do some sort of back of the envelope calculation or your colleagues might go out and try to evaluate the ease a little bit better.

That gives you a little bit more confidence, but still we’re at the level of guesswork at this point. Next we’re moving to data and data could be anecdotal. So you find a few data points dotted across your data or you talk to a handful of customers or maybe one competitor has that same idea. In many companies I meet, if the leading competitor has this feature and we think it’s a good idea validation is done. Let’s launch it, that’s it. It’s a great idea, we need to do it. It never works honestly, you should not assume that your competitor actually knows what they’re doing anymore than you do. Data could be also what I call market data. That comes from surveys, from assessing a lot of your data by doing a deep competitive analysis and there are other methods where you create a larger dataset and you contrast your idea against it. Finally, to gain medium and high confidence you really need to build your idea and test it and that’s where the red area is.

So there’s various forms of tests, we’ll talk about them if we have time and they give you various levels of confidence.

Lenny: Awesome, this is a very cool visual. We’ll link to a image of this in the show notes too if people want to check it out. I think what’s awesome about this is you could just use this as a little tool on your team of just like where are we along the spectrum? We think the impact of this is very high. But we’re probably in this blue area of confidence and so let’s just make sure we understand that and it’s really clear language to help people understand. I see if we had this, it’d be a lot more confident.

Itamar Gilad: So you can also tie your investment into the idea based on the level of confidence you had found essentially, so early on you want to do the cheap stuff just to gain more confidence and then you can go and invest more. If it’s a really cheap idea, you can jump to a high confidence idea, you can test, you can do an AB experiment. Early adopter program, whatever it is and then launch it. Some ideas you don’t need to test, sometimes the expert opinion is enough. If you’re just changing the order of the settings, no one sees this or no one will be impacted. The risk is low, you can launch it without testing. So part of the trick is also knowing when to stop, not just trying to force your way all the way up when you don’t have to.

Lenny: That’s a really important point. The other important point here is just a big part of a PM’s job is to say no and to stop stupid shit from happening and this is an awesome tool to help you do that. To be like, okay, here’s this idea you have, just like let’s just be real, how confident are we in this? And, okay, it’s going to take us three months to do this. Maybe we should think about something different, maybe we should work up the confidence meter before we actually commit to this.

Itamar Gilad: Yeah. This is a real world usage that I hear about a lot, some people use this to kind of do… An objective way to say no and gently. Or to say we’ll think about it but look at these other ideas we have and how their impacting is and confidence stack up.

Lenny: Classic PM move, just like that was a great idea but what about this better idea? Coming back to something that we talked a bit about at the beginning, say you have a founder who’s actually very smart and experienced. Say even at a startup where you don’t really have the time to build tons of evidence for ideas. Do you have a different perspective on how much time to spend building confidence in ideas versus just like cool, they actually have really good ideas let’s just see what happens?

Itamar Gilad: So there’s always like a trade-off between speed of delivery and speed of discovery, and that actually leads to the next layer of how do we combine the two? Because people tend to think it’s an either/or. Either we are building very fast or we are learning and then we’re building very slow, but I think we’re using the wrong metric. The metric is not how fast can we get the bits into production, when there’s a lot of uncertainty and we all face uncertainty and startup especially. It’s not about getting the bits to production, it’s about getting the right bits to production. It’s about creating the outcomes that you need, the impact, and so it’s about time to outcomes and I would argue that the evidence guided method is far more impactful. It’s far faster, it’s far more resource efficient than the opinion-based method. Because opinion-based methods tend to waste a lot more of your resources, building the wrong things or discovering, learning too late. Well, evidence guided helps you learn earlier.

Plus it is a fallacy that if you learn you don’t build, good teams know how to do both at the same time and that’s actually what the steps layer is meant to teach you or to help you do.

Lenny: Awesome. So maybe just to close off that loop, say someone listening is at a bigger company, say Netflix versus a series A, series B or startup. Is there something you’d recommend about them approaching this differently? Any kind of guidance there of just how to take what you’re sharing differently if you’re a different source of companies like that?

Itamar Gilad: Absolutely. I think the concept we talked about of the North Star metric, the value created versus the value captured is very important in every company. Building your entire metrics trees, maybe overkill, doing heavy weighted OKRs may be overkill for early stage. Early stage companies even don’t know how they create value, so they need to iterate and their goals is really to find product market fit. Beyond that, what happens is that you need to start building your business model. So that’s your goal and you iterate towards that and you need to put metrics on that and then when you move into scale, you need to try to create order because when you scale up… And all of this is covered in the book, there’s a special chapter just about these questions. When you scale up, you get a lot of people and a lot of money and everything is happening at the same time. So there you need a order of evaluating ideas in a very systematic way. In a company like Netflix, by the way I don’t know if they need this specific method. They’re very-

Lenny: Yeah, maybe that was a bad example. They’re probably doing things pretty well.

Itamar Gilad: One thing I discovered by the way, there’s two types of companies that really benefit from this technique. One is those companies that are kind of emerging into modern product development. They have product teams, they have product managers, they have OKRs, they’re starting to do Agile. But they’re starting to do experimentation, but they’re struggling to put it all together. Every CPO is building their own little framework and the other type is those companies that used to be evidence guided and they regressed and that happens way too often. Change of management, change of culture, and then all of a sudden they need to rediscover, to rekindle that spirit that was lost along Google+. So some of the people that actually respond to the strongest are actually surprisingly in these companies.

Lenny: What I love about your frameworks and kind of all these things we’re talking about is these are just a… You can almost think of them as a grab bag set of tools to make you more evidence guided as a company. You could start with thinking about the confidence meter, you could start using ICE more. You could start using the metrics tree and all these things just push you closer and closer to being more evidence guided, you don’t have to adopt this whole thing all at once.

Itamar Gilad: Absolutely. I would recommend that you don’t try because if the transformation is way too big, you will get fatigued and you will just create a lot of process for a lot of people and you would not see the results and after a quarter you’ll give up. So exactly what you suggested is the right approach.

Lenny: What would be the first thing you’d suggest if people were trying to move closer to being less opinion oriented and more evidence-based? Which of these frameworks or models would you recommend first?

Itamar Gilad: I recommend that they discuss internally where is the biggest problem that they’re facing. If the goals are unclear, there’s misalignment, we keep chasing the wrong things, start at the goals layer. Try to establish your North Star metric, your top business metric, your metrics trees, start assigning teams with their own area of responsibility. If you’re spending a lot of time in debates and you’re constantly fighting and changing your mind. Start with the ideas there and establish impact is confidence or whatever prioritization model you like, but involve evidence in it. I think the confidence meter is a good tool to use irrespective. If you’re building too much and you’re not learning enough, start adopting the steps layer which we haven’t seen yet and if your team is very disengaged. You have one of these teams where the developers are very into Agile, very into quality, very into launching things, start working on the tasks there.

Lenny: Awesome. Okay, let’s keep going.

Itamar Gilad: All right, so steps. Steps are about kind of helping us learn and build at the same time as we said and one of the patterns I see is that organizations don’t know that they can actually learn at a much lower cost. They believe they need to build this elaborate MVP which is not minimal in any way and then launch it and then they will discover it and basically it’s what we used to call beta 20 years ago but just with a different name. What I’m trying to do here in the steps layer is to help companies realize there’s a gamut of ways to validate your ideas or more specifically to validate the assumptions in your idea and I created a little model for this, it’s called after assessment fact finding, tests, experiments and release results. But again, it’s just putting together things that much smarter people invented. So in assessment you have very easy things, things that don’t require a lot of work. You check if it aligns with the goals, this idea that you have in your hand.

You do maybe some business modeling, you do ICE analysis, you do Assumption Mapping which is great tool by, David J. Blend, or you talk to your stakeholders one-on-one just to see if there are any risks, etc. These are usually not expensive things and they can teach you an awful lot about the impact and the ease of your idea. The next step is to dig data and usually that goes hand in hand with this. So you can find data in your data analysis through surveys, through competitive analysis, through user interviews and through field research, observing your users. Obviously these last two are pretty expensive, so it’s often good not to wait until you have the idea and then start doing your research. It’s best to keep doing your research ongoing and then you have some sort of data to lie on and to compare your idea against. But until now we didn’t build anything, now you’re ready to start testing, building versions of the product and putting them in front of users and measuring the results. But initially you don’t build anything, you fake it.

You do a fake door test, you do a smoke test, Wizard of Oz test, a concierge test, usability test. We used a lot of those in the tabbed inbox by the way, one of the first early versions was actually we showed the tabbed inbox working to people. But it wasn’t really Gmail, it was just a facade of HTML and behind the scenes and according to the permissions that the users gave us. Some of us moved just the subject and the sender into the right place. So initially the interviewer distracted them and then showed them their inbox and in it the top 50 messages were sorted to the right place, more or less if we got it right and people were like, “Wow, this is actually very cool.” And that gave us a lot of evidence.

Lenny: That’s an awesome story. So that was in the user research, it wasn’t rolled out to people? It was a manual individual?

Itamar Gilad: There wasn’t a single line of code written, this was just cooked up by the researchers and our designers. But it gave us some evidence to go and say, we should try and build this thing.

Lenny: Love that.

Itamar Gilad: So initially you fake it, mid-level tests are about building a rough version of it, it’s not complete, it’s not polished, it’s not scalable, but it’s good enough to give to users to start using. So those are early adopter programs, alphas, longitudinal user studies and fish food. Fish food is testing on your own team.

Lenny: Fish food? I haven’t heard that term before. So it’s dog fooding, but more local to your team.

Itamar Gilad: I think it’s a Googly thing, but some people told me that they use fish food as well in their company the name. So I’m using it, I don’t know if there’s a better name for it.

Lenny: I wonder why it’s called fish food, because it’s like little? It’s like little gentle little clicks?

Itamar Gilad: It could be. Yeah, I don’t know.

Lenny: Wow. Okay, super cool. I’m learning a lot here.

Itamar Gilad: So the next stage is to actually build a kind of more complete version of this and then you can dog food it, then you can give this to your users internally. When I joined Microsoft many years ago, the first thing I noticed was that Outlook was very buggy and I asked people what’s going on? And they told me we are all dog fooding the next version of Outlook that hasn’t come out yet and that’s a very common practice in Silicon Valley. You can do previews, you can do betas, you can do labs, so those are tests. Now, there’s a special class of tests which are experiments because they have a control element. So AB tests, multivariate tests, those are all experiments. I’m using the word experiment the way data scientists use it, although people tend to call experiments to everything that you see here and finally, even the release you can do stage release, you can do percent launches, you can do hold backs. All of these things help you further validate your assumptions. Sometimes you need to roll back and change things, but it’s another opportunity to learn.

So the key point is you don’t have to start at the right-hand side, which is expensive. You can start early on and that leads to poking a lot of ideas very quickly. You realize they’re not as good as you thought, and then you can invest more effort into the good ideas. If they generate positive evidence, you can go further and further until that point where you feel you’re ready for delivery.

Lenny: Okay. So we’ve talked about goals, we’ve talked about ideas, we’re talking about steps here. Is there anything else along steps? And then next I know comes tasks.

Itamar Gilad: No, this is it for steps. There’s a lot more with this, we will not go into all of it.

Lenny: Okay, that sounds good. Let’s talk about tasks and what you mean there.

Itamar Gilad: All right, awesome. So in many organizations there’s these two worlds. There’s the planning world where basically you have the managers, the stakeholders, some of the PMs really sit and think about what we need to launch and that’s where we create the strategies and the roadmaps and the projects. But guess who is not invited to the party? The people who are actually doing the work. They live in Agile world, they’re very focused on moving tickets to the done state, on completing burning story points, pushing stuff into production and there’s a big gap between these two worlds. They don’t understand each other, they don’t see eye to eye, there’s a lot of mistrust being built sometimes against the plans or the managers feels that the teams are just not being very effective. We’ve seen all of this and the solution, the stop gap is to put a PM in the middle. The PM is supposed to make all of this work, deliver on the roadmap like a project manager, feed the Agile machine with perfectly prioritized product backlogs and stories and it just doesn’t work honestly.

And the PMs I meet are very tired and they have to spend so much time in planifications and roadmap discussions and they’re very busy, they don’t have time to do research or to test ideas. So I suggest changing this and bringing the developers a little bit out of their Agile cage if you like and no disrespect to Agile, it’s a great thing but let’s let them do more than just develop. Let’s let them discover as well and one of the tools I suggest and again this is a process is what I call the GIST board. So it’s basically the top three layers of GIST. The goals are on the right, these are just the key results usually per team I suggest not more than four. So you create a GIST board per team, then the ideas we’re working on right now sometimes with our ICE scores and then the next few steps that we might want to pursue in order to validate these ideas and this is a very dynamic thing.

It changes all the time, the team leads need to update it and the team needs to meet around it at least once every other week to think to talk about what’s going on. Are we still following the right ideas? How are we doing on the goals? What are the next steps? What’s blocking us from completing the most important steps? And this is a discussion that is not happening today, because most of the discussion happens at the roadmap level and then there’s a lot of discussion at the task level. But this middle layer of what actually are we trying to achieve and how well are we doing on it doesn’t exist. If you do have this, you create a lot more context in the minds of your team and then they need to ask you fewer questions. You need to tell them less what to do. They know what’s success and they are able to actually do a lot more on their own.

Lenny: Is the way to think about the GIST board as the way you should be road roadmapping or is this more of a strategy framework to think about why you should be prioritizing broadly?

Itamar Gilad: The way I say this is at the beginning of the quarter, the team defines its goals. The leads of the team define the goals, but they review it with the team, they review it with the managers, of course with the stakeholders. Everyone’s in agreement, these are the maximum four key results and the one or two objectives you guys need to work on, teams cannot deliver on more than that. You copy these key results into the GIST board, then you start looking at your idea bank or you start generating ideas and say, how can we achieve these key results?

Lenny: And to clarify the thing you copy is the key result as the goal?

Itamar Gilad: Yes, exactly. You can write the objectives alongside that to remind people what are we trying to achieve, but the key results are the thing we show here. Then you pick some ideas, the ones that look most promising and as unintuitive as it sounds or counterintuitive as this sounds I would recommend that you let the team pick these ideas. The manager of the stakeholders can propose the ideas, everyone can propose, but the team should use the ICE process to kind of… And especially the product manager is very important here to choose which ideas to test first. Then the team together needs to develop which steps should we run, how can we validate this? Some of the steps will be done by the PM, some by the data analyst, some by the user researcher. But some will involve the team, there’ll be some coding, there’ll be some running of experiments and so there’s some ownership around the steps. A sub team owns each one of these steps and we will change the board very actively.

So if an idea turns out to be bad we will take it off the board and put another idea in this place or maybe we achieve the goal, we don’t need to work on this anymore, we can focus something else. So it’s a project management tool in a sense.

Lenny: Awesome. So I’m looking at it and I think maybe the most important piece of this is that steps aren’t just like a project, like launch a better onboarding or add the step to onboarding. It’s you want to emphasize the steps that you’re going to take to get to more and more confidence essentially, and more and more evidence guided thinking versus just, “Well, let’s figure out how to launch this feature idea.”

Itamar Gilad: Exactly. It’s not a engineering milestone or a design milestone, it’s a learning milestone. So we build something and along the way we actually grow the scope of what we build. We are building the product in the process and we learn, so the two have to come hand in hand.

Lenny: And for folks that aren’t watching this on YouTube, just to walk through an example, we’ll do it real quick. So one of your goals here is average onboarding time, you want your goal to be the average onboarding time less than two days, currently five and a half days. An idea there is an onboarding wizard, and then the steps are a usability test with mockups and then a usability test as a prototype and then an AB test?

Itamar Gilad: Yeah, basically, and you can alter this as you go along. Sometimes you can run multiple steps in parallel it’s not always sequential. But that’s basically the process, yeah.

Lenny: Awesome. So again, what you’re trying to emphasize here as a team is just we’re not just going to launch this onboarding wizard and we’re not going to figure it out later. It’s like let’s be upfront about the steps we’re going to take to build more and more confidence. This is something we should keep investing more and more in, which is really interesting.

Itamar Gilad: Yeah, and another interesting thing that happens every time you run a step if it’s successful you have evidence and you can go back to the managers and tell them and share and say, “With this idea we thought it was great, but we got this result. What do you think that means?” And sometimes that manager that propose it would say, “I think the test failed, let’s rerun it.” Or sometimes they will say, “Maybe it’s not as strong as I thought. The discussion just becomes that much more nuanced and objective if you like.

Lenny: Maybe just to close out this framework. How does this relate to a roadmap that they may have in a spreadsheet or in Jira or in Asana or something like that. Does this sit on top of that? Is this replacing a roadmap somewhere else?

Itamar Gilad: I would say that release roadmaps where you are just saying by Q3 we want to launch this or by October we have to launch that, they’re kind of competing with this. If you’re doing that and people know that the goal is to launch that thing by October, forget about learning, forget about evidence guided, I recommend using outcome roadmaps saying by October we want to achieve this outcome. By Q4 we want to launch in another three countries, or we want to grow our usage in India by that much, by this time we need to tackle the problem of churn and how we achieve this. Sometimes we know we have a concrete idea that is high confidence that we already tested, we switch into delivery, then we can put it on the roadmap and say, “Yeah, we’re going to build this thing and we’ll aim for October.” But otherwise you want to keep it open and the roadmaps can kind of suffocate this process if you decide upfront with low confidence that this particular idea must be launched.

Lenny: Okay. So you’re proposing people switch the roadmapping practice to this, which is very ambitious. I love it.

Itamar Gilad: Well, this is not a roadmap. This is just a tool for the team to manage the project, but I have a proposal for outcome roadmaps inside the book.

Lenny: Okay, awesome. Okay. So I was going to ask if people wanted to try this approach, the book is the best way to fully understand the framework and how implement it.

Itamar Gilad: That’s one way. I have articles, I have resources on my site, but I try to condense much of what we just discussed in a lot more nuance in the book. So if you are interested in that, I would give it a go.

Lenny: Awesome. Maybe just on the topic of OKRs real quick. How do OKRs connect to all this? It sounds like broadly you kind of assume people will keep working on here’s our metric or key results or objectives and then that plugs into this kind of GIST framework.

Itamar Gilad: So the metrics trees, plus your mission, plus the individual missions of the teams give you most of what you need to populate your OKRs. There’s of course a process of alignment, top down, bottom up, side to side, which I talk a little bit about as well. OKRs is a very rich topic, but those things are usually the core. There’s usually some other OKRs that’s about the health of the company, the health of the product, etc. Those are called supplementary OKRs, I talk about those as well. So yeah, I think OKRs are a helpful tool if you like them.

Lenny: And just zooming out again. Basically you don’t need to take all of these ideas and lump them all together and change the way you work as a business. You can start with picking some of these ideas and starting to become more and more evidence guided. It sounds like this GIST board isn’t where you probably want to start, but maybe it’s once you have more and more experience using some of these tools or you tell me. Do you sometimes go straight to this way of thinking about the roadmap and the plan?

Itamar Gilad: So it might not be the full board because you’re missing some of the pieces, maybe your goals are not as good or your idea prioritization isn’t as good. But if your team is very, very delivery focused and sometimes it’s also the opposite. The managers are telling them how to build and you want to break this kind of dynamic, you want to create a step backlog. So instead of a product backlog, let’s create a backlog of steps which are just validation steps, betas and previews, etc, and that changes the dynamic pretty strongly.

Lenny: So by the time this podcast comes out, the book will be out. What is the best place to find the book?

Itamar Gilad: Hopefully on Amazon, you can search for it. You can go to my site, itamargilad.com and it’ll be presented prominently there and there’s also the book landings page where you’ll find everything you need to know about the book, evidenceguided.com.

Lenny: Well, with that we’ve reached our very exciting lightning round. Are you ready?

Itamar Gilad: Yes, let’s go.

Lenny: What are two or three books you’ve recommended most to other people?

Itamar Gilad: So I’m going to cheat, I’m going to recommend a series of books so two series. One is the-

Lenny: Cheating is allowed.

Itamar Gilad: All right, cool. One, and those are obvious one. One is the series published by SVPG, Silicon Valley Product Group. So INSPIRED, EMPOWERED, now I think TRANSFORMED has come out, I haven’t read it yet but I’m sure it’s amazing. So this is Marty Cagan and his colleagues, they write some tremendous books and every product manager should read them. The other series, a bit older, this is the Lean series, The Lean Startup, Lean Enterprise, Lean Analytics, there’s gold in all these books, Lean UX, really, really important books and I think they’re not as appreciated as they should. Running Lean, that’s another example.

Lenny: What is a favorite recent movie or TV show?

Itamar Gilad: I’m not really a big TV or movie buff, I just put on whatever comes up. I’m discovering that YouTube is actually becoming one of my sources of information entertainment. I’m learning a lot of Spanish recently, so I discovered this channel called Dreaming Spanish which is if you’re learning Spanish it’s incredible. So that’s my recommendation.

Lenny: That’s a unique choice, I love it. Favorite interview question you like to ask candidates.

Itamar Gilad: I like to ask them to design something for a niche audience. So a navigation system for elderly people or some sort of laptop for people with vision impairment, etc. So those are good questions to see their customer empathy, their creativity, their ability to evaluate multiple ideas, their ability to find flaws in their own ideas. So there’s a lot of room to dig in there and kind of see how this person is thinking as a product person.

Lenny: What is a favorite product you recently discovered that you love?

Itamar Gilad: It’s a cliche, but it’s AI. There’s a company called ElevenLabs, that do voices and the best voices, synthetic voices you heard, but they can also replicate your own voice so you can create a voice signature. If you’re American you can use their kind of default free version or cheap version to replicate your own voice and that could be pretty useful if you need to narrate an audiobook or do some online course. So I’m finding this service very interesting.

Lenny: This is all part of my big retirement plan, find all of these components together that can replace me eventually. You got AI generating content, we’ll have this voice thing. I love it, it’s all happening.

Itamar Gilad: There’s an AI version of you, right? I can ask you questions now with-

Lenny: Oh, there is lennybot.com.

Itamar Gilad: Right.

Lenny: It’s all part of the plan. Okay. What is a favorite life motto that you repeat most to yourself that you share with others?

Itamar Gilad: That’s a big one. Albert Einstein I think said, “Strive not to be a success, but to be of value.” And I think that’s a great motto for people and for companies. It’s something that kind of guides me and this whole concept of the value exchange, etc, is kind of loosely connected to that.

Lenny: I love that, that’s such a important point for people putting out content online. So many people are just like, I just want to be successful, get followers, here’s all these things I’m tweeting and showing and the thing that actually works is deliver value, create valuable stuff that people really value and want. I find the signal for that is, do you find it interesting and valuable? If you’re like, “Oh wow, that’s really interesting.” Oftentimes other people are going to find it interesting. So I love that, great choice, I’m going to look at that one up. Two more questions. What’s the most valuable lesson you learned from your mom or your dad?

Itamar Gilad: I think both of them in their own way, they had relatively modest jobs, teaching or doing other things, but they always strived again to be the best they can and to deliver the most value they can. So it’s very connected somehow, maybe I’m seeing the world through this lens. But they kind of taught me to strive to be the best I can at what I do.

Lenny: The final question, you’re Israeli for folks that can’t tell. What is your favorite Israeli food that people should definitely check out or I try to get whenever they can?

Itamar Gilad: When I arrive in Israel I usually go for shawarma, which is like döner kebab if you know it, it’s just better. So if you’re in Israel, if you go visit Haifa, which is the city where I grew up definitely check out the shawarma.

Lenny: Awesome. Itamar, I hope people got the gist of your book from our conversation. What’s the best way to find it? What’s the best way to learn about you and reach out if they want to ask any questions? And then also, how can listeners be useful to you?

Itamar Gilad: To find it you can go to itamargilad.com or to evidenceguided.com, and you’ll find a book and you’ll find me. Best value to me, try it out, just take some of these ideas, bring them back to your office, talk with your colleagues, say what do you think we should do about this? Just give it a go and reach back to me, tell me I’m easy to find in my website. Tell me what happened I’m really interested.

Lenny: Amazing. Itamar, thank you again so much for being here.

Itamar Gilad: Thank you.

Lenny: Bye everyone. Thank you so much for listening. If you found this valuable, you can subscribe to the show on Apple Podcasts, Spotify or your favorite podcast app. Also, please consider giving us a rating or leaving a review as that really helps other listeners find the podcast. You can find all past episodes or learn more about the show at lennyspodcast.com. See you in the next episode.

Glossary

English	中文
activation rate	激活率
Assumption Mapping	Assumption Mapping（David J. Blend 创建的工具，保留原文）
build-measure-learn loops	构建-测量-学习循环
concierge test	礼宾测试
confidence	信心
CPO	CPO（Chief Product Officer 的缩写，保留原文）
David J. Blend	David J. Blend（人名保留原文）
design thinking	设计思维
dog fooding	dog fooding（团队内部使用自己开发的产品）
Dreaming Spanish	Dreaming Spanish（频道名保留原文）
döner kebab	土耳其烤肉
ease	简易度
ElevenLabs	ElevenLabs（公司名保留原文）
evidence guided	证据引导
fake door test	假门测试
fish food	fish food（在自有团队内部进行的产品测试，源自 Google 的说法）
GIST	GIST（Goals, Ideas, Step-steps, Tasks 的首字母缩写，不译）
GIST board	GIST 看板
growth hacking	增长黑客
Haifa	海法（以色列城市）
ICE	ICE（Impact, Confidence, Ease 的首字母缩写，不译）
impact	影响
lean startup	精益创业
Marty Cagan	Marty Cagan（人名保留原文）
meta framework	元框架
metrics trees	指标树
MVP	MVP（Minimum Viable Product 的缩写，保留原文）
North Star metric	北极星指标
OKR	OKR（Objectives and Key Results 的缩写，保留原文）
output	产出
product discovery	产品发现
product market fit	产品市场契合
reach	覆盖面
RICE	RICE（Reach, Impact, Confidence, Ease 的首字母缩写，不译）
roadmap	路线图
Ronny Kohavi	Ronny Kohavi（人名保留原文）
Sean Ellis	Sean Ellis（人名保留原文）
shawarma	沙瓦尔玛（中东烤肉卷）
siloed goals	竖井式目标
smoke test	冒烟测试
story points	故事点
SVPG (Silicon Valley Product Group)	SVPG（硅谷产品集团，保留原文缩写）
top KPI	顶层 KPI
value exchange loop	价值交换循环
Wizard of Oz test	Wizard of Oz 测试

Reformatted by reformat_english.py

走向以证据为导向 | Itamar Gilad (Gmail, YouTube, Microsoft)

绿野仙踪测试：Gmail 标签收件箱的早期验证

Itamar Gilad： 你可以做假门测试（fake door test），做冒烟测试（smoke test），做绿野仙踪测试（Wizard of Oz test）。顺便说一下，我们在分标签收件箱（tabbed inbox）项目中大量使用了这些方法。最早期的版本之一，实际上是向用户展示了标签收件箱的效果，但那并不是真正的 Gmail，只是一个 HTML 的外壳。在幕后，根据用户授予我们的权限，我们几个人手动把邮件的主题和发件人挪到正确的位置。一开始，采访者先转移用户的注意力，然后展示他们的收件箱——如果我们操作正确的话，顶部大约50封邮件或多或少都被归到了正确的位置。用户的反应是：“哇，这真的很酷。“这给了我们一些证据，让我们可以说：“嘿，我们应该试着把这个东西做出来。”

Lenny： 欢迎收听 Lenny’s Podcast，在这里我采访世界级的产品领导和增长专家，从他们来之不易的经验中学习如何打造和增长当今最成功的产品。今天的嘉宾是 Itamar Gilad。Itamar 是一位产品教练、作者、演讲者，也是 Google 的长期产品经理，曾负责 Gmail、身份系统和 YouTube 的工作。他还刚刚出版了一本很棒的新书，叫《Evidence-Guided: Creating High-Impact Products in the Face of Uncertainty》。Itamar 对为什么以及如何推动你的团队和组织从基于观点的决策过程转向以证据为导向的方法，有着重要的见解。在我们的对话中，Itamar 分享了若干非常实用和便捷的框架来实现这一目标，包括置信度仪表（confidence meter）、指标树（metrics trees）、GIST 和 GIST 看板，以及他对人们在用 ICE 进行想法优先级排序时常犯错误的看法，还有如何让你的 OKRs 更有效等等。尽情收听这期与 Itamar Gilad 的节目。

Itamar Gilad，非常感谢你的到来，欢迎来到播客。

Itamar Gilad： 很高兴来到这里，谢谢你的邀请。

Lenny： 这是我的荣幸。我想我们可以从你在 Google+ 和 Gmail 上的工作经历开始，以及这些经历如何塑造了你对如何打造成功产品的看法。你能分享一下这个故事吗？

Google+ 的教训：基于观点的开发

Itamar Gilad： Google+ 是我在 Gmail 的第一次经历。我于2011年8月加入 Gmail，他们交给我的第一件事就是：“让我们把 Gmail 和 Google+ 连接起来。“如果你对这段故事有些模糊，当时 Facebook 非常庞大——现在仍然很大——但那时它像蘑菇一样疯长，人们花好几个小时在上面。这让 Google 非常紧张，显而易见的解决方案就是推出 Google 自己的社交网络 Google+。我们都相信这个东西，它最初确实很受欢迎，我们都在使用，我们都对它充满信心。所以我们的使命就是把它做出来，而 Google 不惜代价，在公司内部创建了一个全新的部门，围绕 Google+ 制定了完整的战略，我们必须把 Gmail、YouTube 和搜索都连接到 Google+ 上，让它们变得更加个性化、更加社交化。这就是当时的想法。于是我们开始行动，在接下来的几年里在 Gmail 中发布了一系列功能。而 Google+ 本身变成了一个庞大的项目，功能非常丰富，经历了大量的重新设计和迭代——但这一切都没有奏效。

事实证明人们其实并不需要另一个社交网络，人们不喜欢它，也不使用它。最终在 Gmail 中，我们几年后回滚了所有 Google+ 的集成，而 Google+ 本身在2019年被关闭。暂且不谈投入其中的巨大浪费——数百万的人时和人周。事后看来，Google 不仅押错了方向，还错过了更容易把握的机会。就在离 Google 总部不远的地方，有 WhatsApp——在美国不太出名，但它实际上创造了巨大的影响力，数亿人在使用他们的产品，他们对 Facebook 构成的威胁远比 Google 大。所以 Google 错过了像 WhatsApp、Snapchat 等社交移动应用的机会。对我来说，这个故事几乎就是我今天所说的基于观点的开发（opinion-based development）的缩影——我们想出一个主意，我们相信它，所有迹象都显示它很好，也许早期测试也显示它很好，然后我们就全力投入去实现它。作为产品经理，我自己犯过很多次这个错误，我就是那个推动这些想法的人。所以对我而言，这是一个转折点，我感到我们需要采用一种不同的体系。

Lenny： 在你讲下一个故事之前，团队有多大？大概在这个方向上花了多少年？只是让大家感受一下你说的那种浪费的规模。

Itamar Gilad： Google 内部发生了一场巨大的震荡来组建 Google+ 团队，各个团队甚至整个部门被打散重组，我想在巅峰时期大约有1000人——

Lenny： 哇，难以置信。

Itamar Gilad： 这是一个与 Android 和 Docs 规模相当的部门，体量相当大，有自己专属的大楼。这套做法取自 Steve Jobs 的剧本——在公司内部创建一个完全保密的项目，然后拼命地干。

Lenny： 对。不过我记得 Facebook 当时真的很害怕，我记得他们停掉了所有东西，简直就是 DEFCON 一级戒备状态，确实也把 Facebook 吓得不轻。

Itamar Gilad： 是的，确实如此。但最终结果是，Google 的广告收入没有受到影响，Facebook 也没有受到影响。所以事实证明这个想法毕竟并没有那么必要。

Lenny： 好。所以这是一个失败的例子，因为它是基于观点的软件，我记得你用了这个说法。然后 Gmail 的标签页则是完全不同的经历。

Google 的基因：从证据驱动到计划执行

Itamar Gilad： 没错。Google 是一家非常成功的公司，轮不到我来批评它，也不是事后诸葛亮地说”你们应该做得更好”。而且 Google+ 背后的一些人是最聪明的领导者，尽管有这个故事，我至今仍然这样认为。如果你回顾 Google 的历史，看头十年左右的发展轨迹——Google 是一家我所说的证据引导型公司。它高度重视关注客户，提出大量想法，观察数据，看这些想法实际表现如何。他们不怯于发布测试版和非常粗糙、不完整的产品，从中学习，然后他们期望人们根据结果采取行动。所以”快速失败”（fail fast）是一个非常著名的范式，如果你的项目不行，就必须砍掉或大幅转型。我认为如果我们当时保持快速失败的心态，对 Google+ 会大有帮助。

但出于某种原因，Google+ 这个项目把这套打法搁置了，转而采用了一套我称之为”计划与执行”（plan and execute）的做法。不过我认为 Google 内部的 DNA 仍然存在。所以在 Gmail 内部，Google+ 之后的下一个项目就是标签页收件箱。它恰好是 Google+ 的反面——它始于一个非常小的、没人看好的想法，然后我们开始追问：背后的本质是什么？目标是什么？我们到底要解决什么问题？结果发现很多人都在接收社交通知和促销信息等等，其中大多数人非常被动。他们不清理收件箱，就这样生活在一堆杂乱之中。我提出了一个解决方案，我当时觉得棒极了，想直接推行，走计划与执行的路子。但我的同事说：“等等，我们其实试过了。我们有一堆帮用户整理收件箱的想法，他们根本不用。你的想法凭什么行？“

Gmail 标签页的诞生

这促使我和我的团队开始研究这些用户，建立一个更加以用户为中心的目标，然后思考其他方案。接着我们开始更严格地测试——基本上先在自己的收件箱上测试，然后招募其他内部测试者，也就是其他 Googler 来测试同样的收件箱，再开放给外部测试者。我们做了可用性研究，收集数据，组建了一整个数据挖掘团队和一整个机器学习团队来构建正确的分类方案。最终我们得到的解决方案对这些被动用户来说非常成功。这对很多人来说是个意外，因为我大多数同事、大多数与我交流的人其实都知道如何管理自己的收件箱。所以对他们来说那个方案完全说不通——把促销和社交邮件分开听起来是最愚蠢的主意。但大约 85% 到 88% 的用户绝对喜欢它。如今 Gmail 拥有大约 18 亿活跃用户，其中大多数人都在使用这个功能，所以它也是一个影响力相当大的功能。

Lenny： 具体的功能，以防有人不太清楚——就是促销邮件文件夹、社交邮件文件夹，还有常规收件箱。

Itamar Gilad： 对，如果你愿意的话，在设置里还可以启用更多分类。

Lenny： 我就在用，我很喜欢。只不过它把我的 Newsletter 放到了用户的促销文件夹里，这事该找谁说理？

Itamar Gilad： 嗯，Newsletter 对分类引擎来说是一个非常复杂的场景。

Lenny： 对，只需要给我的 Newsletter 加个例外就好了，其他都好说。好吧，继续。

证据引导型公司

Itamar Gilad： 事后回顾，我问自己：“为什么这个项目如此不同？“我认为原因在于我们对自己的观点没有那么强的信心。我们有观点、有想法，但我们没有直接全力押上、直接开干。我们实际采用的是一套证据引导的体系，我认为这不仅限于 Google。我认为每一家成功的产品公司——Amazon、Airbnb，你去考察任何一家，至少在它们最好的时期，都找到了一种平衡人类判断与证据的方式。它们并没有试图消灭人类判断和观点，而是用证据来强化它们。它们发展出了非常不同的模式。Apple 是另一个例子，但这个原则在所有这些公司中都成立。

Lenny： 太棒了。所以你把这些经历，加上你指导产品负责人、与企业合作的所有经验，写成了这本书，叫 Evidence-Guided，看 YouTube 的人可以看到它就在你身后。我想聊聊其中的一些故事，以及从中涌现出的其他经验和框架。不过作为开始，这本书的电梯演讲（elevator pitch）是什么？

Itamar Gilad： 这本书是写给我们这样的人的——产品人，想把证据引导思维或者说现代产品管理带入自己组织的人。这中间有很多挑战，并不简单。我们都读过那些书，都知道理论，都知道体系的一部分。这本书试图给你一套系统性的方法来做到这件事，它是一个元框架，帮助你将组织朝着证据引导的方向提升——如果这是你想要的话。

自上而下的产品决策是否有时合理？

Lenny： 让我们稍微回到前面那个故事，在进入书中的框架和经验之前。第一个 Google+ 的例子，基本上是自上而下的——“我们需要建一个社交网络，去建吧。“显然这在很多公司都会发生，我不知道是否有一个简单的答案。但是否存在一些情况下，这种方式是合理的？显然 Apple 是一个经典例子，Steve Jobs 说我们需要做 iPhone。我不知道具体是不是那样发生的。但是否存在一些情况，基于创始人的经验、创造力和洞察力来推进新产品想法是值得的？还是说你认为应该始终采用这种基于证据的方法？

Itamar Gilad： 我认为创始人非常重要，尤其是在初创和扩张阶段。很多最重要的想法都是他们提出的，给他们空间去表达、推动组织关注这些想法，至关重要。但关键不是要压制他们，而是要用审视的眼光来看待这些想法。你需要创造一种组织环境——当领导者走过来说：“你知道吗？我跟三个客户聊了，我想明白了，接下来五年我们该这么做”——你可以问：“你的证据在哪里？“顺便说一句，你举的那个例子很经典。Steve Jobs，据说他在厨房里头脑风暴出了 iPhone，然后让团队去造——这是 Steve Jobs 自己讲的故事，但完全不是真实的情况。现在我们知道实际发生了什么：iPhone 的诞生其实是一个不断探索、不断试错的过程，涉及多个项目——多点触控、手机结合——大部分都失败了。

Steve Jobs 是那个架构师。他成功地把各种线索串联起来，最终打造出了那款完美的设备。但这并不是他的独创，不是他一个人的灵感产物。他最初其实反对做手机，但随着他看到越来越多的证据，看到这个东西能做到什么，看到那些演示，他逐步拼凑出了一件非常有价值的产品。

Lenny： 这个洞察非常重要。听到这里的人可能会想——我很喜欢这个推动反方意见、鼓励创始人走向证据引导的想法。但拿 Google+ 来说，在当时的情况下，这真的可能吗？你能去找 Larry 和 Sergey，说”我收集了所有这些数据，告诉我们这行不通”吗？你有没有什么建议，关于如何提出反对意见，让创始人和高管真正认真对待反面观点，或者说真正去审视他们的想法？

Itamar Gilad： Google 有一个很好的特点，就是文化非常开放。人们并不怯于告诉 Sergey 和 Larry 他们是错的，而且大家一直这么做。当然要通过合适的渠道。关于 Google+ 确实有过很大的讨论——到底要不要做一个 Facebook 的克隆品——有过非常公开的内部讨论。我认为需要改变的是：不要让这些讨论基于观点。因为当讨论基于观点时，每个人都会带着自己的观点来，而通常最资深的人的观点会胜出，就是这样。如果我们当时带着硬数据来说：“听着，实际情况并没有按照你们预期的那样发展。我们该怎么办？继续还是转型？“我认为讨论的效果会好得多。当然，我这么说有点以偏概全，我并没有参与所有讨论。我知道在 Google+ 项目中，类似这样的严肃讨论肯定也在发生。

但总体趋势上，我发现证据确实能赋予我们很大的力量——让我们这些组织中职位较低的人、中层管理者，有能力去挑战那些观点。

如何在不太开放的领导面前用数据说话

Lenny： 在战术层面，你有没有发现什么方法特别有效？比如说，有些人并不在 Google 工作，他们所在的公司创始人、老板和高管没有那么乐于接受挑战。关于如何提出一个反向方案，或者”嘿，我有这份数据，我们真的应该重视它”——你有什么战术上的建议吗？

Itamar Gilad： 我认为如果你带着数据去，如果你偷偷跑了一个实验，然后拿着结果去给他们看，通常会得到两种反应。一种是他们极其愤怒，让你回去干活、照吩咐做事——如果是这种情况，你可能需要开始打磨简历，在组织内部或外部找下一个去处了，因为那个人说实话已经不讲道理了。但更常见的情况是他们会感到惊喜。Steve Jobs 的情况也是如此——他反对做手机，但后来有人向他展示了各种证据，证明 Apple 能做手机。他最初也反对多点触控，但后来改变了想法，中间经过了大量反复讨论。所以即使是 Steve Jobs，面对证据也愿意翻转立场。我在很多组织中都看到过这种情况。证据的力量就是这么大，这也是我把整本书建立在这个原则之上的原因。

你真的做到证据引导了吗？

Lenny： 你提出了一个”证据引导”（evidence guided）的概念。听众可能会觉得：“嘿，我们是证据引导的，我们做实验，我们用数据做决策。“但很多时候他们其实并没有做到。那么，有哪些迹象说明你可能并没有你想象的那样做到证据引导？

Itamar Gilad： 我认为有几个典型的信号。首先，目标非常不清晰——要么目标太多，要么非常模糊、晦涩，要么只关注产出（output），缺乏对齐。目标这块出了问题，通常和指标密切相关——指标缺失，或者只使用收入和商业指标，而没有面向用户的指标。这是另一个典型信号。然后你会看到大量时间和精力花在规划上，尤其是路线图（roadmap）规划上。打造一份完美的路线图，确实会消耗高管和产品经理等人大量的时间。再往下看，你会发现实验很少，即使有实验，也没有从中产生太多学习。最后一个典型信号是团队缺乏参与感——工程师们接收到的信号就是”你需要做的就是交付”，他们专注于产出，这也是他们被考核的标准。所以他们实际上脱离了用户、脱离了业务，并不那么在意。

这些问题通常可以通过采用更证据引导的体系来解决。

GIST 模型概览

Lenny： 好，那我们来深入谈谈你关于走向证据引导的方法。在书中，你分享了一个叫做 GIST 模型的框架，它是一种几乎能迫使你变得更加证据引导的、构建产品的整体方法。我们先从最简单的角度来理解——这个 GIST 模型到底是什么？

Itamar Gilad： 如果你允许的话，我可以展示几张幻灯片。

Lenny： 好啊，来吧。

Itamar Gilad： 也许这样会有帮助。

Lenny： 开始了，这也算是去 YouTube 上看视频版的一个好理由。

Itamar Gilad： 好，你看到了吗？这就是 GIST 模型——目标（Goals）、想法（Ideas）、步骤（Step-steps）和任务（Tasks）。本质上，它试图把对很多公司来说非常巨大的变革，拆解成四个稍微更容易管理的部分。每一部分仍然不小，但你可以逐一应对。这就是我把它们分开的原因。目标（Goals）定义我们要达成什么；想法（Ideas）是达成目标的假设性方案；步骤（Steps）是实施方案并同时验证它的方式——本质上就是”构建-测量-学习”循环（build-measure-learn loops）；任务（Tasks）是我们在看板（Kanban）、Jira 等工具中管理的那些东西，也就是开发团队通常非常关注的事项。听到这里，你会发现很多东西听起来很熟悉，因为 GIST 并不是一个全新的发明。它是一个元框架（meta framework），整合了大量已有的方法论。它基于精益创业（lean startup）、设计思维（design thinking）、产品发现（product discovery）、增长（growth）等——所有这些都在这里面。它只是试图把它们统一到一个框架或模型中。

Lenny： 那么理解这个模型用途最简单的方式是什么？它是你思考路线图的方式吗？是你的规划方式吗？从根本上说，它想告诉人们在构建产品的方式上要做出哪些不同的改变？

GIST 的四个层面

Itamar Gilad： 我会说这是四个你需要审视的领域，在每个领域中都要问自己：我们做的是对的事情吗？在每个层面上，你可能需要改变，甚至彻底转型。当我逐一解释每个部分时，我基本上会给出三样东西。在书中每一章，我都试图涵盖三个方面：背后的原则、落实这些原则的框架或模型，以及流程。说实话，流程是最脆弱的部分，也是你最需要根据自己公司的情况去调整和适配的部分。因为没有两家公司是完全一样的。写书的时候，不给任何流程是很诱人的做法，但流程恰恰是人们最想要的部分。所以书中也包含了流程，但要意识到你必须对其进行调整。

战略与愿景的位置

Lenny： 太好了。好，我们会逐一讨论这四个层面。在此之前，愿景和战略放在哪里？它们归入这四个层面中的某一个吗？你怎么看待战略和愿景？

Itamar Gilad： 这个问题很好。战略有一个整体的上下文，它是在 GIST 之外的。GIST 并不试图解决战略问题，它假设战略已经到位了。还有另一个巨大的板块是研究（research）。GIST 不涉及研究，它更多是关于发现和交付的。但战略极其重要，你也可以用我们将要讨论的一些工具来制定你的战略。在很多公司里，战略不过是一份加强版的路线图，本质上还是”计划-执行”模式，只是规模更大。回想一下 Google+，其实它本身就是一个战略选择。书中有一个章节专门讨论战略，我解释了同样的证据引导方法如何被公司用来制定战略。

GIST 在回应什么问题

Lenny： 好，也许最后一个背景问题。人们看到这个模型可能会想：挺酷的，我有目标、有想法、有步骤、有任务，我已经在做这些了。那这个模型是对什么的反思和回应？人们看到它的时候，可能错过了什么？他们会意识到”哦，原来这是我们没在做的，这是最重要的，是我们应该改变的”。我们后面也会详细展开每一部分。

Itamar Gilad： 我觉得逐一讨论每一层会更有帮助。

Lenny： 好，开始吧。

Itamar Gilad： 我们可以逐层讨论实际在发生什么。当人们说”我有目标”的时候，通常他们把目标这一层当成了一个规划会议。他们讨论的是：我们要在什么时候之前构建什么，需要什么资源。但这其实根本不是目标，这是规划工作。

目标与价值交换循环

Lenny： 好，让我们来聊聊目标，我知道这部分也和 OKR 有关，我很期待听到你对 OKR 的看法。

Itamar Gilad： 哦，那是另一个完整的话题了。你之前请过 Christina，她是真正的专家，我不确定自己还能补充什么。但确实，OKR 也是其中的一部分。我们先从目标说起。目标应该是什么样的？目标应该描绘出终态，定义我们想要到达哪里。如果你不知道自己想去哪里，证据是无法引导你的。在很多公司里，你看到的是顶层有关于收入、市场份额之类的目标，然后每个部门各自有一堆竖井式的目标——工程有工程的目标，设计有设计的目标，营销有营销的目标等等。这实际上把人们推向了不同的方向，很难做出决策。我认为在证据引导的公司中——你在几家公司工作过，可能也见过这种现象——他们会使用模型来构建覆盖整个组织的统一目标。我在关于目标的章节中展示的一个模型叫做”价值交换循环”（value exchange loop）。

它的基本思路是，组织试图向市场交付尽可能多的价值，并尽可能多地从市场中获取价值回来。通过在两者之间建立反馈循环，你实际上能够实现非常快速的增长。我认为你应该同时度量这两者，为每一方设定一个指标。我们通常用来度量交付价值的指标叫做北极星指标（North Star metric）。我知道你写过一篇非常好的文章来介绍它。

Lenny： 谢谢。

Itamar Gilad： 你在那篇文章里列出了几十家领先公司，以及它们各自的北极星指标是什么，非常有趣。不过我认为他们告诉你的其实是”我们度量哪些最重要的指标？我们排名第一的指标是什么？“但这并不完全是我所说的北极星指标。北极星指标度量的是我们为市场创造了多少价值。以 WhatsApp 为例，WhatsApp 很长一段时间度量的是发送消息数，因为每一条发送的消息都是一小份增量价值——对发送者、接收者来说都是如此；它是免费的、支持富媒体，可以从世界任何地方发送。相比短信，这是巨大的价值提升。如果第一年我们发送了十亿条消息，第二年二十亿条，那我们可能就把创造的价值翻了一倍。再看 Airbnb，我认为你们的一个关键指标，或者说真正的北极星指标，是预订夜晚数。我不确定你在的时候是不是还是这样？

Lenny： 是的，没错。

Itamar Gilad： 还有很多类似的例子。比如 Amplitude，他们度量的是活跃学习用户（active learning users），或者叫周活跃学习用户——这些用户在工具中发现了一条非常重要的洞察，以至于至少分享给了另外两个用户，并且这些用户也消费了这条洞察。所以，指向这个指标并说”这是最重要的指标，再加上我们要获取的价值指标——收入、市场份额等等”，是一件非常有力的事情。一旦你有了这两个指标，就可以进一步将它们拆解成我所说的指标树（metrics trees）。北极星指标有一棵指标树，顶层 KPI 也有一个指标树——也就是你在这里左侧蓝色部分看到的顶层业务指标，两者通常会有重叠。你可能会在中间发现一些超级重要的指标，因为推动它们实际上能带动其他一切。

Lenny： 你能再解释一下你所说的顶层 KPI 和北极星指标之间的区别吗？

Itamar Gilad： 北极星指标度量的是我们为用户创造了多少价值，也就是他们获得的核心价值。在这个例子中，这是一个生产力套件，所以指标可能是每月创建的文档数。因为我们认为每创建一份文档——也许是很小的文档，我不知道——加上现在 AI 很流行——都是一小份增量价值，这就是我们要增长的数字。顶层 KPI 则是我们期望获得的回报，应该是收入或利润。

Lenny： 我明白了，这就是价值交换。一个是用户得到了什么，一个是你从用户那里获得了什么。

Itamar Gilad： 没错。

指标树

Lenny： 也就是业务如何受益。太好了。我认为指标树是一个非常重要的概念。很多人觉得自己脑子里已经有类似的东西了——“好，这是我们的北极星指标，这些是我们能拉动的影响它的杠杆”。但我认为像你这样把它真正画出来，一层一层深入到影响这个指标的所有不同变量，这种映射方式不仅在思考影响力、目标等方面有用，还能帮助你估算你打算进行的实验的潜在影响。比如，如果你要在这棵树的底层对激活率（activation rate）做工作，假设你提升了 10%，它对全局指标的影响有多大？可能是一个非常小的数值。

Itamar Gilad： 这是一个非常重要的点，我们稍后会谈及影响评估，指标树对此很有帮助。它也有助于对齐，因为整个组织都在努力推动这两个指标，它们本质上就是我们使命的两面。我们有一个使命，那是公司的最高目标，而这两个就是最顶层的两个关键成果，可以说是最重要的东西。所以当你去找另一个团队合作时说：“嘿，你们要不要来一起做我的项目？“他们可能会说：“根据你的这个模型，这个想法确实有可能推动北极星指标。“这就帮助你们对齐了。我见过这样的案例——B 团队放下自己的想法，转而投身 A 团队的想法，正是因为这个模型。它还创造了一个机会，可以把一些子指标交给各个团队长期负责，从而在树中建立起一种主人翁意识和使命感。

Lenny： 它还能帮你判断应该设立哪些团队，哪些团队对指标有最大的潜在影响力。

Itamar Gilad： 在很多组织中还有另一种常见现象——团队拓扑反映的是软件结构，或者是某种层级模型，我们希望以特定方式来组织团队架构。但如果你从指标树出发，就可以尝试围绕目标来安排团队拓扑。有时候你需要重新调整，这并不是持续不断的重组，但时不时你会意识到目标已经变了，需要重新组织，而指标树可以帮助可视化这个过程。

Lenny： 我觉得对于正在听这段内容、思考这个问题的人来说，最简单的理解方式就是——本质上存在一个数学公式，等于你的北极星指标，或者你的收入，或者你试图实现的任何目标。如果你对这个数学公式没有一个尽可能清晰的认识，你应该先把这个搞清楚。因为它会深刻影响你如何思考资源投向哪里、应该有哪些团队、哪里多投资源、哪里少投资源。

Itamar Gilad： 没错。

（广告部分已跳过）

Lenny： 好的，那指标树之后，接下来是什么？

想法层与 ICE 评估

Itamar Gilad： 好。接下来我们需要进入想法层。想法层的作用是帮我们从可能遇到的众多想法中进行筛选。这些想法可能来自——正如你所说——创始人、管理者、利益相关者，来自团队本身，来自用户研究，来自竞争对手。我们被想法淹没，而在组织内部通常发生的是某种观点之争，有时候甚至是政治博弈，或者是最高薪人士的意见。你之前在节目里请过 Ronny Kohavi，他发明了这个说法。而很少发生的是理性、逻辑性地做出”这些是最好的想法”这样的决策——因为说实话，这真的非常难预测。用户需求、市场变化、技术、产品、我们自己的组织，处处充满了不确定性。几乎不可能断言某个想法一定会是最好的。但我们确实会这样说，因为认知偏差让我们相信某个想法远胜于其他一切，绝对是正确的选择。

为了避免这种情况，我们需要以一种更加客观、一致且透明的方式来评估想法。在书中我建议使用 ICE——影响（impact）、信心（confidence）和简易度（ease）。我想我接下来会有一页幻灯片讲这个。ICE 本质上就是为每个想法赋予三个值。影响试图评估这个想法对目标的影响程度——这就是为什么拥有非常清晰且数量不多的目标如此重要。我们是在用北极星指标、顶层业务 KPI，还是团队的局部指标来衡量想法？不管是什么，我们先把它明确下来，然后以此为标准来评估想法。简易度基本上是投入的反面——做这件事有多容易或多困难。但这二者都是粗略估计，都是我们需要去估算的东西。我认为，仅仅通过把问题拆分成这两个维度来讨论，我们通常就能获得比”我的想法比你的好”稍微好一点的讨论质量。

然后还有第三个要素——信心，它试图评估的是：我们对自己的前两个粗略估计——影响和简易度——到底有多确定，或者说应该有多确定。

Lenny： 你用了”简易度”（ease）这个词，很有意思，因为通常人们用的是”投入”（effort）。你把它变成了一个正向的表述，这是你有意为之的调整吗？

Itamar Gilad： 我用的是 Sean Ellis 的定义。ICE 是 Sean 发明的。你认识 Sean 吗？我不确定你有没有请过他？他是——

Lenny： 我还没请过他。

Itamar Gilad： 对。给不认识他的人介绍一下，Sean 非常了不起。他可以说是增长运动的开创者之一，他创造了”增长黑客”（growth hacking）这个词，并且推广了产品市场契合（product market fit）的概念。

Lenny： 对。

Itamar Gilad： 他发明了 ICE，还创造了很多我们在产品领域使用却不自知的东西。

Lenny： 哇，我不知道 ICE 也是他发明的。好的，很酷。所以 ICE 的原始版本用的就是”简易度”而不是”投入”。

Itamar Gilad： 没错，是的。

Lenny： 有意思。

Itamar Gilad： 很多观众可能在想，R 在哪里？因为还有另一个变体叫 RICE，多了一个 R（reach，覆盖面）。我更倾向于 ICE，因为出于各种原因我更愿意把覆盖面折叠到 I 里面去，但两者都是有效的，在某种意义上也是等价的。

Lenny： 我和你立场一致，我的想法完全一样。我觉得人们把这件事过度复杂化了，试图引入各种数学公式来估算影响。我觉得这些只是简单的启发式方法，用来把最好的想法浮上来。它不需要是对影响、信心等东西的完美估算，所以越简单越好。而且最终它总是落到一个电子表格上。人们总想用各种工具来估算这些东西，但其实就是一个电子表格，Google Sheets，就很好了。

Itamar Gilad： 所以，你正好引到了我的下一个要点。当你去估算影响的时候，你会发现这是最难的部分。有时候它就是直觉和猜测，有时候基于一些电子表格或分析，以及你做的粗略估算——我觉得这是合理的。有时候这些方法确实能揭示你没想到的东西，而最理想的情况是，它基于测试——你实际做了测试，访谈了 12 位客户，给他们看了这个东西，结果其中只有一个真正喜欢。你通常应该据此降低你的影响估算，或者你做其他类型的测试。我们稍后会聊到测试。问题在于，人们往往就凭直觉行事，然后给自己一个很高的信心。他们说影响是 8 分，而且我很确定，所以信心也给 8 分——我觉得这有点令人不安，因为它实际上颠覆了整个系统。

所以我希望能帮助人们认识到，什么时候他们的猜测有强有力的证据支撑，什么时候只是薄弱的证据——从某种意义上说，就是如何计算信心。为此我创建了一个叫信心仪表盘（confidence meter）的工具，就是你能看到的这个彩色的东西——我要不要详细解释一下？

Lenny： 好，来吧。同样提醒一下，如果你只是在听音频，可以去 YouTube 上看，能看到实际的幻灯片。

Itamar Gilad： 好的。基本上我把它设计得有点像温度计。从低信心（蓝色区域，右上角）一直到高信心（红色区域），数字从 0 到 10。0 是非常低的信心——基本上我们什么都不知道，只是在黑暗中猜测；10 是完全的信心——你确定这个东西会成功，毫无疑问。我在圆盘上标出了你可能遇到的各种证据类型。比如，从右上角开始，所有这些蓝色区域都是关于意见的。可以是你自己对想法的自信、你的自我信念——你觉得这是个好主意。但你猜怎么着？每一个糟糕透顶的想法背后，都有人觉得它棒极了——这只能给你 10 分中的 0.01 分。也许你做了一个精美的演示文稿，或者一份六页文档，详细解释为什么这是个好主意。做起来稍微难一点，但信心仍然非常低。也许你把它和某个主题联系起来了——比如区块链……

嗯抱歉，区块链已经不时髦了。现在什么最火？

Lenny： AI。

Itamar Gilad： 没错，AI。这和 AI 有关，所以它就是个好主意？绝对不是。或者和公司战略挂钩，那是另一种主题支撑。就在我们说话的此刻，成千上万个糟糕的主意正在基于这些主题被执行。按照这个工具，所有这些加在一起最多也只能给你 10 分中的 0.1 分。接下来我们进入稍微严格一些的测试。一是和同事、经理、利益相关者一起评审这个想法。他们也不知道答案，他们没有水晶球，他们通常也不是用户，无法预测。但他们可以用稍微客观一点的方式来评估，也许能发现你想法中的缺陷。另一方面，群体也有偏见——政治、从众心理。所以群体有时反而会做出比个人更糟的决策，有这方面的研究支持这一点。接下来是估算和计划。你可能做一些粗略计算，或者你的同事去尝试更好地评估简易度。这会给你稍微多一点信心，但此时我们仍然停留在猜测的层面。接下来是数据——数据可以是轶事性的。你在数据中找到零星的几个数据点，或者和几位客户聊了聊，或者某个竞争对手有同样的想法。在我接触的很多公司里，如果领先的竞争对手有这个功能，我们就认为验证完成了——上线吧，这就是个好主意，我们得做。说实话这从来都不管用，你不应该假设你的竞争对手比你更知道自己在做什么。数据也可以是我所说的市场数据——来自调查，来自对你大量数据的分析，通过深入的竞争分析获得——还有其他方法可以创建更大的数据集，把你的想法与之对照。最后，要获得中等和高等信心，你真正需要的是把你的想法构建出来并测试它——这就是红色区域所在。所以有各种形式的测试，如果时间允许我们会谈到，它们能给你不同程度的信心。

Lenny： 太棒了，这个可视化非常酷。我们会在节目备注中链接这张图片，方便大家查看。我觉得这个工具的厉害之处在于，你完全可以在团队里把它当作一个小工具来用——就是问：我们在光谱上的哪个位置？我们认为这个的影响非常高，但我们的信心可能还停留在蓝色区域——那我们就确保认识到这一点。它用非常清晰的语言帮助人们理解。如果我们有这个工具，大家会更有信心得多。

Itamar Gilad： 所以你还可以根据你获得的信心水平来决定对这个想法的投入程度。本质上，早期你只想做成本低的事情来获取更多信心，然后再加大投入。如果是一个非常低成本的想法，你可以直接跳到高信心的测试——你可以做 AB 实验、早期采用者计划，什么都行，然后上线。有些想法不需要测试，有时候专家意见就足够了。如果你只是在改设置的顺序，没人会注意到，也没人会受影响——风险很低，你可以不测试直接上线。所以诀窍的一部分也在于知道什么时候该停下来，而不是在不必要的时候一路强行推到底。

Lenny： 这是一个非常重要的观点。另一个重要的观点是，产品经理工作的很大一部分就是说”不”，阻止愚蠢的事情发生——这是一个非常好的工具来帮你做到这一点。就是说，好吧，你有这个想法，我们现实一点——我们对此有多大信心？好的，这要花我们三个月来做这件事。也许我们应该考虑别的，也许我们应该在真正投入之前先把信心等级提上去。

Itamar Gilad： 是的。这是我经常听到的真实使用场景，有些人用这个来做一个……客观而温和地说”不”的方式。或者说我们会考虑，但看看我们的其他想法，它们的影响和信心是怎么排的。

Lenny： 经典的产品经理招数——“这个想法很好，但看看这个更好的想法怎么样？” 回到我们开头聊过的一点——假设你有一个非常聪明且经验丰富的创始人，甚至是在初创公司里，你其实没有时间为想法构建大量证据。对于花多少时间来建立信心 versus 就觉得他们的想法确实很好、直接看看效果——你有没有不同的看法？

交付速度与发现速度的权衡

Itamar Gilad： 所以交付速度和发现速度之间始终存在一种权衡，这实际上引出了下一个层面——我们如何将二者结合起来？因为人们倾向于认为这是一非此即彼的选择。要么我们非常快速地构建，要么我们在学习、但构建得非常慢。但我认为我们用错了衡量标准。衡量标准不是你能多快把代码推上线——当存在大量不确定性的时候，而且我们所有人都面临不确定性，初创公司尤其如此——关键不在于把代码推上线，而在于把正确的代码推上线。关键在于创造你所需要的成果、所需要的影响，所以衡量标准是通往成果的时间。而我认为证据引导方法的影响要大得多，速度要快得多，资源效率也高得多，远超基于意见的方法。因为基于意见的方法往往会浪费你更多的资源，构建错误的东西，或者学得太晚。而证据引导帮助你更早地学习。

另外，认为”如果你在学习就不能构建”是一个谬误。优秀的团队知道如何同时做到这两点，而这正是步骤层旨在教你、帮助你做到的事情。

不同阶段公司的应用方式

Lenny： 太好了。那么也许来闭合这个话题——假设听众中有人在更大的公司工作，比如 Netflix，对比处于 A 轮、B 轮的初创公司。你会建议他们以不同的方式来应用这些方法吗？对于不同阶段的公司，有没有什么指导性的建议？

Itamar Gilad： 当然。我认为我们讨论过的北极星指标这个概念，创造的价值与捕获的价值之间的区分，对每家公司都非常重要。构建完整的指标树，在早期阶段可能有些过度；做沉重的加权 OKR 也可能有些过度。早期阶段的公司甚至不知道自己如何创造价值，所以他们需要迭代，他们的目标真正是找到产品市场契合。在此之后，你需要开始构建你的商业模式。这就是你的目标，你朝着这个方向迭代，你需要为此设定指标。然后当你进入规模化阶段，你需要尝试建立秩序，因为当你扩大规模时……所有这些在书里都有涉及，有一个专门的章节专门讨论这些问题。当你扩大规模时，你会有很多人、很多钱，所有事情同时在发生。所以在那里你需要一种非常系统化的方式来评估想法。顺便说一下，像 Netflix 这样的公司，我不知道他们是否需要这个具体方法。他们非常——

Lenny： 对，也许那个例子不太好。他们可能已经做得很好了。

最受益的两类公司

Itamar Gilad： 顺便说一下，我发现有两类公司特别受益于这个方法。一类是那些正在逐步进入现代产品开发的公司。他们有产品团队，有产品经理，有 OKR，开始做敏捷开发，开始做实验，但他们在将这一切整合在一起方面遇到了困难。每个 CPO 都在构建自己的小框架。另一类是那些曾经是证据引导但后来倒退了的公司，这种情况发生的频率太高了。管理层变动，文化变动，然后突然之间他们需要重新发现、重新点燃那种曾经失去的精神，就像 Google+ 那样。所以一些对这个方法反应最强烈的人，出人意料地正是在这些公司里。

Lenny： 我很喜欢你的框架以及我们讨论的所有这些东西——你几乎可以把它们看作一整套工具箱，帮助你的公司变得更加证据引导。你可以从信心仪表盘开始，可以开始更多地使用 ICE，可以开始使用指标树——所有这些东西都在一步步推动你更加证据引导，你不必一次性全部采用。

Itamar Gilad： 完全同意。我建议你不要试图一次性全部采用，因为如果变革规模太大，你会感到疲惫，你只是给很多人带来了很多流程，却看不到结果，一个季度之后就会放弃。所以你刚才建议的正是正确的方法。

从哪里开始

Lenny： 如果人们想从更多以意见为导向转向更多以证据为导向，你会建议首先做什么？这些框架或模型中你会优先推荐哪个？

Itamar Gilad： 我建议他们先在内部讨论，目前面临的最大问题在哪里。如果目标不清晰，存在不一致，我们一直在追错的东西——从目标层开始。尝试确立你的北极星指标、你的顶层业务指标、你的指标树，开始为各团队分配各自的责任领域。如果你花大量时间在争论上，不断争执、反复改变主意——从想法层开始，建立影响-信心评分或任何你喜欢的优先级模型，但要把证据纳入其中。我认为信心仪表盘是一个不论在什么情况下都值得使用的好工具。如果你构建得太多、学习得太少——开始采用我们还没看到的步骤层。如果你的团队参与度很低——你有一种团队，开发人员非常投入于敏捷、非常注重质量、非常注重发布——从任务层开始着手。

Lenny： 太好了。好，我们继续。

步骤层：在学习的同时构建

Itamar Gilad： 好，那么步骤。步骤就是帮助我们同时学习和构建，正如我们所说的。我看到的一个模式是，组织不知道他们其实可以以低得多的成本来学习。他们相信需要构建一个精心设计的 MVP——而这个 MVP 实际上一点也不”最小化”——然后发布它，然后他们才会发现结果，本质上这就是我们 20 年前所谓的 beta 版，只是换了个名字。我在步骤层试图做的，就是帮助公司意识到，验证你的想法——或者更具体地说，验证你想法中的假设——有一个完整的范围。我为此创建了一个小模型，叫做评估、事实调查、测试、实验和发布结果。但同样，这只是把更聪明的人发明的东西组合在一起。在评估阶段，你有非常简单的事情，不需要大量工作。你检查它是否与目标一致——你手头的这个想法。

你可能做一些商业建模，做 ICE 分析，做 Assumption Mapping——这是 David J. Blend 的一个很好的工具——或者你与利益相关者一对一交谈，看看是否存在风险等等。这些通常花费不大，但能让你对想法的影响和简易度了解很多。下一步是挖掘数据，这通常与评估同步进行。你可以通过数据分析、问卷调查、竞品分析、用户访谈和实地研究——观察你的用户——来获取数据。显然最后两种方式花费较高，所以最好不要等到有了想法才开始做研究。最好是持续不断地做研究，这样你就有一些数据可以依赖，可以将你的想法与之对比。但到目前为止我们还没有构建任何东西，现在你准备开始测试了——构建产品的版本，放到用户面前，衡量结果。但最初你不需要构建任何东西，你只需要模拟它。

低成本验证方法

Itamar Gilad： 你可以做假门测试（fake door test）、冒烟测试（smoke test）、Wizard of Oz 测试、礼宾测试（concierge test）、可用性测试。顺便说一下，我们在标签式收件箱项目中大量使用了这些方法，最早的一个版本实际上是我们向人们展示标签式收件箱的工作效果。但这并不是真正的 Gmail，只是一个 HTML 的外壳——在幕后，根据用户授予我们的权限，我们的一些人手动把邮件的主题和发件人移到了正确的位置。所以一开始访谈者先转移他们的注意力，然后向他们展示收件箱，其中前 50 封邮件被分到了正确的位置——如果我们没搞错的话——人们的反应是，“哇，这真的很酷。“这给了我们很多证据。

Lenny： 这个故事太棒了。所以那是在用户研究阶段，并没有真正向用户发布？是手动逐个操作的？

Itamar Gilad： 没写一行代码，都是研究人员和设计师拼凑出来的。但它给了我们一些证据，让我们有理由说，我们应该尝试构建这个东西。

Lenny： 太喜欢这个故事了。

Itamar Gilad： 所以最初你先伪造它。中级测试则是构建一个粗糙版本，不完整，不精致，不可扩展，但足以让用户开始使用。这些就是早期采用者计划、alpha 版、纵向用户研究和 fish food。Fish food 是在你自己的团队上测试。

Lenny： Fish food？我没听过这个词。所以就是 dog fooding，但仅限于你自己的团队？

Itamar Gilad： 我觉得这是 Google 的说法，不过有人告诉我他们公司也用 fish food 这个名字。所以我就用了，不知道有没有更好的叫法。

Lenny： 我想知道为什么叫 fish food，因为是小小的？像是轻轻的小口咬？

Itamar Gilad： 可能是吧。嗯，我不知道。

Lenny： 哇，好的，太酷了。我学到了很多东西。

从测试到实验再到发布

Itamar Gilad： 下一个阶段是构建一个更完整的版本，然后你可以 dog food 它，把它交给内部用户使用。多年前我刚加入微软时，第一件注意到的事就是 Outlook 漏洞百出。我问大家怎么回事？他们告诉我，我们都在 dog food 尚未发布的下一个版本的 Outlook，这在硅谷是非常普遍的做法。你可以做预览版、beta 版、labs 实验，这些也都是测试。现在，有一类特殊的测试是实验（experiment），因为它们包含对照组。所以 AB 测试、多变量测试，这些都是实验。我使用的”实验”一词是数据科学家的用法，虽然人们倾向于把这里看到的一切都叫实验。最后，即使是发布，你也可以做分阶段发布、百分比上线、回退保留。所有这些方法都能帮助你进一步验证假设。有时你需要回滚并修改，但这又是一个学习的机会。

关键点是，你不必从右侧——也就是高成本的那端——开始。你可以从早期阶段开始，这样可以快速试探很多想法。你发现它们并不像你想的那么好，然后你可以把更多精力投入到好的想法上。如果它们产生了正面证据，你可以一步步深入，直到你觉得自己准备好交付为止。

Lenny： 好的。我们已经谈了目标，谈了想法，现在在谈步骤。关于步骤还有什么要补充的吗？然后我知道接下来是任务。

Itamar Gilad： 没有了，步骤就到这里。这中间还有很多内容，我们不会全部展开。

Lenny： 好的。我们来谈谈任务，你指什么。

两个世界的鸿沟

Itamar Gilad： 在很多组织里，存在着两个世界。一个是规划世界，基本上是管理者、利益相关者、一些 PM 坐在一起思考我们需要发布什么，在这里我们制定策略、路线图和项目。但猜猜谁没被邀请参加这个派对？那些真正干活的人。他们生活在敏捷世界里，非常专注于把工单移到完成状态，完成燃烧的故事点，把东西推上线。这两个世界之间存在巨大的鸿沟。他们互不理解，意见不一，有时会积累很多不信任——对计划的不信任，或者管理者觉得团队就是效率不高。这些我们都见过，而解决方案——那个权宜之计——就是把一个 PM 放在中间。PM 本应让这一切运转起来，像项目经理一样按路线图交付，用完美排序的产品待办列表和用户故事喂养敏捷机器——说实话，这根本行不通。

GIST 看板：连接规划与执行

我遇到的 PM 都非常疲惫，他们不得不花大量时间在计划会议和路线图讨论上，忙得没有时间做研究或测试想法。所以我建议改变这种状况，让开发者稍微走出他们的敏捷牢笼——如果你愿意这么叫的话。没有不尊重敏捷的意思，它是好东西，但让我们让他们做的不仅仅是开发，也让他们参与发现。我建议的工具之一——同样这也是一个流程——就是我所说的 GIST 看板。它基本上就是 GIST 的上面三层。目标在右侧，通常就是每个团队的关键结果，我建议不超过四个。所以每个团队创建一个 GIST 看板，然后是我们正在推进的想法——有时带有 ICE 评分——然后是我们可能要推进的接下来几个步骤，以验证这些想法。这是一个非常动态的东西。

它一直在变，团队负责人需要更新它，团队需要至少每两周围绕它开一次会，讨论当前的情况。我们还在跟进正确的想法吗？目标完成得怎么样？下一步是什么？什么阻碍了我们完成最重要的步骤？而这是如今不存在的一种讨论，因为大多数讨论发生在路线图层面，然后又有很多讨论发生在任务层面。但这个中间层——我们到底在试图达成什么、做得怎么样——是不存在的。如果你确实有了这个，你就在团队脑海中建立了更多的上下文，然后他们需要问你的问题就更少了，你需要告诉他们做什么也更少了。他们知道什么是成功，能够真正更多地自主行动。

Lenny： 理解 GIST 看板的方式，是把它看作你做路线图的方法，还是更像一个战略框架，帮助你思考为什么要进行广泛的优先级排序？

Itamar Gilad： 我通常是这样说的：在季度初，团队定义自己的目标。团队负责人定义目标，但他们会和团队一起审核，和管理层一起审核，当然也和利益相关者一起审核。大家达成一致：这些是你们最多四个关键结果和一两个需要推进的目标，团队不可能交付比这更多的东西。你把这些关键结果复制到 GIST 看板上，然后开始查看你的想法库，或者开始生成想法，说：我们怎样才能达成这些关键结果？

Lenny： 确认一下，你复制的是关键结果，把它作为目标？

Itamar Gilad： 是的，没错。你也可以把目标写在旁边，提醒大家我们在追求什么，但关键结果才是我们在这里展示的核心。然后你挑选一些想法，那些看起来最有前景的。听起来可能不太直觉，甚至有点反直觉，但我建议让团队来挑选这些想法。管理者或利益相关者可以提出想法，每个人都可以提出，但团队应该用 ICE 流程来——尤其是产品经理在这里非常关键——决定先测试哪些想法。然后团队一起需要确定我们应该执行哪些步骤，怎么验证？有些步骤由产品经理完成，有些由数据分析师完成，有些由用户研究员完成。但有些需要团队一起参与，会涉及一些编码、一些实验运行等，所以步骤层面也有一定的归属感。一个子团队负责每个步骤，我们会非常频繁地更新看板。

所以如果一个想法被证明不好，我们会把它从看板上拿掉，换成另一个想法；或者也许我们已经达成了目标，不需要再在这上面花时间了，可以专注于其他事情。所以从某种意义上说，它也是一个项目管理工具。

GIST 看板的核心：学习里程碑而非工程里程碑

Lenny： 很好。我看着这个看板，觉得也许最重要的部分是——步骤不仅仅是像”启动一个更好的引导流程”或”在引导中加一个步骤”这样的项目。你想强调的是，你要采取的步骤本质上是为了获得越来越多的信心，越来越多的证据引导思考，而不是仅仅说”好吧，我们来想办法把这个功能点子上线。”

Itamar Gilad： 完全正确。它不是工程里程碑，也不是设计里程碑，而是学习里程碑。所以我们构建一些东西，在此过程中实际上也在扩大构建的范围。我们在过程中构建产品，同时也在学习，这两者必须齐头并进。

Lenny： 对于那些没有在 YouTube 上看视频的人，我们快速走一个例子。你这里的一个目标是平均引导时间，你希望平均引导时间少于两天，目前是五天半。那里的一个想法是引导向导，然后步骤是用模型做可用性测试，接着用原型做可用性测试，然后做 A/B 测试？

Itamar Gilad： 对，基本上是这样，而且你可以边走边调整。有时候你可以并行运行多个步骤，不一定是串行的。但基本上就是这个过程。

Lenny： 很好。所以再次强调，你作为团队要在这里突出的是——我们不会直接上线这个引导向导，然后再慢慢摸索。而是我们要提前规划好将要采取的步骤，一步步建立越来越多的信心。这是我们应该持续投入越来越多的东西，这真的很有意思。

Itamar Gilad： 是的，还有一件有趣的事——每次你运行一个步骤，如果成功了，你就有了证据，可以拿回去给管理者看，跟他们分享，说：“这个想法我们本来觉得很好，但得到了这样的结果。你们觉得这意味着什么？“有时候提出这个想法的管理者会说：“我觉得测试失败了，我们重新跑一次。“有时候他们会说：“也许它没有我想的那么强。“讨论由此变得更加细致、更加客观。

GIST 看板与路线图的关系

Lenny： 也许作为这个框架的收尾。这和人们可能在电子表格或 Jira 或 Asana 里维护的路线图是什么关系？它是叠在那上面的？还是替代其他地方的路线图？

Itamar Gilad： 我觉得发布路线图——就是你写”Q3 之前我们要上线这个”或者”十月之前我们必须上线那个”——那种路线图和这个是冲突的。如果你那样做，人们知道目标是十月上线那个东西，那就别谈学习、别谈证据引导了。我建议使用成果路线图（outcome roadmap），说”到十月我们想达成这个成果”。到 Q4 我们想在另外三个国家上线，或者我们想在那之前把印度的使用量增长多少，到这个时候我们需要解决流失问题，至于怎么达成——有时候我们确实有一个具体想法，是高信心、已经测试过的，那就切换到交付模式，可以把它放上路线图，说”对，我们会构建这个东西，目标是十月”。但其他时候你想保持开放，而路线图如果你在低信心的时候就提前决定某个特定想法必须在某个时间上线，它反而会扼杀这个过程。

Lenny： 好的。所以你在提议人们把路线图的实践切换到这种方式，这非常大胆。我很喜欢。

Itamar Gilad： 嗯，这不是路线图。这只是团队用来管理项目的工具，但我在书里确实有一个成果路线图的方案。

如何开始实践

Lenny： 好，很好。我本来想问，如果人们想尝试这种方法，书是充分理解这个框架及其实施方式最好的途径吗？

Itamar Gilad： 那是一种方式。我有文章，网站上也有资源，但我尽量把我们刚才讨论的很多内容以更细致的方式浓缩在书里。所以如果你对此感兴趣，不妨读一读。

Lenny： 好。也许快速谈谈 OKR 的话题。OKR 怎么和这一切联系起来？听起来大致上你假设人们会继续按”这是我们的指标或关键结果或目标”的方式工作，然后那套东西接入 GIST 框架。

Itamar Gilad： 指标树，加上你的使命，加上各个团队的单独使命，基本就能提供你填写 OKR 所需的大部分内容。当然还有一个对齐的过程——自上而下、自下而上、横向之间——我在书里也谈了一些。OKR 是一个非常丰富的话题，但那些东西通常是核心。通常还有一些关于公司健康度、产品健康度等的 OKR，叫做补充 OKR（supplementary OKR），我也讨论了这些。所以是的，我认为 OKR 如果你喜欢的话是一个有用的工具。

Lenny： 再拉远一点看。基本上你不需要把所有这些想法一口气全盘接受，改变整个业务的工作方式。你可以从挑选其中一些想法开始，逐步变得越来越证据引导。听起来这个 GIST 看板可能不是你起步的地方，也许是在你积累了越来越多使用这些工具的经验之后才用的——或者你来告诉我。你有时候会直接从这种路线图和规划的思考方式开始吗？

Itamar Gilad： 所以可能不是完整的看板，因为你缺少一些组成部分，也许你的目标还不够好，或者想法优先级做得还不够好。但如果你的团队非常、非常以交付为中心——有时候情况恰恰相反，管理者在告诉他们怎么构建——你想打破这种动态，你想创建一个步骤 backlog。所以不是产品 backlog，而是创建一个步骤的 backlog，就是验证步骤、beta 版、预览版等等，这会相当强烈地改变动态。

闪电问答环节（续）

Lenny： 有哪两三本书是你最常推荐给别人的？

Itamar Gilad： 我要耍点小聪明，我要推荐的是系列书，所以是两个系列。一个是——

Lenny： 允许耍小聪明。

Itamar Gilad： 好，太好了。第一个，这些都是比较显而易见的。一个是 SVPG，也就是 Silicon Valley Product Group 出版的系列。《INSPIRED》《EMPOWERED》，现在好像《TRANSFORMED》也出了，我还没读，但我肯定它很棒。这是 Marty Cagan 和他的同事们写的，都是非常好的书，每个产品经理都应该读。另一个系列稍微老一些，是精益系列，《精益创业》《Lean Enterprise》《Lean Analytics》，这些书里都有真金白银，《Lean UX》也是，非常重要非常好的书，我觉得它们没有得到应有的重视。《Running Lean》也是一个例子。

影视与学习

Lenny： 最近最喜欢的电影或电视剧是什么？

Itamar Gilad： 我其实不算什么影视迷，随便放什么就看什么。我发现 YouTube 实际上正在成为我获取信息和娱乐的来源之一。我最近在学西班牙语，所以发现了一个叫 Dreaming Spanish 的频道，如果你在学西班牙语的话，这个频道太棒了。这就是我的推荐。

Lenny： 这个选择很独特，我很喜欢。你最喜欢问候选人的面试问题是什么？

Itamar Gilad： 我喜欢让他们为一个小众受众群体设计一个东西。比如为老年人设计一个导航系统，或者为视力障碍人群设计某种笔记本电脑等等。这类问题能很好地观察他们的用户同理心、创造力、评估多个想法的能力，以及发现自身想法缺陷的能力。所以有很大的空间可以深入挖掘，观察这个人作为产品人的思维方式。

AI 与产品发现

Lenny： 最近发现并喜欢的某个产品是什么？

Itamar Gilad： 说起来有点老套，但就是 AI。有一家叫 ElevenLabs 的公司，做语音的，你听到过的最好的合成语音就是他们做的，而且他们还能复制你自己的声音，你可以创建一个声音签名。如果你是美国人，可以用他们的免费默认版本或者便宜的版本复制自己的声音，如果你需要为有声书配音或者做在线课程，这个就非常有用。所以我觉得这个服务非常有趣。

Lenny： 这都是我宏大退休计划的一部分，把所有这些组件拼到一起，最终能替代我。有 AI 生成内容，还有这个语音的东西。我太喜欢了，一切都在发生。

Itamar Gilad： 已经有一个 AI 版本的你了对吧？我可以向它提问——

Lenny： 哦，有一个 lennybot.com。

Itamar Gilad： 对。

Lenny： 都是这个计划的一部分。好了。你最喜欢、最常对自己重复并分享给他人的人生格言是什么？

人生格言

Itamar Gilad： 这个问题很大。爱因斯坦说过：“不要努力成为一个成功者，而要努力成为一个有价值的人。“我觉得这对个人和公司来说都是一条极好的格言。它指引着我，而整个价值交换等概念也与之有一定关联。

Lenny： 我太喜欢这句话了，对于在网上输出内容的人来说，这一点非常重要。很多人就是想，我想成功，想涨粉，于是发各种推文、晒各种东西，但真正有效的是交付价值，创造出人们真正需要和想要的有价值的东西。我觉得一个判断信号是：你自己是否觉得它有趣、有价值？如果你自己觉得”哇，这真的很有意思”，通常其他人也会觉得有意思。所以我非常喜欢这句话，很好的选择，我要去查一下原文。还有两个问题。你从你妈妈或爸爸那里学到的最有价值的道理是什么？

Itamar Gilad： 他们两位各有各的方式，他们做的工作都比较朴素，教书或者做其他事情，但他们总是努力做到最好，尽可能交付最大的价值。所以这某种程度上是相通的，也许我是透过这个滤镜看世界的。但他们教会了我努力在自己的领域做到最好。

家乡味道

Lenny： 最后一个问题，你是以色列人，可能有些人听不出来。你最喜欢的以色列美食是什么？大家一定要去尝尝或者有机会就要吃的？

Itamar Gilad： 我到以色列通常直奔沙瓦尔玛（shawarma），类似于土耳其烤肉（döner kebab），如果你知道的话，就是比它更好吃。所以如果你在以色列，如果你去海法（Haifa）——就是我长大的城市——一定要尝尝那里的沙瓦尔玛。

Lenny： 太棒了。Itamar，希望大家从我们的对话中了解了你这本书的要义。找到这本书最好的方式是什么？了解你、联系你最好的方式是什么？另外，听众怎样能帮到你？

Itamar Gilad： 要找这本书，可以去 itamargilad.com 或者 evidenceguided.com，你能找到这本书，也能找到我。对我来说最大的价值是，去试试看，把这些想法中的一些带回去，带到你的办公室，和同事聊聊，说说你觉得我们应该怎么做？动手试一试，然后回来找我，我在我的网站上很容易找到。告诉我发生了什么，我真的很有兴趣了解。

Lenny： 太好了。Itamar，再次非常感谢你的到来。

Itamar Gilad： 谢谢你。

Lenny： 大家再见。非常感谢大家的收听。如果你觉得这期节目有价值，可以在 Apple Podcasts、Spotify 或你喜欢的播客应用上订阅本节目。另外，也请考虑给我们评分或留下评价，这真的能帮助其他听众找到这个播客。你可以在 lennyspodcast.com 找到所有往期节目或了解更多关于本节目的信息。下期再见。

术语表

原文	中文
activation rate	激活率
Assumption Mapping	Assumption Mapping（David J. Blend 创建的工具，保留原文）
build-measure-learn loops	构建-测量-学习循环
concierge test	礼宾测试
confidence	信心
CPO	CPO（Chief Product Officer 的缩写，保留原文）
David J. Blend	David J. Blend（人名保留原文）
design thinking	设计思维
dog fooding	dog fooding（团队内部使用自己开发的产品）
Dreaming Spanish	Dreaming Spanish（频道名保留原文）
döner kebab	土耳其烤肉
ease	简易度
ElevenLabs	ElevenLabs（公司名保留原文）
evidence guided	证据引导
fake door test	假门测试
fish food	fish food（在自有团队内部进行的产品测试，源自 Google 的说法）
GIST	GIST（Goals, Ideas, Step-steps, Tasks 的首字母缩写，不译）
GIST board	GIST 看板
growth hacking	增长黑客
Haifa	海法（以色列城市）
ICE	ICE（Impact, Confidence, Ease 的首字母缩写，不译）
impact	影响
lean startup	精益创业
Marty Cagan	Marty Cagan（人名保留原文）
meta framework	元框架
metrics trees	指标树
MVP	MVP（Minimum Viable Product 的缩写，保留原文）
North Star metric	北极星指标
OKR	OKR（Objectives and Key Results 的缩写，保留原文）
output	产出
product discovery	产品发现
product market fit	产品市场契合
reach	覆盖面
RICE	RICE（Reach, Impact, Confidence, Ease 的首字母缩写，不译）
roadmap	路线图
Ronny Kohavi	Ronny Kohavi（人名保留原文）
Sean Ellis	Sean Ellis（人名保留原文）
shawarma	沙瓦尔玛（中东烤肉卷）
siloed goals	竖井式目标
smoke test	冒烟测试
story points	故事点
SVPG (Silicon Valley Product Group)	SVPG（硅谷产品集团，保留原文缩写）
top KPI	顶层 KPI
value exchange loop	价值交换循环
Wizard of Oz test	Wizard of Oz 测试

此文档由 AI 分片翻译（translate_long_document）

Becoming evidence-guided | Itamar Gilad (Gmail, YouTube, Microsoft)

Validating Gmail Tabs via Wizard of Oz

Lessons from Google+: Opinion-Based Development

Google’s DNA: From Evidence to Execution

The Birth of Gmail Tabs

The Evidence-Guided Company

When Are Top-Down Decisions Justified?

Presenting Data to Skeptical Leadership

Are You Truly Evidence-Guided?

Overview of the GIST Model

The Four Levels of GIST

Where Strategy and Vision Fit

What Questions Does GIST Answer?

Goals and Value Exchange Loops

The Metric Tree

Ideas Layer and ICE Scoring

Delivery Speed vs. Discovery Speed

Adapting for Different Company Stages

Which Two Companies Benefit Most?

Where to Begin

Steps Layer: Build While Learning

Low-Cost Validation Methods

From Tests to Experiments to Launch

The Gap Between Two Worlds

GIST Kanban: Connecting Planning and Execution

Learning Milestones Over Engineering Milestones

GIST Kanban and the Roadmap

How to Put It Into Practice

Resources and Lightning Q&A

Movies and Learning

AI and Product Discovery

A Motto for Life

Taste of Home

Glossary

走向以证据为导向 | Itamar Gilad (Gmail, YouTube, Microsoft)

绿野仙踪测试：Gmail 标签收件箱的早期验证

Google+ 的教训：基于观点的开发

Google 的基因：从证据驱动到计划执行

Gmail 标签页的诞生

证据引导型公司

自上而下的产品决策是否有时合理？

如何在不太开放的领导面前用数据说话

你真的做到证据引导了吗？

GIST 模型概览

GIST 的四个层面

战略与愿景的位置

GIST 在回应什么问题

目标与价值交换循环

指标树

想法层与 ICE 评估

交付速度与发现速度的权衡

不同阶段公司的应用方式

最受益的两类公司

从哪里开始

步骤层：在学习的同时构建

低成本验证方法

从测试到实验再到发布

两个世界的鸿沟

GIST 看板：连接规划与执行

GIST 看板的核心：学习里程碑而非工程里程碑

GIST 看板与路线图的关系

如何开始实践

推荐资源与闪电问答环节

闪电问答环节（续）

影视与学习

AI 与产品发现

人生格言

家乡味道

术语表