深入了解 X 的 Community Notes | Keith Coleman & Jay Baxter
An inside look at X’s Community Notes | Keith Coleman & Jay Baxter
Opening Episode Preview
Lenny Rachitsky: The work that you guys do has had such a tremendous impact on the way the world works. I want to start with just giving people a brief understanding of what is Community Notes.
Keith Coleman: Someone on X can see a post. If they think it’s misleading, they can propose a note that they think other people might find informative. Other people can then rate that note.
Jay Baxter: We actually look for agreement from people who have disagreed in the past. And what we see is when people actually have that sort of surprising agreement, that’s what makes the notes so neutral and accurate and well- written, really, overall.
Podcast Episode Introduction
Lenny Rachitsky: There’s many people that are very polarized. How do you deal with people that are super anti-vax, super Jan 6?
Keith Coleman: One philosophical thing that’s important is that we want all of humanity to participate and sometimes people are surprised by that. We have all of humanity. We then have the data to understand what notes will be helpful to actual humanity. Every post is eligible for notes. We shouldn’t exempt Elon. We shouldn’t exempt government figures. We should be like everyone… Even advertisers can get notes.
Jay Baxter: There have been external studies run by people totally independent of us who have found that if you take a post with or without a Community Note, that actually people’s agreement with the core claims in the post does change if they see it with a note versus without.
The Official Interview
Lenny Rachitsky: Is there anything else along the lines of just working for Elon within an org Elon runs that might surprise people?
Keith Coleman: If I were to start a company in that company, it would be even leaner than I would’ve made it before. I’ve been amazed with just how much the team is able to accomplish with a small group and I think because of a small group-
What Are Community Notes
Lenny Rachitsky: Today, my guests are Keith Coleman, Product Lead for Community Notes, and Jay Baxter, Founding ML Engineer and Researcher for Community Notes. This conversation may be my newest favorite podcast episode so far. Community Notes is one of the most impactful and clever and, also, underappreciated products in the world right now.
If you ever use X/Twitter and you see a note underneath a tweet correcting the misinformation in that tweet, that is Community Notes. I’ve never heard a deep dive into the story behind the product and the team that built it and I’m excited to bring you just that. We get into the surprising origin story of the product, how the algorithm actually works, how the algorithm emerged out of an internal contest within Twitter, the principles behind Community Notes, and why staying true to them has been so key to its success. Also, how it survived four different leaders, including Elon and Jack, and why it’s now a big part of the solution to solving misinformation on the internet. Including recently being adopted by Meta as their main fact-checking tool. This is an incredibly special episode and I’m so excited to bring it to you.
If you enjoy this podcast, don’t forget to subscribe and follow it in your favorite podcasting app or YouTube. Also, if you become a subscriber of my newsletter, you now get a year free of Notion and Superhuman and Granola and Linear and Perplexity Pro. Check that out at lennysnewsletter.com.
Today, hundreds of companies are already powered by WorkOS. Including ones you probably know like Vercel, Webflow, and Loom. WorkOS also recently acquired Warrant, the Fine Grain Authorization service. Warrant’s product is based on a groundbreaking authorization system called Zanzibar, which was originally designed for Google to power Google Docs and YouTube. This enables fast authorization checks at enormous scale while maintaining a flexible model that can be adapted to even the most complex use cases.
If you’re currently looking to build role-based access control or other enterprise features like single sign-on, SCIM, or user management, you should consider WorkOS. It’s a drop-in replacement for Auth0 and supports up to 1 million monthly active users for free. Check it out at workos.com to learn more. That’s workos.com.
And now, product leaders can get even more visibility into customer needs with Productboard Pulse, a new voice of customer solution. Built-in intelligence helps you analyze trends across all of your feedback and then dive deeper by asking AI your follow-up questions. See how Productboard can help your team deliver higher-impact products that solve real customer needs and advance your business goals. For a special offer and free 15-day trial, visit productboard.com/lenny. That’s productboard.com/lenny.
Keith and Jay, thank you so much for being here. Welcome to the podcast.
Keith Coleman: It’s great to be here. [inaudible 00:05:25].
Jay Baxter: Thanks for having us on.
How the Algorithm Works
Lenny Rachitsky: It’s so my pleasure. I’m so thrilled to be having this conversation. The work that you guys do has had such a tremendous impact on the way the world works. So many product teams are always talking about driving impact and want to drive impact. You guys have actually built things that have changed the world in meaningful ways and continue to do that. And I’ve never really heard the backstory of how Community Notes came to be and how it works and all these things, so I’m really appreciative of you guys making time to chat.
Principles of Open Participation
Keith Coleman: Yeah. First, thanks for saying that. That’s why we built this thing is to help people and it’s great to hear it. It’s great to see people enjoying it and finding it useful.
Fun Community Notes Examples
Lenny Rachitsky: I want to start with just giving people a brief understanding of what is Community Notes. I think a lot of people kind of heard about it, kind of maybe see it on X. As they scroll through, they see these notes but they’re like, “I don’t actually know what this is.” So can you just briefly describe what is Community Notes?
Keith Coleman: Community Notes is a way for the people, like the public, to add context to posts that might be misleading. The basic way it works is that someone on X can see a post. If they think it’s misleading, they can propose a note that they think other people might find informative. Other people can then rate that note. And if the note is found helpful by people who normally disagree with each other, indicating that it’s probably accurate, it’s probably really neutrally-worded, it’s probably informative, then it will show to everyone on X. The goal is just to get people more information about what they’re seeing so they can make better decisions in their lives.
What Triggers Community Notes
Lenny Rachitsky: Amazing, and I think hearing this, it’s absurd that this works. I think when people originally heard this idea like, “No way this is going to work.” And so, just to dive a little bit deeper, can you give us a deeper understanding of how it actually works? Because I think it’s the algorithm that you guys designed that is so clever that allowed this to work. So talk a little bit about that algorithm.
Jay Baxter: Yeah. So I think a key misunderstanding a lot of people have if they haven’t really dived into details, they just think that maybe someone can write a note and it appears immediately or we’re just taking a majority rules vote of who thinks the note’s good. I think both of those approaches would probably lead to biased or inaccurate notes. I think the key thing, really, that we do is we actually look for agreement from people who have disagreed in the past.
And what we see is when people actually have that sort of surprising agreement, that’s what makes the notes so neutral and accurate and well-written, really, overall. It’s just that people who are very polarized, overall, often can’t find agreement when things aren’t accurate, right? I think it also provides some good anti-manipulation properties. I think people are often… If you said… I think back in 2020 before we started building anything here, whether this could work at all, I think a room of ML engineers would say, “Oh, you have to keep it closed source. People are going to be manipulating this all the time. You have to use ground truth labels from fact checkers. There’s no way that you could bootstrap the system without external labels.” But it turns out that you can do that with this kind of bridging-based agreement algorithm is what we call it.
The Scale of Community Notes
Lenny Rachitsky: Okay. So just to summarize and make it super clear. It’s basically people… Someone writes a note. This information is fault… What’s a good example, just as we talk about this, like a classic example?
Note Coverage and Growth
Keith Coleman: A really classic example is an AI generated image or an out of context image like, “Look what’s happening here.” But it’s actually from five years ago in a different country and a different topic or something-
User Feedback and Notifications
Lenny Rachitsky: Oh, man. I’ve seen this so many times where it’s like, “Look what’s happening in San Francisco,” and I’m like, “No, this is a whole different city and that’s not-”
Keith Coleman: Totally. Yeah.
Note Triggering Thresholds
Lenny Rachitsky: Yeah. Okay. So someone posts this AI image. Someone writes a note, “This is actually five years ago in a different city,” and this algorithm helps understand if this note is true and it’s just regular people doing this.
Jay Baxter: Yep. Regular people who have signed up to be Community Notes contributors. So there are a few checks, like you do have to have a verified phone number for instance. But yeah, at the end of the day, these are regular people. Not necessarily professional fact checkers or anything like that.
Note Approval Rates
Keith Coleman: And yeah, that was really important to us too. There was a question at the beginning, to the point Jay was making of like, “Did anyone think this was going to work?” Obviously, it was a crazy idea. We didn’t know if regular people were going to be able to do this task and certainly people had concerns about whether they would do it effectively.
Initially, some people inside the company were suggesting like, “Hey, why don’t you have journalists or some select group be the first participants?” But very specifically we were like, “No. We’re trying to move away from the idea of curated editorial decisions being made around this. This is supposed to be open to everyone.” So we very intentionally try to allow all humans in. People are randomly selected and that’s important to it feeling fair, feeling open, feeling trustable.
Handling Extreme Viewpoint Users
Lenny Rachitsky: Yeah. And again, it’s just like this sounds like the holy grail of understanding what is true and it actually works. And works so well that Meta recently, as you all know, decided to adopt this exact system for them instead of having tens of thousands of fact checkers reviewing things.
Jay Baxter: One distinction that I would make, which maybe can come off as nitpicky but I think is important, is Community Notes adds additional context. It’s not fact-checking necessarily, right? So there are cases where the post could be true. But maybe, it’s just misleading because there there’s no context or there’s missing context. We cover those cases and I think that’s an important distinction. We just have the philosophy that users should be able to make up their own minds, right? Like, “Here’s extra context, take it or leave it,” right?
Lenny Rachitsky: Yeah. What I think about, you shared this with me, this example of a picture with a cat and somebody’s Community Note was just, “That’s a dog.” Or is it the other way around or that’s a-
Jay Baxter: Yeah. “A Palestinian boy shares his bread with a dog,” was the post and it’s a picture of this cat. So obviously, this particular note is not super necessary because it just says, “That’s a cat,” and links to a Wikipedia for cat. It’s a good example that the system is… This is not something a professional fact-checker or whatever or you think would need fact-checking. But it’s proof that the system is really run by the users at the end of the day and adds some comic relief, I guess. And the note is correct.
Lenny Rachitsky: Okay. It’s important.
Jay Baxter: Yeah.
Lenny Rachitsky: When does a post get triggered to even be considered for Community Note? Is there a threshold or is it just you can write a Community Note on anything and people decide what they would vote on? How does that work?
The Philosophy of Global Participation
Keith Coleman: So every post is eligible for notes and that was, again, another really important principle. It’s like, “We shouldn’t exempt Elon. We shouldn’t exempt government figures. We should…” Everyone, even advertisers, can get notes. So any posts on the platform can get a note. And if you look in practice, you’ll see notes appearing on world leaders, on Elon, on ads, on media organizations, and on, obviously, just regular people using social media. But yeah, the idea is really that it’s an even playing field. For a note to be proposed, the person proposing it has to have earned the ability to write notes. So there is that aspect where you have to earn in to be able to do this. And the way you earn that ability is through your ratings by demonstrating the ability to help identify notes that are found helpful to a broad range of people. So basically, if you have an ability to see and know, recognize what’s helpful with a lot of people, then you have the ability to start proposing notes.
Lenny Rachitsky: I actually signed up to be on… What do you call these people? Note take-
Jay Baxter: Contributors.
Lenny Rachitsky: Okay. Contributors. Yeah. So I’ve been rating. I haven’t achieved-
Volunteer Motivation and Impact
Keith Coleman: Nice.
Lenny Rachitsky: I can’t write notes yet.
Keith Coleman: Yeah. It’s not super easy. It takes some effort.
Lenny Rachitsky: Are there stats you can share about the scale of Community Notes at this point, especially things that might surprise people?
Keith Coleman: Yeah. I mean, the service is growing rapidly, so there are hundreds of notes per day. And to put that into context, I saw some stats recently from someone at UC Berkeley saying there was something like 10 traditional fact checks a day. So in contrast, there’s hundreds of notes a day that are getting shown. They span a huge range of topics from, obviously, politics, news, out to entertainment, sports, gaming. Just whatever is going on that day.
In addition to there being hundreds of these individual notes, they can also be matched to multiple posts. So if someone writes a note on an image or a video, like let’s say it’s AI generated or something like that, that note will automatically be matched to all posts that contain the same image. So you can have a single note matching to thousands of posts. And over let’s say the last year, 2024, we had something like 95,000 notes that were seen about 30 billion times. That’s more than double the prior year. Prior year was something like 37K notes seen 14 billion times. So that rate is increasing dramatically when you think about 30 billion views, that’s a lot of information that is getting out there that might not have been out there otherwise, which is pretty cool. And part of the reason it is expanding like that is the contributor base is expanding. There’s something like 950,000 contributors around the world. That’s nearing a million people making this happen which is amazing.
Lenny Rachitsky: Wow. And I’m one of those, right? I count as a contributor?
Keith Coleman: Yeah. Yep. No. If you’re signed up as a contributor, you count.
Lenny Rachitsky: Okay. Cool.
Massive Impact on Content Distribution
Jay Baxter: Then, there’s more people on the waitlist too. So there’s plenty of headroom for more growth. Regarding the matching on media and URLs, I think that’s a huge way to get extra coverage. Also, I do think we’ve been very careful to make sure that those matches are precise. Because I think one thing that people love about Community Notes compared to other types of fact checking is that, actually, the notes are custom written for the particular claim you’re seeing, right? So often, a fact check warning would just say something like, “Get the facts here.” And then, there’s a link to some generic page about voting information, which is so not helpful to have the information behind a click. So pulling the context up so that you have zero clicks that you need to make and keeping it specific is so important.
Lenny Rachitsky: One feature I love that I imagine you guys thought deeply about is if I liked the post in the past, I get notified later if a community note shows up, so that I’m not remembering this false information.
Joining the Twitter Team
Keith Coleman: Yeah. I mean, we try to make notes as fast as we can, so we want them to appear instantly if possible. But inevitably, there’s going to be a time gap between when a post goes live and when people figure out what’s going on and when they get the note out there. And so, we send those notifications to try to close that gap. And yeah, we get a lot of love for that. We see people take screenshots and share them. They’re excited about it. And it’s also a pretty cool example of something you can do on the internet, in the social media world that was difficult in a print or standard news world where you would see maybe a correction the next day in a corner of a paper that was hard to read. Here, you’re getting a ping about it if you’ve engaged with a post and note shows up.
Lenny Rachitsky: One user feedback point is I’d love the push to just tell me, “Here’s what you got wrong.” Because I find that I actually have to go into it and read it and I feel like the push could just be like, “Here’s more context to this thing.” You’re like-
Jay Baxter: Agreed.
Leaving Management for New Directions
Keith Coleman: We’ll go take a look at that-
Twitter’s State in 2016
Lenny Rachitsky: There we go. Live user feedback.
Keith Coleman: Nice.
From Birdwatch to Community Notes
Lenny Rachitsky: Okay. I want to get into the origin story of this whole thing. But two more questions, because we’re on this thread. One is what’s the the threshold for a note to show up on a note? Is that information you can share, just how does that work?
Jay Baxter: So just because of the details of the way the algorithm works, it uses this machine learning algorithm called Matrix factorization where we fit it with Gradient Descent and whatnot. The threshold is it’s 0.4 on this made up scale-
Lenny Rachitsky: 0.4. Great.
Jay Baxter: Yeah. I mean, in practice, what it means is basically a majority of people… If there is a polarized divide relevant to the notes. Obviously, some notes are not about politics or something polarizing. But if there is, then a sizable majority of people on both sides would generally need to find the note helpful. And then, there are other rules that come into play beyond that main one. So even if it’s above that threshold, it might get filtered out if… There’s a separate algorithm that’s looking at agreement between people’s incorrect tags. So like maybe people found the note helpful but incorrect, right? It happens. And in those cases, it doesn’t matter if it’s above the helpfulness threshold.
Lenny Rachitsky: This is probably the wrong way to think about it, but is it 40% of people that normally disagree, agree-
Jay Baxter: No.
Lenny Rachitsky: Okay. It’s-
Jay Baxter: It means nothing like that. It’s just like on some arbitrary scale-
Lenny Rachitsky: Okay.
Jay Baxter: Yeah.
Returning to a Small Team
Keith Coleman: Yeah. If we change random other things about the algorithm, that number would also have to change to an equally seemingly arbitrary number. We arrived at some numbers like that by gauging user feedback. So we could share a lot of notes with people, get feedback on which ones are helpful, and just a line emerged about indicating where things go from questionable to pretty clearly helpful.
Jay Baxter: Yeah. And it is set right now, by the way, to be really conservative, I think. We just are pretty particular about quality and we really want note quality to be really high. I think Keith and I both believe that we live or die based on the quality of the notes at the end of the day. So we’d rather not show a note that maybe good, but we didn’t have enough signal on than the other way around.
Thermal: The Isolated Innovation Team
Lenny Rachitsky: That makes so much sense. I’ve never seen a Community Note that is wrong and breaking that promise is a big deal. So I completely get why you guys are super conservative there. Okay. Two more questions [inaudible 00:19:53] because I’m just curious. These weren’t on my list of questions to ask, but I feel like people wonder this. How many notes are written versus end up showing up and triggering on a-
Keith Coleman: We probably show about 8% of notes that get proposed. It’s been between, let’s say, 7% and 10% or 11%, something like that over time. The number can vary a little bit. And as Jay said, there are undoubtedly… And you can see it, there’s clearly more good notes than we show, but the goal is to hold a really high bar. We want to show a note when it’s going to be helpful, when it’s not going to appear biased and undermine trust in the system. We want these to be neutral, informative, helpful. And as Jay was saying, we view the worst possible mistake as showing a bad note because that’s going to undermine trust and the trust is why people like the product.
So yeah, the bar is there. And like I said, there’s clearly some in that remaining, let’s call it 90%, that are good. And then, there’s a lot that are just not that great and there’s some that are bad. And if you write one of these ones that are bad which bad being defined as people who normally disagree find the note not helpful, so it’s like the inverse of the ones we show. If you write one that people normally disagree, find not helpful, you actually will ultimately lose your ability to write and have to earn it back. That other 90% is a mix. Sometimes people look at the number, they’re like, “Oh, why don’t you show more?” It’s like, “Well, you probably actually don’t really want us showing most of those.” The gold here is that the system is able to filter out the good ones.
Key Design Elements of Thermal
Lenny Rachitsky: That makes sense. Okay. One other question is there’s many people that are very polarized, like very disagreeable with a lot of things. How do they filter into this algorithm? How do you deal with people that are super anti-vax, super Jan 6, like all these very extreme potential views?
Jay Baxter: If people really are so polarized that there isn’t agreement among people that typically disagree, it’s possible that this is one of those notes that might be correct, but it wouldn’t be helpful to show as context. Maybe it’s about a claim that people have really entrenched opinions about and they’ve read hundreds of things about it already.
Probably this is just not going to improve people’s understanding. It’s just not going to be a helpful user experience. So it might not be the worst thing in those cases to not show the note. People, a few years ago, were pretty pessimistic that maybe fact-checking never changes people’s understandings about what’s true. Actually, there have been external studies run by people totally independent of us who have found that if you take a community note or posts with or without a community note… That actually, people’s agreement with the core claims in the post does change if they see it with the note versus without. So we are having an impact on this thing that people previously thought was maybe not so easy to do.
And so, it’s nice to focus on the cases where there is the bridging agreement. I would also say there is this reputation component to the algorithm as well. So if you consistently rate notes in a way that is counter to the bridging-based consensus, then we will stop counting your ratings. So if you’re the kind of person who constantly rates bad notes as helpful, we do filter you out. So there’s a difference between those types of people versus the good but polarized ones.
Early Team Composition
Keith Coleman: Yeah. I think one philosophical thing that’s important is that we want all of humanity to participate. And sometimes, people are surprised by that. They’ll be like, “Oh, aren’t there people who shouldn’t be doing this?”, or like, “Their thinking is so extreme or something, maybe they shouldn’t participate.” But our view is it’s actually we want to have all of humanity here. Because if we have all of humanity, we then have the data to understand what notes will be helpful to actual humanity. We can better model that better or better understand and better show those notes.
So it’s advantageous to have people who have all sorts of points of views and we don’t expect that every note will be loved by every single person. That’s an impossible bar. But we do intend to show the notes that 80% of people are going to read and say, “Wow. I’m glad I knew that.” And so, in that sense, it doesn’t matter how maybe extreme someone views a person’s views as. It’s still great to have them in the program. So no matter what your views are, please sign up and participate. It helps identify what’s really helpful.
Lenny Rachitsky: Cool. And we’ll link to people if they want to actually sign up, so they know how to do this. Something we didn’t actually specify, these are all volunteers. No one’s getting paid to be doing these notes and voting, right?
Keith Coleman: Yeah. It’s totally based on intrinsic motivation and we think that’s a great reason to be doing it. When you talk to the most active contributors, a lot of them, they want to have better information out in the world and that’s a great motivation. So yeah, that’s why they… If you think about, like for these people, the impact they can have is nuts. So when we first launched US-wide, this was like in 2022, a note appeared on a White House tweet and the White House deleted the tweet and reissued an updated statement.
Imagine being the person who wrote that. You probably have 12 followers. Your posts probably get a couple likes. And here, you just put a note on the White House and they changed their public talking points based on what you did. That is an incredible amount of impact. So you could see why people are motivated to do it when they care about what’s going on in the world. You don’t have to be a big, well-known person to shape the discourse and information flow in a way that’s helpful.
Lenny Rachitsky: It’s insane. There’s so much to love about this. One is just the meritocracy of this whole operation of just anybody that is true and correct can participate and have impact. Also, it just shows you how much information we get that is just wrong. We had no idea how often we see things that are wrong and now we do.
Keith Coleman: Working on this product has made me realize just how many things I used to trust by default, that now I look at more skeptically.
Lenny Rachitsky: Definitely mean these days. Okay. Before we get to the origin story, is there anything else along those lines you guys think might be really important to share, that are really interesting?
Algorithm Evolution: PageRank to Bridging
Jay Baxter: Sure. I guess one other thing is that although we don’t actually use the fact that a post was noted in the core ranking algorithm, which we think is a nice property. There is a really big impact just organically, meaning not from the algorithm but just from user behavior, where people will like and re-share or quote posts way less when-
Quote. Posts way less when notes are applied. I don’t know, for people out there who typically run A-B tests on big platforms, you may already be familiar with this, but 1% is typically an awesome effect size for any algorithm change. We saw more like 30 to 40% engagement rate drops for likes and reposts in A-B tests we were ran when showing a post with or without a note, which is just crazy big. That’s just an A-B test on the engagement rate, so that’s not the network effect. If you capture the overall network effect of how post spread less by that person’s repost, basically if you look top line with a difference in differences approach, multiple different external research groups have both found consistently that there’s a 50 or 60% drop in total reposts, which is just nuts after a note is applied. It’s having a really big impact on spread actually, too.
How the Team Operates
Lenny Rachitsky: That’s so great to hear. It’s what I would want to see and it’s incredible impact. Basically, an AI image of something false would just go crazy on Twitter, and did before Community Notes came out, and now what you’re saying is just adding that context, not actually… Like you’re saying, the algorithm doesn’t demote it. If there’s something incorrect, it’s just people are like, “Okay, this is false, why would I want to retweet this?” That makes sense.
Keith Coleman: Correct.
Jay Baxter: Right.
Keith Coleman: Yeah, the notes just totally take the wind out these stories. The thing will be going viral, note appears, resharing drops 50 to 60%, and that’s it. At 50 to 60% per generation, the virality quickly goes to zero.
Jay Baxter: By the way, I have very mixed feelings about this next one, but authors become 80% more likely to decrease, sorry, to delete their post after they get noted, which okay, that’s great, because less misinfo out there, but I’m pan about, because those are usually the best notes. If the note was so just good that you had no other option but to delete your post, those notes don’t get seen by other people, right? Because-
Summary of Team Operations
Lenny Rachitsky: That’s hard.
Jay Baxter: There’s an argument, by the way, that seeing… Just because you might see the same misleading claim elsewhere off X, or somewhere else on X, it might be good to actually show… Better to have seen the post with the note than not see it at all.
Lenny Rachitsky: Yeah.
Jay Baxter: Unsure about that claim.
Lenny Rachitsky: That is so interesting.
Jay Baxter: Yeah.
Lenny Rachitsky: Yeah, I’d be so sad if I was that community note writer and just… Man, it’s so good. They just can’t even keep the post up. Okay. Coming back from today’s world, where this small amount of code is changing the way people understand the world and what they believe, and making the White House rescind their announcements, zooming back to the beginning of how this whole project started, what I heard just briefly is, Keith, you were just tired of managing PMs, you wanted to just work on something yourself, you wanted to work on something impactful away from corporate BS, and you basically just started looking for something that was impactful, important, and you found this. Talk about just how it all came to be at the beginnings of the story.
The Importance of Team Architecture
Keith Coleman: Yeah. I mean, for me, the beginnings actually go back to why I joined, it was then, Twitter in 2016. I had a startup and we’d had some acquisition offers, and one of them was from this company, Twitter. It was 2016, it was the middle of the election between Donald Trump and Hillary Clinton, and there were something like three televised debates, but every day, there was a debate happening on Twitter, and it was very clear, this is where people are talking about these things that matter, where information is being shared, where ideas are being formed. As a user, it was obvious that I could get good information there, but it was also obvious that there was questionable information floating around. I remember just looking, as an outsider, thinking like, “Wow, this is a really hard problem and it also seems really important,” so we ended up going to Twitter and the company was in a turnaround at that point.
My first three years was just helping to get the company growing again, working on everything that was the consumer product, getting user growth going back and people wanting to work there again, et cetera, but a few years in, I was reflecting on what we had done. I think we had done a lot of good work getting momentum going, and people in the us and in the industry had tried things to deal with misleading information, but nothing was really working. It was obvious nothing was working. Nothing could handle the scale of the problem, nothing could handle the speed, and a lot of people just didn’t trust the existing approaches. The existing approaches were either fact-checkers or internal trust and safety teams making decisions about what was or was not misleading. A lot of people just didn’t want or trust that to be the way this was decided, which is very reasonable.
I’m looking at that, I was still managing a large PM team. That’s a whole story in itself. That job required a lot of energy in, and I didn’t feel like I always saw the output that I wanted to see from it. I didn’t see the change in the product I wanted to see and I was contemplating, “Should I go start a company? Should I do something else?” And I kept coming back to this problem. I’m like, “Man, how is the world going to deal with this information quality issue of what we get on social media?” Wherever get it. I’m at this company where you can make a difference on this problem, why not go and try some crazy ideas and see if one of them might work? I had a kid, I came back from paternity leave, I went to my boss, Kayvon. I was like, “Hey, Kayvon. How about I just stop doing my job and I go work on this instead? ‘This’ being trying some crazy ideas to see if we can deal with misleading info.”
He was stoked, so I went off and started working on that. It started with just reading any research I could on the problem and existing solutions. What was or was not working, what were the issues, and then into prototyping. Then it ultimately led to us building and piloting this idea that became Community Notes.
Lenny Rachitsky: Amazing. I have so many questions and we’re going to keep going through the story, but when you joined Twitter, what was the… It was called Twitter. At this point, I’m going to try to call it X now, which I know is important to your boss. What era of Twitter was it at that point? It was Kayvon joined and who was the CEO? Because there’s been many.
The Self-Selection Mechanism
Keith Coleman: Okay, yeah. I came in December 2016, so Jack had relatively recently come back as CEO to turn the company around, and just to give you a sense of the state of the company, something like a third of employees were leaving every year. Just imagine a third of your team gone every year. The stock was in the toilet, the product was not really growing, so Jack was working on a turnaround and Kayvon was there already. Kayvon was running Periscope with a bunch of video stuff, and that group continued to… Jack was there up through the start of the Community Notes, then Birdwatch Project, and… Yeah.
Lenny Rachitsky: Okay, and it was called Birdwatch. I don’t think we’ve used that term yet, but that’s an important point. It was called Birdwatch initially.
A More Streamlined Team
Keith Coleman: Yeah. It was originally called Birdwatch when we started the project, but obviously, somewhat famously the name changed along the way.
Surviving the 80 Percent Layoffs
Lenny Rachitsky: Yeah, maybe let’s just tell that story real quick, and I know we’re zooming it forward, but just… I have this Twitter thread that I saw between Jack and Elon when they’re debating what to call it, and Elon’s like, “Birdwatch sounds creepy, I want to change it”. Is there anything there you can share?
Keith Coleman: Yeah, the story there… The story, that’s funny. Elon came in, acquired the company, and we had just launched the product relatively recently in the US. It had been in pilot for a year, but we had just made it available US-wide, and I guess he’d been seeing the notes. Soon after the exhibition, he DM’d me and he was like, “Hey, this Community Notes thing is awesome,” and I was like, “I’m glad you like it, let’s talk,” so we talked the next day and he kept referring to it as “This Community Notes thing.” I was like, “It’s interesting you keep calling it that, because that’s actually the very first thing that I called it.” The very first figma mockup I made depicting this thing was called “Community Notes.” I don’t know why, it just felt really natural, so that’s the first prototype we had tested.
Later, the project changed the same to Birdwatch, but Elon was like, “Hey, let’s just call it that.” The next day, we just changed the name. It’s always notable for the team when you change the name, but really, the team was excited about it. I think it is a much more understandable name. Jack has made fun of it, calling it “The ultimate Facebook name,” or something like that.
Jay Baxter: The most boring Facebook name [inaudible 00:36:44].
Keith Coleman: Boring name, which is funny, because they’re now launching Community Notes. I think it is a very understandable, intuitive name, and I think it has served the product really well. There’s a reason it was the name in the very first mockup.
Birdwatch’s Low-Key Launch Strategy
Lenny Rachitsky: Yeah, I think descriptive names just makes sense. This connection with Elon, and I want to talk later about just how you’ve dealt with so many strong personalities over and kept this alive throughout so many changes, but before we get to that, you did something that I think a lot of product leaders, angel leaders, just people that have managed people dream of give up all this power, in air quotes, and career trajectory and influence and just, “Forget all that. I’m going to go back to just building something awesome, small team.” Is there any advice there that you could share from that experience that you think might be helpful for other leaders to share or to hear to help them maybe do that same jump? Because that’s really difficult in practice. Easy to talk about, hard to do.
Keith Coleman: Yeah, I think it is a difficult jump. I’ve done it a bunch of times in my career and I’ve always been very happy with it, where I started with a small team, that it grew into something bigger, and then I was like, “We’re dealing with a lot of big production stuff, team’s really big. I want to go back to doing something like crazy and new with a small team again.” I’ve done that sawtooth leap a bunch of times, but it can be hard, because certainly, the natural… The classic career path is, I don’t know, rewards or running a large organization or being a manager, or things like that, but I think, at the end of the day, you got to work on stuff you love, you got to be having fun, and I think people want to be having impact.
I think there’s one myth that can get in people’s ways. The idea that the more people you manage or the larger your scope is, the more impact you have. I definitely do not think that is true. I mean, look at Community Notes for example. If I had stayed running a large consumer PM team, what would I have produced? 16 more pages of OKRs? I don’t know, a bunch of documents? I think building Community Notes has had way bigger impact on the world. It’s become the industry standard for how to deal with this now, which is super cool. People love it, it’s the first thing that is plausibly dealing with the internet-scale issue of information quality. I think it’s unquestionably a bigger impact than I would’ve had if I were just doing whatever, doing some standard management track thing like I was doing before. I think that’s true of so many other small companies and startups. Someone screenshotted I think it’s Blake Scholl’s LinkedIn the other day. He went from director of coupons or something to building the first supersonic-
Elon’s Early Interest
Lenny Rachitsky: Yeah, from Groupon.
Keith Coleman: Those stories are everywhere when you look, so I definitely have found that, for me, I love building hands-on, I love trying crazy new ideas. I love the zero-to-one experience. It’s fun to scale things up too, and it can be fun to operate at scale, but this team is a good example of one that operates at a very large scale, but that is still very small.
Core Principles of Community Notes
Lenny Rachitsky: Yeah, I think the way you guys operate is what more and more companies are trying to do, remove middle management layers, create small teams that just execute and build impact, just like Ics. Whenever I say IC, I have a comment on YouTube, where like, “What is IC?” I’m just going to explain, individual contributor, non-manager is when I say the word IC. Let me follow this thread, and when I asked people about how you set up the team to operate effectively and protect it initially, there’s this term, “Thermal,” that came up a lot. It was like a thermal team, if that’s how you describe it.
The Principle of Transparency
Keith Coleman: Yeah.
Lenny Rachitsky: What is thermal?
Keith Coleman: Yeah, so anyone who’s worked in a larger company probably knows that things can get bureaucratic or bogged-down, decision-making can be slow. There’s these large planning cycles, people can try to take someone from one team, move them to another at random arbitrary times that can disrupt a project, all sorts of things like that. Our company, this is a number of years ago when we started this project, we had a lot of founders in the company. Kayvon is an example of founder who is helping to run the company, and he had this idea, “Hey, why don’t we create this program, call it Thermal, where we could have teams that were somewhat isolated from that.” They could run through their own process, they would have one clear owner. The team would be entirely dedicated to that project and we would just repeatedly make funding decisions as to whether to continue the effort.
Lenny Rachitsky: Why was it called Thermal, by the way? What was the idea there?
Keith Coleman: I think it was an old bird analogy of thermals lifting the bird on their wings. Twitter 1.0 obviously had a lot of bird analogies, bless its heart, so that was one of them. I loved the idea, as someone who liked the startup environment, so when we were starting this project, I was like, “Hey, Kayvon. Why don’t we make this the first Thermal project?” And he was like, “Yeah, let’s do it,” so we started with that way of operating and it gave us, from day one, a lot of freedom and autonomy that I think was really important to make the product work.
Lenny Rachitsky: Just be very specific about it. What makes it a Thermal project? How do you set that up? This is asking from perspective, if a company wants to build their own something like this, what does that look like?
Keith Coleman: Yeah, I think there’s a bunch of key attributes. One key attribute is there’s one clear driver of the project, who’s effectively a founder. I guess maybe you could have two or something, but really clear, there’s driver of the project and also there’s one clear decision-maker that they go to.
Lenny Rachitsky: Outside of the team?
Keith Coleman: Outside of the team. That was true back when we started and it is true now. If we need something or have a question about something, I talk to Elon. It was like that from the beginning, it’s like that now, and I think that’s a big reason we’re able to make decisions effectively, quickly, in a simple way.
Lenny Rachitsky: It probably has to be someone very senior, not [inaudible 00:43:05] manager.
Keith Coleman: Someone senior who can make the decisions you need made, whatever they are. I think that’s really important, that clear decision-making structure. Another was 100% focus, so everyone on the project is expected to be totally focused on it. A lot of companies, it can be easy to have people’s attention spread across a bunch of things, and it makes it hard to get stuff done. You’ll talk to whoever that person is, you’ll ask them for help on something, and they’ll be like, “Yeah, I’ll help you. I got to finish this thing, and it’ll take me a week or two and then I’ll get to it.” A week or two delay totally changes the momentum of a project. When we were 100% focused, we talk in the morning, it’s like, “Hey, Jay. Why don’t we try this thing in the algorithm?” He’s like, “Yeah.” Then that afternoon or the next day, we’re looking at results.
Because of that total focus, the rate of iteration goes way up. Then beyond that, there was also just the ability to use whatever our own decision-making process was. We didn’t need to write OKRs or… For others standard practices. Obviously, we had to make sure we were responsibly building the product and everything, but we didn’t need to use the standard practices. I think that’s another great example, OKRs, I understand why they can be helpful, but they can also be not necessarily the right cadence at which to set goals. I think it’s really unclear that quarterly or annual goals are actually the right pace. We would set the goal for the next milestone that mattered, and we would work on that. We reached that milestone, we would have an idea of what was coming after, and then when we hit that, we’d set the next milestone. Whether that was two weeks, a month, three months, whatever it was. We set our own pace and goals at that pace, and that just I think is a lot more natural for the development of something.
The Cost of Open Sourcing
Jay Baxter: The whole OKR determination and planning process took longer than it would take us to pick a goal and then execute on it and finish it.
Power in Standing by Principles
Lenny Rachitsky: How big was the team early on that you set up? How many engineers?
Keith Coleman: It started with just me and then, when we decided to build the thing, we figured we needed about five. We wanted it to be as small as we possibly could. It was clear we needed someone on ML doing scoring, it was clear we needed someone to do some client engineering work, someone to do backend engineering work. There may have been one or two other. We needed a designer and a researcher to help us understand the customer base and make sure we were building the thing in a way that was actually going to resonate with people. I think it was backend, frontend, ML, design research. That was the original team, from what I remember.
Key Moments the System Worked
Lenny Rachitsky: Amazing. Basically, one of each function. A question I have for Jay, actually, is there’s all this talk of small teams and moving fast, but sometimes you just need more engineers to build the thing. Is there anything you’ve learned about just how to keep a team small while moving as fast as you are, and not need or need to hire more engineers?
Jay Baxter: I think, in the beginning when we were iterating on what should even the requirements be, it was definitely good to just have one ML engineer, but I think, at some point, we got clear on what the goals of the algorithm should really be and we tried… I think, at the very beginning, it wasn’t clear that we needed to build this bridging-based algorithm. The actual first algorithm that I put into production was very focused on anti-manipulation. It was this page rank variant, but it didn’t solve the problem of bias, basically. If there are more users on one side, a page rank type graph algorithm can actually amplify those biases. I think, after building that prototype and getting data from that, it was clear that the bridging-based algorithm was going to be the way that we needed to solve it, and at that point, basically I set up a bake-off. Kind of a Kaggle competition or something. That was the key time where it was really important to pull in other engineers.
Lenny Rachitsky: That is such a cool story. I want to follow that thread. Before we do that, you just mentioned you guys yell “Thermal.” What does that mean? Is that YOLO, like a version of… Okay.
Anonymous Contributors and Trust Mechanisms
Keith Coleman: We’re just going to ship, because we’re thermal project.
Jay Baxter: Ship it.
Lenny Rachitsky: Okay. Marketers, I know that you love [inaudible 00:47:52], so let me get right to the point. Wix Studio gives you everything you need to cater to any client at any scale, all in one place. Here’s how your workflow could look. Scale content with dynamic pages and reusable assets effortlessly, fast-track projects with built-in marketing integrations like Meta, CAPI, Zapier, Google Ads, and more. A-B test landing pages in days, not weeks, with intuitive design tools, connected tracking, and analytics tools, like Google Analytics and Semrush. Encapture key business events without the hassle of manual setup, manage all your client’s social media and communications from a unified dashboard, then create, schedule, and post content across all their channels. If you’re working on content-rich sites, Wix Studio’s No-Code CMS lets you build and manage without touching the design. When you’re ready for more, Wix Studio grows with you. Add your own code, create custom integrations with Wix-made APIs, or leverage robust native business solutions. Drive real client growth with Wix Studio. Go to Wixstudio.com.
Okay, so coming back to this algorithm, this is actually really interesting, because I’ve never heard any of this. I was going to ask just what inspired this actual algorithm, and you basically did an internal competition amongst ML engineers to see who had the most successful algorithm. Netflix-contest style, Kaggle style.
Jay Baxter: Yeah, yeah. This particular idea of finding content that is liked by people on opposite sides of a polarized divider who typically disagree, this was not an idea out of thin air. I think Keith had found some of Chris Bale’s work, he had made this list of accounts that were often liked by people who were on both sides politically. There is other projects, like polls out there that look for agreement among people who typically disagree, but I think that it wasn’t obvious that our project definitely needed to use that from the very beginning. When you implement it and compare it against these other type… PageRank seems, obviously, it’s designed to be manipulation-resistant. Naturally, if you just have a voting ring of people who all vote themselves up, then PageRank can filter that out very well, but that just wasn’t the main attack vector, I guess.
We had to get some real data from the pilot to realize that, “Okay, the real thing going on here is people are polarized,” so it was only once we got that, the real data from the pilot, that I think it was clear that the bridging-based algorithm was the direction we really needed to go.
Lenny Rachitsky: I want to come back to the way you operate the team. I hear that you run the whole team off a single Google Doc that’s like a four-year-old doc that you just keep adding goals to, bullet points. Is that true?
Transitioning to Anonymous Contributions
Keith Coleman: There is a very long-running doc that has had to be chopped and purged, because it was breaking Google Docs in Chrome at various points in time. It’s like a note-taking doc. It’s really where we coordinate what we’re doing. The team meets on a daily basis, we spend whatever amount of time we need to get on the same page about what we’re building. We might talk about anything from what’s most important right now to, “What should we work on next?” To, “What are we trying to launch right now, and why is it not launched? What’s in the way of launching it?” We might review new modeling or scoring algorithm update and try to understand what’s working in it, what’s not. We’ll just cover whatever we want or whatever feels most important. As you said, we set our goals very dynamically, so whatever seems like the most important thing for us to work on now and next is what we spend our time on. I think that’s served the project really well versus feeling attached to some quarterly goals, or something. We’ll look at, “What is going to help people the most?” Or, “What’s the biggest problem right now?” What are either one of those? And we will go tackle it. We might change our roadmap multiple times in two weeks based on what we see.
Overhauling Trust and Safety Systems
Lenny Rachitsky: I’m hearing no Jira, no Asana, no Monday.com.
Keith Coleman: No.
From Impossible to Worth Considering
Lenny Rachitsky: Okay.
Keith Coleman: Yeah, I mean, we have to use Jira to coordinate with some other teams. Sometimes when we file a request, we have to make a Jira ticket. But no, I am not a fan of heavyweight task management. I love being on the same page, being able to keep most things in my head, and having a really light way to write down the things that the team can’t keep in its head.
Jay Baxter: We did use Asana briefly, but my memory of it is that you spent more time in the meeting grooming a backlog of irrelevant stuff than actually talking about the proper priorities. I think it’s nice in the Google Doc that, if something becomes irrelevant, it can just fall off without needing explicit backlog grooming.
Surviving Multiple Leadership Changes
Lenny Rachitsky: Just to maybe summarize a little bit of how you guys operate that might inspire other companies to set teams up like this, so I’m going to go through a few things you shared. One is one person in charge of the team, like the founder almost. They’re basically the founder of the team. They have one very senior, essentially, sponsor/decision-maker that they interface with. In your case, Elon, no big deal. In other cases, it could be the CTO, CPO, someone like that. The team is focused 100% on its product and goal. You keep the team very small, so you start with one person of each function. One front-end engineer, back-end, ML person, designer, researcher, PM, and then Google Docs is almost basically for your project management. Yeah, it’s basically run with Google Docs, stop, don’t use big, complicated products.
Keith Coleman: I think that’s a pretty good recipe. On the Google Docs, people can do what they want. If they want to use thumbnails, go for it. I think those first ingredients, really, are key structurally. Then beyond that, it’s a matter of having an ambitious goal that gets-
And then beyond that, it’s a matter of having an ambitious goal that gets people fired up to go do great work.
Low Ego and Project Success
Lenny Rachitsky: Yeah. Awesome. I think there’s a lot there that a lot of people think they should do when they set these teams up, but they don’t actually do, and it feels like each of these is just a really key ingredient to it to actually succeeding.
Keith Coleman: It definitely really helped us succeed. I don’t know that the project would be here if it was not for some of those elements.
The Future of Community Notes
Lenny Rachitsky: That’s a powerful statement. This thing that has changed the way the world understands what is true would not have existed if you didn’t set it up in this specific way.
A Product Built by People
Keith Coleman: Yeah. I don’t know if I would’ve begun the project had I not known. We had that structure, that ability to make decisions, the autonomy, the speed, the ability to go fast. We started with that in 1.0 and it’s been continued and if anything, furthered in X. X as a whole company operates with a lot of those attributes, and I think it’s one of the reasons the product is successful. I think those are big reasons why at least, Jay can speak for himself, I have so much fun working on this. I love working on it. It’s great to wake up every day and solve these problems. We get to do them efficiently, make decisions quickly, build stuff that helps a lot of people. It’s awesome.
Unexpected Uses for Community Notes
Jay Baxter: This whether thermal or Elon way of operating is definitely more fun and the fact that… That combined with the awesome mission is super important for internal recruiting. I remember when I was first chatting to Keith about this back in early 2020, I had another project. I worked on a few, but one was like personalize the number of push notifications that we send, and it drove a lot of DAU without losing opt-outs significantly. So that was setting me on track, or if I had kept working on that, I could have probably gotten a promotion from that with low risk, or I could take this huge career… It’s not as big at a career risk as joining or founding an actual external startup, but there is still career risk, I guess, in joining a team like this. I think all of the same aspects of recruiting that apply to external startups and apply internally, and if you can have an exciting vision, that is key.
Reasons to Be Optimistic
Keith Coleman: Related to that and your list, Lenny, one thing we missed that’s super important is that on this project, and I think of successful projects like it in startups, is that people are self-selecting to join. We did not assign anyone to this project. People reached out to join or they applied to join the job. I and the team interviewed every single person that joined the team and we were like, “We want that person on the team. They want to be on the team.” And so people are totally bought in to the goal, mission, the way the team works, the other people they’re going to be working with. And that makes a huge difference.
So a great time to do that is at the start of one of these things. If you’re going to try something crazy, it’s going to be tough if you’re just assigning random people to it. But if you let people opt in and self-select much more likely to be successful. And one thing that I have observed at X, which really surprised me was that this is also possible at a large scale. One of the things that Elon did when he bought the company was he basically asked people to self-select to stay. You had to click the button. And he sent an email out that was like, “Hey, Twitter 2.0.”
Lenny Rachitsky: Fork in the road. Right? [inaudible 00:57:46]
Keith Coleman: Fork in the road. Fork in the road. Exactly. He’s like, “Twitter 2.0, now X, it’s going to be hardcore. We’re going to do ambitious things. You’re going to work your butt off.” And you had to click on the form and say, “Yes, I want to join.” And I think that was really important for the company because you want people to opt into that. You want the people to be saying, “Yeah, that’s what I want to do,” and the company’s going to be a lot more successful. If people aren’t sure, it’s better for them probably to go do something else and where they’re naturally more aligned and happier. And I thought that was a great approach to taking a large company and getting it down to people who are really excited about working together on a mission. So for us, we did it from day one, which I think is an easy way to do it, but it’s possible to do it later as well.
Lenny Rachitsky: I love that you described it as fun and I think a lot of people when they see Elon laying off a bunch of people, being very hardcore himself, people don’t imagine it as a fun place to work. And it’s clear how much you guys love working on this, how fun it is and how interesting it is. And it’s interesting to hear that ‘cause I think a lot of people don’t feel that externally. Is there anything else along the lines of just working for Elon within an org Elon runs that might surprise people about just the way of working that’s interesting or surprising or you think other companies might want to think about adopting?
Keith Coleman: I’ve always liked lean teams, but my experience at X has made me change the way I would think about running a future org-… If I were to start a company and had to change the way I think about starting that company, I would be even leaner than I would’ve made it before. I’ve been amazed with just how much the team is able to accomplish with a small group. And I think because of a small group, shortly after the acquisition, we had this product called Spaces. It had been in the product before, but it was pretty small scale, and Elon wanted to run these large spaces. I forget who the first people he was going to bring on were, but he was going to be there. Ultimately, these things have gone on to host politicians and things like that, and he’s like, “Guys, we got to scale this up.” I forget the numbers.
He’s like, “We need to be able to scale a million people,” or something like that. I’m getting the numbers wrong. “You need to be able to scale way up.” This is the kind of thing at 1.0 That would’ve taken a year if it had ever happened, and the team did it in two or three weeks. And it was really exciting and inspiring to see. I didn’t work on that, but I watched it from the outside. I’m like, “Wow, with this tiny team motivated behind a big goal that was like, ‘Hey, guys, it’s not like, are we going to do this?’ It’s, ‘We are going to do this.’” They got it done in two or three weeks. That must’ve felt amazing for them. It was certainly exciting to see. But I’ve definitely come to appreciate just how lean something can be and not just get by but actually thrive because it’s that lean.
Lenny Rachitsky: I think the point you made about people opting into that is important, ‘cause I think a lot of people hearing that would be like, “I would never want to be asked to build something like that in two weeks.” And I think a lot of people do, and we love that kind of experience, especially working with the Elon, especially shipping something at that scale. But I think there’s an important element there of just like, “Okay, I don’t want to do that. I have other things to do in my life other than ship spaces.” So I think that’s a key point you’ve raised of just there’s an opt-in step.
Keith Coleman: Totally. I think the opt in is important, and it may even be that you want to opt in at one point in your life, and maybe at another point in your life something else is better. I think whatever it is you’re choosing to do, it’s nice to be opting in to feel like it’s aligned with how you want to spend your time.
Lenny Rachitsky: Something on my mind, and I don’t know if you guys want to go here, but it’s something I think a lot of people think about is when Elon came in, he let go of 80% of folks. And everyone’s just like, “Twitter is dead. It’s all going to fall apart. There’s no way they can run this thing with that small of a staff,” and clearly they were wrong. Clearly, it’s working great. It’s becoming a massive deal in the world and continues to grow. Is there anything about that that you were surprised by or anything about just how it continues to operate so well in spite of that big shift?
Keith Coleman: I think the leaner team, the reduced process in bureaucracy is a big reason it does move as fast as it does. It’s easier to get stuff done faster here. Yeah. I think that shrinking is actually a big reason for the increased pace of launches, the increased pace of experimentation. One thing that I noticed a result of that is the people who are here, they seem to all really feel like owners. They take the sense of responsibility that an owner takes in the product. They’ll try to track down what’s wrong, fix whatever is needed, jump in to help build or fix, improve any system that needs help, even if it’s outside of their space. And there’s the flip side of that too. For people who’ve worked at big companies, they may have experienced this thing where there’s like ano-… You want to change something in some other system or product, and so you reach out to that team. And maybe they’re a little resistant, they’ll maybe be like, “Oh, we’ll get to that next quarter or so-
Lenny Rachitsky: They have their own goals to hit. Yeah. [inaudible 01:03:08]
Keith Coleman: Yeah. Exactly. They don’t really necessarily want to help you or they’re busy. Here, you’re like, “Hey, guys, we need to do this thing with that other system you work on.” And they’re like, “Great! Here’s the code. Here are the docs. Send us the fab if you have any questions, and we’ll get it in.” And it’s just the thing, you can just jump in and get it done. And that kind of collaborative effort, like the sense of shared ownership, I think from my experience came from or was a result of the shrinking of the team down to people who wanted to be there and work together to build this thing. So I think that’s been a really positive impact. It’s not always easy. Certainly, a lot of people have a lot of responsibilities, but they’re here because they’re up for it.
Jay Baxter: Yeah. I think one other thing that’s key is when you are forced to have such a small team, well, this is important anyways, but deleting code is more important than writing it a lot of the time. So I think so often maybe due to promotion incentives or just regular human tendency, engineers have a tendency to add these little incremental wins that actually add more of a long-term maintenance cost than is clear, because you just run a little one month A-B test, you see this significant win and you don’t realize the maintenance burden you just added to your team for the rest of eternity until you turn the thing off. So I think there’s a lot to be gained and you get forced to do this, by the way, when you have such a small team. It’s just auditing parts of your system and deleting the things where the maintenance cost is worse than the gains. So I think we did have to do this across the company after the big layoffs, and systems are leaner now and they can be worked on by fewer numbers of people.
Lenny Rachitsky: That’s an amazing point. I remember Elon’s being like, “Here, we have to throw away the whole thing. We have to re-architect everything. It’s stupid the way it’s built.” And it sounds like that actually worked.
Jay Baxter: Yeah, so-
Lenny Rachitsky: Well.
Jay Baxter: You don’t have to rewrite everything from scratch. Some things are good, I guess, to rewrite. But just even deleting the unnecessary cruft and keeping the rest of the core system, that’s awesome.
Lenny Rachitsky: I love that we’re creating a formula to run these sorts of companies and teams. There’s so much here. I want to go back to the building of the original product. I took us on a long tangent and an amazing tangent, but I heard a story of when you launched Birdwatch at that point. You specifically wanted to keep expectations very low and there was a GIF in the thing, and it just looked like clearly this is not ready for prime time. Talk about just how you did that, how you launched it in a way where people weren’t like, “It’s never going to work.”
Keith Coleman: We were very disciplined, I guess you could say, about having the product prove itself at every given point. When we built the first mockups, these were just pictures of depicting what community notes might look like. We showed those to people across the political spectrum. We saw, hey, people really like these. Whether they’re on the right or left, they seem very open to reading these community notes even when they’re critical to people of their own side. So we’re like, “All right. That gives us confidence that if we can build this, if we can actually make this as a reality, it’s going to work.” Then there’s a question of can we make it a reality? Will people in the real world be able to write notes that are of this quality?
And so we had an internal pilot test version of this where you could write notes. And we first basically ran this through an Amazon MTurk type of participant test just to see if you just put some normal people in there, will they be able to write these notes? All those notes weren’t good, but it was clear that there were people out there who could write good notes. So then like, “Okay, this is possible. What will happen if we actually do this out in the real world? And let’s run a pilot and find out.” And so we took that pilot that we’d run the MTurk of test on, and we released it to at first 1000 people, totally out in public, and we didn’t know what was going to show up. You could imagine the notes could have been terrible.
And so we were talking, “Well, what do we do? We’re going to put this out there. Everyone’s going to have all these questions. They’re probably going to be really skeptical, and we know it might be a total dumpster fire. And so what do we do to set expectations appropriately?” We felt like we could probably get there in the end, but we just didn’t know what was going to happen at first. We wanted to set expectations, and so we’re like, “Well, why don’t we just stick…” There’s the page where you see a post in the notes below. We’re like, “Why don’t we just stick a dumpster fire GIF on that page?” And you go there, you’re like, “Hey, anything you see below here might just be a total dumpster fire. At least it would show we were aware of that as a possible risk.” In the end, we did not do that. It cracked me up, but we thought it was like-
Lenny Rachitsky: Oh, you didn’t actually launch. Okay. That was just a concept. Okay.
Keith Coleman: We had mockups of it, and every time I looked at the mockup, I laughed, but ultimately we had so much to explain on that page, like, what is this thing and how does it work? Ultimately, we’re like, “Okay, this is probably going to distract from the point.” So we pulled it. I wish maybe it had seen the light of day at one point, but yeah, ultimately we kept it simple and we focused that page on explaining what was going on here. But again, as has happened many times with the project, we put the pilot out there and the notes were good.
They weren’t all good. It was a mixed bag, but there was gold in there. And from the very early days with just 1000 contributors, it was obvious that people could write notes that were informative, that were neutral, that spoke to controversial challenging topics, and that if we could just identify those from the rest, this was going to work. It was going to work as well as the very first mockups we had made. So that became the focus that is, how do we sift out the gold from the rest?
Lenny Rachitsky: I think you may have shared with this with me, when someone noticed you guys were testing this and they took screenshots and tweeted it, and I think Elon replied, “This is cool.”
Keith Coleman: Yeah. Yeah. So in the very early days when it was just a Figma prototype, we were running these usertesting.com on moderated studies. I guess one of the participants sent one to an NBC reporter who wrote a bunch of stories on it. Anyway, that day, there was a lot of chatter about it on the service, and Elon… To put this back in time perspective, this is, I think, 2020, so two years before any acquisition stuff happened, Elon is just a Twitter user building rockets and electric cars and other cool stuff and stumbles on this thing that depicts the prototype that we’ve been testing. And he writes back, “Definitely worth trying, IMO.” And I remember thinking that was cool back then and it’s interesting to see, he’s obviously had a very consistent point on it. I think the idea was appealing and he has obviously been a big fan of it in the product and had been a big supporter proponent. So yeah, it was cool that it came from… that support has been from the very early days before he was ever involved in the company.
Lenny Rachitsky: I love that moment. That must have felt really wild for Elon to be commenting on this Figma prototype retesting.
Keith Coleman: It was cool. It was cool.
Lenny Rachitsky: Oh, man. So when we were preparing for this interview, I asked you guys what’s the main thing you want to make sure people get and understand about why community notes has been so effective? And Keith, you specifically said that it was the principles behind how you wanted to approach this and how you continue to stick to this throughout. And we’ll talk about how you kept it alive throughout all these different CO changes in leaders. But just talk about these principles, what the actual principles are and why that was so key to it working out.
Keith Coleman: There are a number of principles that I think when we first shared them with people at the company seemed maybe a little bit crazy. But I think they are the reason the product works, and I think they’ve been very important, and we do. We come back to them regularly, today, all the time. Probably the craziest one is just that this thing is going to be the voice of the people. It’s going to represent the voice of people. It’s not going to represent the company’s voice. So it is not a tech company deciding what shows. It is the people deciding what shows, and that had a lot of implications on the design. First of all, we don’t have a button that will change the status of a note. So if a note is showing because the people have rated it and found it helpful, it is going to show. We can’t change that.
And that is the kind of thing that when we first propose this, that’s unsettling to people. They’re like, “Wait, so something can go up and the company can’t take it down, or can’t change its status, get it to stop showing.” And we’re like, “Yeah, and it has to work that well. If it doesn’t work well enough to do that, then it doesn’t work.” This is one of our key principles was, if there’s a problem with a note that’s so bad, you want to do something about it’s a problem with the system. We need to redesign the system to be showing good notes. And so yeah, we had to get everyone comfortable with the idea that there was no button to change the status of a note. Similarly, as we talked about earlier, we wanted this to represent all of humanity.
And so we didn’t want to be arbiters of who can come in and be a contributor and who can’t. So we open it to everyone. You just have to meet a really basic objective criteria. You have to have a verified phone to help reduce the likelihood of having bots or things like that participating. But beyond that, it’s random selection and it still is that way today. And again, that people took some time to get people comfortable with it. But I think that the fact that this is the voice of the people and reflects their output through an open and transparent process is so key to both why it is good, why it works, but also why it’s trusted. So that’s number one and I think will forever be the heart of the product. Another one that people thought was crazy was transparency.
The previous approaches to dealing with misleading info, it felt to a lot of people, like black box tech companies or media companies or leads or whatever making decisions. We’re like, “People need to get comfortable with this. They need to trust this. So the whole thing has to be out in the open.” The code that decides what notes share has to be out in the open. All of the data and ratings that make it happen have to be out in the open. People should be able to take the code and data and replicate the whole service and that we have done exactly what we’ve said we’ve done. And they should be able to audit it. They should be able to go and look and say, “Hey, I think this part could be better.”
Or if they think we’re biased, they should be able to work with the data and point it out. And if people have good observations, that should factor back into the code. And this is, again, something that’s difficult to get people comfortable with, that everything is out there, you can’t cover anything up. But I think that’s so essential to people trusting it. Yeah, we set these out on day one. We go back to them constantly because we’re always evolving the product, and we’re always like got to make sure every new change is open. Whenever we update the scoring system, there’s an update in GitHub when the data is published daily so you can download it. And so yeah, I think those have been really essential to the thing working.
Jay Baxter: And by the way, these do not come without a cost. It’s actually really hard from an end perspective to actually open source the actual algorithm that’s running on the actual data. Because the way large-scale services like this are usually architected does not naturally lend itself to being run as a script by someone who’s downloaded a TSV. So we actually have to take weird architectural decisions to make this possible in a way that probably wouldn’t have been if we didn’t start with this assumption from scratch. We would’ve had to maybe rewrite the system to make it like this.
Lenny Rachitsky: What’s an example of that?
Jay Baxter: For instance, there’s a matrix factorization that we train. Usually, you would train a matrix factor… train your ML model once and then serve it, I guess with a separate service. But we didn’t want to have people externally spinning up services to be able to replicate the system that we had. So basically, I don’t think it would’ve been actually very cool if we had open sourced the code in a way that wasn’t actually runnable, I guess, by someone just… At this point, you can download Python code and run a script. You do need a lot of RAM right now, but you can do it on one machine.
Lenny Rachitsky: Okay. How much RAM are we talking about?
Jay Baxter: Oh, only like 500 gigs.
Lenny Rachitsky: Okay. Okay. That’s reasonable.
Jay Baxter: It’ll take a day if you don’t do anything special to speed it up. Good to know, but yeah.
Lenny Rachitsky: Cool.
Jay Baxter: Possible is the key thing, and people have done it. Vitalik Buterin had a blog post where he talks about his explorations, making sure the algorithm really does what it says it does. And I think just the fact that a handful of people have done this, there’s enough people who have done it that there’s someone you’d probably trust who’s verified it.
Lenny Rachitsky: And that’s rolling out to Meta. No big deal. I love just as you described these principles, just I could imagine a PM at a company being like, “Okay, guys. Here, I want to do this project.” There’s so much idealism to it that rarely works in real life; going to be open source. You’re going to give it to everyone. We don’t have actual control over what it’s going to do, don’t worry about it. It’s going to just change the way people see this thing that we’ve been very careful about and then it works. And I think that’s very rare and it’s really impressive. And what I’m hearing partly is that sticking to those principles was actually really fundamental to it working and not bending over when someone’s like, “No, no, no, we can’t do this. What if we change this part?”
Keith Coleman: I think if we had broken with any of those principles, if there was anything black box, if there was whatever, the product would be a lot harder to trust. And so I think it’s because we’ve just stuck to them so cleanly simply that people can trust it.
Lenny Rachitsky: You’ve talked about a few moments when it was like, wow, the White House changed their announcement because of the community note. We talked about the dog is a cat. Are there any other moments that after you launched of, “Holy shit, this is working? This is going to actually work.”
Keith Coleman: All along, we saw it working. We wanted to be confident whenever we expanded it to new audiences or new countries or whatever, we wanted to be confident it was going to work. So maybe held our breath a little bit just to see that it would do what we expected, but we always expected that. But that said, there were definitely stress cases. The one that comes to mind is the start of the Israel Hamas conflict in 2023 in October. That was probably the largest deluge of misleading information I’ve ever seen shared on the internet at one time. It was overwhelming. A number of photos and videos and whatever coming out related to that, it was insane. And just to give you an example, I think it was first three days or something of that conflict, we had 500 notes covering all sorts of different… out of context imagery.
Someone would say, “Hey, this is happening here.” It’s actually from 2013 in Syria. There were people making fake battle footage in the video game simulator Arma 3. So there were notes explaining, this stuff looked realistic. And unless you saw the note, you wouldn’t really know. There are all sorts of claims about what was going on in the ground, and that was definitely… The product was still pretty new at that point. We’d expanded in the U. S. less than a year before that. We had been rolling out throughout the world that year and then this large event happened. And I felt like we were just enough prepared at the right time for the system to be able to handle that.
Probably one of the most important things we did right before that was launch the ability to write notes on images and videos and have those matched to other posts. I remember at that time thinking, “Wow, I’m glad we launched that feature a few months ago versus still had it on the shelf,” because it was really important in that conflict. And I think even it was just a few weeks before we had launched a major speed up in notes too. When we first built the product, the number one focus was always quality. We knew that the product would live and die by the quality of the notes. That was the thing we could never give up on. We also knew it needed to deliver speed and scale, but we’re like, “We will get the quality in the right place, and we can speed it up and scale-
Get the quality in the right place and we can speed it up and scale it out over time. And we had actually just launched a speed-up that took three hours off the time it needed to go live, and it was I think a matter of weeks before that conflict happened, so again, super glad that was out there. In the first few days of the conflict the median time from a post going live to a note showing up was five hours, which is like crazy fast. Typical fact checking is like two to four, at least it’s really common to see it take two to four days. These notes were showing up in five hours and we’re like, we are so glad we got those things out before this happened, it made the service a lot more helpful.
Jay Baxter: One other thing that was, I think, nice to see working then was, one criticism of Community Notes some people bring up is, well if you always need agreement from people who typically disagree, then in these super polarized settings, that conflict being probably number one, then you wouldn’t see any notes. But actually the reality was there were tons of notes about that conflict. So I think there was this kind of nice property where actually, and maybe this is a surprising fact, that there’s more agreement out there across polarized divides than maybe conventional wisdom says, and the places where people agreed were really objectively true and verifiable. I guess maybe this is more true the more polarized the setting is, but where the agreement actually lends you, and basically notes that are very neutrally written, very focused on the facts and easy to verify information.
Lenny Rachitsky: There’s this talk for a while of just there’s no more facts, nobody believes there is a single true fact anymore, everything is subjective, and I think Community Notes proves the opposite. Facts matter, there are facts that we can all agree with even on the most controversial topics.
Keith Coleman: Yeah, we saw this really from day one, when we would show those prototypes to people just depicting the idea, it was really obvious that people cared more about, or they cared a lot about understanding reality and what was going on and they were willing to disagree with their side, so to speak, to recognize that. And I think that’s not always that obvious to people. The world does feel really polarized, but people definitely are willing to cross partisan boundaries to get to accurate information and that’s why the product works.
Lenny Rachitsky: It feels like as we rely more and more on what we know and understand about the world is becoming social media online and moving this quickly, it’s like I’m so thankful this exists because otherwise it’d just be, what do we trust anymore? This being out aligns with we need this thing to exist at the same time. And it feels like at the same time there’s also people I just don’t trust. I think people have shifted from I trust what I read to, okay, I shouldn’t just believe everything I’m reading. Is there anything there you’re noticing about just how people think about news they see and their shift of just like, I’m not going to believe everything. Is there anything that you’ve noticed about just human behavior or just the way we’ve shifted understanding what is true?
Keith Coleman: We haven’t done any research to look broadly at how people’s perceptions are changing there, but I certainly have found myself that particularly seeing notes, I am more skeptical about what I read at first, and I think that’s been helpful. And we hear that from people, that they think about things a bit more, and I think that’s a good secondary effect and benefit of something like this, which is the more you see the patterns of how what you’re reading can be wrong, the more you can thoughtfully question it and try to get a better understanding of what’s really going on. So historically I think this was called media literacy, but basic idea of can you understand the ways in which things can go wrong and try to cut them yourself.
Jay Baxter: Another aspect I think we help with that is discovery of the Community Notes. I think often before Community Notes you could have just been living in a little news filter bubble, or maybe there were fact checks out there that you should have been reading but you weren’t discovering them. So the fact that the note applies, it is directly attached to the post and visible by anyone who sees the post helps cross those filter bubbles and can kind of… I think for some people it’s the first time they’ve actually seen counter arguments to claims made in their own little echo chamber.
Lenny Rachitsky: That’s incredible, yeah. I love the point you’re making about how it actually teaches people to be a little more skeptical of the things they read. It’s an education system more than just, here, this one thing is wrong. I love that.
Okay, just a few more questions. There was an audience question asked on Twitter, we all asked on Twitter, “What do people want to know about Community Notes?” one was actually why you guys switched to anonymous contributors, what was the decision behind that?
Keith Coleman: Yeah, we had this pilot where we were testing with a small number of contributors, a few thousand contributors, and we learned a lot through that pilot. Probably the biggest thing we learned was related to anonymity or pseudonymity of contributors. We had originally assumed that it was important that people contribute under their real handle, or their real name, or whatever it was. The first prototypes depicted that, we kind of thought that would be important for people trusting the note, and actually it was totally wrong. The best option was actually opposite of what we first tried.
We found a few things. One, people were hesitant to write a note on a controversial topic because they didn’t want to get attacked or harassed online. And so some people were comfortable doing this but others were not, and so it meant there was more potential good notes to be written than were getting written, and this was very clear feedback from the pilot.
Two, and this is super interesting, people are actually more willing to cross partisan boundaries when they are anonymous or pseudonymous than when they are under their real name, and it intuitively makes a lot of sense. If you publicly are using your name, you feel are affiliated with one side versus the other, you might hesitate to be perceived as breaking with that side. But you may actually, for example, find a note helpful that’s critical of that side, and there’s a bunch of studies that show when people are anonymous, they’re much more willing to cross partisan boundaries and work with the other side, agree with the other side, and we saw that too. And so by allowing people to be pseudonymous, you actually get more honest answers about what they really think and it helps find disagreement that really-
Lenny Rachitsky: That’s so counterintuitive.
Keith Coleman: Yes.
Lenny Rachitsky: You never hear the opposite always, and it’s so interesting it’s the opposite.
Keith Coleman: Yeah, yeah.
Jay Baxter: I think the same principle applies to making the likes private.
Lenny Rachitsky: I was just thinking that.
Jay Baxter: Yeah.
Lenny Rachitsky: Yeah, I like a lot more stuff that’s a little, definitely, I wouldn’t have liked, yeah.
Keith Coleman: It allows freedom for honesty, which is pretty great. And one of the criticisms of pseudonymity is it can generate, maybe people have reached the quality threshold that they put out there, but we have so many quality mechanisms in the system that that wasn’t an issue, so we could keep quality high while opening up for that honesty.
Lenny Rachitsky: Another question, you touched on this a little bit, which is around navigating the existing trust and safety apparatus of Twitter, which as you described, basically, previously, it was like we make decisions on what is true and not, and every company works this way, you guys basically upended that like, here’s a completely different way, you have no control over what we say is true or not. Talk about just that experience of overcoming that, I imagine, very difficult hurdle of like, okay, forget all that, we’re going to do it totally different.
Keith Coleman: Yeah, it was definitely, what we were proposing was very different. I will say that I think people were sort of open-minded to it, generally speaking, and I think everyone had a sense that what was being done at the time wasn’t really working that well or solving the problem, and people were open to new ideas, so that’s a good foundation.
But I think one thing we did that was probably very helpful in that is we wanted the product to prove itself at any point. First it had to prove that people could possibly find notes helpful, then it had to prove that people could possibly write these notes that would be good quality. And so anytime that we were proposing doing something with the product, like running some research test, or running the pilot, or expanding the pilot, we always had the data that had convinced us that that was a good decision, like we were stepping into the next phase of expansion that made sense. And so I think we probably rarely proposed anything that seemed unwise, because we were holding such a high bar for quality ourselves, and I suspect that went a long way.
Lenny Rachitsky: So it’s partly, what I’m hearing is, take it step by step to prove this is actually working, and partly be confident it is working to yourself before you try to convince the trust and safety team this is the way to go.
Keith Coleman: Exactly.
Lenny Rachitsky: Was there a moment along that journey it shifted from no way this is a thing to okay, wow, let’s actually consider this? Or was it this very gradual process?
Keith Coleman: Whether other people were saying no way to wow, let’s actually-
Lenny Rachitsky: Yeah, just internally of just like, okay, we’re going to actually stop this trust and safety way of operating and instead rely on Community Notes, was there a moment of like, okay, let’s actually make that switch, or was that Elon actually, is that the big switch?
Keith Coleman: The biggest change there happened in X, the biggest changes prior to that were just the decision to put this out there and have it be operating in public at first US wide scale. But yeah, then the bigger switches came in the X period.
Jay Baxter: I think even though there was original research before Birdwatch had even started, or Community Notes had even started, from external researchers showing that crowdsourced fact-checkers, laypeople can do about as well as fact-checkers and actually the agreement rates were kind of similar between the groups. I think even though that research was out there, I think there were definitely a lot of people who didn’t really believe it could work until it already worked.
Lenny Rachitsky: Basically prove it, prove that it works. Yeah, that makes sense, versus just a bunch of docs and strategy and thinking, it’s just like, look, it’s actually working, you can see for yourself.
Jay Baxter: Yeah.
Lenny Rachitsky: Makes sense. Okay, possibly last question, we’ll see which fractals of questions you guys bring up here. I referenced this a couple times, this incredible achievement of keeping a project alive through Jack and then, I have this note, and Kayvon running the show then, and then Parag running Twitter, and then Elon, and then Linda taking over as CEO, quite rare, especially something this visible, this impactful to everything that X is. Any lessons or keys to that actually working, of this project surviving throughout so many work changes and leaders?
Keith Coleman: It definitely has been a crazy time to be building something. It’s been fun. The craziness has been entertaining. I think one reason perhaps the product has done so well and survived is the nature of the product itself. It is designed to produce information that is found helpful by people who normally disagree. And so even if you have CEOs or leaders who might disagree, there’s a good chance actually they’ll find it helpful, they’ll be like, wow, this thing does produce pretty useful output. So I think there’s something in the nature of the product itself, that when people see it, whatever side they’re on, left, up, down, they’re likely to find it pretty helpful, so I do think that helps.
I also think the team executed really well. We had ambitious goals that were exciting, they solved a real problem. This is a real problem that matters in the world. At every step, as we talked about, the product needed to prove itself, and we would make sure it proved itself and we would bring the results that convinced us and we’d share those with people. And so they would say, oh yeah, I agree, it kind of proved itself, let’s take the next leap. And we’ve done that all along the way and we continue to operate that way, and I think that focus on the outcome and goal that matters, and executing against it, really helped.
The team did not get distracted by much all through the period during which the acquisition happened. There was a lot of opportunity for distraction. This team was shipping every week, we were super focused on the goal, let’s make this thing work, let’s get these notes out there, and I think people saw that execution and were excited to support it.
Lenny Rachitsky: Yeah, like it’s working, why would we mess with that? And it’s important, and it keeps us from having to hire tens of thousands of people to fact check.
Keith Coleman: The interesting thing about that is no one ever asked us or brought up or seemed to care about anything related to cost savings in this process. And I think that’s an assumption people have outside the company, that this must have been a reason there was interest in it. But that was never a goal, it was not at all why the project was started, it was not why people were excited about the project. And I think that’s also, for people outside who maybe don’t see the conversations, it’s kind of a heartening thing to know, is that the focus was always on solving the problem. The other approach is even if you had 10,000 people doing it, the real issue is that they don’t work that well because they’re not trusted or they don’t scale or they’re too slow. And so the goal was really always just help people stay informed at scale. Let’s build an internet scale solution to an internet scale problem that people like.
Lenny Rachitsky: Something I heard about you, Keith, when I was asking people about how this worked and why this worked so well is that they describe you with having a very low ego, and that allowed you to give up this whole team and power and influence and just the name, forget it, whatever you want, we’ll call it Community Notes, great. Is there anything in there you can share of just how you think about that and how important that is as a product leader to have a low ego?
Keith Coleman: For me, this project, I feel like I get to do community service with this project. I see my work as in service of the people and the community, and that’s what motivates me. The only thing that I care about is delivering the outcome that the world finds helpful. And so in some ways the project has not been about ego, it’s about truth-seeking, let’s find… Not truth in the sense of what information is true, but let’s find out what’s actually going to make this work. How does it need to be structured, what should it be called? Whatever is going to produce the best outcome is what we should do. So I think I feel more attached to the product being helpful than to anything else, and so to whatever degree it might seem like low ego is probably more a result of wanting to actually solve the problem.
Lenny Rachitsky: And I think partly what I’m hearing is just if you win and succeed, good things will happen, so focus on that.
Keith Coleman: Certainly satisfying things will happen, it’s very satisfying to have people appreciate it. It’s satisfying that people on the left and right love it. It’s satisfying that even people who receive notes, love notes, and reach out to them and post them, that’s amazing, it feels so good to have helped give people that, and yeah, it’s very motivating. It’s a great reason to wake up in the morning.
Lenny Rachitsky: It’s absurd this has worked, but it’s also like of course this would work, of course something like this should work. It’s like such interesting-
Keith Coleman: It’s the internet, it’s of the internet, that’s why it works.
Lenny Rachitsky: Oh man. Where’s Community Notes going from here? What’s happening, where’s it going, what’s the future?
Keith Coleman: We’re always working on basically more better notes faster. So there’s clearly an opportunity to get more notes out there, we want them to stay as good or better than they are, we want to get them there faster, so we’re always working on core product changes to help deliver that. Recently, for example, we just released an update to what we call the Community Notes bat signal, or the ability to request a Community Note. So anyone on X can say, “Hey, I think this post needs a Community Note,” and now they can even add a source explaining why so that when a prospective writer sees that it’s much easier for them to write a note. So we’re always working on core things like that, core algorithm improvements.
I think there are also new frontiers that show a lot of potential, AI and LLMs are one. It’s easy to imagine a lot of ways that AI could assist the people in this task they’re doing of trying to get information out there quickly. And maybe Jay should talk about the Supernotes work that we’ve done with some folks outside the company.
Jay Baxter: Yeah, so one cool thing about having public data and code is that external researchers can collaborate with you, and in this case Supernotes had this idea that we can basically take existing notes as input, existing proposed notes that maybe they have some problem, maybe they have part of the story, maybe they’re worded in kind of a biased way. Basically take all these in, have an LLM generate a ton of different variants, and then basically make the simulated jury to basically get a representative group of contributors for community notes who would be rating the note and try to predict based on their past ratings how they would rate these LLM generated notes. And so this way you can actually, rather than just having an LLM write a note from scratch and hoping it’s good, you can simulate the entire community notes rating process and explicitly create notes that are likely to be rated helpful by people.
So I think ideas like that are very promising for the future, and it’s a nice way that LLMs and humans can work together. Obviously agents can browse the web too, and that’s one way that you could imagine agents assisting humans is maybe checking whether a note is actually supported by the source. Although then you get into things like, well, are people going to actually be as diligent? Right now I think raters are very diligent because they know just some Community Notes contributor wrote this like, I better check this before I rate it helpful. But hopefully we can design things in a way such that people don’t trust the output and actually verify it themselves before issuing a helpful rating.
Lenny Rachitsky: Yeah, that is such an interesting area to explore where you want to avoid AI hallucinating slop versus make it easier and scale it even further. What an interesting challenge.
Keith Coleman: What’s cool about this project, in addition to the AI element, is that it’s being done outside the company. We talked earlier about the open source transparency. The key reason we made this all open source was so people could see how it worked, but the dream is actually that, it’s not just that the contributions to the notes and ratings are from the people, but the dream is actually the product is built by the people. What if the scoring algorithm were significantly or entirely written by the public? That would be incredible. And Supernotes is probably the first very substantial potential change in the algorithm of the way it works, that was kind of coming from the outside and plausibly could be part of the core, so we’d love to see the product go in that direction as well.
Lenny Rachitsky: Sweet, go Supernotes. Well guys, the work you’re doing is tremendous. This is every product person’s dream, I think to work, on something like this. Small team, lots of support, lots of impact, just innately interesting, and so I think this is going to inspire a lot of people.
So let me just ask you, is there anything else you wanted to share? Anything else you think might be helpful for folks to leave them with?
Jay Baxter: Sure, I guess one thing that just I thought was interesting over the course of working on this product is just there’s… I think in a similar way to how retweets originally were not something Jack came up with, I think users just started doing it and then it became a core part of the product. There’s a huge way already in which there’s just a lot of surprising things that people wanted to use Community Notes for that I don’t think we really expected, and it’s kind of cool to see those user desires emerge.
I think one example, I guess we had always been imagining political type of misinformation, but for whatever reason there’s a lot of people who love debating whether Messi or Ronaldo got more goals. I guess it’s kind of a funny one. There’s a community moderation aspect, so I think we also thought that this would be specifically for adding context to misleading or potentially misleading information, but what you can see is that there are some notes that go beyond that towards calling out content that they think is spammy or something. So I think that’s just another dimension in which commuted notes is a product that’s driven by the people.
Lenny Rachitsky: That’s so beautiful, basically they’re trying to keep Twitter/X healthy and they’re just like, no, this should be taken down, this tweet of spam.
Jay Baxter: Yeah.
Lenny Rachitsky: I love that. Is there an answer on the Messi versus, who is the other soccer player?
Jay Baxter: Ronaldo.
Lenny Rachitsky: Ronaldo, okay. Is there a definitive fact there or is that just unknowable?
Jay Baxter: Yeah, I guess that’s an interesting one because it’s a case where raters are actually very polarized. I guess it actually kind of fits into the core algorithm where there’s some people who are just diehard Messi fans or Ronaldo fans, just like they could be on politics, so we actually specifically modeled that topic, as well as some other topics, so we can estimate people’s opinion on that particular debate. It’s kind of funny that something like that would emerge.
Lenny Rachitsky: You’re saying that’s the most controversial topic on X, Ronaldo versus Messi.
Jay Baxter: That’s a controversial one.
Lenny Rachitsky: Oh wow, who knew? Okay. Keith, is there anything you wanted to add?
Keith Coleman: Yeah, community Notes is cool itself, but I think what it points to about society is actually even bigger. Society often feels really polarized, you hear people talk about it all the time, no one can ever agree on anything, but actually Community Note shows you people really can agree on quite a lot. Even on super controversial topics related to politics and everything, there’s a lot of agreement, that’s why notes work.
And I think that’s a really big reason for optimism about the world, is that while it might feel polarized, there’s probably like an 80% set of people that agree on quite a lot of things. And imagine if we could use the same kind of approaches we use with notes, but to find agreement on legislation, or policies, or things like that that people want the government or the world to do, possibly we could get a lot more momentum behind these ideas that the people really want and everyone would be a lot happier. Maybe 10% of the people on the edges wouldn’t be happy, but I bet there’s a lot of agreement that we are not identifying, and if we did it, we’d all be pretty happy. So I don’t know, I think it’s easy for people to feel pessimistic about the world, but I think this product is a good reason to be optimistic about the future.
Lenny Rachitsky: What an incredible way to end it. I can also see, Keith, why people want to join you and work with you and work on this team.
Keith Coleman: Appreciate it. If you do want to join, we are hiring an ML engineer. You get to work on these amazing problems with us and have a lot of fun, so we’re accepting applications at x.com/communitynotes.
Lenny Rachitsky: Okay, great, I’m glad you gave the URL. Oh man, you’re about to get flooded.
Guys, thank you so much for doing this. Is there anywhere other than that place to go off, join the team as an ML engineer, is there any other place you want to point people to, either your socials or anything else?
Keith Coleman: I’m KeithColeman on X, please reach out if you have any feedback or want to help us out, whether you may want to work here or want to do something from the outside, we would love to talk.
Jay Baxter: Yeah, I’m @ JayBaxter at X. Yeah, I think in particular, besides just using Community Notes, it would be great to get more substantial contributions, pull requests, collaborate on projects like Supernotes, I think that’s the most exciting type of stuff if people do want to contribute.
Lenny Rachitsky: Ship some code guys. Amazing. Guys, thank you so much for doing this.
Keith Coleman: Thanks for having us, Lenny.
Jay Baxter: Thank you, thank so much.
Lenny Rachitsky: It’s my pleasure. Bye everyone. Thank you so much for listening. If you found this valuable, you can subscribe to the show on Apple Podcasts, Spotify, or your favorite podcast app. Also, please consider giving us a rating or leaving a review as that really helps other listeners find the podcast. You can find all past episodes or learn more about the show at LennysPodcast.com. See you in the next episode.
Glossary
| English | 中文 |
|---|---|
| A/B 测试 | A/B 测试(即对照实验,保留原文写法) |
| Amazon MTurk | Amazon MTurk(亚马逊的众包微任务平台,保留原文) |
| Arma 3 | Arma 3(军事模拟视频游戏) |
| Asana | Asana(项目管理工具,保留原文) |
| Birdwatch | Birdwatch(Community Notes 的项目原名) |
| Blake Scholl | Blake Scholl(Boom Supersonic 创始人兼 CEO) |
| bridging-based agreement algorithm | 基于桥接的共识算法(bridging-based agreement algorithm) |
| Chris Bale | Chris Bale(研究者,保留原文) |
| Community Notes | Community Notes(X 平台的社区注释功能,暂保留原文) |
| Contributor | 贡献者(Contributor) |
| DAU | DAU(Daily Active Users,日活跃用户数) |
| echo chamber | 回音室 |
| Elon | Elon(指 Elon Musk,X/Twitter 所有者) |
| Figma | Figma(设计工具) |
| filter bubble | 过滤气泡 |
| IC | IC(Individual Contributor,个人贡献者,非管理岗) |
| Jack | Jack(指 Jack Dorsey,Twitter 前CEO) |
| Jay Baxter | Jay Baxter(X 高级工程师,Community Notes 算法负责人) |
| Jira | Jira(项目管理工具,保留原文) |
| Kaggle | Kaggle(数据科学竞赛平台,保留原文) |
| Kayvon | Kayvon(Twitter/X 高管,Keith Coleman 的上司) |
| Keith Coleman | Keith Coleman(X 副总裁,负责 Community Notes) |
| Lenny Rachitsky | Lenny Rachitsky(播客主持人,《Lenny’s Newsletter》作者) |
| Linda | Linda(Linda Yaccarino,X 现任 CEO,保留原文) |
| matrix factorization | 矩阵分解(matrix factorization,一种机器学习方法) |
| media literacy | 媒介素养 |
| Messi | Messi(Lionel Messi,阿根廷足球运动员) |
| Meta | Meta(科技公司,Facebook 母公司) |
| Monday.com | Monday.com(项目管理工具,保留原文) |
| OKR | OKR(Objectives and Key Results,目标与关键成果管理方法) |
| PageRank | PageRank(Google 的网页排名算法,保留原文) |
| Parag | Parag(Parag Agrawal,Twitter 前CEO,保留原文) |
| Periscope | Periscope(Twitter 旗下的直播视频应用) |
| pseudonymity | 化名性 |
| Ronaldo | Ronaldo(Cristiano Ronaldo,葡萄牙足球运动员) |
| Spaces | Spaces(X/Twitter 的语音直播聊天室功能) |
| Supernotes | Supernotes(一个利用 LLM 辅助生成 Community Notes 的外部研究项目) |
| Thermal | Thermal(Twitter 内部的隔离式创新团队机制) |
| Twitter 2.0 | Twitter 2.0(Elon Musk 收购 Twitter 后的改革计划名称) |
| Vitalik Buterin | Vitalik Buterin(以太坊联合创始人) |
| 双重差分 | 双重差分(difference in differences,一种因果推断方法) |
Reformatted by reformat_english.py
深入了解 X 的 Community Notes | Keith Coleman & Jay Baxter
转录文本
开场预览
Lenny Rachitsky: 你们所做的工作对世界的运作方式产生了如此巨大的影响。我想先让大家简要了解一下什么是 Community Notes。
Keith Coleman: X 上的用户可以看到一条帖子。如果认为它有误导性,就可以提出一条他们认为其他人会觉得有帮助的注释。然后其他用户可以对这条注释进行评分。
Jay Baxter: 我们实际上是在寻找过去曾经持不同意见的人之间的共识。我们发现,当人们真正达成那种出人意料的共识时,注释就会变得非常中立、准确,总体来说写得非常好。
Lenny Rachitsky: 有很多人立场非常极化。你们怎么应对那些超级反疫苗的、超级挺 1 月 6 日事件的人?
Keith Coleman: 一个重要的理念是,我们希望全人类都能参与,有时候人们对此感到惊讶。我们拥有全人类(的数据),然后就有数据来判断哪些注释对真正的人类是有帮助的。每条帖子都有资格获得注释。我们不应该豁免 Elon,不应该豁免政府人物,应该一视同仁……甚至广告商也能收到注释。
Jay Baxter: 有完全独立于我们的外部研究,他们发现如果你把一条带有和不带有 Community Note 的帖子做对比,人们在看到带注释和不带注释时,对帖子核心主张的认同程度确实会发生变化。
Lenny Rachitsky: 关于在 Elon 运营的组织中为他工作这件事,还有什么可能让人惊讶的?
Keith Coleman: 如果我要在那种公司里创业,我会把它做得比我以前做的更精简。我对这个小团队能够完成这么多工作感到惊讶,我觉得正是因为是小团队——
节目介绍
Lenny Rachitsky: 今天我的嘉宾是 Keith Coleman,Community Notes 的产品负责人,以及 Jay Baxter,Community Notes 的创始机器学习工程师和研究员。这次对话可能是我迄今为止最喜欢的一期播客。Community Notes 是目前世界上最具影响力、最巧妙,同时也是最被低估的产品之一。
如果你用过 X/Twitter,并在一条推文下面看到一条纠正该推文中错误信息的注释,那就是 Community Notes。我从未听过关于这个产品背后故事和构建团队的深入介绍,我很高兴能为你带来这些。我们会谈到这个产品令人惊讶的起源故事、算法实际的工作原理、算法如何从 Twitter 内部竞赛中诞生、Community Notes 背后的原则,以及为什么忠于这些原则对其成功如此关键。还有,它如何在四位不同的领导者——包括 Elon 和 Jack——手下幸存下来,以及为什么它现在已成为解决互联网错误信息方案的重要组成部分,包括最近被 Meta 采用作为其主要事实核查工具。这是一期非常特别的节目,我非常激动能带给你。
正式访谈
Lenny Rachitsky: Keith 和 Jay,非常感谢你们来到这里。欢迎收听播客。
Keith Coleman: 很高兴来到这里。
Jay Baxter: 谢谢邀请我们。
Lenny Rachitsky: 这是我的荣幸。我非常激动能有这次对话。你们所做的工作对世界的运作方式产生了如此巨大的影响。很多产品团队总是在谈论推动影响力、想要推动影响力,而你们真正构建了以有意义的方式改变世界的东西,并且继续在做。我真的从未听过 Community Notes 是如何诞生的、如何运作的等等这些背后的故事,所以我非常感谢你们抽出时间来聊。
Keith Coleman: 是的。首先,谢谢你这么说。这就是我们构建这个东西的原因——帮助人们,很高兴听到你这样说。很高兴看到人们喜欢它、觉得它有用。
Community Notes 是什么
Lenny Rachitsky: 我想先让大家简要了解一下什么是 Community Notes。我想很多人大概听说过它,大概在 X 上见过——刷的时候看到这些注释,但他们会说:“我不太清楚这到底是什么。“所以你能否简要描述一下什么是 Community Notes?
Keith Coleman: Community Notes 是一种让公众——让普通人——为可能有误导性的帖子添加上下文的方式。它的基本运作方式是:X 上的用户看到一条帖子,如果认为它有误导性,就可以提出一条他们认为其他人会觉得有帮助的注释。然后其他人可以对这些注释进行评分。如果一条注释被那些通常持不同意见的人认为是有帮助的——这表明它可能是准确的、措辞可能是真正中立的、可能是提供有用信息的——那么它就会展示给 X 上的所有人。目标就是让人们获得更多关于他们所看到内容的信息,以便在生活中做出更好的决定。
算法如何运作
Lenny Rachitsky: 太棒了。我听到这些,觉得这能运行简直不可思议。我想人们最初听到这个想法的时候,都会说”这不可能行得通”。那么,再深入一点,你能否让我们更深入地理解它究竟是怎么运作的?因为我觉得关键是你们设计的那个算法太巧妙了,才让这一切成为可能。请谈谈那个算法。
Jay Baxter: 是的。我觉得很多人的一个关键误解是,如果他们没有真正深入了解细节,就会以为也许有人写了一条注释,它就立刻出现了,或者我们只是用少数服从多数的投票来决定谁觉得注释好。我觉得这两种做法都可能导致有偏见或不准确的注释。我认为我们做的真正关键的一点是,我们实际上寻找的是过去持不同意见的人之间的共识。
当我们看到人们真正达成那种出人意料的共识时,这正是注释之所以如此中立、准确、措辞精良的原因。那些在整体上非常极化的人,在不准确的事情上往往是找不到共识的,对吧?我认为这也提供了一些很好的防操纵特性。人们经常……如果说……回想2020年我们还没开始构建这些的时候,这到底能不能行,我想一屋子机器学习工程师会说:“哦,你必须保持闭源。人们会一直操纵这个系统。你必须使用事实核查员提供的真值标签。没有外部标签,你不可能引导这个系统启动。“但事实证明,你可以用这种我们称之为基于桥接的共识算法(bridging-based agreement algorithm)来实现。
Lenny Rachitsky: 好的。总结一下,说得清楚一些。就是人们……有人写了一条注释。这个信息是错误的……举个什么好例子呢,我们聊到这里,一个经典的例子是什么?
Keith Coleman: 一个非常经典的例子是 AI 生成的图片,或者是脱离了上下文的图片——“看看这里发生了什么。“但实际上那是五年前在另一个国家、完全不同的主题下的照片——
Lenny Rachitsky: 天哪,这种我见过太多次了,就是那种”看看旧金山发生了什么”,然后我一看,“不,这完全是另一个城市,那根本不是——”
Keith Coleman: 完全正确,是的。
Lenny Rachitsky: 好。所以有人发了这张 AI 图片。有人写了一条注释:“这实际上是五年前在另一个城市拍的。“然后这个算法帮助判断这条注释是否真实,而做这些的就是普通人。
Jay Baxter: 没错。是注册成为 Community Notes 贡献者的普通人。所以会有一些门槛,比如你必须有一个经过验证的电话号码。但归根结底,这些就是普通人,不一定是职业事实核查员之类的。
开放参与的原则
Keith Coleman: 是的,这一点对我们来说也非常重要。最开始就有这样一个问题,也是 Jay 刚才提到的——“有没有人觉得这能行?“显然,这是一个疯狂的想法。我们不知道普通人能不能完成这项任务,人们当然也担心他们能不能做得好。
最初,公司内部一些人建议说:“嘿,你们为什么不找记者或者某个特定群体来做第一批参与者呢?“但我们非常明确地说:“不。我们试图摆脱的就是围绕这件事做编辑筛选式决策的思路。这应该对所有人开放。“所以我们非常刻意地让所有人都能参与进来。参与者是随机选择的,这对于让人觉得公平、开放、值得信赖非常重要。
Lenny Rachitsky: 是的。再说一次,这听起来就像是辨别真相的圣杯,而它居然真的有效。而且效果好到 Meta 最近——大家都知道——决定采用完全一样的系统来替代他们原本数以万计的事实核查员审核内容的方式。
Jay Baxter: 我想做一个区分,也许听起来有点吹毛求疵,但我觉得很重要——Community Notes 是添加额外的上下文,它不一定是事实核查,对吧?所以有些情况下帖子本身可能是真的,但只是有误导性,因为没有上下文或缺少上下文。这些情况我们也能覆盖到,我觉得这是一个重要的区别。我们的理念就是用户应该能够自己做判断,对吧?就像”这里是额外的上下文,接受不接受随你。“
Community Notes 的趣味案例
Lenny Rachitsky: 对。我想起来了,你跟我分享过这个例子——一张猫的照片,然后有人的 Community Notes 就写了”那是一条狗”。还是反过来,那个——
Jay Baxter: 对。原帖是”一个巴勒斯坦男孩把面包分给一只狗”,配了一张猫的照片。所以这条注释确实不是特别必要,因为它只是说”那是一只猫”,然后附了一个猫的维基百科链接。这是一个很好的例子,说明这个系统——这不是什么职业事实核查员会觉得需要核查的事情。但它证明了归根结底这个系统确实是由用户在运行,而且也增添了一些喜剧效果吧。而且注释本身是正确的。
Lenny Rachitsky: 好,这很重要。
Jay Baxter: 是的。
什么样的帖子可以触发 Community Notes
Lenny Rachitsky: 一条帖子什么时候会触发 Community Notes 的审核?有没有一个门槛?还是说你可以给任何帖子写 Community Notes,然后由人们来投票决定?这是怎么运作的?
Keith Coleman: 每条帖子都有资格获得注释,这又是一个非常重要的原则。就像——“我们不应该豁免 Elon,不应该豁免政府官员,我们应该……”所有人,甚至广告主,都可能被加上注释。所以平台上的任何帖子都可能获得注释。你在实际中可以看到,注释出现在世界各国领导人、Elon、广告、媒体机构的帖子下面,当然也出现在普通社交媒体用户的帖子下面。但没错,核心理念就是一个公平的竞争环境。要提出一条注释,提出者必须已经获得了写注释的资格。所以确实有这方面的要求——你必须先取得资格才能做这件事。而获得这个资格的方式是通过你的评分记录来证明你有能力帮助识别那些被广泛人群认为有帮助的注释。基本上,如果你有能力辨别、识别什么对很多人来说是有帮助的,那你就获得了开始撰写注释的资格。
Lenny Rachitsky: 我其实注册成了……你们怎么称呼这些人?写注释的——
Jay Baxter: 贡献者(Contributors)。
Lenny Rachitsky: 好,贡献者。是的,我一直在评分。我还没有达到——
Keith Coleman: 不错。
Lenny Rachitsky: 我还写不了注释。
Keith Coleman: 是的,这不是特别容易,需要一些努力。
Community Notes 的规模
Lenny Rachitsky: 你能分享一些关于目前 Community Notes 规模的数据吗,尤其是那些可能会让人们感到意外的?
Keith Coleman: 好的。这个服务正在快速增长,目前每天有数百条注释。作为一个对比参照,我最近看到 UC Berkeley 有人提供的数据,说传统的专业事实核查每天大约只有10条。相比之下,每天有数百条注释被展示出来。它们涵盖的话题非常广泛——从显而易见的政治、新闻,到娱乐、体育、游戏。基本上就是当天发生的任何事情。
注释的覆盖范围与增长
Keith Coleman: 除了每天有数百条独立的注释之外,一条注释还可以匹配到多条帖子。比如有人对一张图片或一段视频写了注释,假设它是 AI 生成的之类的,这条注释会自动匹配到所有包含相同图片的帖子。所以一条注释可以匹配到数千条帖子。以2024年为例,我们大约有95,000条注释,被展示了约300亿次。这比前一年翻了一倍多。前一年大约是37,000条注释,被展示140亿次。增长速度非常惊人——300亿次浏览,意味着大量原本不会出现的信息得以传播,这非常了不起。增长如此迅速的部分原因是贡献者群体在不断扩大。全球大约有950,000名贡献者,将近一百万人参与其中,这太了不起了。
Lenny Rachitsky: 哇,我也算其中一个,对吧?我也算贡献者?
Keith Coleman: 是的,没错。如果你注册成为贡献者,就算。
Lenny Rachitsky: 好的,太酷了。
Jay Baxter: 而且候补名单上还有很多人,所以增长空间还很大。关于媒体和URL的匹配,我认为这是扩大覆盖面的一个巨大途径。同时,我们也非常谨慎地确保这些匹配是精确的。因为我认为与其他类型的事实核查相比,人们喜欢 Community Notes 的一个原因是,注释确实是针对你看到的特定说法量身定制的,对吧?传统的事实核查警告通常只会说类似”点击此处获取事实”之类的话,然后链接到一个关于投票信息的通用页面,把信息藏在一次点击之后,这种方式毫无帮助。所以把上下文直接呈现出来,让你零点击就能看到,并且保持针对性,这是极其重要的。
用户反馈与通知机制
Lenny Rachitsky: 有一个我很喜欢的功能,我想你们一定对此深思熟虑过——如果我之前给某条帖子点了赞,后来出现了 Community Note,我会收到通知,这样我就不会继续记住那条错误信息。
Keith Coleman: 是的。我们尽可能让注释更快出现,理想情况下希望它们即时显示。但不可避免地,从帖子发布到人们搞清楚怎么回事、再到注释上线,中间总会有一个时间差。所以我们发送这些通知来尽量弥补这个时间差。是的,这个功能收获了很多好评。我们看到人们截图分享,对此很兴奋。这也是互联网和社交媒体世界能做到的一件很酷的事——在印刷媒体或传统新闻领域很难做到。在报纸上你可能第二天才能在一个不起眼的角落看到一条更正。而在这里,只要你与某条帖子有过互动,注释出现时你就会收到推送通知。
Lenny Rachitsky: 一个用户反馈是——我希望推送能直接告诉我”你看到的这条有误”。因为我觉得我得点进去自己读,而我觉得推送完全可以直接写”这条内容有更多背景”。就像——
Jay Baxter: 同意。
Keith Coleman: 我们会去看看这个建议——
Lenny Rachitsky: 好了,实时用户反馈。
Keith Coleman: 不错。
注释的触发阈值
Lenny Rachitsky: 好的。我想聊聊这整个事情的起源故事。但还有两个问题,因为我们正聊到这儿。第一个是,一条注释要在什么条件下才会显示出来?这个信息能分享吗,具体是怎么运作的?
Jay Baxter: 简单来说,由于算法的具体实现方式,它使用了一种叫 Matrix Factorization(矩阵分解)的机器学习算法,通过 Gradient Descent(梯度下降)等方式来拟合。阈值是这个人为设定的尺度上的0.4——
Lenny Rachitsky: 0.4,好的。
Jay Baxter: 对。实际操作中,它基本上意味着……如果存在与注释相关的对立分歧。显然,有些注释不涉及政治或其他引发分歧的话题。但如果存在分歧的话,那么双方中相当多的人都通常需要认为这条注释是有帮助的。除此之外还有其他规则会发挥作用。所以即使超过那个阈值,也可能被过滤掉——有一个单独的算法会审查人们对”不正确”标签之间的一致性。比如,有人可能觉得注释有帮助但不正确,对吧?这种情况是有的。在这些情况下,不管有没有超过有帮助的阈值都不行。
Lenny Rachitsky: 我这种理解可能不对,但意思是40%通常持不同意见的人表示同意——
Jay Baxter: 不是。
Lenny Rachitsky: 好的,那是——
Jay Baxter: 完全不是那个意思。它只是某个任意尺度上的一个数字——
Lenny Rachitsky: 好的。
Jay Baxter: 对。
Keith Coleman: 是的。如果我们改变算法中的其他随机因素,这个数字也得相应变成另一个看起来同样任意的数字。我们是通过收集用户反馈来确定这些数字的。我们可以向人们展示大量注释,收集哪些是有帮助的反馈,然后一条分界线就自然浮现出来了——标示着从”有争议”到”相当明确有帮助”的界限。
Jay Baxter: 是的。而且顺便说一句,目前的阈值设置是非常保守的。我们对质量非常讲究,真的希望注释质量非常高。我和 Keith 都相信,归根结底我们成败取决于注释的质量。所以我们宁可不去展示一条可能不错但信号不够充足的注释,也不愿反过来。
Lenny Rachitsky: 这太有道理了。我从没见过一条错误的 Community Note,打破这个承诺后果很严重。所以我完全理解你们为什么在那方面超级保守。
注释的通过率
好的,还有两个问题,因为我就好奇。这些不在我的问题清单上,但我觉得大家会想知道。写了多少注释,最终实际展示出来的又有多少——
Keith Coleman: 提交的注释中我们大约展示8%左右。随着时间推移,这个比例大概在7%到10%或11%之间浮动。数字可能会有一些变化。正如 Jay 所说,毫无疑问……你也看得出来,确实有比我们展示的更多的好注释,但我们的目标是保持极高的标准。我们希望在注释确实有帮助的时候才展示,在不会显得有偏见、不会损害系统公信力的时候才展示。我们希望这些注释是中立的、信息丰富的、有帮助的。正如 Jay 所说,我们认为最糟糕的错误就是展示了一条不好的注释,因为那会损害公信力,而公信力正是人们喜欢这个产品的原因。
所以门槛就在那里。正如我所说,剩下的那90%里,确实有一些是好的。然后有很多就是质量一般的,还有一些是不好的。如果你写了一条不好的——“不好”的定义是:通常持不同意见的人都认为这条注释没有帮助,正好和我们展示的那些相反——如果你写了一条通常持不同意见的人都认为没有帮助的注释,你最终会失去写注释的权限,需要重新挣回来。那90%是一个混合体。有时候人们看到这个数字会说:“哎,你们为什么不多展示一些?“答案就是:“你大概实际上不希望我们展示那些中的大部分。“这个系统的价值所在,就是能够筛选出好的那些。
极端观点用户的处理
Lenny Rachitsky: 这说得通。还有一个问题是,有很多人观点非常两极化,对很多事情都极度持反对态度。他们在算法中是怎么被筛选的?你们怎么处理那些极度反疫苗、与1月6日事件相关、持有各种极端观点的人?
Jay Baxter: 如果人们确实如此两极化,以至于在通常持不同意见的人之间都无法达成共识,那这条注释可能是正确的,但作为上下文展示出来未必有帮助。也许它涉及的是一个人们已经形成了根深蒂固观点的话题,而且他们已经阅读了大量相关内容。这可能确实不会增进人们的理解,也不会带来好的用户体验。所以在这些情况下,不展示注释未必是坏事。
几年前,人们相当悲观,认为事实核查永远不会改变人们对真相的认知。实际上,有一些完全独立于我们的外部研究发现,如果把带有 Community Note 的帖子和不带 Community Note 的帖子做对比,人们在看到注释后,对帖子核心主张的认同度确实会发生变化。所以我们确实在影响这个人们以前认为可能很难做到的事情。因此,聚焦在能达成桥接共识的案例上是很好的。我还想说,算法中也有一个声誉机制。如果你持续以违背基于桥接的共识算法所达成的共识的方式对注释进行评分,我们就会停止计入你的评分。如果你是那种不断把差评注释评为”有帮助”的人,我们确实会把你过滤掉。所以这两种人——那些恶意评分的人和那些只是观点两极化但诚实参与的人——是有区别的。
让全人类参与的哲学
Keith Coleman: 对。我认为一个重要的哲学理念是,我们希望全人类都能参与。有时候人们听到这个会感到惊讶,他们会说:“不是有些人不应该参与吗?“或者”他们的思想太极端了,也许不应该让他们参与。“但我们的观点是,实际上我们希望全人类都在这里。因为如果我们拥有全人类,我们就有了数据,可以理解什么样的注释对真正的人类是有帮助的。我们可以更好地建模、更好地理解、更好地展示这些注释。所以,拥有持有各种各样观点的人对我们是有利的。我们不期望每条注释都能被每个人喜欢——这是不可能的标准。但我们确实希望展示那些80%的人看了会说”哇,我很高兴知道了这个信息”的注释。从这个意义上说,不管别人觉得某个人的观点有多极端,让他们参与进来仍然是件好事。所以无论你的观点是什么,请注册并参与进来,这有助于识别真正有帮助的内容。
Lenny Rachitsky: 好。我们会附上链接,方便想注册的人知道怎么操作。有一点我们之前没有明确说明,这些都是志愿者。没有人因为写注释和投票而获得报酬,对吧?
志愿者的动机与影响力
Keith Coleman: 对。完全基于内在动机,我们认为这是一个很好的参与理由。如果你和最活跃的贡献者交流,他们中的很多人希望世界上有更好的信息环境,这是一个很棒的动机。想想看,对这些人来说,他们能产生的影响力是惊人的。我们最初在美国全面推出的时候,大概是2022年,一条注释出现在了一条白宫推文上,白宫随后删除了那条推文并重新发布了一份修正后的声明。想象一下写那条注释的人。你可能只有12个粉丝,你的帖子可能只得到几个赞。而在这一刻,你在白宫的推文上加了一条注释,他们根据你的行为修改了公开声明。这是不可思议的影响力。所以你能理解,当人们关心世界上正在发生的事情时,他们为什么会有动力去做这件事。你不需要是一个大人物、知名人士,也能以有益的方式塑造公共讨论和信息传播。
Lenny Rachitsky: 太不可思议了。这个系统有太多值得称道的地方。一是整个机制的精英属性——任何真实、正确的人都可以参与并产生影响。另外,它也让我们看到了我们接收到的信息中有多少是错误的。我们以前根本不知道自己看到错误信息的频率有多高,现在我们知道了。
Keith Coleman: 做这个产品让我意识到,以前我默认信任的很多东西,现在我会用更审慎的眼光去看待。
Lenny Rachitsky: 如今确实如此。好的,在我们进入起源故事之前,你们还有什么觉得非常重要或非常有趣的事情想分享的吗?
Community Notes 对传播的巨大影响
Jay Baxter: 好的。还有一点,虽然我们实际上并没有把帖子被添加注释这一事实用于核心排序算法中——我们认为这是一个很好的特性——但确实存在一个非常大的有机影响,也就是说不是来自算法的影响,而是来自用户行为本身的影响。当帖子被添加注释后,人们的点赞、转发或引用转发的行为会大幅减少。对于在大型平台上运行 A/B 测试的人来说,你们可能已经知道这一点,但对于任何算法改动来说,1% 通常已经是非常好的效果了。我们在 A/B 测试中观察到,展示和不展示注释的情况下,点赞和转发的参与率下降了大约30%到40%,这个数字大得惊人。这只是参与率的 A/B 测试结果,还不包括网络效应。如果把帖子通过转发减少而减少传播的整体网络效应考虑进去,用双重差分方法来看,多个不同的外部研究团队都一致发现,总转发量下降了50%到60%,这在帖子被添加注释后简直是惊人的数字。它对信息传播确实产生了非常大的影响。
Lenny Rachitsky: 听到这个太好了。这正是我希望看到的,而且影响力令人难以置信。基本上,一张虚假的 AI 图片以前会在 Twitter 上疯传,在 Community Notes 出现之前确实如此。而现在你说的是,仅仅是添加了那个上下文——而且如你所说,算法并没有对其进行降权——如果内容有误,人们就会说:“好吧,这是假的,我为什么要转发这个?“这很合理。
Keith Coleman: 对。
Jay Baxter: 没错。
Keith Coleman: 是的,注释完全是给这些热门内容釜底抽薪。一个内容正在疯传,注释出现,转发量骤降50%到60%,就这样了。每经过一轮传播减少50%到60%,病毒的传播力很快就归零了。
Jay Baxter: 顺便说一下,我对接下来这个数据感受很矛盾。作者在收到注释后删除帖子的可能性增加了80%——好吧,这很好,因为错误信息少了,但我又觉得有点遗憾,因为那些通常是最好的注释。如果一条注释写得好到让作者别无选择只能删帖,那其他人是看不到这条注释的,对吧?因为——
Lenny Rachitsky: 这确实很矛盾。
Jay Baxter: 有一种观点认为,因为你可能在 X 之外的其他地方,或者 X 上的其他地方看到同样的误导性说法,所以实际上展示带注释的帖子可能比完全不看到这个帖子要好。
Lenny Rachitsky: 对。
Jay Baxter: 不过我对这个说法也不太确定。
Lenny Rachitsky: 这太有意思了。
Jay Baxter: 是的。
Lenny Rachitsky: 是啊,如果我就是那条 Community Notes 的撰写者,我肯定特别沮丧……天哪,写得那么好,结果帖子直接被删了。好吧。回到当下——如今这么一小段代码正在改变人们理解世界和形成信念的方式,甚至让白宫撤回了他们的公告——我们把镜头拉回到这个项目最初是怎么开始的。我听到的简短版本是:Keith,你当时厌倦了管理产品经理,想自己亲手做点东西,想做有意义的事情,想远离那些公司里的破事,于是你开始寻找真正有影响力、重要的事情来做,然后你找到了这个。跟我们聊聊这一切最初是怎么来的吧。
加入 Twitter 的契机
Keith Coleman: 好。对我来说,这个故事实际上要追溯到当初为什么加入当时还叫 Twitter 的这家公司,那是 2016 年。我当时有一家创业公司,收到了一些收购要约,其中一个就来自 Twitter。那是 2016 年,正值 Donald Trump 和 Hillary Clinton 之间的选举期间,电视上播了三场辩论,但每一天,Twitter 上都在上演辩论。非常明显,人们就是在讨论这些重要的事情,信息在这里被分享,观点在这里被形成。作为用户,我明显能从这里获取好的信息,但同样明显的是,也有存疑的信息在流传。我记得当时作为一个局外人看着这些,心想,“哇,这是一个非常难的问题,而且看起来也非常重要。“于是我们最终去了 Twitter,而当时的公司正处于转型期。
我的前三年就是帮助公司重新恢复增长,负责消费者产品的所有工作,让用户增长重回正轨,让人们重新愿意来这里工作,等等。但过了几年,我开始回顾我们做的事情。我觉得我们在恢复势头方面做了很多好的工作,美国和行业内的人也尝试了各种方法来应对误导性信息,但没有什么真正奏效的。很显然,没有一种方法奏效。没有一种方法能应对这个问题的规模,没有一种方法能应对这个问题的速度,而且很多人也不信任现有的方案。现有的方案要么是事实核查机构,要么是内部的信任与安全团队来判定什么是或不是误导性信息。很多人就是不想、也不信任由这种方式来做决定,这非常合理。
放下管理岗,寻找新方向
我当时看着这一切,还在管理一个很大的产品经理团队。这本身就是一个很长的故事。那个职位需要投入大量精力,但我常常觉得看不到我期望的产出。我看不到产品上我想要看到的变化。我在考虑,“我是不是应该去创办一家公司?是不是应该做点别的?“但我不断回到这个问题上。我在想,“这个世界到底要怎么应对社交媒体上的信息质量问题?“不管你在哪里获取信息。我就在这家公司,在这里你可以对这个问题产生影响,为什么不去试试一些疯狂的想法,看看其中有没有哪个能行呢?我当时刚有了一个孩子,休完陪产假回来,去找我的上司 Kayvon。我说,“嘿,Kayvon,我能不能不做现在的工作了,转去做这个?""这个”就是指尝试一些疯狂的想法,看看能不能解决误导性信息的问题。
他非常支持,于是我就开始着手了。一开始就是尽可能多地阅读关于这个问题和现有方案的研究资料。哪些有效,哪些无效,问题在哪里,然后进入原型开发阶段。最终,我们构建并试运行了这个后来成为 Community Notes 的想法。
2016 年 Twitter 的状况
Lenny Rachitsky: 太精彩了。我有很多问题想问,我们会继续聊这个故事,但当你加入的时候……当时它还叫 Twitter。我现在会尽量叫它 X,我知道这对你老板很重要。那个时候 Twitter 处于什么阶段?Kayvon 已经在了,当时的 CEO 是谁?因为中间换过好几位。
Keith Coleman: 好,是的。我是 2016 年 12 月加入的,当时 Jack 相对较近才重新回来担任 CEO 来扭转公司局面。给你一个公司当时状况的概念——大概每年有三分之一的员工离职。想象一下,你的团队每年有三分之一的人离开。股价跌到了谷底,产品基本没有增长。所以 Jack 正在做转型工作,Kayvon 已经在了。Kayvon 当时在负责 Periscope,做一些视频相关的东西,那个团队后来继续……Jack 一直在那里,直到 Community Notes 启动的时候,当时还叫 Birdwatch 项目,然后……对。
从 Birdwatch 到 Community Notes
Lenny Rachitsky: 好,它当时叫 Birdwatch。我觉得我们之前还没用过这个名字,但这是一个重要的点。它最初叫 Birdwatch。
Keith Coleman: 对。我们开始这个项目的时候,它最初叫 Birdwatch,但显然,后来名字改了,这件事某种程度上还挺有名的。
Lenny Rachitsky: 对,也许我们可以快速讲讲这个故事。我知道我们在时间线上往前跳了,但是……我看到了 Jack 和 Elon 之间的一条推文串,他们在讨论该叫什么名字,Elon 说 “Birdwatch 听起来有点阴森,我想改掉它”。关于这个你有什么可以分享的吗?
Keith Coleman: 好的,那个故事……那个故事挺有意思的。Elon 收购了公司,而我们刚刚在美国相对不久前才发布了这个产品。它之前已经试运行了一年,但我们刚刚把它推广到全美范围。他应该已经看到了那些注释。收购完成后不久,他给我发了私信,说,“嘿,这个 Community Notes 的东西太棒了。“我说,“很高兴你喜欢,我们聊聊吧。“于是我们第二天聊了,他一直管它叫 “这个 Community Notes 的东西”。我心想,“你一直这么叫它,这很有意思,因为那其实是我最初给它起的名字。“我做的第一个描绘这个东西的 Figma 模型就叫 “Community Notes”。不知道为什么,就是感觉很自然。所以那是我们测试的第一个原型的名字。
后来,项目的名字改成了 Birdwatch,但 Elon 说,“嘿,就叫那个名字吧。“第二天,我们就改了名字。对于一个团队来说,改名总是一件值得注意的事,但实际上,团队很兴奋。我觉得这是一个好理解得多的名字。Jack 还调侃过它,说它是 “最典型的 Facebook 名字” 之类的。
Jay Baxter: 最无聊的 Facebook 名字……
Keith Coleman: 无聊的名字,这很好笑,因为他们现在也在推出 Community Notes。我觉得这是一个非常容易理解、直觉化的名字,而且我认为它很好地服务了这个产品。它之所以成为第一个模型里的名字,是有原因的。
Lenny Rachitsky: 是的,我觉得描述性的名字就是更合理。关于你和 Elon 的这段联系——我之后想聊聊你是怎么在这么多强势人物之间周旋、在这么多变化中让这个项目存活下来的——但在这之前,你做了一件我觉得很多产品负责人、管理层的领导者、所有管过人的人都梦寐以求的事情:放弃了所有那些加了引号的”权力”,放弃了职业上升路径和影响力,然后说,“管他呢,我要回到一个小团队,就去做一个很棒的东西。“关于这段经历,你有没有什么建议可以分享,是你觉得其他领导者可能需要听到、可能帮助他们做出同样跳跃的?因为这件事实际操作起来真的很难。说起来容易,做起来难。
回归小团队的锯齿形路径
Keith Coleman: 是的,我觉得这确实是一个艰难的跳跃。我在职业生涯中做过好几次这样的事情,每次都非常开心——从一个小团队开始,它发展成更大的东西,然后我就想,“我们在处理很多大型产品的事情,团队真的很大了。我想回到小团队,去做一些疯狂的新东西。“这种锯齿形的跳跃我经历过很多次,但确实不容易,因为很自然地……经典的职业路径是,怎么说呢,追求回报、管理大型组织、做管理者之类的,但我觉得归根结底,你得做自己热爱的事情,得享受其中,而且我认为人们想要的是产生影响力。
我觉得有一个迷思会阻碍人们,就是认为你管理的人越多、范围越大,影响力就越大。我绝对不认为这是真的。你看 Community Notes 就是一个例子。如果我继续管理一个大型消费者产品经理团队,我会产出什么?十六页 OKR?还是一堆文档?我觉得打造 Community Notes 对世界的影响力要大得多。它已经成为行业处理此类问题的标准做法,这真的很酷。人们喜欢它,它是第一个有可能真正应对互联网规模信息质量问题的方案。我觉得这毫无疑问比我继续做以前那种标准管理路径上的事情影响力要大得多。我认为这对许多小公司和创业团队来说也是如此。前几天有人截图了,好像是 Blake Scholl 的 LinkedIn。他从优惠券总监之类的职位,转而去打造第一架超音速……
Lenny Rachitsky: 对,从 Groupon 出来的。
Keith Coleman: 这种故事到处都是,只要你去找。所以我确实发现,对我来说,我热爱亲自动手打造,热爱尝试疯狂的新想法。我热爱从零到一的体验。规模化运作也很有趣,大规模运营也可以很有意思,但这个团队就是一个很好的例子——它在大规模上运作,但团队本身仍然非常小。
Lenny Rachitsky: 对,我觉得你们团队的运作方式正是越来越多公司想要做到的——去掉中间管理层,创建小团队,直接执行、产出影响力,就像 IC 一样。每次我说 IC,YouTube 上都会有人评论问,“IC 是什么?“我就解释一下,IC 就是指个人贡献者(Individual Contributor),也就是非管理岗。说到这个话题,当我问大家你们是怎么组建团队、让团队高效运作并在初期保护好它的,有一个词频繁出现,就是”Thermal”。好像是叫 Thermal 团队,不知道是不是这么说的。
Keith Coleman: 对。
Thermal:隔离式创新团队机制
Lenny Rachitsky: 什么是 Thermal?
Keith Coleman: 在大公司工作过的人大概都知道,事情容易变得官僚化或陷入泥沼,决策可以很慢。有那种大型的规划周期,人们可能随意就把某人从一个团队调到另一个团队,打乱整个项目,诸如此类的事情。我们公司,这是好几年前我们开始这个项目的时候了,公司里有很多创始人。Kayvon 就是一位参与公司运营的创始人,他提出了一个想法,“嘿,我们为什么不做这样一个项目,叫做 Thermal,让团队在一定程度上与那些东西隔离。“它们可以按自己的流程运作,有一个明确的负责人。团队完全专注于那个项目,然后我们就反复做资金决策,决定是否继续推进。
Lenny Rachitsky: 对了,为什么叫 Thermal?这背后的想法是什么?
Keith Coleman: 我觉得是一个老的鸟类比喻,热气流托起鸟的翅膀。Twitter 1.0 显然有很多鸟类相关的比喻,愿它安好吧,这就是其中之一。我喜欢这个想法,作为一个喜欢创业环境的人,所以当我们启动这个项目的时候,我就说,“嘿,Kayvon,我们把这个做成第一个 Thermal 项目怎么样?“他说,“好,干吧。“于是我们就以那种方式开始运作,从第一天起就赋予了我们大量的自由和自主权,我觉得这对产品的成功真的非常重要。
Thermal 的关键设计要素
Lenny Rachitsky: 能具体说说吗?什么让一个项目成为 Thermal 项目?你是怎么搭建的?这是从一个公司的角度来问,如果一家公司想建一个类似的东西,具体是什么样的?
Keith Coleman: 我觉得有几个关键要素。一个关键要素是有唯一明确的项目推动者,实际上就是一个创始人角色。也许可以有两个人之类的,但必须非常清楚谁是项目推动者,同时团队外部也有一个唯一的明确决策者。
Lenny Rachitsky: 团队外部的?
Keith Coleman: 团队外部的。从一开始就是这样,现在也是。如果我们需要什么,或者对什么有疑问,我就去找 Elon。从一开始就是这样,现在也是,我觉得这就是我们能够高效、快速、简单地做出决策的一个重要原因。
Lenny Rachitsky: 这个人应该得是非常高层的人,不能是普通经理吧。
Keith Coleman: 一个能做出你需要的各种决策的高层人士。我觉得这一点非常重要,就是那种清晰的决策结构。另一个要素是百分之百的专注,所以项目上的每个人都被期望完全专注于它。在很多公司,人们的注意力很容易分散在很多事情上,这让推进事情变得困难。你去找某个人,请他在某个事情上帮忙,他会说,“好,我帮你。我得先完成这个东西,可能要一两周,然后我来处理。“一两周的延迟完全会改变一个项目的势头。当我们百分之百专注的时候,我们早上讨论,“嘿,Jay,我们试试在算法里做这个调整?“他说,“好。“然后当天下午或者第二天,我们就在看结果了。因为这种完全的专注,迭代速度大幅提升。除此之外,我们还可以使用自己的决策流程。我们不需要写 OKR 或者……其他标准做法。显然,我们得确保负责任地打造产品等等,但我们不需要使用那些标准做法。我觉得 OKR 是另一个很好的例子,我理解它们为什么有用,但它们也不一定就是设定目标的正确节奏。我认为季度或年度目标到底是不是正确的步调,这一点其实很不清楚。我们会为下一个重要里程碑设定目标,然后为此努力。达到那个里程碑后,我们对接下来要做什么会有想法,等我们完成了那个,再设定下一个里程碑。不管是两周、一个月、三个月,还是多长时间。我们按照自己的节奏,以那个节奏设定目标,我觉得这对开发一个东西来说要自然得多。
Jay Baxter: 整个 OKR 制定和规划流程花的时间,比我们选定一个目标然后执行完毕还要长。
Lenny Rachitsky: 你们早期搭建的团队有多大?多少个工程师?
早期团队构成
Keith Coleman: 一开始只有我一个人,后来当我们决定要把这个东西做出来的时候,我们觉得大概需要五个人。我们希望团队尽可能小。很明显我们需要一个做机器学习的人负责打分,需要一个做客户端工程的人,一个做后端工程的人。可能还有一两个其他角色。我们需要一个设计师和一个研究员来帮助我们理解用户群体,确保我们做出来的东西能真正引起人们的共鸣。我记得是后端、前端、机器学习、设计和研究。据我回忆,这就是最初的团队。
Lenny Rachitsky: 太棒了。基本上每个职能一个人。我有个问题想问 Jay,大家总说小团队、快速推进,但有时候你就是需要更多工程师才能把东西做出来。关于如何在保持如此快的速度的同时让团队保持精简,不需要也不需要招聘更多工程师,你有什么心得吗?
算法的演进:从 PageRank 到桥接算法
Jay Baxter: 我觉得在最初我们还在探索需求到底是什么的时候,只有一个机器学习工程师肯定是好的。但到了某个阶段,我们对算法的目标变得非常清晰了,我们尝试了……我觉得在最开始,我们并不清楚需要构建这种基于桥接的共识算法(bridging-based agreement algorithm)。我实际上线部署的第一个算法非常专注于反操控。它是一个 PageRank 的变体,但它基本上没有解决偏见问题。如果一方的用户更多,PageRank 这种图算法实际上会放大那些偏见。我觉得在构建了那个原型并从中获取数据之后,就很清楚基于桥接的共识算法才是我们需要解决问题的方式。于是在那个时候,基本上我搞了一个内部竞赛。类似于 Kaggle 竞赛那种形式。那是一个关键时期,引入其他工程师非常重要。
Lenny Rachitsky: 这个故事太酷了。我想顺着这条线往下聊。在这之前,你刚才提到你们会喊 “Thermal”。这是什么意思?是 YOLO 那种,类似于……好吧。
Keith Coleman: 就是直接上线,因为我们是 Thermal 项目。
Jay Baxter: 上线。
[广告已跳过]
Lenny Rachitsky: 好,回到算法这个话题,这其实非常有趣,因为这些我从来没听说过。我本来就想问这个算法本身的灵感来源是什么,而你们基本上是在机器学习工程师之间搞了一个内部竞赛,看谁的算法最成功。Netflix 大赛那种风格,Kaggle 那种风格。
Jay Baxter: 对,对。寻找那些被立场对立、通常持不同意见的人共同认可的内容——这个想法并不是凭空而来的。我觉得 Keith 之前发现了一些 Chris Bale 的研究,他整理了一个账号列表,那些账号经常被政治立场两边的人同时点赞。还有其他一些项目,比如一些民调也在寻找通常持不同意见的人之间的共识。但我觉得从一开始就确定我们的项目一定需要用到这个思路,这并不是显而易见的。当你把它实现出来,和其他类型的算法做比较……PageRank 看起来显然是设计来抗操控的。如果你的投票环里一群人互相给自己投票,PageRank 能很好地过滤掉这种情况,但那并不是主要的攻击向量。
我们得从试点中获取一些真实数据才能意识到,“好吧,这里真正的问题是人们的极化。“所以我认为直到我们从试点中拿到了真实数据之后,才清楚地看到基于桥接的共识算法是我们真正需要走的方向。
团队运作方式
Lenny Rachitsky: 我想回到你们团队运作方式的话题。我听说你们整个团队就靠一个 Google 文档来运转,一个用了四年的文档,不断往里面加目标和要点。这是真的吗?
Keith Coleman: 确实有一个运行了很长时间的文档,而且不得不定期裁剪和清理,因为它有时候会导致 Chrome 里的 Google Docs 崩掉。它就像一个笔记文档,真的是我们协调工作的地方。团队每天开会,花我们需要的任何时间来对齐我们正在构建什么。我们可能讨论从”现在最重要的是什么”到”我们接下来应该做什么?“到”我们现在正试图上线什么,为什么还没上线?是什么阻碍了上线?“我们可能会审查新的建模或打分算法更新,试着理解哪些地方有效,哪些不行。我们就讨论任何想讨论的或者觉得最重要的事情。正如你所说,我们非常动态地设定目标,所以对我们来说现在和接下来最重要的事情就是我们要花时间做的。我觉得这对项目来说效果非常好,而不是执着于某些季度目标之类的东西。我们会看”什么最能帮助用户?“或者”现在最大的问题是什么?“两者中任何一个是什么?然后我们就去解决它。我们可能根据我们看到的情况,在两周内多次改变路线图。
Lenny Rachitsky: 我听到的是没有 Jira,没有 Asana,没有 Monday.com。
Keith Coleman: 没有。
Lenny Rachitsky: 好的。
Keith Coleman: 是的,我的意思是,我们必须用 Jira 来和其他一些团队协调。有时候我们提交请求的时候,必须创建一个 Jira 工单。但确实,我不喜欢重量级的任务管理。我喜欢大家保持信息同步,能把大部分事情记在脑子里,然后用一种非常轻量的方式把团队记不住的东西写下来。
Jay Baxter: 我们确实短暂用过 Asana,但我的记忆是,你花在会议上整理一堆无关紧要的待办事项上的时间,比真正讨论正确的优先级还要多。我觉得 Google 文档的好处是,如果某件事变得不相关了,它可以自然地消失,而不需要显式地清理待办列表。
团队运作方式总结
Lenny Rachitsky: 也许可以总结一下你们的运作方式,可能会启发其他公司也这样组建团队。我来梳理一下你们分享的几个要点。首先,团队有一个负责人,几乎就像创始人一样,基本上就是团队的创始人。他们有一个非常资深的管理者/决策者作为对接人,在你们这里就是 Elon,不是什么小事。在其他情况下,可能是 CTO、CPO 之类的角色。团队百分之百专注于自己的产品和目标。保持团队非常小,每个职能一开始只设一个人——一个前端工程师、后端工程师、ML 工程师、设计师、研究员、PM,然后项目管理基本上就用 Google 文档。没错,基本上就是用 Google 文档跑起来的,别用那些又大又复杂的产品。
Keith Coleman: 我觉得这是一个相当好的配方。Google 文档那部分,大家想用什么都可以,如果有人喜欢用便利贴,尽管用。但我认为前面提到的那些要素,在结构上才是真正关键的。除此之外,还需要有一个让人热血沸腾的远大目标,激发大家去做伟大的工作。
Lenny Rachitsky: 是的,太棒了。我觉得这里面很多东西,很多人在组建这类团队时觉得自己应该做,但实际上并没有做到,而每一个要素似乎都是真正成功的关键。
Keith Coleman: 这些确实极大地帮助了我们成功。如果不是因为其中一些要素,我不确定这个项目能不能走到今天。
Lenny Rachitsky: 这句话分量很重。这个改变了世界认知何为真实的东西,如果不是你们以这种特定方式组建起来,就不会存在。
团队架构的关键性
Keith Coleman: 是的,如果不是事先知道我们有那样的结构——那种决策能力、自主权、速度、快速推进的能力——我可能根本不会启动这个项目。我们从 1.0 版本开始就有这样的架构,一直延续下来,到了 X 甚至进一步强化了。X 作为一家公司整体上就以很多这样的特质在运作,我认为这是产品成功的原因之一。我想至少对我来说——Jay 可以代表自己说话——这些正是我在这个项目上工作得如此开心的原因。我热爱这个工作。每天醒来去解决这些问题,感觉很棒。我们能高效地推进,快速做出决策,构建出帮助很多人的东西。太棒了。
Jay Baxter: 无论是 Thermal 还是 Elon 式的运营方式,确实更有趣。而且……这一点加上令人振奋的使命,对内部招聘来说极其重要。我记得 2020 年初刚开始和 Keith 聊这个项目的时候,我手上还有别的项目。我做了几个,其中一个大概是优化推送通知的发送数量,它在没有显著增加用户关闭通知的情况下大幅提升了 DAU。所以那条路是稳妥的,如果我继续做下去,大概率可以低风险地拿到晋升。但我也可以选择承担这个巨大的职业……虽然不像加入或创办一个真正的外部创业公司那样风险巨大,但加入这样一个团队确实还是有一定的职业风险。我觉得外部创业公司招聘的所有要素,同样适用于内部。如果你能有一个令人兴奋的愿景,那就是关键。
自选择机制
Keith Coleman: 跟这点相关的,还有你刚才列的那个清单,Lenny,我们漏了一个非常重要的东西——在这个项目上,以及我认为类似的成功的创业项目中,人们是主动选择加入的。我们没有把任何人分配到这个项目上。人们是主动联系要加入的,或者申请了这个岗位。我和团队面试了每一个加入团队的人,我们会说”我们想让这个人在团队里,而他也想加入”。所以每个人对目标、使命、团队的工作方式以及将要共事的人,都是完全认同的。这产生了巨大的差别。
所以做这件事最好的时机是在一开始。如果你要做一件疯狂的事情,只是随机分配人员上去的话会很艰难。但如果你让人们自主选择加入,成功的可能性要大得多。而我在 X 观察到的一件事,真的让我惊讶——这在很大规模上也是可行的。Elon 收购公司时做的事情之一,就是基本上让人们自主选择留下。你必须点击一个按钮。他发了一封邮件,大意是”嘿,Twitter 2.0。”
Lenny Rachitsky: “分岔路口”,对吧?
Keith Coleman: 分岔路口,没错。他说”Twitter 2.0,现在叫 X,将会是非常硬核的。我们要做有野心的事,你们会拼命工作。“你必须在表单上点击,说”是的,我要加入。“我认为这对公司来说非常重要,因为你要让人们主动选择加入。你要让那些人说”对,这就是我想做的”,这样公司会成功得多。如果有人不确定,他们可能更适合去做别的事情,去一个他们天然更认同、更开心的地方。我觉得这是一个很好的方式,把一家大公司精简到那些真正对一起为使命奋斗而兴奋的人。对我们来说,我们从第一天就这样做了,我觉得这是一种比较容易做到的方式,但后来再做也是可以的。
Lenny Rachitsky: 我很喜欢你用”有趣”来形容它。我觉得很多人看到 Elon 大规模裁员、自己也非常硬核的时候,不会想象那是一个有趣的工作场所。但很显然你们有多么热爱做这件事,它有多么有趣、多么有意思。听到这些很有意思,因为我觉得很多外部的人感受不到这一点。关于在 Elon 管理的组织里为他工作,还有什么其他方面会让人们感到惊讶的吗?关于工作方式有什么有趣的、令人惊讶的、或者你认为其他公司可能会想借鉴的东西?
更精简的团队
Keith Coleman: 我一直偏好精简的团队,但我在 X 的经历改变了我对未来运营组织的思考方式——如果我创办一家公司的话,我不得不改变我对创办那家公司的方式的想法,我会比以前做得更加精简。我对一个小团队能完成如此多的事情感到惊叹。我认为正是因为团队小——收购后不久,我们有一个叫 Spaces 的产品。这个产品之前就存在于产品中,但规模很小,Elon 想运营大规模的 Spaces。我忘了第一批要上来的嘉宾是谁,但他本人会在场。后来这些活动发展到能接待政治人物等等。他说”伙计们,我们得把这个扩容。“我忘了具体的数字。
Keith Coleman: 他说,“我们得支持一百万人同时在线,“大概是这样的。具体数字我记不清了。“你们得把容量大幅提升上去。“这种事放在 1.0 时代,就算能做也要一年时间,而团队两三周就搞定了。看到这个过程真的令人兴奋和振奋。我没有参与那个项目,但我是从外面旁观的。我当时就想,“哇,就这么一个小团队,在一个宏大目标驱动下——那个目标不是说’嘿,大家,我们要不要做这个?‘而是’我们一定要把这个做出来。‘“他们两三周就完成了。对他们来说那一定感觉太棒了。对我来说也绝对是激动人心的。我确实开始深刻体会到,一个团队可以精简到什么程度,不仅能够撑下去,而且正因为精简反而能够蓬勃发展。
Lenny Rachitsky: 你提到的”主动选择加入”这一点很重要,因为我觉得很多人听到这些会想,“我才不想被要求两周内做出那种东西。“但我也觉得很多人会愿意,我们热爱那种体验,尤其是和 Elon 一起工作,尤其是发布那种规模的产品。不过我觉得其中有一个关键要素就是,“好吧,我不想做这个。我除了上线 Spaces 之外还有别的事情要做。“所以我觉得你提到的”主动选择”是一个关键点。
Keith Coleman: 完全同意。我觉得主动选择很重要,而且可能你在人生的某个阶段愿意选择加入,而另一个阶段可能别的选择更好。我觉得无论你选择做什么,能够主动选择、感觉它与你想如何度过自己的时间相一致,这是很好的。
裁员 80% 之后为什么没有崩溃
Lenny Rachitsky: 我一直在想一个问题,不知道你们愿不愿意聊,但我觉得很多人都在想:Elon 进来之后,裁掉了 80% 的人。所有人都说,“Twitter 完了。一切都会崩溃。那么少的人根本不可能运转起来。“但显然他们错了。显然,现在一切运转得很好。它在世界上变得越来越重要,而且持续增长。关于这一点,你有什么感到惊讶的吗?或者关于为什么在经历了这么大的变动之后它仍然运转得这么好,你有什么看法?
Keith Coleman: 我认为更精简的团队、更少的流程和官僚主义,是它能如此快速运转的一个重要原因。在这里更容易更快地把事情做成。是的。我认为那次缩减实际上正是发布节奏加快、实验节奏加快的一个重要原因。我注意到的另一个结果是,现在在这里的人,他们似乎都真的有主人翁意识。他们像产品所有者那样承担责任。他们会去追踪问题出在哪里,修复需要修复的东西,跳进来帮助构建或修复、改进任何需要帮助的系统,即使那不是他们负责的领域。这还有另一面。在大公司工作过的人可能都有过这种经历——你想修改另一个系统或产品中的某些东西,于是你联系那个团队。他们可能有点抵触,可能会说,“哦,我们下个季度再说吧——”
Lenny Rachitsky: 他们有自己的目标要完成。是的。
Keith Coleman: 是的,没错。他们不一定想帮你,或者他们很忙。但在我们这里,你说,“嘿,大家,我们需要在你负责的那个系统上做件事。“他们会说,“太好了!这是代码,这是文档,有问题在 fab 上发消息,我们把它合进去。“事情就是这样,你可以直接跳进去把它搞定。这种协作方式,这种共享的主人翁意识,根据我的经验,正是来源于团队缩减到那些愿意留在这里、愿意一起构建这个产品的人。所以我认为这是一个非常积极的影响。这并不总是轻松的。当然,很多人肩负着很多责任,但他们在这里是因为他们愿意承担。
Jay Baxter: 是的。我认为另一个关键点是,当你被迫拥有这么小的团队时——其实不管团队大小这都很重要——删除代码很多时候比写代码更重要。我觉得很多时候,也许是因为晋升激励,或者只是人的天性,工程师倾向于不断添加那些看似微小但实际带来增量的改进,而这些改进带来的长期维护成本比表面上看要大得多。因为你只做了一个月的 A/B 测试,看到了一个显著的收益,却没有意识到你刚刚给自己的团队增加了永远的维护负担,直到你把那个功能关掉为止。所以我觉得这里面有很大的收益空间,而且当你拥有这么小的团队时,你不得不这么做。就是审计你的系统的各个部分,删除那些维护成本大于收益的东西。所以我认为在大裁员之后,我们在全公司范围内确实不得不做这件事,现在系统更精简了,可以用更少的人来维护。
Lenny Rachitsky: 这点说得太好了。我记得 Elon 当时说,“我们必须把整个东西扔掉。必须重新架构一切。现在这种构建方式太蠢了。“听起来这招确实奏效了。
Jay Baxter: 是的,所以——
Lenny Rachitsky: 嗯。
Jay Baxter: 你不需要把所有东西都从头重写。有些东西确实值得重写。但仅仅是删除不必要的冗余代码,保留核心系统的其余部分,这就已经很棒了。
Birdwatch 的低调发布策略
Lenny Rachitsky: 我很喜欢我们正在总结出一套运营这类公司和团队的公式。这里面有太多内容了。我想回到最初产品的构建过程。我们刚才聊了很长的一段题外话,一段非常精彩的题外话,但我听过一个故事,说你们当时发布 Birdwatch 的时候,特别想把预期压得很低,里面还放了一个 GIF,看起来明显就是一个还没准备好的产品。聊聊你们是怎么做的吧,你们是怎么以一种让人觉得”这东西永远不可能行”的方式来发布的。
Keith Coleman: 我们非常自律,可以说,让产品在每一个阶段都自己证明自己。当我们构建第一批原型图的时候,那些只是描绘 Community Notes 可能呈现效果的图片。我们把这些拿给不同政治立场的人看。我们发现,嘿,人们真的很喜欢这些。无论他们是在右翼还是左翼,他们似乎都很乐意阅读这些 Community Notes,即使这些 Notes 对自己阵营的人持批评态度。所以我们觉得,“好的。这给了我们信心:如果我们能把这个做出来,如果我们能让它成为现实,它就会成功。“接下来就是,我们能不能把它变成现实?现实世界中的人能不能写出这种质量的 Notes?
于是我们先做了一个内部试点测试版本,你可以写 Notes。我们首先基本上通过类似 Amazon MTurk 的方式做了一个参与者测试,就是看看如果你放一些普通人进去,他们能不能写出这些 Notes。那些 Notes 并不都好,但很明显有些人确实能写出好的 Notes。于是我们想,“好的,这是可行的。如果我们真正在现实世界中做这件事会发生什么?我们来运行一个试点,看看结果。“所以我们把之前做过 MTurk 测试的那个试点拿了出来,先向 1000 人开放,完全公开,我们不知道会出现什么内容。你可以想象那些 Notes 可能会很糟糕。
于是我们讨论,“那我们该怎么办?我们把这个东西放出去,所有人都会有一堆问题。他们可能会非常怀疑,而且我们知道它可能完全是一场灾难。那我们该怎么合理地设置预期?“我们觉得最终应该能把这事做好,但一开始会怎样真的不知道。我们想设定好预期,于是我们就想,“那我们干脆放一个……”你知道那个页面吧,你在上面看到一条帖子,下面是 Notes。我们就说,“我们干脆在那个页面上放一个垃圾桶着火的 GIF?“你点进去一看,“嘿,你在下面看到的任何东西可能都是一场彻底的灾难。至少这能表明我们意识到了这个风险。“最后我们没那么做。虽然我觉得特别搞笑,但我们觉得——
Lenny Rachitsky: 哦,你们没有真的上线那个 GIF。好的。那只是一个概念而已。好的。
Keith Coleman: 我们做了原型图,每次我看着那个原型图都会笑,但最终那个页面上有太多东西需要解释,比如,这东西是什么,它是怎么运作的?最终我们觉得,“好吧,这个 GIF 可能会分散注意力。“所以我们把它撤掉了。我有时候觉得如果它能见一次天日也挺好的,但最终我们保持了简洁,让那个页面专注于解释这里正在发生什么。不过,和这个项目很多次的情况一样,我们把试点放出去之后,Notes 的质量是好的。
不是全部都好。参差不齐,但里面有真金。从最开始只有 1000 名贡献者的时候起,就很明显人们能写出信息丰富的、中立的 Notes,能谈论有争议的、棘手的话题,而且只要我们能把这些好的 Notes 从其余的里面筛选出来,这东西就能成功。它的效果会和我们最初做的那些原型图一样好。于是这就成了重点——怎么从其余的内容里筛出真金?
Elon 的早期关注
Lenny Rachitsky: 你可能跟我分享过这个——有人注意到你们在测试这个东西,截了图发到了推特上,我记得 Elon 回复说,“This is cool。”
Keith Coleman: 对。对。那是在非常早期的阶段,当时还只是一个 Figma 原型,我们在 usertesting.com 上跑受控的用户研究。大概有一个参与者把东西发给了一位 NBC 的记者,那位记者写了一系列报道。总之那天平台上议论纷纷,然后 Elon——把时间线拉回到当时,这大概是在 2020 年,收购之类的事情还两年之后呢——Elon 当时就是一个推特用户,造火箭、造电动车,做各种酷东西,然后偶然看到了这个描绘我们测试中原型的东西。他回复道,“Definitely worth trying, IMO.”我记得当时觉得挺酷的,而且有意思的是,他显然一直持有非常一致的看法。我觉得这个想法本身就吸引人,而他显然在产品层面一直是它的忠实粉丝,一直是坚定的支持者。所以是的,这份支持其实从非常早期就开始了,远在他涉足公司之前。
Lenny Rachitsky: 我太喜欢那个时刻了。Elon 对一个 Figma 原型测试发表评论,那种感觉一定非常不可思议吧。
Keith Coleman: 确实很酷。确实很酷。
Community Notes 的核心原则
Lenny Rachitsky: 天哪。在我们准备这次访谈的时候,我问你们最希望人们理解和认识到 Community Notes 为什么如此有效的核心要点是什么?Keith,你特别说到是你们在方法论背后的那些原则——你们从一开始就坚持并且一直贯彻至今的那些原则。我们之后也会聊到你们是怎么在这么多任 CEO 更替中把它保留下来的。但先来谈谈这些原则,具体是哪些原则,为什么它们对成功如此关键?
Keith Coleman: 有好几条原则。当我们第一次在公司内部向人们分享这些原则的时候,它们看起来可能有点疯狂。但我认为它们正是这个产品能运作的原因,我认为它们非常重要,而且我们也确实如此——我们今天仍然不断地回到这些原则上来。大概最疯狂的一条就是,这个东西将是人民的声音。它将代表人民的声音,而不是公司的声音。所以决定展示什么内容的不是某个科技公司,而是人民,这一点在设计上产生了很多影响。首先,我们没有按钮可以改变一条 Note 的状态。如果一条 Note 因为人们投票认为它有帮助而展示了,它就会展示。我们无法改变这一点。
而这正是我们最初提出时让人不安的地方。他们会说,“等等,所以一个东西可以上线,而公司无法撤下它,或者无法改变它的状态让它停止展示?“我们就说,“没错,而且它必须这样运作。如果它的质量不足以支撑这一点,那它就不算成功。“我们的核心原则之一就是,如果一条 Note 的问题严重到你想要干预,那问题出在系统身上。我们需要重新设计系统来展示好的 Notes。所以没错,我们必须让所有人都接受这个理念:没有一个按钮可以改变 Note 的状态。类似地,正如我们之前讨论过的,我们希望这个产品代表全人类。
所以我们不想充当仲裁者,决定谁可以进来当贡献者、谁不行。我们对所有人开放。你只需要满足一个非常基础客观的标准。你需要有一个已验证的电话号码,帮助降低机器人之类的东西参与的可能性。但除此之外,就是随机选择,而且今天仍然如此。同样,这一点也需要花时间让人们接受。但我认为,这代表了人民的声音,通过一个开放、透明的流程反映他们的产出,这一点对于它为什么好、为什么有效,以及为什么被信任,都是至关重要的。这是第一条原则,我认为它永远是这个产品的核心。另一条人们觉得疯狂的原则是透明度。
透明度原则
之前应对误导性信息的做法,在很多人看来就像是黑箱——科技公司或媒体公司或什么权威在做决策。我们的想法是,“人们需要对这件事感到安心。他们需要信任它。所以整个过程必须公开透明。“决定哪些 Note 被展示的代码必须公开。所有驱动它的数据和评分必须公开。人们应该能拿代码和数据复现整个服务,验证我们的确做了我们说要做的事。他们应该能够审计它。他们应该能去查看并说,“嘿,我觉得这部分可以改进。”
或者如果他们认为我们有偏见,他们应该能拿数据来指出这一点。如果人们提出了好的观察,那应该被纳入代码的改进中。同样,这也是一件很难让人安心的事情——所有东西都摆在那里,你无法遮掩任何事。但我认为这对于人们信任它至关重要。是的,我们从第一天就确立了这些原则。我们不断地回到它们,因为产品一直在演进,我们始终要确保每一个新变化都是开放的。每当我们更新评分系统,GitHub 上就会有更新;数据每天发布,你可以下载。所以是的,我认为这些原则对于这个东西的成功是根本性的。
开源的代价
Jay Baxter: 顺便说一下,这些并不是没有代价的。从终端用户的角度来说,要把实际运行的算法在实际数据上开源,其实非常困难。因为像这样的大规模服务,通常的架构方式天然不适合被别人下载一个 TSV 文件就能当脚本来跑。所以我们实际上不得不做一些不同寻常的架构决策,才让这件事成为可能。如果我们不是从一开始就带着这个假设从零做起,可能就得重写整个系统才能做到这一点。
Lenny Rachitsky: 能举个例子吗?
Jay Baxter: 比如我们训练了一个矩阵分解(matrix factorization)模型。通常你会训练一个矩阵分解模型,然后用一个独立的服务来部署它。但我们不想让外部的人还要去启动服务才能复现我们的系统。所以基本上,我觉得如果我们开源代码的方式让别人没法真正跑起来——不是那种下载下来就能跑的——那就不太酷了。现在你可以下载 Python 代码,运行一个脚本。你确实需要很大的内存,但可以在一台机器上完成。
Lenny Rachitsky: 好的,需要多少内存?
Jay Baxter: 哦,大概也就 500 GB 吧。
Lenny Rachitsky: 好的好的,那还算合理。
Jay Baxter: 如果你不做任何加速优化的话,大概要跑一天。了解一下就好。
Lenny Rachitsky: 挺酷的。
Jay Baxter: 关键是”能做到”,而且确实有人做到了。Vitalik Buterin 发过一篇博客,讲了他探索的过程,确认算法确实做了它声称要做的事。我觉得仅仅是有少数人做了这件事这个事实——已经有足够多的人验证过了,其中总有你愿意信任的人。
坚守原则的力量
Lenny Rachitsky: 而且这套东西还在向 Meta 推广。没什么大不了的。我只是听你描述这些原则的时候,能想象到一家公司的产品经理说,“好的各位,我要做这个项目。“这里面有太多理想主义,在现实中很少能成功——要开源,要给所有人用,我们对它会做什么没有实际控制权,别担心,它就是会改变人们看待这个东西的方式,而我们已经非常小心翼翼了——然后它居然真的成功了。我觉得这是非常罕见的,真的很令人印象深刻。我听出来的部分信息是,坚持这些原则实际上对它的成功是根本性的,而不是在有人说”不不不,我们不能这样做,要是我们改了这个部分呢?“的时候就妥协退让。
Keith Coleman: 我觉得如果我们违背了其中任何一条原则——如果有任何黑箱,如果有任何诸如此类的东西——这个产品就会难信任得多。所以我认为,正因为我们在这些原则上坚守得如此干净纯粹,人们才能信任它。
关键时刻:系统真正发挥作用
Lenny Rachitsky: 你们提到过几个让人惊叹的时刻,比如白宫因为 Community Note 修改了他们的公告,还有那个狗是猫的例子。发布之后还有没有其他让你觉得”天哪,这真的管用了?这真的能成功”的时刻?
Keith Coleman: 一直以来我们都在看到它在起作用。每当我们把它扩展到新的用户群、新的国家或其他什么范围时,我们都要确信它能正常运作。所以可能会稍微屏住呼吸看看它是否如我们预期的那样工作,但我们始终预期它能做到。话虽如此,确实有一些压力极大的案例。印象最深的是 2023 年 10 月以色列-哈马斯冲突爆发的时候。那可能是我见过的同时在互联网上传播的最大规模的误导信息洪流,铺天盖地。大量照片、视频和相关内容涌出来,简直疯狂。举个例子,在那场冲突的头三天左右,我们就有了 500 条 Note,覆盖了各种不同的……脱离上下文的图片。有人会说”嘿,这里正在发生这件事”,实际上那是 2013 年叙利亚的画面。还有人用视频游戏模拟器 Arma 3 制作假的战斗画面。所以有 Note 解释这些——这些东西看起来很逼真,除非你看到 Note,否则你真的分辨不出来。关于地面实际情况的各种说法层出不穷。那次确实是……当时产品还相当新。我们不到一年前才在美国推广,那一年一直在向全球推广,然后这个大事件就发生了。我感觉我们在恰当的时机做好了刚好足够的准备,让系统能够应对这一切。
可能在那之前我们做的最重要的一件事,就是上线了对图片和视频撰写 Note 并将它们匹配到其他帖子的功能。我记得当时想,“幸好我们在几个月前就上线了这个功能,而不是还搁置着,“因为那场冲突中它真的非常重要。我记得就在那之前几周,我们还做了一个大幅提速的更新。最初做这个产品时,第一要务始终是质量。我们知道这个产品成败系于 Note 的质量,这是我们不能放弃的东西。我们也知道它需要在速度和规模上达标,但我们想,“我们先把质量做到位,然后再逐步提速和扩展。“我们当时刚好上线了一个提速方案,把上线所需时间缩短了三个小时,而且距离那场冲突只有几周时间——再次庆幸那个更新已经推出了。冲突最初几天,从帖子发布到 Note 显示出来的中位时间是五个小时,快得疯狂。典型的事实核查通常是两到四天——至少两到四天是很常见的。这些 Note 在五个小时内就出现了,我们当时就想,太庆幸这些改进在那之前就推出了,让服务有用得多。
Jay Baxter: 还有一件事我觉得在那时看到了很好的效果。有些人对 Community Notes 的一个批评是,如果你总是需要通常持不同意见的人达成共识,那么在这些高度两极化的场景下——那场冲突大概是头号例子——你就不会看到任何 Note 出现。但实际上,有大量关于那场冲突的 Note 出现了。所以我觉得这是一种很好的属性:也许这是一个出人意料的事实——跨越两极化分歧的共识比传统认知所认为的要多,而人们达成共识的地方,确实是客观真实、可验证的。也许在越是两极化的场景下这一点越成立,但共识真正能带给你的,是那些非常中立地撰写、非常聚焦于事实、且信息易于验证的 Note。
Lenny Rachitsky: 有一种说法流传了一阵子——不再有事实了,没人相信还存在单一的真实事实了,一切都是主观的。我觉得 Community Notes 证明了恰恰相反。事实是重要的,存在我们可以共同认可的事实,哪怕是在最具争议的话题上。
匿名贡献者与信任机制
Keith Coleman: 是的,我们从第一天起就看到了这一点。当我们向人们展示那些仅仅描绘了这个想法的原型时,很明显人们非常在乎理解现实、了解正在发生的事情,而且他们愿意——姑且这么说——违背自己一方立场来承认这一点。我觉得这对很多人来说并不总是那么显而易见。这个世界确实感觉很两极化,但人们绝对愿意跨越党派边界去获取准确的信息,这正是这个产品能够运作的原因。
Lenny Rachitsky: 感觉随着我们越来越多地依赖从社交媒体上快速获取的在线信息来认知和理解世界,我真的非常庆幸这个东西存在,否则我们就完全不知道还能相信什么了。它的出现恰好满足了我们对这类机制的迫切需求。但与此同时,也有很多人我确实不信任。我觉得人们已经从”我相信我读到的内容”转变为”好吧,我不应该轻信我读到的所有东西”。关于人们如何看待他们看到的新闻,以及这种”我不再什么都信”的转变,你有没有注意到什么?在人类行为方面,或者在我们理解真相的方式发生的变化上,你有没有观察到什么?
Keith Coleman: 我们没有做过任何研究来广泛考察人们的认知在那方面是如何变化的。但我自己确实有这种体会——尤其是在看到各种 Note 之后,我对最初读到的东西变得更加审慎了,我觉得这其实是有益的。我们也从用户那里听到类似的反馈,说他们会多想一想,我认为这是这类系统一个很好的副效应和好处——你越是看到自己读到的东西可能出错的各种模式,你就越能审慎地质疑它,并努力更好地理解到底发生了什么。所以从历史上看,我觉得这被称为媒介素养,但基本理念就是:你能否理解事物可能出错的方式,并尝试自己去甄别。
Jay Baxter: 我认为我们在另一个方面也有所帮助,那就是 Community Notes 的发现机制。我觉得在 Community Notes 出现之前,你可能一直生活在一个小小的新闻过滤气泡里,或者也许外面有一些你应该去看的事实核查,但你根本没有发现它们。而 Note 直接附加在帖子下面、任何看到帖子的人都能看到这一事实,有助于穿越那些过滤气泡,并且某种程度上……我觉得对一些人来说,这是他们第一次真正看到针对自己那个信息回音室里提出的论断的反驳论据。
Lenny Rachitsky: 这太了不起了。我很喜欢你说的这一点——它实际上教会人们对读到的东西多一些质疑。它不仅仅是一个教育系统,而不仅仅是说”嘿,这个东西是错的”。我很喜欢这一点。
好的,还有几个问题。之前我们在 Twitter 上问了大家想了解 Community Notes 的什么,其中一个是:为什么你们转向匿名贡献者,这个决定背后的考量是什么?
从实名到匿名的转变
Keith Coleman: 是的,我们当时进行了一个小规模试点,用几千名贡献者来测试,通过那个试点我们学到了很多。我们学到的最重要的东西,大概与贡献者的匿名性或化名性有关。我们最初假设人们在贡献时使用自己的真实账号、真实姓名之类的很重要。最初的原型就是这么设计的,我们认为这对于人们信任 Note 会很重要,但实际上完全错了。最佳方案恰恰与我们最初尝试的相反。
我们发现了几个问题。第一,人们不愿意在有争议的话题上撰写 Note,因为他们不想在网上被攻击或骚扰。所以有些人对此觉得没问题,但其他人并不这么觉得,这意味着本可以写出来的优质 Note 比实际写出来的要多得多,这是从试点中得到的非常明确的反馈。
第二,这个超级有趣——人们在匿名或化名的状态下,实际上比使用真名时更愿意跨越党派边界,而这一点从直觉上很容易理解。如果你公开使用自己的名字,你觉得自己属于某一方的阵营,你可能会犹豫是否要被认为背离了那一方。但实际上,你可能——举个例子——觉得一条批评你所在阵营的 Note 很有帮助。有很多研究表明,当人们处于匿名状态时,他们更愿意跨越党派边界,与另一方合作,认同另一方。我们也观察到了同样的现象。所以通过允许人们使用化名,你实际上获得了关于他们真实想法的更诚实的回答,这有助于发现真正的分歧——
Lenny Rachitsky: 这太反直觉了。
Keith Coleman: 是的。
Lenny Rachitsky: 你通常听到的总是相反的说法,结果实际上恰好相反,这太有意思了。
Keith Coleman: 是的,是的。
Jay Baxter: 我认为同样的原则也适用于将点赞设为私密。
Lenny Rachitsky: 我刚才正好在想这个。
Jay Baxter: 是的。
Lenny Rachitsky: 对,我现在点赞了更多稍微有点那个的东西——那些我以前肯定不会点赞的。
Keith Coleman: 它为诚实提供了自由空间,这相当棒。对化名制的一个批评是,它可能导致——也许人们达不到他们公开呈现时所保持的质量标准,但我们的系统中有如此多的质量保障机制,所以这并不是问题,我们可以在保持高质量的同时为那种诚实敞开大门。
推翻既有信任与安全机制
Lenny Rachitsky: 另一个问题——你之前稍微提到过一点,就是如何在 Twitter 既有的信任与安全体系中进行斡旋。正如你所描述的,以前基本上就是”我们来决定什么是真的、什么不是”,每家公司都是这样运作的。而你们基本上颠覆了这一点,说”这里有一种完全不同的方式,你们无法控制我们说什么真、什么不真”。谈谈那次经历吧——克服那个障碍,那个”好吧,忘掉那些,我们用完全不同的方式来做”的艰难挑战。
Keith Coleman: 是的,我们提出的东西确实非常不同。我想说的是,我觉得人们总体上对此是持开放态度的,我认为大家都有一种感觉,即当时的做法并不是很有效,也没有真正解决问题,人们愿意接受新的想法,所以这是一个不错的基础。
但我觉得我们做了一件可能非常有帮助的事,就是我们希望这个产品在任何阶段都能自我证明。首先它得证明人们有可能觉得 Note 有帮助,然后它得证明人们有可能写出高质量的 Note。所以每次我们提议对产品做什么事情——比如运行某项研究测试、启动试点、或扩大试点规模——我们手里总有数据说明这是一个好的决定,说明我们正在推进的下一步扩展是合理的。所以我猜我们很少提出任何看起来不明智的建议,因为我们自己就维持着如此高的质量标准,我怀疑这在很大程度上起了作用。
Lenny Rachitsky: 所以部分是——我听到的是——一步一步来,证明这确实有效;另一部分是,在试图说服信任与安全团队这就是正确的方向之前,先让自己确信它确实有效。
Keith Coleman: 正是如此。
从”不可能”到”真的可以考虑”
Lenny Rachitsky: 在这段历程中,有没有一个时刻,人们的反应从”这不可能行得通”转变成了”好吧,哇,让我们认真考虑一下”?还是说这是一个非常渐进的过程?
Keith Coleman: 你是说其他人从”不可能”到”哇,让我们认真……”
Lenny Rachitsky: 对,就是在公司内部,从”我们要继续沿用信任与安全团队的运作方式”到”转而依赖 Community Notes”,有没有这样一个时刻,说”好吧,我们真的要做这个切换了”?还是说这其实是 Elon 带来的,那才是最大的转折?
Keith Coleman: 最大的变化发生在 X 时期。在那之前,最大的决定只是把这个东西推出来,让它在公开环境下以全美规模运行。但确实,更大的转变是在 X 时期发生的。
Jay Baxter: 我觉得,即使在 Birdwatch 启动之前、Community Notes 启动之前,就已经有外部研究人员的原创研究,表明众包的事实核查员——普通人——能做到和专业事实核查员差不多好的程度,而且两组之间的共识率其实也挺接近的。但我觉得,即使有这些研究在那里,很多人确实是在它已经运作起来了之后,才真正相信它能行。
Lenny Rachitsky: 基本上就是用事实证明,证明它确实有效。对,这说得通——而不是一堆文档、策略和思考,就是”看,它真的在运作,你自己可以看到”。
Jay Baxter: 对。
项目在多次领导层更迭中存活下来的秘诀
Lenny Rachitsky: 有道理。好的,可能是最后一个问题了——看你们会引出什么问题分支来。我之前提过几次,这个项目取得的令人难以置信的成就:在 Jack 任内活了下来,然后我这儿记着 Kayvon 接手管理,然后 Parag 管理 Twitter,然后是 Elon,然后 Linda 接任 CEO——这相当罕见,尤其是这样一个如此高曝光度、对 X 的一切都如此有影响力的项目。对于这个项目在如此多的组织变革和领导者更迭中存活下来,你有什么经验或关键心得吗?
Keith Coleman: 做这件事的这段时期确实非常疯狂,但也很有趣,这种疯狂本身就很有娱乐性。我觉得这个产品之所以做得这么好、存活了下来,可能有一个原因是产品本身的性质。它被设计用来产出那些平时意见不合的人也会觉得有帮助的信息。所以即使你有意见相左的 CEO 或领导者,他们也很有可能觉得它有帮助——“哇,这东西确实产出了很有用的东西”。所以我觉得产品本身有一种特质,当人们看到它时,无论他们站在哪一边——左、右、上、下——都很有可能觉得它挺有帮助的,我认为这一点确实起了作用。
同时我也认为团队执行得非常好。我们有令人兴奋的远大目标,解决的是真实存在的问题,一个在世界上真正重要的问题。正如我们之前谈到的,在每一步,产品都需要自我证明,而我们会确保它证明了自己,然后把说服我们的结果拿出来与他人分享。于是他们会说,“哦对,我同意,它确实证明了自己,让我们迈出下一步吧。“我们一路上一直这样做,现在也继续以这种方式运作。我认为这种对重要的结果和目标的专注,以及扎实地执行,确实帮助很大。
在收购发生的那段时间里,团队几乎没有被什么东西分心。那段时期有很多让人分心的机会,但这个团队每周都在发布,我们高度专注于目标——让这个东西运作起来,让这些 Notes 出现在那里——我觉得人们看到了这种执行力,也愿意支持我们。
Lenny Rachitsky: 对,就像”它在运作,我们为什么要去动它?“而且它很重要,还省得我们雇几万人来做事实核查。
Keith Coleman: 这件事有意思的地方在于,在整个过程中,从来没有人问过我们、提起过、或似乎关心过任何与节约成本有关的事情。我觉得这是公司外部的人的一种假设,认为这一定是人们对它感兴趣的原因。但这从来不是目标,完全不是项目启动的原因,也不是人们对这个项目感到兴奋的原因。我觉得对于外面那些可能看不到内部对话的人来说,这也是一件令人感到暖心的事情——焦点始终放在解决问题上。另一种方式,即使你有一万个人在做,真正的问题在于它们效果不好,因为不被信任,或者无法扩展,或者太慢。所以目标始终就是帮助人们在大规模上保持信息灵通。让我们构建一个互联网规模的解决方案,来应对一个互联网规模的问题,而且是人们喜欢的解决方案。
低自我与项目成功
Lenny Rachitsky: Keith,当我向别人打听这件事是怎么运作的、为什么会这么成功的时候,有人跟我提到你,说你非常低自我(low ego),这让你能够放弃整个团队、权力和影响力,连名字都无所谓——“随便吧,你们想叫它 Community Notes 就叫 Community Notes,没问题。“关于这一点你有什么可以分享的吗?你怎么看待这件事?作为产品领导者,低自我有多重要?
Keith Coleman: 对我来说,这个项目——我觉得我在用这个项目做社区服务。我把我的工作视为为人和社区服务,这也是驱动我的东西。我唯一在乎的就是交付一个让世界觉得有帮助的成果。所以在某种意义上,这个项目从来不是关于自我,而是关于探求真相——不是”真相”意义上的什么信息是真的,而是让我们搞清楚到底怎样做才能让它运作起来。它需要怎样的结构,应该叫什么名字?任何能产生最好结果的做法就是我们应该做的。所以我觉得我更关注产品是否有帮助,而不是其他任何东西。所以在某种程度上,看起来像低自我,可能更多是因为我真的想把问题解决好。
Lenny Rachitsky: 我听到的部分意思还有——如果你赢了、成功了,好事自然会发生,所以专注在那上面就好。
Keith Coleman: 肯定会有令人满足的事情发生。看到人们欣赏它,这非常令人满足。左右两边的人都喜欢它,这令人满足。甚至连收到 Notes 的人都喜欢 Notes,主动联系作者、把它发出来——这太棒了,帮助人们获得这些让人感觉太好了。是的,这非常有驱动力,是每天早上醒来的一个很好的理由。
Lenny Rachitsky: 这件事居然成功了,这很荒诞,但同时又会觉得——当然它应该成功,当然这样的东西应该能行。就是非常有趣——
Keith Coleman: 这是互联网的产物,它属于互联网,这就是它能运作的原因。
Community Notes 的未来
Lenny Rachitsky: 天哪。Community Notes 接下来要往哪里去?正在发生什么,未来的方向是什么?
Keith Coleman: 我们基本上一直在朝着一个方向努力:更多、更好、更快的笔记。显然有机会让更多笔记涌现出来,我们希望它们保持现有水平甚至更好,同时希望它们更快地出现。所以我们一直在做核心产品的改动来实现这个目标。比如最近,我们刚刚发布了一项更新,就是所谓的 Community Notes 蝙蝠信号,也就是请求 Community Notes 的功能。X 上的任何人都可以说,“嘿,我觉得这条帖子需要一条 Community Notes”,现在他们甚至可以附上一个来源来解释原因,这样当潜在的写作者看到时,就能更容易地撰写笔记。所以我们一直在做诸如此类的核心改进和核心算法优化。
我觉得还有一些新的前沿领域展现出很大的潜力,AI 和 LLM 就是其中之一。很容易想象 AI 可以在很多方面帮助人们完成这项工作——快速地把信息传播出去。也许 Jay 可以谈谈我们和一些公司外部的人一起做的 Supernotes 项目。
Jay Baxter: 对,拥有公开数据和代码的一个很酷的好处就是,外部研究者可以和你合作。在这个案例中,Supernotes 团队提出了一个想法:我们可以把现有的笔记作为输入,把那些可能存在某些问题、只涵盖了部分事实、或者措辞带有某种偏见的已提交笔记都收集起来,把它们全部输入,让 LLM 生成大量不同的变体,然后基本上构建一个模拟陪审团——也就是模拟一组有代表性的 Community Notes 贡献者来给笔记评分,并根据他们过去的评分记录来预测他们会如何给这些 LLM 生成的笔记打分。这样一来,你实际上不是让 LLM 从零开始写一条笔记然后祈祷它写得好,而是可以模拟整个 Community Notes 的评分流程,明确地创建出很可能被人们评为有帮助的笔记。
我觉得这类想法对未来非常有前景,也是 LLM 和人类协作的一种很好的方式。当然,AI Agent 也可以浏览网页,你可以想象 Agent 辅助人类的一种方式就是检查笔记是否确实有来源支撑。不过这也会带来一些问题——人们还会那么认真地去核实吗?目前我觉得评分者非常认真,因为他们知道这只是一个 Community Notes 的贡献者写的,我最好在评为”有帮助”之前先核实一下。但我们希望能把产品设计成让人们不会盲目信任输出结果,而是在给出”有帮助”的评分之前自己去验证。
Lenny Rachitsky: 对,这真是一个非常值得探索的有趣领域——既要避免 AI 产生幻觉和垃圾内容,又要让它更容易扩展、覆盖面更广。真是一个有趣的挑战。
由人民构建的产品
Keith Coleman: 这个项目除了 AI 元素之外,还有一个很酷的地方,就是它是在公司外部完成的。我们之前谈到过开源透明性。我们把这一切开源的关键原因是让人们看到它是如何运作的,但梦想其实不止于此——不仅仅是笔记和评分的贡献来自人民,真正的梦想是这个产品由人民来构建。如果评分算法在很大程度上、甚至完全由公众来编写呢?那就太了不起了。而 Supernotes 可能是第一个非常有分量的、可能改变算法运作方式的潜在变更,而且它来自外部,有可能会成为核心的一部分。我们很希望看到产品朝这个方向发展。
Lenny Rachitsky: 太棒了,加油 Supernotes。两位,你们做的工作太了不起了。我觉得这是每个产品人的梦想——在一个小团队里,得到大量支持,产生巨大影响,而且本身就有天然的吸引力。我认为这会激励很多人。
那么让我问问你们,还有什么想分享的吗?还有什么觉得对听众有帮助的话想说?
Community Notes 的意外用途
Jay Baxter: 好的,我觉得在开发这个产品的过程中有一件事很有意思,就是……我觉得类似于转发功能最初并不是 Jack 想出来的,是用户自发开始这样做,然后它才成为产品的核心功能。Community Notes 也是如此,已经有很多令人意外的使用方式出现,人们在用 Community Notes 做的事情中有很多是我们没有预料到的,看到这些用户需求自然涌现真的很酷。
我觉得举个例子,我们一直设想的是政治类的不实信息,但不知为什么有很多人热衷于辩论 Messi 和 Ronaldo 谁进的球更多。这个挺有趣的。还有一个社区自治的方面——我们原本认为这专门用于为具有误导性或可能具有误导性的信息添加背景,但你可以看到有些笔记已经超出了这个范畴,转而举报那些他们认为是垃圾信息的内容。所以我觉得这也是 Community Notes 作为一款由人民驱动的产品的另一个维度。
Lenny Rachitsky: 这太美好了,基本上是他们在努力维护 Twitter/X 的健康生态,就像在说,不,这个应该被处理,这条推文是垃圾信息。
Jay Baxter: 对。
Lenny Rachitsky: 我太喜欢这个了。关于 Messi 对阵,另一个足球运动员是谁来着?
Jay Baxter: Ronaldo。
Lenny Rachitsky: Ronaldo,好的。这个有确定的事实答案吗,还是说根本就无法判断?
Jay Baxter: 对,我觉得这是一个很有意思的案例,因为评分者实际上两极分化很严重。这其实恰好契合我们的核心算法——有些人就是 Messi 或 Ronaldo 的铁杆粉丝,就像他们在政治议题上可能有的立场一样,所以我们实际上专门对那个话题以及其他一些话题进行了建模,这样我们就能估算人们在那场特定辩论中的立场。像这样的事情会涌现出来,还挺有趣的。
Lenny Rachitsky: 你是说那是 X 上最具争议的话题——Ronaldo 对阵 Messi。
Jay Baxter: 那确实是一个争议性很强的话题。
Lenny Rachitsky: 哇,谁能想到呢?好的。Keith,你还有什么要补充的吗?
对世界的乐观理由
Keith Coleman: 有的。Community Notes 本身就很酷,但它所揭示的关于社会的意义其实更大。社会经常让人觉得非常两极分化,你总是听到人们这样说,说没有人能在任何事情上达成共识。但实际上 Community Notes 向你展示了,人们真的可以在很多事情上达成共识。即便是在与政治等相关的极具争议的话题上,也存在大量共识,这正是笔记能够生效的原因。
我觉得这是对世界保持乐观的一个非常重要的理由——虽然世界看起来可能很两极分化,但可能大约 80% 的人在很多事情上是有共识的。想象一下,如果我们能把 Community Notes 中使用的同类方法用来寻找在立法、政策或其他人们希望政府或世界采取行动的事情上的共识,我们也许能为那些人民真正想要的想法积聚更多的推动力,每个人都会更幸福。也许处于两端的那 10% 的人不会满意,但我敢说有大量我们尚未识别的共识存在,如果我们去发现并落实它,我们大家都会相当满意。所以我不知道,我觉得人们很容易对世界感到悲观,但我认为这个产品是让我们对未来保持乐观的一个很好的理由。
Lenny Rachitsky: 用这种方式来收尾真是太棒了。Keith,我现在也能理解为什么大家想加入你、和你一起工作了。
Keith Coleman: 感谢。如果你确实想加入的话,我们正在招聘一位机器学习工程师。你可以和我们一起研究这些令人兴奋的问题,也会有很多乐趣,我们正在 x.com/communitynotes 接受申请。
Lenny Rachitsky: 好的,很高兴你给了网址。天哪,你们马上要被申请淹没了。各位,非常感谢你们来做这期节目。除了去那个地址加入团队做机器学习工程师之外,还有没有其他地方你想让大家关注的,比如你们的社交媒体或者其他什么?
Keith Coleman: 我在 X 上的账号是 KeithColeman,如果你有任何反馈或者想帮助我们,不管是想来这里工作还是想从外部参与贡献,都请联系我们,我们很乐意交流。
Jay Baxter: 我在 X 上是 @JayBaxter。我觉得除了使用 Community Notes 之外,如果能得到更多实质性的贡献就更好了——提交 pull request、合作像 Supernotes 这样的项目——如果大家确实想贡献的话,我觉得这是最令人兴奋的方式。
Lenny Rachitsky: 各位,去提交代码吧。太棒了。再次感谢你们来做这期节目。
Keith Coleman: 谢谢你的邀请,Lenny。
Jay Baxter: 谢谢,非常感谢。
Lenny Rachitsky: 这是我的荣幸。大家再见。非常感谢收听。如果你觉得这期节目有价值,可以在 Apple Podcasts、Spotify 或你喜欢的播客应用上订阅本节目。也请考虑给我们评分或留言评论,这真的能帮助其他听众发现这个播客。你可以在 LennysPodcast.com 找到所有往期节目或了解更多关于本节目的信息。下期再见。
术语表
| 原文 | 中文 |
|---|---|
| A/B 测试 | A/B 测试(即对照实验,保留原文写法) |
| Amazon MTurk | Amazon MTurk(亚马逊的众包微任务平台,保留原文) |
| Arma 3 | Arma 3(军事模拟视频游戏) |
| Asana | Asana(项目管理工具,保留原文) |
| Birdwatch | Birdwatch(Community Notes 的项目原名) |
| Blake Scholl | Blake Scholl(Boom Supersonic 创始人兼 CEO) |
| bridging-based agreement algorithm | 基于桥接的共识算法(bridging-based agreement algorithm) |
| Chris Bale | Chris Bale(研究者,保留原文) |
| Community Notes | Community Notes(X 平台的社区注释功能,暂保留原文) |
| Contributor | 贡献者(Contributor) |
| DAU | DAU(Daily Active Users,日活跃用户数) |
| echo chamber | 回音室 |
| Elon | Elon(指 Elon Musk,X/Twitter 所有者) |
| Figma | Figma(设计工具) |
| filter bubble | 过滤气泡 |
| IC | IC(Individual Contributor,个人贡献者,非管理岗) |
| Jack | Jack(指 Jack Dorsey,Twitter 前CEO) |
| Jay Baxter | Jay Baxter(X 高级工程师,Community Notes 算法负责人) |
| Jira | Jira(项目管理工具,保留原文) |
| Kaggle | Kaggle(数据科学竞赛平台,保留原文) |
| Kayvon | Kayvon(Twitter/X 高管,Keith Coleman 的上司) |
| Keith Coleman | Keith Coleman(X 副总裁,负责 Community Notes) |
| Lenny Rachitsky | Lenny Rachitsky(播客主持人,《Lenny’s Newsletter》作者) |
| Linda | Linda(Linda Yaccarino,X 现任 CEO,保留原文) |
| matrix factorization | 矩阵分解(matrix factorization,一种机器学习方法) |
| media literacy | 媒介素养 |
| Messi | Messi(Lionel Messi,阿根廷足球运动员) |
| Meta | Meta(科技公司,Facebook 母公司) |
| Monday.com | Monday.com(项目管理工具,保留原文) |
| OKR | OKR(Objectives and Key Results,目标与关键成果管理方法) |
| PageRank | PageRank(Google 的网页排名算法,保留原文) |
| Parag | Parag(Parag Agrawal,Twitter 前CEO,保留原文) |
| Periscope | Periscope(Twitter 旗下的直播视频应用) |
| pseudonymity | 化名性 |
| Ronaldo | Ronaldo(Cristiano Ronaldo,葡萄牙足球运动员) |
| Spaces | Spaces(X/Twitter 的语音直播聊天室功能) |
| Supernotes | Supernotes(一个利用 LLM 辅助生成 Community Notes 的外部研究项目) |
| Thermal | Thermal(Twitter 内部的隔离式创新团队机制) |
| Twitter 2.0 | Twitter 2.0(Elon Musk 收购 Twitter 后的改革计划名称) |
| Vitalik Buterin | Vitalik Buterin(以太坊联合创始人) |
| 双重差分 | 双重差分(difference in differences,一种因果推断方法) |
此文档由 AI 分片翻译(translate_long_document)