使用模型上下文协议 (MCP) 的工作流
摘要
本课程介绍模型上下文协议(MCP),这是一个开放标准,使 AI 代理能够安全地连接外部工具、API 和数据源。讲师演示如何通过 MCP 将 Gemini CLI 连接到 Canva,从远程设计工具获取品牌素材,并自动为会议网站生成社交媒体资源页面。
要点
- MCP 是开放标准,让 LLM 通过统一协议与外部系统交互,支持 MongoDB、Google Cloud、GitHub 等服务
- 通过
gemini mcp add添加远程服务器,使用/mcp auth进行 OAuth 认证,/mcp列出可用工具 - Gemini CLI 可并行调用多个 MCP 工具,自动发现并执行所需操作(如搜索设计、读取文件、写入代码)
- 使用 “Yes, allow always” 或 Shift+Tab 进入接受编辑模式,加速多步骤任务的工作流程
- MCP 不锁定特定平台,迁移应用时工具仍可复用
视频信息:Workflows with Model Context Protocol (MCP)
中文翻译
MCP 让你能够将 Gemini CLI 连接到你最喜欢的外部工具和服务,从 Canva 到数据库都可以。在本节课中,你将学习如何访问远程服务器,利用品牌素材来生成一个网页。让我们开始写代码吧。
模型上下文协议(Model Context Protocol,简称 MCP)是一个开放标准,旨在使 AI 代理能够无缝且安全地连接到外部工具、API 和数据源。MCP 提供了一个标准,规定了 LLM(大语言模型)如何与数据和上下文进行交互。
让我们以 MongoDB 为例。我们在 MongoDB 数据库中存储了信息,并且他们提供了一个 MCP 服务器。我们可以在 Gemini CLI 中使用该 MCP 服务器来访问所有数据,列出表格,获取数据,并通过这个标准协议对我们的实际数据库进行读写操作。而且这不仅仅局限于 Gemini CLI。如果我们决定迁移我们的应用程序或使用不同的代理,我们仍然可以使用完全相同的工具,因为我们使用了 MCP。
这非常有价值,因为 MCP 允许你在 Gemini CLI 的内置功能之上进行构建,扩展你可以访问的工具和命令。你可以使用 MCP 连接到 Google Cloud、GitHub、Snyk 等。你甚至可以构建自己的 MCP 服务器来运行你自己的代码。你在日常任务中使用的任何流行的云服务,很可能都已经作为 MCP 服务器可供使用了。
在我们将第一个 MCP 服务器连接到 Gemini CLI 之前,让我们先看看我们的网站及其结构。我们可以使用 Gemini CLI 的 shell 模式。该模式会透传并实际运行命令,而无需 Gemini CLI 进行思考。现在,你运行的任何命令就像是在常规终端中直接运行一样。
现在我们已经安装了网站依赖项。让我们继续运行实际的开发服务器。对于这一点,Gemini CLI 实际上会从我们的上下文中得知。它甚至不需要去读取任何文件。它仅仅知道要启动我们的开发服务器,命令是 npm run dev。它会继续提示我们确认 shell 命令。让我们允许它。我们将打开运行 npm 服务器的 localhost 端口 5173。
哎呀,这有点刺眼。让我们切换到深色模式。好的,这看起来好多了。看起来很棒。我们已经有了一个很好的基础。看起来我们想要的大部分功能都已经实现了。
我发现缺少的一样东西是社交媒体套件。参加会议的人会希望能够在社交媒体上进行分享。我们应该告诉他们使用什么话题标签(hashtag),以及会议的配色方案和字体。我们的社交媒体团队在 Canva 中策划了一个品牌套件,我们可以利用它。与其下载并保存图像和设计,我们实际上可以利用 Canva 的 MCP 服务器,通过 Gemini CLI 来完成这项工作,而无需离开终端。
Gemini CLI 拥有用于 MCP 的子命令,我们可以使用它通过一行代码添加 MCP 服务器。gemini mcp add -t 指定类型。我们使用的是 http 远程服务器。我们传递服务器的名称 canva,以及远程 MCP 端点的 URL。你可以在 Canva 的官网上找到这个 URL。一旦我们运行该命令,我们应该会看到 MCP 服务器已添加到我们的设置中。
现在,当我们启动 Gemini CLI 时,我们应该会看到它要求我们对远程服务器进行身份验证。我们可以使用 /mcp auth 来完成此操作。它会列出你配置的 MCP 服务器。让我们选择 Canva。这将在你的本地浏览器中打开一个 OAuth 流程。你需要点击允许。然后你可以回到 Gemini CLI,你应该会看到你已通过验证。要验证这一点,请运行 /mcp。这将列出你可用的 MCP 服务器,并且会有一个图标(如绿色图标)显示你已连接。/mcp 命令将列出你拥有的所有可用工具。如果你想查看详细描述,可以运行 /mcp desc。
现在我们已经连接到了来自 Canva 的 MCP 服务器,我们可以向它询问有关我们 Canva 设计的问题了,例如,“你能列出我最近的设计吗?” Gemini CLI 将查看 Canva 提供的工具,并意识到它需要调用 search-designs 工具。它知道我们最近编辑的设计是 TechStack 品牌套件。
让我们继续让 Gemini CLI 读取该设计,并为我们的社交媒体套件创建一个 /socials 页面。Gemini CLI 将再次发现工具……它将继续获取设计。我们实际上看到它在这里并行调用不同的工具。所以它在读取文件的同时,实际上也在去从我们的 Canva 设计中获取内容。我们可以看到它正在拉取我们 Canva 设计中的不同元素,比如我们在屏幕上看到的内容,以及所有不同的字体和颜色。
它现在正在逐步执行并弄清楚它的计划,以及如何将 socials 页面添加到我们的网站。现在它将继续写入文件。与其手动允许每一个确认,我们可以选择"是,始终允许(Yes, allow always)",这将有助于加快我们的流程。你可以看到它现在让我们进入了接受编辑模式。你也可以通过运行 Shift Tab 来进入这里。
好了,完成了。我们现在应该有了一个 socials 页面,并且在页脚中,我们应该有一个指向 Social Kit 的链接。让我们看一看。就在那里,Social Kit。我们现在有了一个官方页面,可以发送给与会者,这样他们在社交媒体上发帖时就可以使用官方资源了。
非常不错。Gemini CLI 可以替你完成相当复杂的任务。我鼓励你把复杂的任务交给 Gemini CLI。你可能会对它能完成的事情感到惊讶。接下来,我们将通过 Gemini CLI 的扩展生态系统,超越 MCP 的范畴。
English Script
MCP lets you connect Gemini CLI to your favorite external tools and services. From Canva to databases. In this lesson, you’ll learn how to access remote servers, using brand materials to generate a web page. Let’s code.
Model Context Protocol or MCP is an open standard enabling AI agents to seamlessly and securely connect with external tools, APIs, and data sources. MCP gives a standard of how LLMs or large language models interact with data and context.
Let’s take MongoDB as an example. We have information stored in a MongoDB database, and they have an MCP server. We can use that MCP server within Gemini CLI to get access to all of the data, list the tables, fetch the data, do reads and writes to our actual database through this standard protocol. And it’s not locked into just Gemini CLI. If we decide to move our application or use a different agent, we can still use those exact same tools because we’ve used MCP.
This is valuable because MCP lets you build on top of the built-in capabilities of Gemini CLI, extending the tools and commands you have access to. You can go ahead and use MCP to connect to Google Cloud, to GitHub, to Snyk. You can even build your own MCP server that runs your own code. Any of the popular cloud services that you use in your day-to-day tasks are probably available as MCP servers.
Before we connect our first MCP server to Gemini CLI, Let’s take a look at our website and see how it’s structured. We can use Gemini CLI’s shell mode. This passes through and actually runs the command without Gemini CLI having to think about it. Now, any command you run would be as if you’re running it directly in your normal terminal.
So we now have our website dependencies installed. Let’s go ahead and run the actual dev server. For this, Gemini CLI is actually going to know this from our context. It doesn’t even have to go and read any files. It just knows that to start our dev server, it’s npm run dev. And it’s going to go ahead and prompt us for the shell command. Let’s go ahead and allow it. We’ll go ahead and open localhost on port 5173 where the npm server is running.
Oof, that’s a little tough on the eyes. Let’s go ahead and change into dark mode. Okay, that looks a lot, lot better. This looks awesome. We’ve got a great foundation. It looks like we have most of the stuff we want implemented.
One thing I see that’s missing is a social media kit. People who attend the conference are going to want to be able to share on social media. We should tell them what hashtag to use as well as the color schemes and fonts for the conference. Our social media team has curated a brand kit in Canva that we can use for this. Instead of downloading and saving the image and design, we can actually leverage Canva’s MCP server to do this with Gemini CLI without having to leave our terminal.
Gemini CLI has subcommands for MCP that we can use to add an MCP server with a single line. gemini mcp add -t for the type. We’re using an http remote server. We pass the name of the server, canva, as well as the URL to the remote MCP endpoint. You can find this URL online from Canva. Once we run the command, we should see that the MCP server has been added to our settings.
Now when we start up Gemini CLI, we should see that it’s asking us to authenticate to the remote server. We can do this with /mcp auth. It will give you a list of your configured MCP servers. Let’s choose Canva. It will open up an OAuth flow in your local browser. You’ll want to go ahead and allow. You can come back to Gemini CLI and you should see that you are authenticated. To verify this, run /mcp. This will list your available MCP servers and it will have an icon such as green to show that you are connected. The /mcp command will list all the available tools you have. If you want to see detailed descriptions, you can run /mcp desc.
Now that we are connected to the MCP server from Canva, we can now ask it questions about our Canva designs, such as, “can you list my most recent design?” Gemini CLI will look at the tools Canva provides and realize that it needs to call the search-designs tool. It knows that our most recent design that we edited was the TechStack Brand Kit.
Let’s go ahead and ask Gemini CLI to read in the design and create a /socials page for our social media kit. Gemini CLI will again discover the tools it… It’s going to go ahead and get the design. We’re actually seeing it call different tools in parallel here. So it’s reading the file at the exact same time that it’s going to actually go and get the content from our Canva design. We can see that it’s pulling in the different elements from our Canva design, such as what we had on screen, as well as all the different fonts and colors.
It’s now stepping through and figuring out its plan, and how it can add the socials page to our website. And now it’s going to go ahead and write the file. Instead of having to manually allow each and every single one of these confirmations, we can do “Yes, allow always”, and it will help speed up our flow. You can see that it’s now put us into accepting edits mode. You can also get here by running shift tab.
All right, it’s complete. We now should have a socials page as well as in the footer, we should have a link to the Social Kit. Let’s take a look. And there it is, Social Kit. We now have an official page we can send to conference goers so they can use official assets when they post on socials.
Not bad at all. Gemini CLI can accomplish pretty complex tasks on your behalf. I encourage you to give complex tasks to Gemini CLI. You might be surprised with what it can accomplish. Coming up, we’ll go above and beyond MCP with Gemini CLI’s extensions ecosystem.