Jack Wotherspoon · 2025-01-30

使用 Gemini CLI 进行内容创作

摘要

本课程展示了 Gemini CLI 在内容创作领域的强大应用,通过处理会议播客视频,演示了如何利用 AI 工具自动完成视频转录、短片剪辑、博客撰写和社交媒体内容生成等营销任务。课程充分利用了 Gemini 的多模态能力,结合 Nano Banana 扩展,实现了从原始视频到全套宣传素材的高效工作流程。

要点

  • 视频转录与工具集成:Gemini CLI 能够自动调用 ffmpeg、whisper、uv 等工具完成视频转录,生成带时间戳的字幕文件
  • 智能短片剪辑:基于转录内容自动识别精彩片段,从原始视频中提取短视频用于社交媒体推广
  • 创意博客生成:结合视频截图和 Nano Banana 扩展生成的创意图像,自动撰写配有视觉内容的博客文章
  • 社交媒体自动化:根据现有内容自动生成多个版本的社交媒体帖子,简化营销流程
  • 多模态工作流:展示了 Gemini CLI 处理视频、图像、文本等多种媒体类型的综合能力

视频信息:Gemini CLI for Content Creation


中文翻译

到目前为止,你已经看到了 Gemini CLI 的许多编程用例。但许多开发者正在将他们对代理型编程工具的使用扩展到非编程用例中。Gemini CLI 擅长处理多媒体数据,无论是图像还是视频。让我们通过处理一项营销任务来发挥创意吧。

现在我们要来看看如何将 Gemini CLI 用于内容创作。正如我们提到的,Gemini 的优势之一是它的多模态能力。它真的可以很好地与各种不同类型的文本、视频、图像等进行交互。

在会议期间,我们录制了一个视频播客。我们的一位团队成员刚刚把它发给了我们,我们已经把它下载到了本地机器上。我们想用 Gemini CLI 为这段录音制作宣传内容。我们希望它能真正深入分析视频,并帮助我们制作一些宣传素材,比如几个短视频(Shorts)、一篇博客文章和一些社交媒体帖子。

还有一些扩展对此非常有用,在本节课中我们将利用 Nano Banana 这个扩展。我们现在位于一个 CONTENT 文件夹中,这里有我们的本地播客文件,以及我们团队成员发来的 GEMINI.md 上下文文件。

我们想要进行的第一个提示操作实际上是让 Gemini CLI 对视频进行转录,并带有时间戳和引用。我们先让它把内容保存到一个本地的 .txt 文件中。看起来 Gemini CLI 已经通知我们,它需要某些工具来完成手头的任务,比如 ffmpegwhisper。它会去检查这些工具是否已安装。它还将尝试通过检查 uv 来设置环境。现在它将调用 uv 以便使用 whisper 来转录视频。

看起来 Gemini CLI 遇到了一些问题,但它已经诊断出了问题所在,并将尝试恢复。看起来它已经运行完毕了。它遍历了整个视频并生成了带有时间戳的转录。这看起来正是我们想要的,非常完美。看起来 Gemini CLI 还创建了不同格式的字幕文件。

利用这些转录内容,我们将要求 Gemini CLI 为我们创建一些短视频,实际上就是深入我们的原始视频,将其剪切并提供给我们可用于社交推广的片段。做这种事情有很多不同的方法。我个人很多时候都是手动完成的。所以,有一个像 Gemini CLI 这样的工具能替我完成这项工作真是太好了。

看起来 Gemini CLI 已经确定了它想要用来剪切视频的三个短片或时间戳。现在完成了。它选取了三个片段并进行了剪辑。看起来它剪辑了"Gemini CLI 简介"、“10倍与100倍的见解"以及"关于 Gemini CLI 扩展的讨论”。

“有了 AI,如今使用 AI,让自己实现 10 倍的提升太容易了。但要实现 100 倍的提升,那才是难点所在。”

既然我们有了不同的视频片段,我们想更进一步。我们还想要一篇可以发布的博客文章。我们会希望在博客中包含图片。所以这就是我们要使用 nanobanana 扩展的地方。我已经完成了安装它的过程,其方式与你看到的 Google Workspace 扩展的安装方式完全相同。

所以现在让我们告诉 Gemini CLI 使用它之前创建的转录内容来创建那篇博客。Gemini CLI 将从实际视频本身截取屏幕截图,然后将这些作为输入提供给 nanobanana,以帮助使它们更具创意。

看起来已经完成了。它把博客写到了一个 Markdown 文件中,博客的标题是"编程的未来在你的终端里:深入探索 Gemini CLI"。相当吸引人。我就不让大家通读全文了,但我们可以快速浏览一下。我们可以看到截图和 Nano Banana 生成的图像很好地融合在了一起。

我们给 Gemini CLI 的最后一个任务。现在我们有了不同的短视频以及博客,我们想创建宣传用的社交媒体帖子。那就让 Gemini CLI 为我们做这件事吧。

看起来它已经生成了四个社交媒体帖子。让我们看一看。这是它想出的一些帖子。老实说,还不错。你甚至可能会看到我发布其中一条。

我们已经看到了 Gemini CLI 的许多不同用例。我们将稍微转换一下思路,离开文本会议这个用例,给你带来一个我们认为你会觉得有用的额外课程。我们知道你正在上一门在线课程,所以我们要展示如何将 Gemini CLI 用于学习,这应该会很有用。

English Script

So far, you’ve seen many coding use cases for Gemini CLI. But many developers are expanding their use of agentic coding tools into non-coding use cases. Gemini CLI excels at working with multimedia data, whether that’s images or videos. Let’s get creative by tackling a marketing task.

So now we’re going to take a look at how Gemini CLI can be used for content creation. As we mentioned, one of the strengths of Gemini is its multimodality. It can really interact well with all different types of text, video, images, you name it.

During the conference, we recorded a video podcast. One of our team members just sent it over to us and we’ve downloaded it onto our local machine. We want to use Gemini CLI to produce promotional content for the recording. We want to have it actually go through and analyze the video and help us make some promotional assets, like a few shorts, a blog post, and some social media posts.

There are also a few extensions that are very useful for this, and we’ll take advantage of the Nano Banana one in this lesson. We’re now in a CONTENT folder where we have our local podcast, as well as a GEMINI.md context file that our team member sent over.

The first prompt we want to go ahead with is actually have Gemini CLI transcribe the video with timestamps and quotes. We’ll have it save it into a local .txt file for now. It looks like Gemini CLI has gone ahead and notified us that there’s certain tools it’s going to want to accomplish the task at hand, such as ffmpeg and whisper. It’s going to go ahead and check if these are installed. It’s also going to try to set up the environment by checking for uv. It’s now going to go ahead and call uv in order to transcribe the video using whisper.

It looks like Gemini CLI ran into an issue, but it’s diagnosed it and it’s going to try to recover. It looks like it’s done running. It’s gone through the whole video and transcribed it with timestamps. This looks like exactly what we wanted, which is perfect. It also looks like Gemini CLI went ahead and created different formats for subtitles.

Using these transcriptions, we’re going to ask Gemini CLI to create some shorts for us to actually go into our original video and chop it up and give us clips that we can use for social promo. There are a lot of different ways to do this kind of thing. I personally do it a lot of the time manually. So it’s nice to have a tool like Gemini CLI that will go ahead and do it for me.

It looks like Gemini CLI has gone ahead and pinpointed the three shorts or timestamps that it wants to use for cutting up the videos. It’s now done. It’s picked up three clips and gone ahead and clipped them. It looks like it did “introduction to Gemini CLI”, “the 10x vs 100x insight”, as well as “discussion on Gemini CLI extensions”.

“With AI, using AI these days, it is so easy to 10x yourself. 100x it. That’s the hard part.”

So now that we have different video clips, we want to take it a step further. We also want a blog that we can post. We’re going to want to include images in our blog. So this is where we’re going to use the nanobanana extension. I’ve already gone through the process of installing it, which I did in the exact same fashion that you saw with the Google Workspace extension.

So let’s now tell Gemini CLI to go ahead and create that blog using the transcriptions it created earlier. Gemini CLI is going to go ahead and take screen grabs from the actual videos themselves, and then use those as input to nanobanana to help make them more creative.

It looks like it’s complete. It’s written a blog to a markdown file, and the title of the blog is “The Future of Coding is in Your Terminal: A Deep Dive with Gemini CLI”. Pretty catchy. I won’t make you all read through this, but let’s take a quick look. We can see a nice mix of screenshots as well as Nano Banana generated images.

We’ve got one final task for Gemini CLI. Now that we have our different short videos as well as our blog, we want to create promotional social media posts. So let’s have Gemini CLI do that for us.

It looks like it’s gone ahead and generated four social media posts. Let’s take a look. Here are some of the posts that came up with. Honestly, it’s not too bad. You might even see me post one of these.

We’ve seen so many different use cases for Gemini CLI. We’re going to switch gears slightly and head away from the text conference use case and give you a bonus lesson that we think you’ll find useful. We know you’re taking an online course, so we thought it would be useful to showcase how you can use Gemini CLI for learning.