创建自定义技能
摘要
本课程深入讲解Claude技能(Skills)的结构规范与创建最佳实践,包括YAML前置元数据要求、技能主体编写准则、可选目录组织方式。通过两个实战案例(生成练习题技能和时间序列分析技能)演示如何构建生产级技能,并介绍使用skill-creator工具评估技能质量的方法,最后探讨技能测试与验证策略。
要点
- 技能结构与规范:每个技能必需包含SKILL.md文件,带有名称和描述的YAML前置元数据,名称需使用小写字母、数字和连字符,描述应明确说明技能功能及触发条件
- 编写最佳实践:提供逐步指令、指定边缘情况、保持500行以内、使用正斜杠路径、明确工作流顺序;根据任务性质选择自由度(最佳实践宜低自由度,创意输出可高自由度)
- 可选目录组织:scripts存放可执行代码,references存放文档参考,assets存放模板和资源文件,采用渐进式披露策略按需加载以优化token使用
- 实战案例分析:生成练习题技能演示如何引用外部模板处理不同输出格式(Markdown/LaTeX/PDF);时间序列分析技能展示确定性工作流设计,通过Python脚本实现可视化和诊断
- 评估与测试:使用skill-creator技能和并行子智能体评估技能质量,建立类似单元测试的验证框架,确保输入输出、工作流顺序和跨模型兼容性
中文翻译
我们现在将仔细研究技能(Skills)是如何构建的,以及创建技能的最佳实践。然后,我们将把你学到的知识应用到两个例子中。一个是根据讲义创建练习题,另一个是分析时间序列数据的特征。我们开始吧。
在这节课中,我们将重点关注一下技能的结构,以及与之相关的一些最佳实践。然后我们将看看我们制作的两个技能,看看它们通过技能创建器(Skill Creator)运行时表现如何,看看它们在某些最佳实践方面的表现。复习一下,我们制作的每一项技能都有一个必需的 SKILL.md 文件,其中包含一些 YAML Frontmatter(前置元数据),该元数据要求包含名称和描述。在底层的 SKILL.md 中,我们拥有技能的内容,然后是对脚本或任何额外文本文件、必要资源的引用,这些资源仅在必要时才会被加载。
当我们看看关于名称和描述的一些最佳实践时,你可以想象这是任务关键型的。你的名称和描述不仅是 Claude 分析你的技能做什么的方式,也是 Claude 检测何时使用该特定技能的方式。所以对于名称,字符数有限制,描述也是如此。我们简要提到过,名称必须包含小写字母、数字和连字符,一般来说,技能的名称要坚持使用动词加 ing 的形式。对于描述,你不仅要描述它做什么,还要描述何时使用它。如果有特定的关键字会导致智能体触发此技能,请确保利用好这些关键字。
除了你拥有的必需字段外,智能体技能规范还允许使用可选字段。这可能是许可证、兼容性以及元数据中的任意键值对。这里需要注意的重要一点是,虽然智能体技能有一个标准,但你可能会遇到一些技能,有些是由 Anthropic 构建的,有些是其他人构建的,它们并不完全遵循这个规范。技能正处于积极开发中,技能规范也是如此,因为我们跨许多不同的模型提供商和许多不同的智能体工具生态系统开展工作。
当我们开始越过 YAML Frontmatter 进入技能的底层主体时,我们对技能的格式没有底层限制。但是,当你考虑构建可预测的工作流时,你要确保有分步说明。正如我们在其他技能中看到的那样,特别是技能创建器技能,重要的是要指定边缘情况、分步说明,如果有理由跳过某个步骤,要非常清楚原因。一般来说,将其保持在 500 行以内是最佳实践,因为我们总是可以在必要时引用外部文件、资源和脚本。一般来说,清晰简洁是有价值的,使用正斜杠是任务关键型的,即使在 Windows 上也是如此。确保技能在许多不同的环境中工作很重要。
当你考虑创建技能时,你要稍微考虑一下你想给该技能多少自由度。我们应该允许通用的方法和通用的方向,还是应该专注于特定的顺序?你可以想象,为了遵循最佳实践,我们可能希望自由度较低,但对于更具创造性的输出,如多种颜色、多种风格、多种字体,我们可以允许那种高自由度。当我们开始考虑具有多个技能的更复杂的工作流时,将事情分解为连续的步骤总是比拥有一个试图做所有事情的非常非常大的技能更有价值。这些系统可以处理 100 多个技能。重要的是要确保它们的命名恰当,不令人困惑,并且可以遵循可预测的模式。
在规范中,有空间容纳可选目录。正如我们在许多不同的技能中看到的那样,有用于脚本、参考资料和资源的子文件夹。你的脚本包括任何需要读取和执行的代码。你还要确保你有错误处理和清晰的文档。我们的参考资料包含额外的文档或参考文件。一般来说,如果参考文件很长,指示技能读取整个参考文件通常很有价值。最后,我们有底层资源。这些可以包括输出模板、图像、徽标、数据文件、架构等等。
值得注意的是,这些目录、脚本、参考资料和资源都遵循智能体技能的标准。但你可能会遇到相当多尚未遵循该特定标准的不同技能。该标准正在迅速发展,技能也在迅速发展。因此展望未来,我们期望创建的技能遵循此标准。但你可能会遇到一些具有不同文件夹名称和不同约定的技能。现在我们对最佳实践、可选目录以及如何编写生产级技能有了很好的了解,让我们来看看我们创建的两个技能示例,逐步了解它们,然后通过 skill-creator 运行它们以分析最佳实践,并讨论评估这些技能以确保我们为生产做好准备。
我现在在 VS Code 中。在这里我们要深入研究两个自定义技能。第一个是生成练习题的技能。如果我们看看这个技能,我们可以看到描述是根据讲义生成教育练习题以测试理解能力。你可以想象你是一名老师或讲师,你想为输入和输出提供特定的格式,并且你想生成全面的问题来测试理解能力。让我们逐步了解这个技能。首先,我们支持的输入格式。我们指定使用哪些特定的库,并指定要提取什么文本。然后我们接着是我们的问题结构。同样,我们要非常具体,所以我们要指定生成这些问题的确切顺序。从判断题开始,一直到现实应用题。
对于每一个问题,我们在下面都有子指南。我们可以看到,这个技能的代码不超过 500 行。但如果它需要变得越来越大,我们总是可以在必要时包含底层文件以供引用。当我们看看其中一些例子时,对于判断题,甚至是编码问题等等。我们可以看到这里,我们非常明确地规定了这些特定问题的范围、结构和所需的输出。当我们更深入地研究该输出格式时,我们指定它取决于用户的请求。而且我们实际上是在引用 assets 文件夹内的模板,而不是直接给出每种输出的直接示例。如果我们要处理 LaTeX 或 Markdown,我们要确切指定我们要它看起来像什么样。
例如,对于 Markdown,这就是判断题可能的样子。对于 LaTeX,这就是我们的判断题和示例在我们进行过程中的样子。如果你发现自己需要某种特定的输出格式,与其把所有内容都放在 SKILL.md 中,不如在一个外部资源或文件中引用它。请记住,这些文件、这些模板仅在必要时才会被加载。所以我们可以通过仅加载我们需要的数据格式的特定文件,在 token 和上下文窗口方面做到极其高效。如果我们需要外部资源、特定领域的示例,我们也可以链接到它,就像我们在这里的 references 文件夹中所做的那样。我们倾向于逐步披露(Progressive Disclosure)的概念,只加载绝对必要的内容,并且仅在需要时引用外部文件。
我们要看的第二个技能是一个分析时间序列数据的技能。我们将提供一个 CSV 文件,我们希望在预测很多不同的事情之前了解其特征。这里需要注意的重要一点是,当我们通过这个特定技能时,我们要有一个非常特定的确定性工作流。我们利用几个不同的 Python 脚本来执行该特定操作。首先,我们有一个 Python 脚本用于可视化我们要处理的数据。绘制时间序列图、直方图、滚动统计数据、箱线图等等。为了处理自相关,我们也有可以绘制的图。分解也是如此。
当我们看看我们的 diagnose.py 时,我们有用于分析我们要处理的数据的底层功能。虽然这里有很多函数,但我想提请你注意我们在运行诊断结束时所做的事情。我们利用这些函数来分析数据质量、分布、平稳性测试、季节性、趋势、自相关,最后以变换建议结束。我们在这里拥有的是一个可预测的工作流,我们要每次都按特定顺序运行它。所以让我们回去看看我们的技能,看看究竟是如何做到的。
首先我们要从我们的输入格式开始。我们要非常明确我们应该寻找什么,列的名称和特定的数据类型。接下来我们要进入这个技能最重要的部分之一,即工作流。请注意这里,我们对我们拥有的步骤非常明确,告诉我们的特定技能和 Claude 在我们要开始诊断时运行这个确切的脚本。然后我们可以选择生成必要的图表并将此数据报告给用户。获取此数据,查找 summary.txt 中的内容并展示相关图表。我们还可以看到,为了回答我们需要的一些问题,我们有一个 interpretation.md 文件作为指导。
当我们看看一些脚本选项时,如果有必要,我们可以添加额外的标志。当我们开始思考输出什么时,我们可以确切指定我们要输出的文件树、文本文件、图像等等。我们要对传入的数据、我们执行的操作以及最后的输出具有极高的可预测性。一如既往,如果有外部参考资料,我们要确保在这里列出它们。鉴于我们有依赖 Python 库的脚本,我们需要确保我们确切地强调了这些依赖关系是什么,并确保它们已安装,以便这些脚本能正确运行。
现在我们已经查看了我们创建的这两个自定义技能,让我们看看当我们通过技能创建器技能运行它并确定我们是否遵循最佳实践时,它们的表现如何。我们可以在几个环境中这样做。我们可以回到 Claude Desktop,但我想向你展示的是我们如何将 Claude Code 与技能一起使用。我们在未来的课程中会更深入地看到这一点。但现在,我要打开 Claude Code。我要安装必要的技能,在我们的例子中是 skill-creator。然后我们要并行使用两个子智能体来评估我们的分析时间序列技能和生成练习题技能。这是一个非常有用的方法,可以以此开始评估过程,看看我们在编写这些技能方面做得如何。
所以我们将继续进入 Claude Code。与 Claude AI 不同,Claude Code 不附带包括 skill-creator 在内的内置技能。所以我们需要安装它们。我们要使用市场(Marketplace)来做到这一点。所以我们将前往我们的市场。我们要为 anthropic/skills 添加一个市场。这是我们之前看到的存储库,包含两个集合。首先是 document-skills(文档技能)。这些包括处理 Excel 文件、PowerPoint、Word 文档和 PDF。另一个集合是 example-skills(示例技能)。这是我们看到的一些其他的技能,包括技能创建器技能。所以让我们在项目范围内安装它。一旦我们安装了它,我们会看到我们需要重启 Claude Code。我们还会看到在我们的 .claude 中,我们在 settings.json 中有包含这些技能的 enabledPlugins。
所以让我们重启 Claude Code,看看我们有什么技能。我们可以使用 /skills 命令来做到这一点。如果我们做得正确,我们应该会看到我们的 skill-creator 技能就在这里,正如预期的那样。让我们继续使用该技能。所以我们要让 Claude Code 使用 skill-creator 技能来评估这些技能是如何遵循最佳实践的。为了做得快一点,我们要并行使用子智能体,每个子智能体都在评估我拥有的每一个自定义技能。为了做到这一点,我们将被提示使用该技能,skill-creator,这很棒。它按预期工作。我们将成功加载该技能,读取必要的文件,并继续调度我们的子智能体来检查最佳实践。
我们可以看到这里,它找到了正确的技能,生成练习题,分析时间序列。让我们启动我们的两个智能体,根据我们拥有的最佳实践来评估这些技能。好吧,让我们看看我们做得怎么样。嗯,还不错。生成练习题,十分之九。我们可以在简洁性方面稍微改进一下。我们这里有一些很好的建议。好消息是我们在分析时间序列方面做得更好。我们可以看到这里的一些观察结果,以及在避免重复、Frontmatter 质量和简洁性方面的一些出色工作。评估你的技能的一个非常好的方法是通过这个技能创建器运行它们,它开箱即用地包含了最佳实践。
所以我们已经通过这个技能创建器运行了我们的技能,以分析底层 SKILL.md 和相关文件中的最佳实践,但我们如何确保技能按预期工作呢?这是一个我们可以围绕它构建一个测试工具的例子,考虑为我们的技能编写单元测试,类似于我们为软件编写单元测试的方式。所以从我们的生成练习题开始,当我们思考评估可能是什么样子时,我们会从几个不同的查询开始。生成问题并将其保存到 Markdown 文件、LaTeX 文件、PDF 文件。我们可以继续确保我们以正确的格式传入正确的文件。
然后我们可以确保我们的预期行为是我们所需要的。使用正确的库进行 PDF 输入,按我们指定的方式提取学习目标,生成不同类型的问题并遵循这些问题的指南。使用正确的输出结构,使用我们在 assets 文件夹中看到的正确输出模板,确保在某些数据格式(如 LaTeX)中它能成功编译。最后,确保我们的问题生成到正确的文件和正确的格式中。我们还希望确保在这个过程中收集人工反馈,并在我们要计划使用的所有不同模型中测试这一点。
对于我们用于分析时间序列的第二个技能,我们利用了三个不同的 Python 脚本。所以我们要假设我们已经用软件中的传统单元测试测试了那些 Python 脚本。假设这些脚本正在做我们想让它们做的事情,现在让我们测试一切是否按正确的顺序发生,并具有适当的输入和输出以及预期行为。你在这里可能会有的查询是分析时间序列数据并生成图表。我们会想传入一些潜在的 CSV,确保我们展示的用于可视化和诊断的 Python 脚本运行正确。
更重要的是,确保工作流中的所有步骤顺序正确。我们要求绘制图表,我们要确保包含了那个可选步骤。然后我们要返回摘要,解释这些发现,最后,创建一个文件夹,将所有必需的文件放在正确的位置。如果你还记得,在输出中,我们要为不同的文件、不同的文件夹和底层资源指定非常特定的位置。与我们的其他技能类似,我们要获取人工反馈并在我们使用的模型中进行测试。在下一节课中,我们将采用这两个技能,并将它们带入 Jupyter Notebook,并使用 Claude Messages API 来运行这些技能,使用代码执行工具以编程方式生成输出。
English Script
We’ll now take a closer look at how skills are structured and best practices for creating skills. Then we’ll apply what you learn to two examples. One to create practice questions based on lecture notes, and another to analyze the characteristics of time series data. Let’s go.
In this lesson, we’re going to focus a bit on the structure of a skill, some of the best practices associated with it. and then we’re going to take a look at two skills that we make and see how they fare when run through the skill creator to see how they perform against some of the best practices. To review, every skill that we make has required SKILL.md file with some YAML Frontmatter that requires a name and a description. In the underlying SKILL.md, we have the content that goes in our skill, and then any references to scripts or any additional text files, assets necessary that are loaded only when necessary.
As we take a look at some of the best practices for names and descriptions, you can imagine this is mission critical. Your name and description are not only how Claude can analyze what your skill does, but also detect when to use that particular skill. So with the name, there’s a maximum of characters, same with the description. We mentioned briefly, the name has to contain lowercase letters, numbers and hyphens, and in general, stick with the verb plus ing form the name of your skill. For the description, you want to describe not only what it does, but also when to use it. And if there are specific keywords that lead to agents triggering this skill, make sure to lean into those.
In addition to the required fields that you have, the agent skills specification allows for optional fields. This could be the license, compatibility, and arbitrary key-value pairs in your metadata. What’s important to note here is that while there is a standard on agent skills, there are some skills that you might come across, some built by Anthropic, some others, that don’t follow this specification to a T. The skills are in active development, as is the specification for skills as we work across many different model providers and many different agent tooling ecosystems.
As we start to move past the YAML Frontmatter and into the underlying body of the skill, there are no underlying restrictions that we have for the format of our skill. However, when you think about building predictable workflows, you want to make sure you have step-by-step instructions. As we saw in other skills, especially the skill creator skill, it’s important to specify edge cases, step-by-step instructions, and if there’s a reason for a step to be skipped, be very clear why that is. In general, keeping this to under 500 lines is best practice, because we can always reference external files, assets, scripts, when necessary. In general, being clear and concise is valuable, and using forward slashes is mission critical even when on Windows. It’s important to make sure the skill works across many different environments.
When you think about creating skills, you want to think a little bit about how much freedom you want to give to that skill. Should we allow for general approaches and general directions, or should we be focusing on a specific sequence? You can imagine for following best practices, we might want a low degree of freedom, but for more creative outputs, multiple colors, multiple styles, multiple fonts, we can allow for that high degree of freedom. As we start to think about more complex workflows with multiple skills, breaking things down into sequential steps is always more valuable than having one very, very large skill that tries to do it all. These systems can handle 100+ skills. It’s important to make sure that they’re named appropriately, not confusing, and can be followed with a predictable pattern.
In the specification, there’s room for optional directories. And as we’ve seen with quite a few different skills, there are subfolders for scripts, references, and assets. Your scripts include any kind of code that needs to be read and executed. You also want to make sure you have error handling and clear documentation. Our references contain additional documentation or reference files. And in general, it’s often valuable to instruct the skill to read the entire reference file if it happens to be quite long. Finally, we have underlying assets. These could include templates for output, images, logos, data files, schemas, and so on.
It’s important to note that these directories, scripts, references, and assets are following the standard of agent skills. But you might come across quite a few different skills that don’t necessarily follow that particular standard yet. The standard is rapidly evolving and skills are also rapidly evolving. So going forward, we’d expect that skills created follow this standard. But you might come across some that have different folder names and different conventions. Now that we have a good sense of best practices, optional directories, and how to write production grade skills, let’s take a look at two examples of skills that we’ve created, step through them, and then run them through the skill-creator to analyze for best practices and talk about evaluating these skills to make sure we’re ready for production.
So I’m in VS Code now. And here we have two custom skills that we’re going to dive into. The first one is a generating practice questions skill. If we take a look at this skill, we can see that the description is for generating educational practice questions from lecture notes to test understanding. You can imagine you’re a teacher or instructor, you want to provide a particular format for input and output. and you want to generate comprehensive questions to test understanding. Let’s step through this skill. To start, we have supported formats for input. We specify what particular libraries to use, and we specify what text to extract. We then follow with our question structure. Again, we want to be very specific, so we’re specifying the exact order that we want these questions generated in. Starting with True/False, working all the way towards realistic applications.
For each of these questions, we have sub guidelines below. We can see here that this skill is not more than 500 lines of code. But if it needed to grow larger and larger, we can always include underlying files to reference to if necessary. As we take a look at some of these examples, for true and false, even coding questions and so on. We can see here, we’re being very explicit with the scope and the structure and the required output for these particular questions. As we dive deeper into that output format, We specify that it depends on the user request. And instead of giving direct examples of every single kind of output, we’re actually referencing templates inside of our assets folder. If we’re dealing with LaTeX or we’re dealing with Markdown, we specify exactly how we want that to look like.
For example, with Markdown, here’s how true and false might look like. With LaTeX, here is how our true and false and examples might look like as we go through. If you find yourself needing a particular kind of output format, instead of putting that all in the SKILL.md, reference it in an external asset or file. Remember that these files, these templates are only being loaded when necessary. So we can be extremely efficient with our tokens and context window. by only loading the particular file in the data format that we need. If there are external resources that we need, domain-specific examples, we can link to that as well, like we do in the references folder here. We’re leaning into that concept of progressive disclosure by only loading what’s absolutely necessary and referencing external files only when we need.
The second skill we’re going to look at is a skill for analyzing time series data. We’re going to provide a CSV and we want to understand the characteristics before forecasting quite a few different things. What’s important to note here is that as we go through this particular skill, there is a very particular deterministic workflow that we want to have. We’re making use of a few different Python scripts to perform that particular action. To start, we have a Python script for visualizing the data that we’re working with. Plotting the time series, a histogram, rolling stats, box plots, and quite a few more. For working with autocorrelation, we also have plots as well that we can draw. Similarly with decomposition.
As we take a look at our diagnose.py, we have underlying functionality for analyzing the data that we’re working with. While there are quite a few functions here, I want to draw your attention to what we do at the end when we run our diagnostics. We make use of these functions to analyze data quality, distribution, stationary tests, seasonality, trend, autocorrelation, and finally, end with a transform recommendation. What we have here is a predictable workflow that we want to run each time in a particular order. So let’s go back and look at our skill to see exactly how that’s done.
First we’re going to start with the format for our input. We’re going to be very explicit for what we should be looking for, the names of the columns and the particular data types. Next we’re going to move on to one of the most important parts of this skill, the workflow. Notice here, we’re being extremely explicit with the steps that we have, telling our particular skill and Claude to run this exact script when we begin our diagnostics. We then have the option for generating the plots necessary and reporting this data to the user. taking this data, finding what’s in the summary.txt and presenting the relevant plots. We can also see here for answering some of these questions that we might need, we have an interpretation.md file for guidance.
As we take a look at some of the script options, we can add additional flags if necessary. And as we start to think about what’s being output, we can specify exactly the tree of files, text files, images, and so on that we output. We want to be extremely predictable with the data that’s coming in the operations that we perform, and then finally the output. As always, if there are external references, we can make sure to list those here. And given that we have scripts that are dependent on Python libraries, we need to make sure that we highlight exactly what those dependencies are and make sure that they’re installed so that these scripts run correctly.
Now that we’ve taken a look at these two custom skills that we’ve created, let’s see how they stand up when we run this through the skill creator skill and determine if we’re following best practices. We could do this in a couple environments. We can go back to Claude desktop, but what I’d like to show you is how we can use Claude Code with skills. We’re going to see this in much more depth in a future lesson. But right now, I’m going to open up Claude Code. I’m going to install the necessary skill in our case skill-creator. And then we’re going to use two subagents in parallel to evaluate our analyzing time series and our generating practice question skills. This is a really helpful way to just start the evaluation process for how well we’ve done with writing these skills.
So we’re going to go ahead and hop into Claude Code. Unlike Claude AI, Claude Code does not come with the built-in skills that include skill creator. So we need to install those. And we’re going to do that using a marketplace. So we’re going to head over to our Marketplaces. We’re going to add a marketplace for anthropic/skills. This is the repository that we saw earlier that contains two collections. First, document-skills. These include processing Excel files, PowerPoints, Word docs, and PDFs. And the other collection are the example-skills. These are some of the other ones that we saw including the skill creator skill. So let’s install this in the project scope. Once we install that, We’re going to see that we need to restart Claude Code. And we’re also going to see that in our .claude, we have in our settings.json the enabledPlugins that includes these skills.
So let’s go ahead and restart Claude Code and see what skills we have. And we can do that using the /skills command. If we’ve done this correctly, we should see that we have our skill-creator skill right here as expected. Let’s go ahead and make use of that skill. So we’re going to ask Claude Code to use the skill-creator skill to evaluate how these skills have followed best practices. To do this a little bit faster, we’re going to use subagents in parallel where each subagent is evaluating each of the custom skills that I have. In order to do this, we’re going to be prompted to use that skill, skill-creator, which is great. It’s working as expected. We’re going to successfully load that skill, read the necessary files, and go ahead and dispatch our subagents to check for best practices.
We can see here, it’s found the correct skills, generating practice questions, analyzing time series. And let’s launch our two agents to evaluate these skills against best practices that we have. Alright, let’s see how we did. Well, not too bad. generating practice questions, nine out of ten. We could improve a bit on the conciseness. We’ve got some nice recommendations here. The good news is we did even better on analyzing time series. We can see some observations here and some excellent job across avoiding duplication, Frontmatter quality, and conciseness. A really nice way to evaluate your skills is to run them through this skill creator, which includes best practices out of the box.
So we’ve run our skills through this skill creator to analyze for best practices in the underlying SKILL.md and associated files, but how can we make sure the skills are working as expected? Here’s one example that we could build a harness around to think about writing a unit test for our skills, similarly to how we write unit tests for software. So to start with our generating practice questions, when we think about what the evaluation might look like, we would start with a couple different queries. Generating questions and saving it to a markdown file, to a LaTeX file, to a PDF. we can go ahead and make sure that we’re passing in the correct files in the correct format.
We can then make sure that our expected behavior is what we need. Using the correct libraries for PDF input, extracting the learning objectives as we specified, generating different kinds of questions and following guidelines for those. using the correct output structure, using the correct output templates that we saw in our assets folder, making sure in certain data formats like LaTeX that it’s successfully compiling. And then finally, making sure that our questions are generated to the correct files and the right format. We also would want to make sure that we gather human feedback in this process and that we test this across all the different models that we’re planning to use.
For our second skill for analyzing time series, we make use of three different Python scripts. So we’re going to make the assumption that we’ve already tested those Python scripts with traditional unit tests in software. Assuming that those scripts are doing what we want them to do, let’s now test that everything is happening in the correct order with the appropriate inputs and outputs and expected behavior. The query you might have here is to analyze and generate plots for some time series data. we’d want to pass in some potential CSVs, make sure that the Python scripts that we showed for visualizing and diagnosing are run correct.
More importantly, making sure that all the steps in the workflow are in the correct order. we’re asking for plots, we want to make sure that that optional step is included. We then want to return a summary, interpret those findings, and finally, create a folder with all the required files in their right place. If you can remember, in the output, we had a very specific location for different files, different folders, and underlying assets. Similarly to our other skill, we want to get human feedback and test across the models that we use. In the next lesson, we’re going to take these two skills and bring them into Jupyter notebooks and use the Claude messages API to run these skills using code execution tools to produce outputs programmatically.