The pipeline at a glance
Each spec type follows this same pipeline, but the internal steps differ depending on what the skill needs to analyze. The diagrams below show what happens inside each skill.Triggering a skill
Skills are triggered by typing@ followed by the skill name in Cursor’s chat.
Select a skill
Continue typing to filter (e.g.,
@create-v) or use arrow keys to select. The skill name must match exactly: @create-voice, not create voice or voice spec.Inside each skill
Every skill loads an instruction file, reads platform-specific or domain-specific reference files, extracts data from Figma via MCP, runs through a checklist, and outputs validated JSON. The reference files determine what the agent knows about each domain.- Screen Reader
- Color Annotation
- Overview
- API
- Structure
- Changelog
The screen reader skill loads four reference files, one for general instructions and one per platform, then runs a merge analysis to determine focus stops before generating per-platform tables.The merge analysis is the critical step. It determines which visual parts become independent focus stops and which get merged into a parent announcement. The three platform files provide the exact property names and announcement patterns for iOS, Android, and Web.
What the agent sees vs. what you provide
The agent can extract structure, tokens, and styles from Figma automatically. But some information only exists in your head:| The agent can extract | You need to describe |
|---|---|
| Component layers and hierarchy | States not visible in the current frame |
| Design token names and values | Behavioral modes (fill vs. hug, truncation) |
| Variant axes and properties | Focus order preferences |
| Visual dimensions and spacing | Platform-specific interaction details |
| Styles and color values | Business logic or conditional rules |