Browse Skills
Discover and install AI Agent skills
Showing 1-20 of 2602 skills
transcript-fixer
Corrects speech-to-text transcription errors in meeting notes, lectures, and interviews using dictionary rules and AI. Learns patterns to build personalized correction databases. Use when working with transcripts containing ASR/STT errors, homophones, or Chinese/English mixed content requiring cleanup.
agent-browser
Browser automation for AI agents via inference.sh. Navigate web pages, interact with elements using @e refs, take screenshots, record video. Capabilities: web scraping, form filling, clicking, typing, drag-drop, file upload, JavaScript execution. Use for: web automation, data extraction, testing, agent browsing, research. Triggers: browser, web automation, scrape, navigate, click, fill form, screenshot, browse web, playwright, headless browser, web agent, surf internet, record video
spotify-linux
Spotify CLI for headless Linux servers. Control Spotify playback via terminal using cookie auth (no OAuth callback needed). Perfect for remote servers without localhost access.
node-llm-web-automation
Build or update Node.js + LLM API workflows that replace Codex/MCP UI exploration for web automation at lower cost. Use when a user asks to productize page-structure discovery into a Node runner (Puppeteer/Playwright) with LLM-based element selection, deterministic playback control, cost controls, logging, and fallback paths.
suno-music-creator
Professional music creation with Suno AI V5 and Suno Studio. Use this skill when users want to create songs, playlists, corporate anthems, jingles, workout music, ambient soundscapes, or any AI-generated music. Triggers on requests mentioning Suno, music creation, playlist generation, song composition, or specific music projects like "create a track", "make a playlist", "compose music for", "corporate anthem", "workout mix", or any music production task.
tts
Convert text to speech using Hume AI (or OpenAI) API. Use when the user asks for an audio message, a voice reply, or to hear something "of vive voix".
video
Generate videos using fal.ai (Wan, Kling) or Sora. Text-to-video and image-to-video.
audio-transcriber
Transform audio recordings into professional Markdown documentation with intelligent summaries using LLM integration
douyin-video
抖音无水印视频下载和文案提取工具. 从抖音分享链接获取无水印视频下载链接, 下载视频, 提取视频中的语音文案并自动保存到文件. 适用场景包括获取抖音视频信息, 下载无水印视频, 批量提取视频文案. 当用户需要处理抖音视频链接或提取视频内容时触发.
vidu-video
使用 Vidu Q3 Pro 模型生成视频。当用户想要文生视频、生成带音频的视频,或提到 vidu 时使用此 skill。
video-clipper
从长视频(直播回放、会议录像、播客)中批量生成短视频切片。基于转写文稿精确定位观点边界,自动去除静音卡顿和口吃,输出音画同步的短视频。适用于:直播切片、会议精华提取、短视频二创、播客精彩片段。
agent-tools
Run 150+ AI apps via inference.sh CLI - image generation, video creation, LLMs, search, 3D, Twitter automation. Models: FLUX, Veo, Gemini, Grok, Claude, Seedance, OmniHuman, Tavily, Exa, OpenRouter, and many more. Use when running AI apps, generating images/videos, calling LLMs, web search, or automating Twitter. Triggers: inference.sh, infsh, ai model, run ai, serverless ai, ai api, flux, veo, claude api, image generation, video generation, openrouter, tavily, exa search, twitter api, grok
youtube-clipper
YouTube 视频智能剪辑工具。下载视频和字幕,AI 分析生成精细章节(几分钟级别), 用户选择片段后自动剪辑、翻译字幕为中英双语、烧录字幕到视频,并生成总结文案。 使用场景:当用户需要剪辑 YouTube 视频、生成短视频片段、制作双语字幕版本时。 关键词:视频剪辑、YouTube、字幕翻译、双语字幕、视频下载、clip video
audio-reply
Generate audio replies using TTS. Trigger with "read it to me [URL]" to fetch and read content aloud, or "talk to me [topic]" to generate a spoken response. Also responds to "speak", "say it", "voice reply".
audio-processing
Audio ingestion, analysis, transformation, and generation (Transcribe, TTS, VAD, Features).
audio-transcriber
Transcribe audio files using Groq's Whisper API (fast, cloud-based). Use when the user sends voice messages, audio files (ogg, mp3, wav, m4a, etc.), or asks for speech-to-text transcription. Requires GROQ_API_KEY environment variable.
ai-content-pipeline
Build multi-step AI content creation pipelines combining image, video, audio, and text. Workflow examples: generate image -> animate -> add voiceover -> merge with music. Tools: FLUX, Veo, Kokoro TTS, OmniHuman, media merger, upscaling. Use for: YouTube videos, social media content, marketing materials, automated content. Triggers: content pipeline, ai workflow, content creation, multi-step ai, content automation, ai video workflow, generate and edit, ai content factory, automated content creation, ai production pipeline, media pipeline, content at scale
summarize
Summarize or extract text/transcripts from URLs, podcasts, and local files (great fallback for “transcribe this YouTube/video”).
building-mcp-servers
Use when building MCP servers in TypeScript, Python, or C#; when implementing tools, resources, or prompts; when configuring Streamable HTTP transport; when migrating from SSE; when adding OAuth authentication; when seeing MCP protocol errors
mv-pipeline
End-to-end automated Music Video pipeline. Covers songwriting (lyrics/composition), Suno music generation (browser automation), lyrics alignment (stable-ts), video generation (Veo 3.1 via Vertex AI or Google Flow via browser), Remotion-based editing (subtitles, effects, telops), and YouTube upload. Use when creating a full MV from scratch, or running any individual stage of the pipeline.
Page 1 of 131