Submagic does captions on clips you already have. Podcli does the full pipeline: finds the clip, crops it, captions it, exports it. Captions are open-source React components you can edit.
The features that change the day-to-day for clip creators.
Submagic's pitch is captions. You bring an edited clip and they burn in beautiful word-level captions. They do that part well.
Podcli's pitch is the full pipeline. It picks moments from a long podcast, crops them to 9:16 with face tracking, and burns in captions in one pass. The output overlaps with Submagic at the captions step.
Submagic captions are template-driven. Podcli's are rendered by Remotion (a React-based video renderer). The four styles are real React components. If you want a fifth style or a brand variant, write a component and Remotion renders it the same way as the built-ins.
If captions are the only thing you need and you already edit clips in another tool, Submagic is purpose-built and faster to start. Podcli covers more of the pipeline but you pay for that with the install step.
The honest version. Steps in the order you'd actually do them.
Install Podcli: git clone https://github.com/nmbrthirteen/podcli && cd podcli && ./setup.sh.
If you already have a clip from another tool, pass it to Podcli's caption step via the CLI.
Pick a caption style. Branded with the dark pill is the closest match to Submagic's default look.
If your transcript came from somewhere else, use the import_transcript MCP tool or --transcript flag.
Render. Output: 1080x1920 MP4 with captions burned in.
Direct answers to the searches people run before they decide.
Yes. Podcli is MIT licensed and free for any use. Submagic charges a monthly subscription.
Yes. Whisper provides word-level timestamps and Podcli's caption renderer (Remotion) syncs each word to its exact start and end frame. The four styles all animate at the word level.
Not directly. Podcli's caption styles are React components, not template files. You can replicate any Submagic look by editing one of the four built-in styles, but it is a manual port.
Yes. The output is 1080x1920 with burned-in captions, which is the format both TikTok and YouTube Shorts accept directly.
Open source, MIT, no signup, no watermark, no upload. Clone and run.