Auto Subtitles for Video, Generated by AI
VoxCut transcribes your video and turns it into clean, word-by-word captions that are timed to the speech and burned straight into the export. You pick a style, the captions appear on the words as they're spoken, and the file is ready to post to TikTok, Reels or Shorts. It runs in your browser, with a free plan to start.
Word-level timing that follows the speech
Most short-form captions live or die on timing. VoxCut's Auto Captions generate subtitles at the word level, so each word lights up as it's spoken instead of dumping a whole sentence on screen at once. That word-by-word rhythm is what keeps viewers reading and watching on muted feeds.
Because the captions are burned in, they're part of the video file itself. There's no separate .srt to upload, no platform that strips your formatting, and no risk of the styling changing between TikTok, Reels and Shorts. What you preview is what your audience sees.
Animated styles you can match to your content
Captions come in many animated styles, so you can pick a look that fits the video rather than settling for one default. The text moves with the words, and the styling stays consistent across every clip you make.
If you publish under a brand, the Brand Kit lets you lock fonts, colors and a watermark across every export, so your captions and your channel look the same from one video to the next without redoing the setup each time.
Multilingual captions, built for short-form
Auto Captions are multilingual, so you can generate subtitles for videos in many languages, not just English. The transcription follows the audio and times the words the same way regardless of the spoken language.
Captions are one piece of the workflow. In the same browser tab you can reframe footage to vertical 9:16, use Clip Factory to split one long video into a batch of short clips in a single pass, and let Best Moments surface the strongest segments to cut. The captions ride along on whatever you export.
From upload to a ready-to-post clip
The flow is straightforward: upload a video, VoxCut transcribes it and generates word-level captions, you choose a style, and you export a vertical clip with the subtitles burned in. The VoxCut interface is available in 10 languages, and nothing needs to be installed.
When the clip is ready, you can post or schedule it straight to TikTok and YouTube, or add titles, hooks and descriptions with the AI tools before it goes out. VoxCut has a free plan to try captions, and paid plans start at $5.67/month if you need more.
Frequently asked questions
How does VoxCut generate captions automatically?
You upload a video and VoxCut transcribes the speech, then generates animated, word-level subtitles timed to the audio. You pick a style and export the clip with the captions burned in, so they're part of the video file.
Can it caption videos in languages other than English?
Yes. Auto Captions are multilingual and follow the spoken audio, so you can generate accurate subtitles for videos in many languages. Separately, the VoxCut interface itself is available in 10 languages.
Are the subtitles burned in or a separate file?
They're burned in. The captions are rendered directly into the exported video, so the timing and styling stay exactly the same on TikTok, Reels and Shorts with no separate subtitle file to upload.
What platforms and formats are these captions made for?
Vertical short-form: TikTok, Instagram Reels and YouTube Shorts. VoxCut can reframe footage to 9:16, and you can post or schedule finished clips straight to TikTok and YouTube.
Is there a free plan, and what does it cost otherwise?
Yes, there's a free plan you can use to try Auto Captions in your browser with no install. Paid plans start at $5.67/month for higher limits and more features.
AI Caption Generator for Video | VoxCut