SRT vs VTT vs TTML: Subtitle Specs, Safe Area, Timing

If you’ve ever delivered subtitles for OTT platforms or YouTube, you’ve likely faced two painful problems:

The platform asks for a subtitle format you don’t have (TTML, WebVTT, etc.).
The subtitles “work” but fail in real viewing because of safe area, timing, or readability issues.

Quick Answer

Use SRT when you need a simple, widely compatible subtitle file with basic timing and text.
Use VTT (WebVTT) when you need subtitles for web players and want more styling/positioning support than SRT.
Use TTML when you are delivering to OTT/broadcast pipelines that require structured, styled, XML-based subtitles (often with strict spec compliance).

Most teams succeed with this approach:Author in SRT (fast) → Convert to VTT/TTML (as required) → QC again after conversion (mandatory).

1) What Subtitle Formats Are

Subtitles are basically timed text + rules.

Each format is a container that defines:

How timecodes are written,
How text is stored,
How styling/positioning can be expressed.

Your “subtitle quality” depends on more than translation:

Timing,
Segmentation (line breaking),
Reading speed,
Safe area placement,
Consistency rules,
Platform expectations.

2) SRT vs VTT vs TTML: Comparison Table

Feature	SRT	WebVTT (VTT)	TTML
Structure	Plain text	Plain text with cues	XML-based (structured)
Styling support	Minimal	Moderate (supports more styling/positioning than SRT)	Strong (built for styling, regions, layout)
Best for	Simple workflows, broad compatibility	Web players, online video workflows	OTT/broadcast delivery pipelines, strict specs
Complexity	Low	Medium	High
Common problems	Limited styling; inconsistent handling across players	Some features not consistently supported across players	Easy to “break” with invalid XML/spec mismatches
Conversion friendliness	Great source format	Great for web delivery	Requires careful conversion + strict QC

If your client is an OTT platform, assume you will eventually need TTML (or a TTML-compatible variant), even if you start with SRT.

3) When to Use Each Format

When SRT is best

Use SRT when:

Speed matters,
You need maximum compatibility,
You’re building a working master subtitle file,
Styling requirements are minimal.

Best for: initial translation workflows, internal reviews, quick pilots, creator content.

When VTT is best

Use WebVTT when:

Delivery is primarily web-player based,
You need better handling of cues, positioning, or certain styling features,
You are working in web-first pipelines.

Best for: web video platforms, certain creator pipelines, some internal players.

When TTML is best

Use TTML when:

The delivery requirement is OTT/broadcast-grade,
The platform enforces formatting and layout rules,
You need structured styling/regions to comply with spec expectations.

Best for: enterprise OTT pipelines, distribution partners, spec-driven deliveries.Important note: Different platforms use different TTML profiles/requirements. Treat TTML as “spec-sensitive.” Always validate against the platform’s delivery sheet.

4) Subtitle Safe Area

Safe area is one of the most common failure points—especially for vertical video and mobile viewers.

What “safe area” means

Safe area is the region where text remains readable and not blocked by:

UI overlays (play/pause bars, progress bars, captions toggles),
Platform controls,
Mobile gestures and cutouts,
Other on-screen elements.

Why safe area is not “one universal box”

A common misconception is “just keep subtitles 10% above the bottom.”
In reality, safe area is:

Platform dependent (different players overlay UI differently),
Format dependent (16:9 vs vertical),
Context dependent (controls appear/disappear).

Practical safe area rules that work in real deliveries

Avoid placing subtitles too low; keep enough bottom margin for player UI.
For vertical video, assume the bottom UI overlays are heavier.
Avoid placing subtitles over important on-screen text (names, labels, chat overlays).
Test subtitles on mobile (not only desktop preview).
If the platform supports positioning, use it carefully and re-QC.

If you are delivering to multiple platforms, safe area should be treated like a spec item, not an afterthought.

5) Timing and Spotting Rules

Subtitles fail most often because of timing—not translation.

A) Timecode alignment

Subtitles should appear when speech starts (not late)
Disappear when speech ends (not abruptly early).
Avoid extremely short “flash” cues that can’t be read.

B) Reading speed (CPS) discipline

Reading speed is typically measured as characters per second (CPS).
If CPS is too high, viewers miss text and drop engagement.

Practical rule: keep subtitles readable for the average viewer, especially on mobile. If your content has fast dialogue, you must:

Compress lines (without losing meaning),
Improve segmentation,
Avoid over-long cues.

C) Segmentation (line breaking) matters more than people think

Bad segmentation makes even accurate translation hard to read.

Good segmentation:

Breaks on natural phrase boundaries,
Keeps names and verbs together,
Avoids splitting articles/prepositions awkwardly,
Avoids breaking numbers and units.

D) Overlap and continuity

Avoid overlaps that cause flicker or hidden cues.
Maintain consistency in timing rhythm across episodes.

6) Formatting Rules

Line length and layout

A common best practice is:

Keep lines short enough to read comfortably,
Avoid cramming too much into one cue,
Use 1–2 lines where possible,
Avoid three-line subtitles unless the platform explicitly allows it.

Speaker identification

If two speakers share a cue:

Use hyphens or speaker labels (based on your style guide),
Ensure clarity without overloading text.

SDH and closed captions

SDH includes non-dialogue cues like:

[door slams]
[music]
Speaker IDs when needed

This is required for accessibility in many workflows, but it must follow the platform’s style rules.

Consistency rules

Names spelled consistently across episodes
Terminology consistent (use a glossary)
Punctuation style consistent
Numerals style consistent (e.g., “10” vs “ten” based on guide)

7) Conversion Workflows

Conversion is not a mechanical step. It’s a quality risk.

SRT → VTT conversion

Usually straightforward, but you must QC:

Timing precision differences,
Cue formatting changes,
Any styling cues you rely on.

SRT → TTML conversion

TTML conversion is where teams get burned.

You must QC:

XML validity
Timecode format compatibility
Whether special characters and punctuation are preserved correctly
Styling/region compliance (if used)
Line break behavior (often changes during conversion)

Best practice: always perform a post-conversion QC pass before delivery.

8) Subtitle QC Checklist

Use this checklist before sending files to an OTT platform or client.

A) Technical QC

Correct format requested (SRT / VTT / TTML)
Correct encoding (no garbled characters)
Timecodes valid, consistent, no negative durations
No overlaps that break playback
No cues too short to read (“flash” cues)
Correct frame rate/timebase assumptions (where relevant)

B) Readability QC

Reading speed reasonable
Line breaks are natural
No overcrowded cues
Punctuation supports readability
Italics used consistently (if required)

C) Language QC

Meaning accurate (not literal-only)
Names/terms consistent (glossary applied)
Tone and register consistent
Spelling and grammar clean (proofread)

D) Safe area and viewing QC

Check on mobile playback
Ensure subtitles are not covered by UI overlays
Ensure subtitles do not block critical on-screen text

9) Common Mistakes That Cause Rejections

Delivering the wrong format (or wrong TTML variant)
Skipping post-conversion QC (conversion can break layout and timing)
Subtitles placed too low (covered by platform UI)
High reading speed (viewers can’t keep up)
Bad segmentation (hard to read even if accurate)
Inconsistent terminology and names (breaks continuity in series)
No style guide (different episodes feel “different”)
No version control (v1/v2 confusion and wrong files go live)

If you’re delivering subtitles at scale, the fastest way to reduce rejections is a standard format + QC workflow.

If you share:

a sample episode
target platform requirements (format + any style constraints)
target languages

We can propose a delivery-ready subtitle workflow including translation, proofreading, and QC. Contact Sukudo Studios Today!

FAQ: SRT vs VTT vs TTML

1) Which subtitle format is best: SRT, VTT, or TTML?

There is no universal best. SRT is simplest and widely supported. VTT is better for web workflows with more cue features. TTML is best for spec-driven OTT/broadcast deliveries.

2) Why do OTT platforms often request TTML?

Because TTML supports structured styling, layout rules, and spec compliance in enterprise pipelines. It is more predictable for platform ingestion when done correctly.

3) Can I convert SRT to TTML automatically?

You can convert it, but you must QC after conversion. TTML is spec-sensitive and conversion can break timing, line breaks, or formatting.

4) What causes subtitles to be hidden behind player controls?

Safe area issues—subtitles are positioned too close to the bottom, and UI overlays cover them. This is common on mobile.

5) What is “spotting” in subtitling?

Spotting is the timing and segmentation process: deciding when subtitles appear/disappear and how text is split into readable lines.

6) What is CPS and why does it matter?

CPS means characters per second. If CPS is too high, viewers can’t read subtitles comfortably, especially on mobile.

7) Do YouTube subtitles require VTT?

YouTube supports common formats including SRT and VTT. What matters most is timing accuracy and readability.

8) What is SDH and when is it required?

SDH (Subtitles for the Deaf and Hard-of-hearing) includes sound cues and sometimes speaker IDs. It’s required for accessibility in many distribution workflows.

9) What is the biggest QC mistake teams make?

Skipping post-conversion QC. Conversion can change line breaks and timing and introduce errors that weren’t present in the source file.

10) What should I send a subtitle vendor to get started?

A sample video/episode, transcript (if available), target languages, platform delivery specs (format + style requirements), and any glossary/style rules.

SRT vs VTT vs TTML: Subtitle Formats, Safe Area, and Timing Rules