Skip to content Skip to footer

SRT vs VTT vs TTML: Subtitle Formats, Safe Area, and Timing Rules

SRT vs VTT vs TTML - Sukudo Studios

If you’ve ever delivered subtitles for OTT platforms or YouTube, you’ve likely faced two painful problems:

  1. The platform asks for a subtitle format you don’t have (TTML, WebVTT, etc.).
  2. The subtitles “work” but fail in real viewing because of safe area, timing, or readability issues.

Quick Answer

  • Use SRT when you need a simple, widely compatible subtitle file with basic timing and text.
  • Use VTT (WebVTT) when you need subtitles for web players and want more styling/positioning support than SRT.
  • Use TTML when you are delivering to OTT/broadcast pipelines that require structured, styled, XML-based subtitles (often with strict spec compliance).

Most teams succeed with this approach:Author in SRT (fast) → Convert to VTT/TTML (as required) → QC again after conversion (mandatory).

1) What Subtitle Formats Are

Subtitles are basically timed text + rules.

Each format is a container that defines:

  • How timecodes are written,
  • How text is stored,
  • How styling/positioning can be expressed.

Your “subtitle quality” depends on more than translation:

  • Timing,
  • Segmentation (line breaking),
  • Reading speed,
  • Safe area placement,
  • Consistency rules,
  • Platform expectations.

2) SRT vs VTT vs TTML: Comparison Table

FeatureSRTWebVTT (VTT)TTML
StructurePlain textPlain text with cuesXML-based (structured)
Styling supportMinimalModerate (supports more styling/positioning than SRT)Strong (built for styling, regions, layout)
Best forSimple workflows, broad compatibilityWeb players, online video workflowsOTT/broadcast delivery pipelines, strict specs
ComplexityLowMediumHigh
Common problemsLimited styling; inconsistent handling across playersSome features not consistently supported across playersEasy to “break” with invalid XML/spec mismatches
Conversion friendlinessGreat source formatGreat for web deliveryRequires careful conversion + strict QC

If your client is an OTT platform, assume you will eventually need TTML (or a TTML-compatible variant), even if you start with SRT.

3) When to Use Each Format

When SRT is best

Use SRT when:

  • Speed matters,
  • You need maximum compatibility,
  • You’re building a working master subtitle file,
  • Styling requirements are minimal.

Best for: initial translation workflows, internal reviews, quick pilots, creator content.

When VTT is best

Use WebVTT when:

  • Delivery is primarily web-player based,
  • You need better handling of cues, positioning, or certain styling features,
  • You are working in web-first pipelines.

Best for: web video platforms, certain creator pipelines, some internal players.

When TTML is best

Use TTML when:

  • The delivery requirement is OTT/broadcast-grade,
  • The platform enforces formatting and layout rules,
  • You need structured styling/regions to comply with spec expectations.

Best for: enterprise OTT pipelines, distribution partners, spec-driven deliveries.Important note: Different platforms use different TTML profiles/requirements. Treat TTML as “spec-sensitive.” Always validate against the platform’s delivery sheet.

4) Subtitle Safe Area

Safe area is one of the most common failure points—especially for vertical video and mobile viewers.

What “safe area” means

Safe area is the region where text remains readable and not blocked by:

  • UI overlays (play/pause bars, progress bars, captions toggles),
  • Platform controls,
  • Mobile gestures and cutouts,
  • Other on-screen elements.

Why safe area is not “one universal box”

A common misconception is “just keep subtitles 10% above the bottom.”
In reality, safe area is:

  • Platform dependent (different players overlay UI differently),
  • Format dependent (16:9 vs vertical),
  • Context dependent (controls appear/disappear).

Practical safe area rules that work in real deliveries

  • Avoid placing subtitles too low; keep enough bottom margin for player UI.
  • For vertical video, assume the bottom UI overlays are heavier.
  • Avoid placing subtitles over important on-screen text (names, labels, chat overlays).
  • Test subtitles on mobile (not only desktop preview).
  • If the platform supports positioning, use it carefully and re-QC.

If you are delivering to multiple platforms, safe area should be treated like a spec item, not an afterthought.

5) Timing and Spotting Rules

Subtitles fail most often because of timing—not translation.

A) Timecode alignment

  • Subtitles should appear when speech starts (not late)
  • Disappear when speech ends (not abruptly early).
  • Avoid extremely short “flash” cues that can’t be read.

B) Reading speed (CPS) discipline

Reading speed is typically measured as characters per second (CPS).
If CPS is too high, viewers miss text and drop engagement.

Practical rule: keep subtitles readable for the average viewer, especially on mobile. If your content has fast dialogue, you must:

  • Compress lines (without losing meaning),
  • Improve segmentation,
  • Avoid over-long cues.

C) Segmentation (line breaking) matters more than people think

Bad segmentation makes even accurate translation hard to read.

Good segmentation:

  • Breaks on natural phrase boundaries,
  • Keeps names and verbs together,
  • Avoids splitting articles/prepositions awkwardly,
  • Avoids breaking numbers and units.

D) Overlap and continuity

  • Avoid overlaps that cause flicker or hidden cues.
  • Maintain consistency in timing rhythm across episodes.

6) Formatting Rules

Line length and layout

A common best practice is:

  • Keep lines short enough to read comfortably,
  • Avoid cramming too much into one cue,
  • Use 1–2 lines where possible,
  • Avoid three-line subtitles unless the platform explicitly allows it.

Speaker identification

If two speakers share a cue:

  • Use hyphens or speaker labels (based on your style guide),
  • Ensure clarity without overloading text.

SDH and closed captions

SDH includes non-dialogue cues like:

  • [door slams]
  • [music]
  • Speaker IDs when needed

This is required for accessibility in many workflows, but it must follow the platform’s style rules.

Consistency rules

  • Names spelled consistently across episodes
  • Terminology consistent (use a glossary)
  • Punctuation style consistent
  • Numerals style consistent (e.g., “10” vs “ten” based on guide)

7) Conversion Workflows

Conversion is not a mechanical step. It’s a quality risk.

SRT → VTT conversion

Usually straightforward, but you must QC:

  • Timing precision differences,
  • Cue formatting changes,
  • Any styling cues you rely on.

SRT → TTML conversion

TTML conversion is where teams get burned.

You must QC:

  • XML validity
  • Timecode format compatibility
  • Whether special characters and punctuation are preserved correctly
  • Styling/region compliance (if used)
  • Line break behavior (often changes during conversion)

Best practice: always perform a post-conversion QC pass before delivery.

8) Subtitle QC Checklist

Use this checklist before sending files to an OTT platform or client.

A) Technical QC

  • Correct format requested (SRT / VTT / TTML)
  • Correct encoding (no garbled characters)
  • Timecodes valid, consistent, no negative durations
  • No overlaps that break playback
  • No cues too short to read (“flash” cues)
  • Correct frame rate/timebase assumptions (where relevant)

B) Readability QC

  • Reading speed reasonable
  • Line breaks are natural
  • No overcrowded cues
  • Punctuation supports readability
  • Italics used consistently (if required)

C) Language QC

  • Meaning accurate (not literal-only)
  • Names/terms consistent (glossary applied)
  • Tone and register consistent
  • Spelling and grammar clean (proofread)

D) Safe area and viewing QC

  • Check on mobile playback
  • Ensure subtitles are not covered by UI overlays
  • Ensure subtitles do not block critical on-screen text

9) Common Mistakes That Cause Rejections

  1. Delivering the wrong format (or wrong TTML variant)
  2. Skipping post-conversion QC (conversion can break layout and timing)
  3. Subtitles placed too low (covered by platform UI)
  4. High reading speed (viewers can’t keep up)
  5. Bad segmentation (hard to read even if accurate)
  6. Inconsistent terminology and names (breaks continuity in series)
  7. No style guide (different episodes feel “different”)
  8. No version control (v1/v2 confusion and wrong files go live)

If you’re delivering subtitles at scale, the fastest way to reduce rejections is a standard format + QC workflow.

If you share:

  • a sample episode
  • target platform requirements (format + any style constraints)
  • target languages

We can propose a delivery-ready subtitle workflow including translation, proofreading, and QC. Contact Sukudo Studios Today!


FAQ: SRT vs VTT vs TTML

1) Which subtitle format is best: SRT, VTT, or TTML?

There is no universal best. SRT is simplest and widely supported. VTT is better for web workflows with more cue features. TTML is best for spec-driven OTT/broadcast deliveries.

2) Why do OTT platforms often request TTML?

Because TTML supports structured styling, layout rules, and spec compliance in enterprise pipelines. It is more predictable for platform ingestion when done correctly.

3) Can I convert SRT to TTML automatically?

You can convert it, but you must QC after conversion. TTML is spec-sensitive and conversion can break timing, line breaks, or formatting.

4) What causes subtitles to be hidden behind player controls?

Safe area issues—subtitles are positioned too close to the bottom, and UI overlays cover them. This is common on mobile.

5) What is “spotting” in subtitling?

Spotting is the timing and segmentation process: deciding when subtitles appear/disappear and how text is split into readable lines.

6) What is CPS and why does it matter?

CPS means characters per second. If CPS is too high, viewers can’t read subtitles comfortably, especially on mobile.

7) Do YouTube subtitles require VTT?

YouTube supports common formats including SRT and VTT. What matters most is timing accuracy and readability.

8) What is SDH and when is it required?

SDH (Subtitles for the Deaf and Hard-of-hearing) includes sound cues and sometimes speaker IDs. It’s required for accessibility in many distribution workflows.

9) What is the biggest QC mistake teams make?

Skipping post-conversion QC. Conversion can change line breaks and timing and introduce errors that weren’t present in the source file.

10) What should I send a subtitle vendor to get started?

A sample video/episode, transcript (if available), target languages, platform delivery specs (format + style requirements), and any glossary/style rules.

Leave a comment

Go to Top