Scority

YouTube Transcript API Guide

Learn how to fetch YouTube transcripts and captions with an API, handle errors, choose languages and use transcript data in AI workflows.

Direct answer

Use a transcript API when you need spoken video as structured text

A YouTube Transcript API returns machine-readable transcript text and timestamped segments for public YouTube videos that have accessible captions. Scority focuses on that transcript contract instead of downloads, uploads, channel analytics or player embeds.

  • Use it when your workflow needs transcript text, not video downloads or media conversion.
  • Send a YouTube video ID or URL and receive JSON with language, source, text and segments.
  • Build summarization, search, RAG, compliance review and agent workflows on top of transcript text.
Request model

Video ID, video URL and language

The Scority endpoint accepts either video_id or video_url. Send exactly one. You can also request a caption language with language or lang.

  • video_id is the compact 11-character YouTube ID.
  • video_url accepts a canonical YouTube URL when your app stores full links.
  • language accepts values such as en, en-US, ru or ru-RU.
  • If the exact requested language is not available, the response language reflects the selected caption track.
Authentication

Keep API keys server-side

Use the x-api-key header from trusted server-side code. Do not put API keys in frontend bundles or browser-only apps.

  • Store keys in environment variables.
  • Rotate keys if they are exposed.
  • Use separate keys when you need separate quota tracking.
Response

Response shape

A successful response returns the selected language, transcript source, full text and timestamped segments.

  • language: selected caption track language when available.
  • source: transcript source such as caption_track_direct.
  • text: full transcript text.
  • segments: ordered transcript segments with text, start and duration.

curl

curl "https://api.scority.ai/v1/youtube/transcript?video_id=dQw4w9WgXcQ&language=en" \
  -H "x-api-key: YOUR_API_KEY"

Server-side JavaScript

const url = new URL("https://api.scority.ai/v1/youtube/transcript")
url.searchParams.set("video_id", "dQw4w9WgXcQ")
url.searchParams.set("language", "en")

const response = await fetch(url, {
  headers: {
    "x-api-key": process.env.SCORITY_API_KEY
  }
})

if (!response.ok) {
  const error = await response.json()
  throw new Error(error.error?.code ?? "transcript_request_failed")
}

const transcript = await response.json()

Python

import os
import requests

response = requests.get(
    "https://api.scority.ai/v1/youtube/transcript",
    params={"video_id": "dQw4w9WgXcQ", "language": "en"},
    headers={"x-api-key": os.environ["SCORITY_API_KEY"]},
    timeout=30,
)

if not response.ok:
    code = response.json().get("error", {}).get("code")
    raise RuntimeError(code or "transcript_request_failed")

transcript = response.json()
Errors

Handle transcript failures explicitly

Transcript availability depends on the video and its caption tracks. Your integration should treat errors as part of the normal API contract.

  • invalid_video_id and invalid_video_url mean the request should be corrected before retrying.
  • transcript_not_available means captions are unavailable for that video or request.
  • upstream_transcript_failed means the upstream caption fetch failed and may be worth retrying later.
  • rate_limited and quota_exceeded are separate 429 cases with different operational meaning.
AI workflows

Where transcript APIs fit

Transcript APIs are useful when video speech needs to become structured text for automated processing.

  • AI agents can inspect video content before answering questions.
  • RAG pipelines can index transcript text and segment timestamps.
  • Summarization tools can produce notes, outlines and chapter summaries.
  • Search workflows can match spoken content without storing video media.
Limitations

Not every video has a transcript

Scority works with many public YouTube videos and has fallback infrastructure for harder cases, but some videos may still return transcript_not_available or upstream_transcript_failed. Do not design clients around a promise that every video will resolve.