Scority

YouTube API 429 Error and Rate Limits

Understand YouTube API 429 errors, rate limits, quota_exceeded responses, Retry-After headers and safe retry strategies.

Direct answer

429 means the request is being throttled or quota is exhausted

A 429 is not one single problem. It can mean a short-window rate limit, a monthly quota limit or another provider-specific throttling state. In Scority, check error.code before deciding what to do next.

  • rate_limited means slow down and respect Retry-After when present.
  • quota_exceeded means the API key reached its monthly product quota.
  • Do not retry 429 responses in a tight loop.
  • Use caching and queues when the same videos are requested repeatedly.
Concept

Rate limit vs monthly quota

Rate limits protect service reliability over a short window. Monthly quota controls product usage over a longer billing or access period.

  • A burst of requests can trigger rate_limited even when monthly quota remains.
  • A key can hit quota_exceeded even if the current request rate is low.
  • Both can return HTTP 429, so your app should branch on error.code.
Scority

Scority rate_limited vs quota_exceeded

Scority's transcript API uses normalized error codes so integrations can distinguish short-window throttling from monthly quota exhaustion.

  • rate_limited: wait, queue work or reduce concurrency.
  • quota_exceeded: reduce usage, wait for reset or request a quota change.
  • The public response does not expose internal API key IDs or quota internals.

Rate-limit headers

Read these headers when building retry and backoff behavior.

X-RateLimit-Limit

Maximum requests allowed in the current short-window interval.

X-RateLimit-Remaining

Requests left before the current window is exhausted.

X-RateLimit-Reset

When the current short-window interval resets.

Retry-After

How long to wait after a 429 rate_limited response.

Retry

How to retry safely

Retries should reduce pressure, not multiply it. Use explicit delays, backoff and queueing instead of immediate loops.

  • If Retry-After is present, wait at least that long.
  • Use exponential backoff for transient upstream or short-window failures.
  • Do not retry invalid request inputs.
  • Stop automatic retries when quota_exceeded is returned.
Cache

How caching reduces repeated requests

Transcript workflows often request the same video more than once. Caching successful responses in your own application can reduce both quota usage and rate-limit pressure.

  • Cache by video ID and requested language when possible.
  • Avoid fetching the same transcript repeatedly during a single user session.
  • Separate background processing from interactive page loads.
  • Use queues for bulk jobs instead of sending all requests at once.
Avoid

What not to do

  • Do not run tight retry loops after 429.
  • Do not expose x-api-key in browser code to work around throttling.
  • Do not send broad scraping jobs without limits or queueing.
  • Do not ignore quota_exceeded and keep retrying the same key.

Request example

The request shape is the same whether the response succeeds or returns a normalized 429 error.

curl "https://api.scority.ai/v1/youtube/transcript?video_id=dQw4w9WgXcQ" \
  -H "x-api-key: YOUR_API_KEY"
Checklist

Debug checklist

  • Read error.code to distinguish rate_limited from quota_exceeded.
  • Read Retry-After before retrying.
  • Check whether repeated jobs are requesting the same video and language.
  • Move bursty work into a queue.
  • Review monthly usage when quota_exceeded appears.
Docs

Rate limits and quotas

See the canonical header and quota reference.

Open →
Errors

Error codes

Map rate_limited and quota_exceeded to actions.

Open →
Guide

Quota guide

Understand monthly quota, short-window limits and caching.

Open →
Pricing

Pricing concepts

Estimate transcript usage and cost drivers.

Open →
Reference

API reference

Review the transcript endpoint and response shape.

Open →