Running Tube Archivist: Self-Hosted YouTube Archive

YouTube videos disappear all the time. Channels get deleted, videos go private, content gets region-locked, and suddenly that tutorial or lecture you bookmarked six months ago is gone forever. Tube Archivist solves this by letting you build your own personal YouTube archive — complete with metadata, thumbnails, subtitles, and a slick search interface.

What is Tube Archivist?

Tube Archivist is an open-source, self-hosted YouTube media server. Think of it as Plex or Jellyfin, but specifically designed for YouTube content. It downloads videos using yt-dlp under the hood, indexes everything in Elasticsearch for fast full-text search, and presents it all through a clean web interface.

Why Archive YouTube Content?

Videos disappear: Channels get banned, videos go private, copyright strikes remove content
Offline access: Watch downloaded content without an internet connection
No ads: Archived videos play without interruptions
Full-text search: Search across video titles, descriptions, and subtitles
Organize by channel: Automatically sorts content by channel and playlist
Metadata preservation: Keeps thumbnails, descriptions, timestamps, and subtitles
Sponsor segments: Integrates with SponsorBlock to skip sponsored sections
Cast support: Stream archived videos to your TV

How It Compares

Feature	Tube Archivist	yt-dlp alone	Jellyfin + yt-dlp
Web UI	✅ Beautiful	❌ CLI only	✅ Generic media UI
Auto-download	✅ Scheduled	❌ Manual	❌ Manual
Search	✅ Full-text	❌ None	⚠️ Basic
Channel tracking	✅ Subscribe	❌ None	❌ None
Subtitles	✅ Indexed	✅ Downloaded	⚠️ Displayed only
SponsorBlock	✅ Built-in	❌ Plugin	❌ Plugin

Prerequisites

Before starting, you’ll need:

A Linux server (Ubuntu 22.04+ or Debian 12+ recommended)
Docker and Docker Compose installed
At least 4GB RAM (Elasticsearch is memory-hungry)
Plenty of storage — video files add up fast (plan for 500GB+ if you’re serious)
Basic command line knowledge

Storage Planning

YouTube videos vary wildly in size. A rough guide:

720p video (10 min): ~150MB
1080p video (10 min): ~300MB
4K video (10 min): ~1GB+
Typical channel (100 videos): 15-50GB at 720p

Plan your storage accordingly. A 2TB drive gives you room for roughly 5,000-10,000 videos at 720p, which is a substantial archive.

Step 1: Create the Project Directory

mkdir -p ~/tube-archivist
cd ~/tube-archivist

Create directories for persistent data:

mkdir -p data/media data/cache data/redis data/es

The media directory is where your downloaded videos will live. You can point this to a separate drive or NAS mount if you have one.

Step 2: Set Up Docker Compose

Create the docker-compose.yml file:

nano docker-compose.yml

Paste the following configuration:

version: "3.9"

services:
  tubearchivist:
    container_name: tubearchivist
    image: bbilly1/tubearchivist:latest
    restart: unless-stopped
    ports:
      - "8000:8000"
    volumes:
      - ./data/media:/youtube
      - ./data/cache:/cache
    environment:
      - ES_URL=http://archivist-es:9200
      - REDIS_HOST=archivist-redis
      - HOST_UID=1000
      - HOST_GID=1000
      - TA_HOST=tubearchivist.local
      - TA_USERNAME=admin
      - TA_PASSWORD=changeme123
      - ELASTIC_PASSWORD=verysecret
      - TZ=America/New_York
    depends_on:
      - archivist-es
      - archivist-redis
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
      interval: 2m
      timeout: 10s
      retries: 3
      start_period: 30s

  archivist-redis:
    container_name: archivist-redis
    image: redis/redis-stack-server:latest
    restart: unless-stopped
    volumes:
      - ./data/redis:/data
    depends_on:
      - archivist-es

  archivist-es:
    container_name: archivist-es
    image: bbilly1/tubearchivist-es:latest
    restart: unless-stopped
    environment:
      - "ELASTIC_PASSWORD=verysecret"
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
      - "xpack.security.enabled=true"
      - "discovery.type=single-node"
      - "path.repo=/usr/share/elasticsearch/data/snapshot"
    ulimits:
      memlock:
        soft: -1
        hard: -1
    volumes:
      - ./data/es:/usr/share/elasticsearch/data

Important Configuration Notes

TA_HOST: Set this to the hostname or IP you’ll access the UI from. Use your server’s IP (e.g., 192.168.1.100) or a domain name if you’re using a reverse proxy.
TA_USERNAME / TA_PASSWORD: Change these from the defaults immediately.
ELASTIC_PASSWORD: Must match in both the tubearchivist and archivist-es services.
ES_JAVA_OPTS: The -Xms512m -Xmx512m allocates 512MB to Elasticsearch. Increase to 1g or 2g if you have RAM to spare — search performance improves significantly.
HOST_UID / HOST_GID: Set these to match your user’s UID/GID so file permissions work correctly. Check with id -u and id -g.

Step 3: Prepare Elasticsearch

Elasticsearch needs a kernel parameter adjustment to run properly:

sudo sysctl -w vm.max_map_count=262144

Make it permanent:

echo "vm.max_map_count=262144" | sudo tee -a /etc/sysctl.conf

Set proper permissions on the Elasticsearch data directory:

sudo chown -R 1000:0 data/es

Step 4: Start the Stack

docker compose up -d

Watch the logs to make sure everything starts cleanly:

docker compose logs -f

The first startup takes a minute or two while Elasticsearch initializes its indices. Wait until you see Tube Archivist’s web server reporting ready before trying to access the UI.

Check that all three containers are healthy:

docker compose ps

You should see tubearchivist, archivist-redis, and archivist-es all running.

Step 5: Access the Web Interface

Open your browser and navigate to:

Log in with the username and password you set in the Docker Compose file. You’ll be greeted by an empty dashboard — time to add some content.

Step 6: Configure Downloads

From the dashboard, click Channels in the navigation. Use the search bar or paste a YouTube channel URL directly:

Click Subscribe and Tube Archivist will index the channel’s metadata, including all video titles and thumbnails.

You can also subscribe to specific playlists. Go to Playlists, paste a playlist URL, and subscribe. This is great for curating specific content without downloading an entire channel.

Download Settings

Navigate to Settings → Downloads to configure:

Video quality: Choose between 720p, 1080p, 1440p, or best available
Video format: mp4 is the safest choice for broad compatibility
Subtitle languages: Download subtitles in your preferred languages
Auto-delete watched: Optionally remove videos after you’ve watched them
Rate limiting: Add a delay between downloads to avoid getting throttled
SponsorBlock: Enable to automatically mark or skip sponsored segments
Cookies: Import browser cookies if you need access to age-restricted or members-only content

Recommended Settings for New Users

Start conservative:

Quality: 720p (saves massive storage, still looks good)
Rate limit: 1 download every 5 seconds
Subtitles: Enable for English (or your language)
SponsorBlock: Enable — it’s one of the best features

Step 7: Schedule Automatic Downloads

Tube Archivist can automatically check for new videos from your subscribed channels. Go to Settings → Scheduling and configure:

Check for new videos: How often to scan channels (daily recommended)
Download new videos: Run the download queue automatically
Check reindex: Periodically refresh metadata for existing videos

A typical schedule:

Check subscriptions: Daily at 3 AM
Run downloads: Daily at 4 AM
Reindex: Weekly on Sundays

This way, new videos from your subscribed channels appear in your archive automatically.

Step 8: Browser Extension (Optional)

Install the Tube Archivist Companion browser extension for Firefox or Chrome. It adds a download button directly on YouTube pages, so you can add videos to your queue while browsing.

Configure the extension with your Tube Archivist URL and API key (found in Settings → Application → API).

Integrating with Jellyfin

If you already run Jellyfin, you can connect it to your Tube Archivist library using the Jellyfin Plugin:

In Jellyfin, add the Tube Archivist plugin repository
Install the Tube Archivist plugin
Configure it with your Tube Archivist URL and API key
Add a new library in Jellyfin pointing to your /youtube media directory

This gives you the best of both worlds: Tube Archivist’s YouTube-specific features plus Jellyfin’s playback capabilities and client apps.

Setting Up a Reverse Proxy

To access Tube Archivist over HTTPS with a proper domain, add it to your reverse proxy. Here’s an Nginx Proxy Manager configuration:

Add a new proxy host
Domain: tube.yourdomain.com
Forward hostname: Your server’s local IP
Forward port: 8000
Enable WebSocket support: Yes
SSL: Request a new Let’s Encrypt certificate

Update the TA_HOST environment variable in your Docker Compose file to match your domain:

- TA_HOST=tube.yourdomain.com

Then restart the stack:

docker compose down && docker compose up -d

Backup Strategy

Your Tube Archivist data lives in two places:

data/media: The actual video files — this is the bulk of your storage
data/es: Elasticsearch indices containing metadata, search data, and settings

Elasticsearch Snapshots

Tube Archivist has a built-in snapshot feature. Go to Settings → Application → Snapshot to create and restore Elasticsearch backups. Schedule these regularly — losing your ES data means losing all your metadata, watch history, and settings.

Video File Backup

For the media files, use whatever backup solution you prefer — rsync to a NAS, rclone to cloud storage, or a dedicated backup tool like Restic:

restic -r /path/to/backup backup ~/tube-archivist/data/media

Troubleshooting

Elasticsearch Won’t Start

Symptom: The archivist-es container keeps restarting.

Fix: Check the vm.max_map_count setting:

sysctl vm.max_map_count

If it’s below 262144, set it:

sudo sysctl -w vm.max_map_count=262144

Also check permissions on the ES data directory:

sudo chown -R 1000:0 ~/tube-archivist/data/es

Downloads Failing

Symptom: Videos get stuck in the queue or fail to download.

Common causes:

Rate limiting: YouTube throttles aggressive downloaders. Increase the delay between downloads.
yt-dlp outdated: Tube Archivist updates yt-dlp automatically, but you can force an update by restarting the container with a fresh pull: docker compose pull && docker compose up -d
Geo-restricted content: Some videos are region-locked. Use a VPN on the server or configure a proxy in settings.
Age-restricted content: Import browser cookies via Settings → Downloads → Cookie.

High Memory Usage

Elasticsearch is the main memory consumer. If your server is struggling:

Reduce ES_JAVA_OPTS to -Xms256m -Xmx256m (minimum viable)
Don’t run other memory-heavy services on the same machine
Consider upgrading your server’s RAM — 4GB is the minimum, 8GB is comfortable

Videos Not Playing in Browser

Symptom: Videos show in the library but won’t play.

Fix: Check the video codec. Some browsers can’t play certain formats. In Settings → Downloads, set the format to mp4 with h264 video codec for maximum compatibility.

Performance Tips

Use an SSD for Elasticsearch: ES performance depends heavily on disk speed. Keep data/es on an SSD even if your media files are on spinning drives.
Separate media storage: Mount a large HDD or NAS share at data/media for cost-effective video storage.
Allocate more ES memory: If you have 16GB+ RAM, set -Xms2g -Xmx2g for noticeably faster search.
Use 720p as default: Unless you need 4K archival, 720p saves enormous amounts of storage with minimal quality loss for most content.

Conclusion

Tube Archivist is the most complete self-hosted YouTube archiving solution available. It combines yt-dlp’s powerful downloading engine with Elasticsearch’s search capabilities and wraps it in a genuinely pleasant web interface. Once you start building your archive, you’ll wonder how you ever relied on YouTube’s mercy to keep your favorite content available.

Start small — subscribe to a few channels you care about, set conservative download settings, and let it run. As your archive grows, you’ll find yourself using YouTube less and your own server more.

What is Tube Archivist?#

Why Archive YouTube Content?#

How It Compares#

Prerequisites#

Storage Planning#

Step 1: Create the Project Directory#

Step 2: Set Up Docker Compose#

Important Configuration Notes#

Step 3: Prepare Elasticsearch#

Step 4: Start the Stack#

Step 5: Access the Web Interface#

Step 6: Configure Downloads#

Subscribe to Channels#

Subscribe to Playlists#

Download Settings#

Recommended Settings for New Users#

Step 7: Schedule Automatic Downloads#

Step 8: Browser Extension (Optional)#

Integrating with Jellyfin#

Setting Up a Reverse Proxy#

Backup Strategy#

Elasticsearch Snapshots#

Video File Backup#

Troubleshooting#

Elasticsearch Won’t Start#

Downloads Failing#

High Memory Usage#

Videos Not Playing in Browser#

Performance Tips#

Conclusion#

Useful Links#

📬 Get Self-Hosting Tips in Your Inbox