YouTube videos disappear all the time. Channels get deleted, videos go private, content gets region-locked, and suddenly that tutorial or lecture you bookmarked six months ago is gone forever. Tube Archivist solves this by letting you build your own personal YouTube archive — complete with metadata, thumbnails, subtitles, and a slick search interface.
What is Tube Archivist?
Tube Archivist is an open-source, self-hosted YouTube media server. Think of it as Plex or Jellyfin, but specifically designed for YouTube content. It downloads videos using yt-dlp under the hood, indexes everything in Elasticsearch for fast full-text search, and presents it all through a clean web interface.
Why Archive YouTube Content?
- Videos disappear: Channels get banned, videos go private, copyright strikes remove content
- Offline access: Watch downloaded content without an internet connection
- No ads: Archived videos play without interruptions
- Full-text search: Search across video titles, descriptions, and subtitles
- Organize by channel: Automatically sorts content by channel and playlist
- Metadata preservation: Keeps thumbnails, descriptions, timestamps, and subtitles
- Sponsor segments: Integrates with SponsorBlock to skip sponsored sections
- Cast support: Stream archived videos to your TV
How It Compares
| Feature | Tube Archivist | yt-dlp alone | Jellyfin + yt-dlp |
|---|---|---|---|
| Web UI | ✅ Beautiful | ❌ CLI only | ✅ Generic media UI |
| Auto-download | ✅ Scheduled | ❌ Manual | ❌ Manual |
| Search | ✅ Full-text | ❌ None | ⚠️ Basic |
| Channel tracking | ✅ Subscribe | ❌ None | ❌ None |
| Subtitles | ✅ Indexed | ✅ Downloaded | ⚠️ Displayed only |
| SponsorBlock | ✅ Built-in | ❌ Plugin | ❌ Plugin |
Prerequisites
Before starting, you’ll need:
- A Linux server (Ubuntu 22.04+ or Debian 12+ recommended)
- Docker and Docker Compose installed
- At least 4GB RAM (Elasticsearch is memory-hungry)
- Plenty of storage — video files add up fast (plan for 500GB+ if you’re serious)
- Basic command line knowledge
Storage Planning
YouTube videos vary wildly in size. A rough guide:
- 720p video (10 min): ~150MB
- 1080p video (10 min): ~300MB
- 4K video (10 min): ~1GB+
- Typical channel (100 videos): 15-50GB at 720p
Plan your storage accordingly. A 2TB drive gives you room for roughly 5,000-10,000 videos at 720p, which is a substantial archive.
Step 1: Create the Project Directory
mkdir -p ~/tube-archivist
cd ~/tube-archivist
Create directories for persistent data:
mkdir -p data/media data/cache data/redis data/es
The media directory is where your downloaded videos will live. You can point this to a separate drive or NAS mount if you have one.
Step 2: Set Up Docker Compose
Create the docker-compose.yml file:
nano docker-compose.yml
Paste the following configuration:
version: "3.9"
services:
tubearchivist:
container_name: tubearchivist
image: bbilly1/tubearchivist:latest
restart: unless-stopped
ports:
- "8000:8000"
volumes:
- ./data/media:/youtube
- ./data/cache:/cache
environment:
- ES_URL=http://archivist-es:9200
- REDIS_HOST=archivist-redis
- HOST_UID=1000
- HOST_GID=1000
- TA_HOST=tubearchivist.local
- TA_USERNAME=admin
- TA_PASSWORD=changeme123
- ELASTIC_PASSWORD=verysecret
- TZ=America/New_York
depends_on:
- archivist-es
- archivist-redis
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
interval: 2m
timeout: 10s
retries: 3
start_period: 30s
archivist-redis:
container_name: archivist-redis
image: redis/redis-stack-server:latest
restart: unless-stopped
volumes:
- ./data/redis:/data
depends_on:
- archivist-es
archivist-es:
container_name: archivist-es
image: bbilly1/tubearchivist-es:latest
restart: unless-stopped
environment:
- "ELASTIC_PASSWORD=verysecret"
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
- "xpack.security.enabled=true"
- "discovery.type=single-node"
- "path.repo=/usr/share/elasticsearch/data/snapshot"
ulimits:
memlock:
soft: -1
hard: -1
volumes:
- ./data/es:/usr/share/elasticsearch/data
Important Configuration Notes
- TA_HOST: Set this to the hostname or IP you’ll access the UI from. Use your server’s IP (e.g.,
192.168.1.100) or a domain name if you’re using a reverse proxy. - TA_USERNAME / TA_PASSWORD: Change these from the defaults immediately.
- ELASTIC_PASSWORD: Must match in both the
tubearchivistandarchivist-esservices. - ES_JAVA_OPTS: The
-Xms512m -Xmx512mallocates 512MB to Elasticsearch. Increase to1gor2gif you have RAM to spare — search performance improves significantly. - HOST_UID / HOST_GID: Set these to match your user’s UID/GID so file permissions work correctly. Check with
id -uandid -g.
Step 3: Prepare Elasticsearch
Elasticsearch needs a kernel parameter adjustment to run properly:
sudo sysctl -w vm.max_map_count=262144
Make it permanent:
echo "vm.max_map_count=262144" | sudo tee -a /etc/sysctl.conf
Set proper permissions on the Elasticsearch data directory:
sudo chown -R 1000:0 data/es
Step 4: Start the Stack
docker compose up -d
Watch the logs to make sure everything starts cleanly:
docker compose logs -f
The first startup takes a minute or two while Elasticsearch initializes its indices. Wait until you see Tube Archivist’s web server reporting ready before trying to access the UI.
Check that all three containers are healthy:
docker compose ps
You should see tubearchivist, archivist-redis, and archivist-es all running.
Step 5: Access the Web Interface
Open your browser and navigate to:
Log in with the username and password you set in the Docker Compose file. You’ll be greeted by an empty dashboard — time to add some content.
Step 6: Configure Downloads
Subscribe to Channels
From the dashboard, click Channels in the navigation. Use the search bar or paste a YouTube channel URL directly:
Click Subscribe and Tube Archivist will index the channel’s metadata, including all video titles and thumbnails.
Subscribe to Playlists
You can also subscribe to specific playlists. Go to Playlists, paste a playlist URL, and subscribe. This is great for curating specific content without downloading an entire channel.
Download Settings
Navigate to Settings → Downloads to configure:
- Video quality: Choose between 720p, 1080p, 1440p, or best available
- Video format: mp4 is the safest choice for broad compatibility
- Subtitle languages: Download subtitles in your preferred languages
- Auto-delete watched: Optionally remove videos after you’ve watched them
- Rate limiting: Add a delay between downloads to avoid getting throttled
- SponsorBlock: Enable to automatically mark or skip sponsored segments
- Cookies: Import browser cookies if you need access to age-restricted or members-only content
Recommended Settings for New Users
Start conservative:
- Quality: 720p (saves massive storage, still looks good)
- Rate limit: 1 download every 5 seconds
- Subtitles: Enable for English (or your language)
- SponsorBlock: Enable — it’s one of the best features
Step 7: Schedule Automatic Downloads
Tube Archivist can automatically check for new videos from your subscribed channels. Go to Settings → Scheduling and configure:
- Check for new videos: How often to scan channels (daily recommended)
- Download new videos: Run the download queue automatically
- Check reindex: Periodically refresh metadata for existing videos
A typical schedule:
- Check subscriptions: Daily at 3 AM
- Run downloads: Daily at 4 AM
- Reindex: Weekly on Sundays
This way, new videos from your subscribed channels appear in your archive automatically.
Step 8: Browser Extension (Optional)
Install the Tube Archivist Companion browser extension for Firefox or Chrome. It adds a download button directly on YouTube pages, so you can add videos to your queue while browsing.
Configure the extension with your Tube Archivist URL and API key (found in Settings → Application → API).
Integrating with Jellyfin
If you already run Jellyfin, you can connect it to your Tube Archivist library using the Jellyfin Plugin:
- In Jellyfin, add the Tube Archivist plugin repository
- Install the Tube Archivist plugin
- Configure it with your Tube Archivist URL and API key
- Add a new library in Jellyfin pointing to your
/youtubemedia directory
This gives you the best of both worlds: Tube Archivist’s YouTube-specific features plus Jellyfin’s playback capabilities and client apps.
Setting Up a Reverse Proxy
To access Tube Archivist over HTTPS with a proper domain, add it to your reverse proxy. Here’s an Nginx Proxy Manager configuration:
- Add a new proxy host
- Domain:
tube.yourdomain.com - Forward hostname: Your server’s local IP
- Forward port:
8000 - Enable WebSocket support: Yes
- SSL: Request a new Let’s Encrypt certificate
Update the TA_HOST environment variable in your Docker Compose file to match your domain:
- TA_HOST=tube.yourdomain.com
Then restart the stack:
docker compose down && docker compose up -d
Backup Strategy
Your Tube Archivist data lives in two places:
data/media: The actual video files — this is the bulk of your storagedata/es: Elasticsearch indices containing metadata, search data, and settings
Elasticsearch Snapshots
Tube Archivist has a built-in snapshot feature. Go to Settings → Application → Snapshot to create and restore Elasticsearch backups. Schedule these regularly — losing your ES data means losing all your metadata, watch history, and settings.
Video File Backup
For the media files, use whatever backup solution you prefer — rsync to a NAS, rclone to cloud storage, or a dedicated backup tool like Restic:
restic -r /path/to/backup backup ~/tube-archivist/data/media
Troubleshooting
Elasticsearch Won’t Start
Symptom: The archivist-es container keeps restarting.
Fix: Check the vm.max_map_count setting:
sysctl vm.max_map_count
If it’s below 262144, set it:
sudo sysctl -w vm.max_map_count=262144
Also check permissions on the ES data directory:
sudo chown -R 1000:0 ~/tube-archivist/data/es
Downloads Failing
Symptom: Videos get stuck in the queue or fail to download.
Common causes:
- Rate limiting: YouTube throttles aggressive downloaders. Increase the delay between downloads.
- yt-dlp outdated: Tube Archivist updates yt-dlp automatically, but you can force an update by restarting the container with a fresh pull:
docker compose pull && docker compose up -d - Geo-restricted content: Some videos are region-locked. Use a VPN on the server or configure a proxy in settings.
- Age-restricted content: Import browser cookies via Settings → Downloads → Cookie.
High Memory Usage
Elasticsearch is the main memory consumer. If your server is struggling:
- Reduce
ES_JAVA_OPTSto-Xms256m -Xmx256m(minimum viable) - Don’t run other memory-heavy services on the same machine
- Consider upgrading your server’s RAM — 4GB is the minimum, 8GB is comfortable
Videos Not Playing in Browser
Symptom: Videos show in the library but won’t play.
Fix: Check the video codec. Some browsers can’t play certain formats. In Settings → Downloads, set the format to mp4 with h264 video codec for maximum compatibility.
Performance Tips
- Use an SSD for Elasticsearch: ES performance depends heavily on disk speed. Keep
data/eson an SSD even if your media files are on spinning drives. - Separate media storage: Mount a large HDD or NAS share at
data/mediafor cost-effective video storage. - Allocate more ES memory: If you have 16GB+ RAM, set
-Xms2g -Xmx2gfor noticeably faster search. - Use 720p as default: Unless you need 4K archival, 720p saves enormous amounts of storage with minimal quality loss for most content.
Conclusion
Tube Archivist is the most complete self-hosted YouTube archiving solution available. It combines yt-dlp’s powerful downloading engine with Elasticsearch’s search capabilities and wraps it in a genuinely pleasant web interface. Once you start building your archive, you’ll wonder how you ever relied on YouTube’s mercy to keep your favorite content available.
Start small — subscribe to a few channels you care about, set conservative download settings, and let it run. As your archive grows, you’ll find yourself using YouTube less and your own server more.