Self-hosted Telegram channel archiver with web interface.
Archive messages from Telegram channels with media storage, translation support, full-text search, and RSS feed generation.
- Real-time Archiving: Monitor channels via Telegram folders - add channels by drag-and-drop
- Media Storage: Photos, videos, documents with SHA-256 deduplication
- Translation: Auto-translate non-English content (Google Translate free tier, optional DeepL)
- Full-Text Search: PostgreSQL tsvector-powered search across content and translations
- RSS Feeds: Generate RSS/Atom/JSON feeds for archived channels
- Social Graph: Track forwards, replies, comments, and engagement metrics
- Forward Chain Tracking: Auto-discover channels via forwards, fetch original message context
- Topic Classification: Admin-defined topic taxonomy for message categorization
- Backfill: Automatically fetch historical messages when adding new channels
- Self-Hosted: Complete data sovereignty - your data stays on your infrastructure
┌─────────────┐ ┌─────────┐ ┌───────────┐ ┌──────────┐
│ Telegram │────▶│ Listener │────▶│ Redis │────▶│ Processor │
│ Folders │ │ Service │ │ Streams │ │ Service │
└─────────────┘ └─────────┘ └───────────┘ └──────────┘
│
┌─────────┐ ┌───────────┐ │
│ Frontend │◀───│ API │◀──────────┤
│ Next.js │ │ FastAPI │ │
└─────────┘ └───────────┘ ▼
┌──────────────┐
│ PostgreSQL │
│ MinIO │
└──────────────┘
| Service | Purpose |
|---|---|
| Listener | Connects to Telegram, monitors folder-based channels |
| Processor | Processes messages, downloads media, stores to DB |
| API | FastAPI REST API with search, RSS, and admin endpoints |
| Frontend | Next.js web interface for browsing and search |
| PostgreSQL | Message storage with full-text search |
| Redis | Message queue using Redis Streams |
| MinIO | S3-compatible media storage |
- Docker and Docker Compose (v2.0+)
- Python 3.11+ (for initial Telegram authentication only)
- Telegram API credentials from my.telegram.org/apps
git clone https://github.com/yourusername/tg-archiver.git
cd tg-archiver
# Copy example environment file
cp .env.example .envEdit .env with your Telegram credentials:
# Required - get from https://my.telegram.org/apps
TELEGRAM_API_ID=your_api_id_here
TELEGRAM_API_HASH=your_api_hash_here
# Change these in production!
POSTGRES_PASSWORD=your_secure_password
JWT_SECRET_KEY=generate_a_64_char_secret_key
JWT_ADMIN_PASSWORD=your_admin_passwordBefore starting Docker, you need to authenticate with Telegram to create a session file:
# Install dependencies for auth script
pip install telethon python-dotenv
# Run the authentication script
python3 scripts/telegram_auth.pyThe script will:
- Ask for your phone number (with country code, e.g.,
+1234567890) - Send a verification code to your Telegram app
- Ask for 2FA password if enabled
- Create
sessions/listener.sessionfile - Show your Telegram folders for verification
Example output:
🔐 tg-archiver Telegram Authentication
============================================================
API ID: 12345678
Session file: /path/to/tg-archiver/sessions/listener.session
============================================================
📱 Phone number authentication required
Enter your phone number (with country code, e.g., +1234567890): +1234567890
📤 Sending verification code to +1234567890...
🔑 Enter the verification code you received: 12345
🔐 Signing in...
============================================================
✅ Authentication successful!
============================================================
Logged in as: John Doe
Username: @johndoe
Phone: +1234567890
User ID: 123456789
📁 Telegram Folders on this account:
----------------------------------------
1. [All Chats]
2. Personal (15 chats)
3. Work (8 chats)
----------------------------------------
⚠️ Target folder 'tg-archiver' NOT FOUND
Create a folder named 'tg-archiver' in your Telegram app
and add channels to archive.
- Open Telegram (mobile or desktop)
- Go to Settings → Folders → Create New Folder
- Name it exactly:
tg-archiver(or match yourFOLDER_ARCHIVE_ALL_PATTERNin.env) - Add channels you want to archive to this folder
# Build all container images (first time or after updates)
docker-compose build
# Start all services
docker-compose up -d
# Watch logs to verify startup
docker-compose logs -fNote: The build step compiles the Python services and Next.js frontend. First build takes 3-5 minutes.
| Service | URL | Credentials |
|---|---|---|
| Frontend | http://localhost:3000 | JWT_ADMIN_EMAIL / JWT_ADMIN_PASSWORD |
| API Docs | http://localhost:8000/docs | - |
| MinIO Console | http://localhost:9001 | MINIO_ACCESS_KEY / MINIO_SECRET_KEY |
tg-archiver uses Telegram's native folder feature for channel management - no admin panel needed!
- Find a channel in Telegram
- Long-press (mobile) or right-click (desktop) → Add to Folder
- Select your
tg-archiverfolder - Done! The listener detects changes within 5 minutes
- Remove the channel from your
tg-archiverfolder - The listener stops monitoring (existing messages remain archived)
The default folder name is tg-archiver. You can customize it in .env:
FOLDER_ARCHIVE_ALL_PATTERN=my-archiveNote: Telegram folder names are limited to 12 characters.
For importing many channels at once, use the CSV import feature in the admin panel.
- Go to Admin → Import Channels (
/admin/import) - Prepare a CSV file with your channels
- Drag-and-drop or click to upload
- Review the validation results
- Select channels and target folders
- Click Start Import
channel_id,channel_username,target_folder,notes
-1001234567890,channelname,tg-archiver,Optional notes
,anotherchannel,tg-archiver,Username only (ID will be resolved)
-1009876543210,,tg-archiver,ID only| Column | Required | Description |
|---|---|---|
channel_id |
One of ID or username | Telegram channel ID (negative number) |
channel_username |
One of ID or username | Channel username (without @) |
target_folder |
Yes | Target Telegram folder name |
notes |
No | Optional notes for reference |
- Validation - Each channel is validated (exists, accessible, not duplicate)
- Folder Creation - Target folders are created in Telegram if they don't exist
- Channel Addition - Channels are added to the specified folders
- Monitoring Starts - Listener automatically picks up new channels
| Status | Meaning |
|---|---|
pending |
Waiting to be processed |
validating |
Checking channel accessibility |
valid |
Ready to import |
invalid |
Cannot be imported (see error) |
importing |
Being added to folder |
completed |
Successfully imported |
failed |
Import failed (see error) |
skipped |
Skipped (already monitored or deselected) |
tg-archiver automatically discovers new channels through message forwards and enriches the social graph with original message context.
Your Monitored Channel Discovered Channel (auto-joined)
┌──────────────────────┐ ┌──────────────────────────────┐
│ Message A │ │ Original Message │
│ "Forwarded from X" │───────▶│ - Content cached │
│ - forward_from_id │ │ - Reactions fetched │
│ - propagation time │ │ - Comments fetched │
└──────────────────────┘ └──────────────────────────────┘
When a forwarded message arrives in your monitored channels:
- Discovery: The source channel is recorded in
discovered_channels - Auto-Join: Background worker joins the channel (for social data access only)
- Social Fetch: Original message content, reactions, and comments are retrieved
- Admin Review: Discovered channels appear in
/admin/channels→ Discovered tab
- No archiving of discovered channels - Only social context is fetched
- Propagation timing - Track how fast content spreads (seconds from original to forward)
- Complete social graph - Reactions and comments from the original post
- Admin control - Promote interesting channels to full archiving, or ignore them
Navigate to Admin → Channels → Discovered tab to:
| Action | Description |
|---|---|
| Promote | Add channel to full archiving (select category, folder, rule) |
| Ignore | Hide from suggestions, stop social fetching |
| Retry | Retry joining a failed/private channel |
| Variable | Default | Description |
|---|---|---|
CHANNEL_JOIN_ENABLED |
true |
Enable auto-joining discovered channels |
CHANNEL_JOIN_INTERVAL_SECONDS |
60 |
Seconds between join attempts |
CHANNEL_JOIN_BATCH_SIZE |
5 |
Max channels to join per cycle |
CHANNEL_JOIN_MAX_RETRIES |
3 |
Retries before marking as failed |
CHANNEL_JOIN_RETRY_DELAY_HOURS |
24 |
Hours between retry attempts |
Social data fetching uses the existing SOCIAL_FETCH_* settings.
| Table | Purpose |
|---|---|
discovered_channels |
Channels found via forwards (status, metadata, admin actions) |
message_forwards |
Links archived messages to their original sources |
original_messages |
Cached content of original messages (graph leaf nodes) |
forward_reactions |
Reactions on original messages |
forward_comments |
Comments on original messages |
GET /api/admin/discovered # List discovered channels
GET /api/admin/discovered/stats # Statistics
GET /api/admin/discovered/{id} # Channel details + recent forwards
POST /api/admin/discovered/{id}/promote # Promote to full archiving
POST /api/admin/discovered/{id}/ignore # Mark as ignored
POST /api/admin/discovered/{id}/retry # Retry failed join
| Variable | Description |
|---|---|
TELEGRAM_API_ID |
Telegram API ID from my.telegram.org |
TELEGRAM_API_HASH |
Telegram API Hash from my.telegram.org |
| Variable | Default | Description |
|---|---|---|
POSTGRES_HOST |
postgres |
PostgreSQL hostname |
POSTGRES_PORT |
5432 |
PostgreSQL port |
POSTGRES_DB |
tg_archiver |
Database name |
POSTGRES_USER |
archiver |
Database user |
POSTGRES_PASSWORD |
- | Database password (change in production) |
| Variable | Default | Description |
|---|---|---|
AUTH_PROVIDER |
jwt |
jwt for local auth, none to disable |
JWT_SECRET_KEY |
- | Secret for JWT signing (change in production) |
JWT_ADMIN_EMAIL |
admin@tg-archiver.local |
Admin login email |
JWT_ADMIN_PASSWORD |
- | Admin login password |
JWT_EXPIRATION_MINUTES |
60 |
Token expiration time |
| Variable | Default | Description |
|---|---|---|
MINIO_ENDPOINT |
minio:9000 |
MinIO server address |
MINIO_ACCESS_KEY |
minioadmin |
MinIO access key |
MINIO_SECRET_KEY |
minioadmin |
MinIO secret key (change in production) |
MINIO_BUCKET_NAME |
tg-archive-media |
Bucket for media files |
| Variable | Default | Description |
|---|---|---|
TRANSLATION_ENABLED |
true |
Enable auto-translation |
DEEPL_API_KEY |
- | DeepL API key (uses Google Translate if not set) |
| Variable | Default | Description |
|---|---|---|
BACKFILL_ENABLED |
true |
Enable historical message backfill |
BACKFILL_START_DATE |
2024-01-01 |
How far back to fetch |
BACKFILL_MODE |
on_discovery |
manual, on_discovery, or scheduled |
BACKFILL_BATCH_SIZE |
100 |
Messages per batch |
BACKFILL_DELAY_MS |
1000 |
Delay between batches (rate limiting) |
| Variable | Default | Description |
|---|---|---|
NEXT_PUBLIC_API_URL |
`` (empty) | API URL for browser requests. Empty = relative URLs (use when behind proxy) |
# All services
docker-compose logs -f
# Specific service
docker-compose logs -f listener
docker-compose logs -f processor
docker-compose logs -f api# Restart everything
docker-compose restart
# Restart specific service
docker-compose restart listenerdocker-compose downgit pull
docker-compose build
docker-compose up -dIf your session expires or you need to switch accounts:
# Remove old session
rm sessions/listener.session
# Re-run authentication
python3 scripts/telegram_auth.py
# Restart listener
docker-compose restart listenerThe listener can't find the Telegram session file.
# Check if session exists
ls -la sessions/
# If missing, create it
python3 scripts/telegram_auth.pyThe listener can't find your archive folder in Telegram.
- Verify folder name matches
FOLDER_ARCHIVE_ALL_PATTERNin.env - Folder names are case-insensitive but must match exactly
- Re-run
python3 scripts/telegram_auth.pyto see current folders
If you see CORS errors when accessing the frontend:
- Ensure
NEXT_PUBLIC_API_URLis empty (for proxied setup) or set correctly - Check that Caddy/nginx is properly routing
/api/*to the API service
- Check listener logs:
docker-compose logs -f listener - Verify the channel is in your archive folder
- Check processor logs:
docker-compose logs -f processor - Verify Redis is running:
docker-compose logs redis
Telegram rate limiting. The listener handles this automatically by waiting.
# Check listener logs for wait time
docker-compose logs listener | grep -i floodtg-archiver/
├── services/
│ ├── listener/ # Telegram monitoring service
│ ├── processor/ # Message processing service
│ ├── api/ # FastAPI backend
│ └── frontend/ # Next.js frontend
├── shared/python/ # Shared Python modules (models, utils)
├── infrastructure/
│ └── postgres/ # Database schema (init.sql)
├── scripts/
│ └── telegram_auth.py # Telegram authentication script
├── sessions/ # Telegram session files (gitignored)
├── docker-compose.yml
├── .env.example
└── .env # Your configuration (gitignored)
# Start infrastructure only
docker-compose up -d postgres redis minio minio-init
# Install Python dependencies
cd services/listener
pip install -r requirements.txt
# Run listener locally (for debugging)
POSTGRES_HOST=localhost python -m src.maincd services/frontend
npm install
npm run devBefore deploying to production, ensure you've configured proper security:
# Generate JWT secret (64 characters minimum)
openssl rand -hex 64
# Generate Redis password
openssl rand -base64 32
# Generate PostgreSQL password
openssl rand -base64 24Create or update your .env file:
# Environment mode
ENVIRONMENT=production
# Strong passwords (use generated values above)
POSTGRES_PASSWORD=your_generated_postgres_password
JWT_SECRET_KEY=your_generated_64_char_jwt_secret
JWT_ADMIN_PASSWORD=your_strong_admin_password
REDIS_PASSWORD=your_generated_redis_password
MINIO_SECRET_KEY=your_minio_secret_at_least_32_chars
# Security features
CSRF_ENABLED=true
TOKEN_BLACKLIST_FAIL_MODE=closedUse the production Caddyfile with automatic HTTPS:
# Set your domain
export DOMAIN=archive.yourdomain.com
export ACME_EMAIL=admin@yourdomain.com
# Use production Caddy config
cp infrastructure/caddy/Caddyfile.production infrastructure/caddy/Caddyfile# Build with production settings
docker-compose build
# Start services
docker-compose up -d
# Verify all services are healthy
docker-compose ps| Feature | Description | Config Variable |
|---|---|---|
| HTTPS | TLS 1.2+ with automatic Let's Encrypt | Caddyfile.production |
| Rate Limiting | 5 login attempts per minute | Built-in |
| CSRF Protection | Double-submit cookie pattern | CSRF_ENABLED=true |
| Token Invalidation | Logout blacklists JWT tokens | Built-in |
| Secure Cookies | HttpOnly, SameSite=Strict, Secure | ENVIRONMENT=production |
| Password Policy | Minimum 8 characters | Built-in |
| Non-root Containers | All services run as non-root | Built-in |
| Redis Auth | Password-protected Redis | REDIS_PASSWORD |
Internet → Caddy (HTTPS:443) → Internal Docker Network
│
├── /api/* → api:8000
├── /media/* → minio:9000
└── /* → frontend:3000
Only Caddy is exposed to the internet. All other services communicate internally.
# Database backup
docker-compose exec postgres pg_dump -U archiver tg_archiver > backup.sql
# Media backup (MinIO data)
docker run --rm -v tg-archiver_minio_data:/data -v $(pwd):/backup \
alpine tar czf /backup/minio-backup.tar.gz /dataAGPL-3.0 License - See LICENSE file for details.