- Add downloadStarted flag to prevent false "stopped unexpectedly" errors
when TDLib emits initial updateFile before download is active
- Add 5-minute stall detection for both downloads and uploads
- Reduce max split part size from 2GiB to 1950MiB to stay under
Telegram's internal upload part count limits
- Increase timeouts from max(10min, 15min/GB) to max(15min, 20min/GB)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. Distinguish failure reasons: inspect error messages to label skipped
packages as DOWNLOAD_FAILED, UPLOAD_FAILED, or EXTRACT_FAILED
instead of catch-all DOWNLOAD_FAILED.
2. Detect orphaned uploads: before uploading, check if the same content
hash already has a successful upload on the destination channel. Reuse
the existing message ID instead of re-uploading (prevents duplicates
when worker crashed between upload and DB write).
3. Increase timeouts: download from max(5min, GB*10min) to
max(10min, GB*15min), upload from GB*10min to GB*15min.
Prevents premature timeouts on slow connections.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
getChatHistory fails silently in supergroups with hidden history for new
members, returning only system messages. searchChatMessages with document
and photo filters works regardless of history visibility settings.
Also adds getChats call after TDLib client creation to populate the chat
list, preventing 'Chat not found' errors.
- Add settled flag to invokeWithTimeout to prevent double-settling
- Create mutex queue entry with wrapped resolve before pushing to queue
Co-authored-by: xCyanGrizzly <53275238+xCyanGrizzly@users.noreply.github.com>
- Add invokeWithTimeout wrapper for TDLib API calls (2min timeout per call)
- Add stuck detection to getChannelMessages: break if from_message_id doesn't advance
- Add stuck detection to getTopicMessages: same protection for topic scanning
- Add stuck detection to getForumTopicList: break if pagination offsets don't advance
- Add max page limit (5000) to all scanning loops to prevent infinite pagination
- Add mutex wait timeout (30min) to prevent indefinite blocking when holder hangs
- Add cycle timeout (4h default, configurable via WORKER_CYCLE_TIMEOUT_MINUTES)
- Fix end-of-page detection to use actual limit value instead of hardcoded 100
Co-authored-by: xCyanGrizzly <53275238+xCyanGrizzly@users.noreply.github.com>
- Add progress callbacks to getChannelMessages and getTopicMessages that
fire after each page of messages is fetched
- Worker now shows channel progress (e.g. "[2/5] Channel Name") when
processing multiple source channels
- Worker now shows topic progress (e.g. "topic 3/12") when scanning forums
- Worker now shows live message scanning count during channel/topic scans
(e.g. "Scanning Channel — 300 messages scanned")
- UI stats line now always shows messagesScanned count
- messagesScanned counter now increments during the scanning phase, not
just during archive processing
Co-authored-by: xCyanGrizzly <53275238+xCyanGrizzly@users.noreply.github.com>
Adds full Telegram ZIP ingestion pipeline: TDLib worker service scans source
channels for archive files, deduplicates by content hash, extracts metadata,
uploads to archive channel, and indexes in Postgres. Forum supergroups are
scanned per-topic with topic names used as creator. Filename-based creator
extraction (e.g. "Mammoth Factory - 2026-01.zip") serves as fallback.
Includes admin UI for managing accounts/channels, simplified account setup
(API credentials via env vars), auth code/password submission dialog,
package browser with creator column, and live ingestion activity tracking.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>