Previously, channels/topics with no new archives never had their
watermark updated. This meant every cycle re-scanned all messages from
scratch just to discover nothing new — especially costly for the 1079-
topic Model Printing Emporium forum.
- Add maxScannedMessageId to ChannelScanResult (highest msg ID seen)
- Set channel watermark to scan boundary when no archives are found
- Set topic watermark to scan boundary when no archives are found
- Fall back to scan watermark when archive processing doesn't advance it
After one full cycle, subsequent cycles will skip already-scanned
messages via the early-exit boundary check, dramatically reducing
TDLib API calls on channels with mostly non-archive content.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
searchChatMessages returns newest-first. Once the oldest message on a
page is at or below the lastProcessedMessageId boundary, all remaining
pages are even older. Stop scanning immediately instead of reading every
message in the channel.
This was already implemented for topic scans but missing from channel
scans. On a test run, total messages scanned dropped from 3805 to 1615
(57% reduction) for an account with no new archives.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
After TDLib login completes, calls getMe() to detect isPremium, persists
it to DB via updateAccountPremiumStatus, and returns { client, isPremium }
from createTdlibClient. All callers updated to destructure accordingly.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Group merge UI:
- Add mergeGroups query and mergeGroupsAction server action
- Add "Start Merge" / "Merge Here" buttons to group row actions
- Two-step UX: click Start on source, click Merge Here on target
ZIP path prefix grouping (Signal 7):
- Compare PackageFile.path root folders across ungrouped packages
- Auto-group if 2+ packages share the same dominant root folder
Reply chain grouping (Signal 6):
- Capture reply_to_message_id during channel scanning
- Group archives that reply to the same root message
- Add replyToMessageId field to Package schema
Caption fuzzy match grouping (Signal 8):
- Capture source caption during channel scanning
- Normalize captions (strip extensions, extract significant words)
- Group packages with matching normalized caption keys
- Add sourceCaption field to Package schema
Periodic integrity audit:
- Check multipart packages for completeness (parts vs destMessageIds)
- Detect orphaned indexes (destChannelId set but no destMessageId)
- Runs after each ingestion cycle, deduplicates notifications
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Multi-part send fix:
- Add destMessageIds BigInt[] to Package schema with backfill migration
- Worker uploadToChannel now returns all message IDs, stored in DB
- Bot forwards all parts of multi-part archives (not just the first)
- Add retry logic for upload rate limits (429) and download stalls
Kickstarter package linking:
- Add package search/linking queries and API routes
- Add PackageLinkerDialog with search + checkbox selection
- Add "Link Packages" and "Send All" actions to kickstarter table
- Add sendAllKickstarterPackages server action
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add downloadStarted flag to prevent false "stopped unexpectedly" errors
when TDLib emits initial updateFile before download is active
- Add 5-minute stall detection for both downloads and uploads
- Reduce max split part size from 2GiB to 1950MiB to stay under
Telegram's internal upload part count limits
- Increase timeouts from max(10min, 15min/GB) to max(15min, 20min/GB)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. Distinguish failure reasons: inspect error messages to label skipped
packages as DOWNLOAD_FAILED, UPLOAD_FAILED, or EXTRACT_FAILED
instead of catch-all DOWNLOAD_FAILED.
2. Detect orphaned uploads: before uploading, check if the same content
hash already has a successful upload on the destination channel. Reuse
the existing message ID instead of re-uploading (prevents duplicates
when worker crashed between upload and DB write).
3. Increase timeouts: download from max(5min, GB*10min) to
max(10min, GB*15min), upload from GB*10min to GB*15min.
Prevents premature timeouts on slow connections.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Switch from getChats pagination to loadChats (the TDLib-recommended
API) which properly loads all chats into TDLib's cache and signals
completion with a 404 error
- Discover and load chat folders via getChatFolders so chats in
user-created folders are included
- Load from main + archive + all folders in both worker startup and
getAccountChats channel discovery
- After loading, use getChats with high limit to retrieve all cached IDs
- This ensures private chats, 1-on-1 conversations, Saved Messages,
basic groups, and archived/folder chats are all discoverable
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Increase getChats pagination from 50 pages (5K chats) to 500 pages
(50K chats) to support accounts with many channels/groups
- Load from both chatListMain AND chatListArchive so older/archived
chats are discovered and scannable
- Deduplicate chat IDs across both lists
- Worker startup also loads both lists before scanning
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Channel Discovery:
- Remove channel/supergroup filter from getAccountChats — all chat types
(private, groups, Saved Messages, etc.) are now discoverable as sources
- Detect and label the self-chat as "Saved Messages" via getMe
- Update channel picker dialog to accept any chat type string
Bot Rich Messages:
- Enhance package send preview with creator, file count, tags, and source
channel info in MarkdownV2 caption
- Include tags in new_package subscription notifications
- Expand getPendingSendRequest to fetch richer package data
Performance:
- Reviewed pipeline for many-channel load — getChats pagination fix and
per-channel getChat pre-load from prior commit address the main concerns
- Channels with no new messages skip in 2-3 API calls
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Bug fixes:
- Fix channels not being scanned by paginating TDLib getChats (was only
loading first batch, additional channels were unknown to TDLib)
- Add per-channel getChat pre-load as safety net before scanning
- Fix preview pictures not loading by checking previewData instead of
previewMsgId for hasPreview flag
- Prevent previewMsgId from being set when preview download fails
Package Tags:
- Add tags Text[] column to Package with migration backfilling from
channel categories
- Worker auto-inherits source channel category as initial tag
- Tag filter dropdown and Tags column in STL Files table
- Server actions for individual and bulk tag editing
Kickstarters Tab:
- New KickstarterHost, Kickstarter, and KickstarterPackage models
- Full CRUD with delivery status, payment status, host management
- Package linking (many-to-many with existing packages)
- Sidebar entry with Gift icon
- Table with search, filters, modal forms
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Auto-extract preview images from ZIP/RAR/7z archives during ingestion
- Upload custom preview images via package drawer
- Select preview from archive contents with on-demand extraction UI
- Manually add Telegram channels by t.me link, username, or invite link
- Invite code UX: bulk create, copy link, usage tracking, delete confirm
- Incomplete upload recovery: verify dest messages on worker startup
- Rebuild package DB by scanning destination channel with live progress
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
getChatHistory fails silently in supergroups with hidden history for new
members, returning only system messages. searchChatMessages with document
and photo filters works regardless of history visibility settings.
Also adds getChats call after TDLib client creation to populate the chat
list, preventing 'Chat not found' errors.
- Add settled flag to invokeWithTimeout to prevent double-settling
- Create mutex queue entry with wrapped resolve before pushing to queue
Co-authored-by: xCyanGrizzly <53275238+xCyanGrizzly@users.noreply.github.com>
- Add invokeWithTimeout wrapper for TDLib API calls (2min timeout per call)
- Add stuck detection to getChannelMessages: break if from_message_id doesn't advance
- Add stuck detection to getTopicMessages: same protection for topic scanning
- Add stuck detection to getForumTopicList: break if pagination offsets don't advance
- Add max page limit (5000) to all scanning loops to prevent infinite pagination
- Add mutex wait timeout (30min) to prevent indefinite blocking when holder hangs
- Add cycle timeout (4h default, configurable via WORKER_CYCLE_TIMEOUT_MINUTES)
- Fix end-of-page detection to use actual limit value instead of hardcoded 100
Co-authored-by: xCyanGrizzly <53275238+xCyanGrizzly@users.noreply.github.com>
- Add progress callbacks to getChannelMessages and getTopicMessages that
fire after each page of messages is fetched
- Worker now shows channel progress (e.g. "[2/5] Channel Name") when
processing multiple source channels
- Worker now shows topic progress (e.g. "topic 3/12") when scanning forums
- Worker now shows live message scanning count during channel/topic scans
(e.g. "Scanning Channel — 300 messages scanned")
- UI stats line now always shows messagesScanned count
- messagesScanned counter now increments during the scanning phase, not
just during archive processing
Co-authored-by: xCyanGrizzly <53275238+xCyanGrizzly@users.noreply.github.com>
Adds full Telegram ZIP ingestion pipeline: TDLib worker service scans source
channels for archive files, deduplicates by content hash, extracts metadata,
uploads to archive channel, and indexes in Postgres. Forum supergroups are
scanned per-topic with topic names used as creator. Filename-based creator
extraction (e.g. "Mammoth Factory - 2026-01.zip") serves as fallback.
Includes admin UI for managing accounts/channels, simplified account setup
(API credentials via env vars), auth code/password submission dialog,
package browser with creator column, and live ingestion activity tracking.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>