Files
dragonsstash/worker
xCyanGrizzly 77aeb4cc00
All checks were successful
continuous-integration/drone/push Build is passing
fix: advance channel/topic watermark incrementally per successful set
Diagnosed from production logs for the main (Premium) account:

  RUNNING   2026-05-21 → in progress, 22h     ingested: 0
  FAILED    2026-05-14 → 2026-05-21 (7.4d)    ingested: 5,426 (killed by restart)
  FAILED    2026-05-06 → 2026-05-14 (7.7d)    ingested: 8,300 (killed by restart)

Main's two source channels have 378k+ messages each. A full scan takes
days, but the worker gets restarted (container update, cycle timeout,
etc.) every few days. updateLastProcessedMessage was only called at the
END of a channel's scan — so the watermark on AccountChannelMap stayed
NULL through restart after restart, and every new run re-scanned from
message 0.

That explains the user's symptom: "main wasn't uploading although it
said it did". The dashboard showed currentStep alternating through
downloading / hashing / deduplicating, but zipsIngested stayed at 0
because every archive the run encountered was already a hash-duplicate
of something uploaded by a previous run.

Fix: processArchiveSets now accepts an onWatermarkAdvance callback.
After each successful set (ingested OR confirmed duplicate), the callback
fires with a watermark capped below the current minFailedId. Both call
sites (forum/topic and non-forum) wire it to upsertTopicProgress /
updateLastProcessedMessage. The end-of-scan write is retained for the
no-archives and all-failures-with-fallback cases.

Worst-case progress loss on restart now is one in-flight archive set,
not the entire scan.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 23:20:20 +02:00
..