mirror of
https://github.com/xCyanGrizzly/DragonsStash.git
synced 2026-05-11 06:11:15 +00:00
feat: grouping phase 1 — schema, ungrouped tab, time-window grouping, hash verification
Schema: - Add GroupingSource enum (ALBUM, MANUAL, AUTO_TIME, AUTO_PATTERN, etc.) - Add groupingSource field to PackageGroup with backfill - Add SystemNotification model for persistent alerts - Add NotificationType and NotificationSeverity enums Ungrouped staging tab: - Add listUngroupedPackages/countUngroupedPackages queries - Add "Ungrouped" tab to STL page showing packages without a group Time-window auto-grouping: - After album grouping, cluster ungrouped packages within configurable time window (default 5 min, AUTO_GROUP_TIME_WINDOW_MINUTES env var) - Groups named from common filename prefix - Groups created with groupingSource=AUTO_TIME Hash verification after split: - Re-hash split parts and compare to original contentHash - Log error and create SystemNotification on mismatch - Prevents silently corrupted split uploads Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,67 @@
|
|||||||
|
# Grouping Phase 1: Foundation + Time-Window Grouping
|
||||||
|
|
||||||
|
> **For agentic workers:** Use superpowers:subagent-driven-development to implement this plan.
|
||||||
|
|
||||||
|
**Goal:** Add grouping infrastructure (schema, enums, notifications model), an ungrouped staging queue in the UI, and time-window auto-grouping as the first automatic signal beyond album grouping.
|
||||||
|
|
||||||
|
**Architecture:** Schema changes lay the foundation. Ungrouped tab is a query filter. Time-window grouping runs as a post-processing pass after album grouping in the worker pipeline.
|
||||||
|
|
||||||
|
**Tech Stack:** Prisma schema + migration, worker TypeScript, Next.js App Router.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Task 1: Schema Migration
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `prisma/schema.prisma`
|
||||||
|
- Create: migration SQL
|
||||||
|
|
||||||
|
Add:
|
||||||
|
1. `GroupingSource` enum: `ALBUM`, `MANUAL`, `AUTO_TIME`, `AUTO_PATTERN`, `AUTO_REPLY`, `AUTO_ZIP`, `AUTO_CAPTION`
|
||||||
|
2. `groupingSource GroupingSource @default(MANUAL)` on `PackageGroup`
|
||||||
|
3. `SystemNotification` model with `type`, `severity`, `title`, `message`, `context` (Json), `isRead`
|
||||||
|
4. `NotificationType` enum: `HASH_MISMATCH`, `MISSING_PART`, `UPLOAD_FAILED`, `DOWNLOAD_FAILED`, `GROUPING_CONFLICT`, `INTEGRITY_AUDIT`
|
||||||
|
5. `NotificationSeverity` enum: `INFO`, `WARNING`, `ERROR`
|
||||||
|
|
||||||
|
Backfill: `UPDATE package_groups SET "groupingSource" = 'ALBUM' WHERE "mediaAlbumId" IS NOT NULL`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Task 2: Ungrouped Staging Tab in STL Page
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `src/lib/telegram/queries.ts` — add `listUngroupedPackages()` query
|
||||||
|
- Modify: `src/app/(app)/stls/page.tsx` — add tab parameter support
|
||||||
|
- Modify: `src/app/(app)/stls/_components/stl-table.tsx` — add "Ungrouped" tab
|
||||||
|
|
||||||
|
Add a tab next to the existing "Skipped" tab that shows packages where `packageGroupId IS NULL`. Uses the existing `PackageListItem` type and table rendering. This gives users a clear view of files that need manual grouping.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Task 3: Time-Window Auto-Grouping in Worker
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Create: `worker/src/grouping.ts` — add `processTimeWindowGroups()` after existing `processAlbumGroups()`
|
||||||
|
- Modify: `worker/src/worker.ts` — call time-window grouping after album grouping
|
||||||
|
- Modify: `worker/src/util/config.ts` — add `autoGroupTimeWindowMinutes` config
|
||||||
|
|
||||||
|
After album grouping completes, find remaining ungrouped packages from the same channel scan. Cluster packages whose `sourceMessageId` timestamps are within the configured window (default 5 minutes). Create groups for clusters of 2+ with `groupingSource = AUTO_TIME` and name derived from the common filename prefix or first file's base name.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Task 4: Hash Verification After Split
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `worker/src/worker.ts` — add hash re-check after concat+split
|
||||||
|
- Modify: `worker/src/archive/hash.ts` — (no changes needed, reuse `hashParts`)
|
||||||
|
|
||||||
|
After `concatenateFiles()` + `byteLevelSplit()`, re-hash the split parts and compare to the original `contentHash`. If mismatch, log error and create a `SystemNotification` (once that table exists). This closes the integrity gap identified in the audit.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Task 5: Build & Deploy
|
||||||
|
|
||||||
|
Rebuild worker and app images. Deploy. Verify:
|
||||||
|
- Worker logs show `maxPartSizeMB` and new `autoGroupTimeWindowMinutes` in config
|
||||||
|
- Ungrouped tab visible in STL page
|
||||||
|
- Previously-skipped large archives begin processing
|
||||||
@@ -0,0 +1,32 @@
|
|||||||
|
-- CreateEnum GroupingSource
|
||||||
|
CREATE TYPE "GroupingSource" AS ENUM ('ALBUM', 'MANUAL', 'AUTO_TIME', 'AUTO_PATTERN', 'AUTO_REPLY', 'AUTO_ZIP', 'AUTO_CAPTION');
|
||||||
|
|
||||||
|
-- CreateEnum NotificationType
|
||||||
|
CREATE TYPE "NotificationType" AS ENUM ('HASH_MISMATCH', 'MISSING_PART', 'UPLOAD_FAILED', 'DOWNLOAD_FAILED', 'GROUPING_CONFLICT', 'INTEGRITY_AUDIT');
|
||||||
|
|
||||||
|
-- CreateEnum NotificationSeverity
|
||||||
|
CREATE TYPE "NotificationSeverity" AS ENUM ('INFO', 'WARNING', 'ERROR');
|
||||||
|
|
||||||
|
-- AlterTable: add groupingSource to package_groups
|
||||||
|
ALTER TABLE "package_groups" ADD COLUMN "groupingSource" "GroupingSource" NOT NULL DEFAULT 'MANUAL';
|
||||||
|
|
||||||
|
-- Backfill: mark album-based groups
|
||||||
|
UPDATE "package_groups" SET "groupingSource" = 'ALBUM' WHERE "mediaAlbumId" IS NOT NULL;
|
||||||
|
|
||||||
|
-- CreateTable: system_notifications
|
||||||
|
CREATE TABLE "system_notifications" (
|
||||||
|
"id" TEXT NOT NULL,
|
||||||
|
"type" "NotificationType" NOT NULL,
|
||||||
|
"severity" "NotificationSeverity" NOT NULL DEFAULT 'INFO',
|
||||||
|
"title" TEXT NOT NULL,
|
||||||
|
"message" TEXT NOT NULL,
|
||||||
|
"context" JSONB,
|
||||||
|
"isRead" BOOLEAN NOT NULL DEFAULT false,
|
||||||
|
"createdAt" TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP,
|
||||||
|
|
||||||
|
CONSTRAINT "system_notifications_pkey" PRIMARY KEY ("id")
|
||||||
|
);
|
||||||
|
|
||||||
|
-- CreateIndex
|
||||||
|
CREATE INDEX "system_notifications_isRead_createdAt_idx" ON "system_notifications"("isRead", "createdAt");
|
||||||
|
CREATE INDEX "system_notifications_type_idx" ON "system_notifications"("type");
|
||||||
@@ -522,6 +522,7 @@ model PackageGroup {
|
|||||||
name String
|
name String
|
||||||
mediaAlbumId String?
|
mediaAlbumId String?
|
||||||
sourceChannelId String
|
sourceChannelId String
|
||||||
|
groupingSource GroupingSource @default(MANUAL)
|
||||||
previewData Bytes?
|
previewData Bytes?
|
||||||
createdAt DateTime @default(now())
|
createdAt DateTime @default(now())
|
||||||
updatedAt DateTime @updatedAt
|
updatedAt DateTime @updatedAt
|
||||||
@@ -802,3 +803,45 @@ model KickstarterPackage {
|
|||||||
@@id([kickstarterId, packageId])
|
@@id([kickstarterId, packageId])
|
||||||
@@map("kickstarter_packages")
|
@@map("kickstarter_packages")
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// ── Grouping & Notifications ──
|
||||||
|
|
||||||
|
enum GroupingSource {
|
||||||
|
ALBUM
|
||||||
|
MANUAL
|
||||||
|
AUTO_TIME
|
||||||
|
AUTO_PATTERN
|
||||||
|
AUTO_REPLY
|
||||||
|
AUTO_ZIP
|
||||||
|
AUTO_CAPTION
|
||||||
|
}
|
||||||
|
|
||||||
|
enum NotificationType {
|
||||||
|
HASH_MISMATCH
|
||||||
|
MISSING_PART
|
||||||
|
UPLOAD_FAILED
|
||||||
|
DOWNLOAD_FAILED
|
||||||
|
GROUPING_CONFLICT
|
||||||
|
INTEGRITY_AUDIT
|
||||||
|
}
|
||||||
|
|
||||||
|
enum NotificationSeverity {
|
||||||
|
INFO
|
||||||
|
WARNING
|
||||||
|
ERROR
|
||||||
|
}
|
||||||
|
|
||||||
|
model SystemNotification {
|
||||||
|
id String @id @default(cuid())
|
||||||
|
type NotificationType
|
||||||
|
severity NotificationSeverity @default(INFO)
|
||||||
|
title String
|
||||||
|
message String
|
||||||
|
context Json?
|
||||||
|
isRead Boolean @default(false)
|
||||||
|
createdAt DateTime @default(now())
|
||||||
|
|
||||||
|
@@index([isRead, createdAt])
|
||||||
|
@@index([type])
|
||||||
|
@@map("system_notifications")
|
||||||
|
}
|
||||||
|
|||||||
@@ -38,7 +38,7 @@ import {
|
|||||||
} from "@/components/ui/dialog";
|
} from "@/components/ui/dialog";
|
||||||
import { Tabs, TabsContent, TabsList, TabsTrigger } from "@/components/ui/tabs";
|
import { Tabs, TabsContent, TabsList, TabsTrigger } from "@/components/ui/tabs";
|
||||||
import { Badge } from "@/components/ui/badge";
|
import { Badge } from "@/components/ui/badge";
|
||||||
import type { DisplayItem, IngestionAccountStatus } from "@/lib/telegram/types";
|
import type { DisplayItem, IngestionAccountStatus, PackageListItem } from "@/lib/telegram/types";
|
||||||
import type { SkippedRow } from "./skipped-columns";
|
import type { SkippedRow } from "./skipped-columns";
|
||||||
import {
|
import {
|
||||||
updatePackageCreator,
|
updatePackageCreator,
|
||||||
@@ -61,6 +61,9 @@ interface StlTableProps {
|
|||||||
skippedData: SkippedRow[];
|
skippedData: SkippedRow[];
|
||||||
skippedPageCount: number;
|
skippedPageCount: number;
|
||||||
skippedTotalCount: number;
|
skippedTotalCount: number;
|
||||||
|
ungroupedData: PackageListItem[];
|
||||||
|
ungroupedPageCount: number;
|
||||||
|
ungroupedTotalCount: number;
|
||||||
}
|
}
|
||||||
|
|
||||||
export function StlTable({
|
export function StlTable({
|
||||||
@@ -73,6 +76,9 @@ export function StlTable({
|
|||||||
skippedData,
|
skippedData,
|
||||||
skippedPageCount,
|
skippedPageCount,
|
||||||
skippedTotalCount,
|
skippedTotalCount,
|
||||||
|
ungroupedData,
|
||||||
|
ungroupedPageCount,
|
||||||
|
ungroupedTotalCount,
|
||||||
}: StlTableProps) {
|
}: StlTableProps) {
|
||||||
const router = useRouter();
|
const router = useRouter();
|
||||||
const pathname = usePathname();
|
const pathname = usePathname();
|
||||||
@@ -379,6 +385,23 @@ export function StlTable({
|
|||||||
|
|
||||||
const { table } = useDataTable({ data: tableRows, columns, pageCount });
|
const { table } = useDataTable({ data: tableRows, columns, pageCount });
|
||||||
|
|
||||||
|
const ungroupedRows: StlTableRow[] = useMemo(
|
||||||
|
() =>
|
||||||
|
ungroupedData.map((pkg) => ({
|
||||||
|
...pkg,
|
||||||
|
_rowType: "package" as const,
|
||||||
|
_groupId: null,
|
||||||
|
_isGroupMember: false,
|
||||||
|
})),
|
||||||
|
[ungroupedData]
|
||||||
|
);
|
||||||
|
|
||||||
|
const { table: ungroupedTable } = useDataTable({
|
||||||
|
data: ungroupedRows,
|
||||||
|
columns,
|
||||||
|
pageCount: ungroupedPageCount,
|
||||||
|
});
|
||||||
|
|
||||||
const activeTag = searchParams.get("tag") ?? "";
|
const activeTag = searchParams.get("tag") ?? "";
|
||||||
|
|
||||||
return (
|
return (
|
||||||
@@ -401,6 +424,14 @@ export function StlTable({
|
|||||||
</Badge>
|
</Badge>
|
||||||
)}
|
)}
|
||||||
</TabsTrigger>
|
</TabsTrigger>
|
||||||
|
<TabsTrigger value="ungrouped" className="gap-1.5">
|
||||||
|
Ungrouped
|
||||||
|
{ungroupedTotalCount > 0 && (
|
||||||
|
<Badge variant="secondary" className="h-5 px-1.5 text-[10px]">
|
||||||
|
{ungroupedTotalCount}
|
||||||
|
</Badge>
|
||||||
|
)}
|
||||||
|
</TabsTrigger>
|
||||||
</TabsList>
|
</TabsList>
|
||||||
|
|
||||||
<TabsContent value="packages" className="space-y-4">
|
<TabsContent value="packages" className="space-y-4">
|
||||||
@@ -472,6 +503,11 @@ export function StlTable({
|
|||||||
totalCount={skippedTotalCount}
|
totalCount={skippedTotalCount}
|
||||||
/>
|
/>
|
||||||
</TabsContent>
|
</TabsContent>
|
||||||
|
|
||||||
|
<TabsContent value="ungrouped" className="space-y-4">
|
||||||
|
<DataTable table={ungroupedTable} emptyMessage="All packages are grouped!" />
|
||||||
|
<DataTablePagination table={ungroupedTable} totalCount={ungroupedTotalCount} />
|
||||||
|
</TabsContent>
|
||||||
</Tabs>
|
</Tabs>
|
||||||
|
|
||||||
<PackageFilesDrawer
|
<PackageFilesDrawer
|
||||||
|
|||||||
@@ -1,6 +1,6 @@
|
|||||||
import { auth } from "@/lib/auth";
|
import { auth } from "@/lib/auth";
|
||||||
import { redirect } from "next/navigation";
|
import { redirect } from "next/navigation";
|
||||||
import { listDisplayItems, searchPackages, getIngestionStatus, getAllPackageTags, listSkippedPackages, countSkippedPackages } from "@/lib/telegram/queries";
|
import { listDisplayItems, searchPackages, getIngestionStatus, getAllPackageTags, listSkippedPackages, countSkippedPackages, listUngroupedPackages, countUngroupedPackages } from "@/lib/telegram/queries";
|
||||||
import { StlTable } from "./_components/stl-table";
|
import { StlTable } from "./_components/stl-table";
|
||||||
import type { DisplayItem, PackageListItem } from "@/lib/telegram/types";
|
import type { DisplayItem, PackageListItem } from "@/lib/telegram/types";
|
||||||
|
|
||||||
@@ -24,7 +24,7 @@ export default async function StlFilesPage({ searchParams }: Props) {
|
|||||||
const tab = (params.tab as string) ?? "packages";
|
const tab = (params.tab as string) ?? "packages";
|
||||||
|
|
||||||
// Fetch packages, ingestion status, tags, and skipped count in parallel
|
// Fetch packages, ingestion status, tags, and skipped count in parallel
|
||||||
const [result, ingestionStatus, availableTags, skippedCount] = await Promise.all([
|
const [result, ingestionStatus, availableTags, skippedCount, ungroupedCount] = await Promise.all([
|
||||||
search
|
search
|
||||||
? searchPackages({
|
? searchPackages({
|
||||||
query: search,
|
query: search,
|
||||||
@@ -43,6 +43,7 @@ export default async function StlFilesPage({ searchParams }: Props) {
|
|||||||
getIngestionStatus(),
|
getIngestionStatus(),
|
||||||
getAllPackageTags(),
|
getAllPackageTags(),
|
||||||
countSkippedPackages(),
|
countSkippedPackages(),
|
||||||
|
countUngroupedPackages(),
|
||||||
]);
|
]);
|
||||||
|
|
||||||
// For search results, wrap as DisplayItem[]; for non-search, already DisplayItem[]
|
// For search results, wrap as DisplayItem[]; for non-search, already DisplayItem[]
|
||||||
@@ -55,6 +56,11 @@ export default async function StlFilesPage({ searchParams }: Props) {
|
|||||||
? await listSkippedPackages({ page, limit: perPage })
|
? await listSkippedPackages({ page, limit: perPage })
|
||||||
: null;
|
: null;
|
||||||
|
|
||||||
|
// Fetch ungrouped packages only if on that tab
|
||||||
|
const ungroupedResult = tab === "ungrouped"
|
||||||
|
? await listUngroupedPackages({ page, limit: perPage })
|
||||||
|
: null;
|
||||||
|
|
||||||
return (
|
return (
|
||||||
<StlTable
|
<StlTable
|
||||||
data={displayItems}
|
data={displayItems}
|
||||||
@@ -66,6 +72,9 @@ export default async function StlFilesPage({ searchParams }: Props) {
|
|||||||
skippedData={skippedResult?.items ?? []}
|
skippedData={skippedResult?.items ?? []}
|
||||||
skippedPageCount={skippedResult?.pagination.totalPages ?? 0}
|
skippedPageCount={skippedResult?.pagination.totalPages ?? 0}
|
||||||
skippedTotalCount={skippedCount}
|
skippedTotalCount={skippedCount}
|
||||||
|
ungroupedData={ungroupedResult?.items ?? []}
|
||||||
|
ungroupedPageCount={ungroupedResult?.pagination.totalPages ?? 0}
|
||||||
|
ungroupedTotalCount={ungroupedCount}
|
||||||
/>
|
/>
|
||||||
);
|
);
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -571,6 +571,72 @@ export async function countSkippedPackages(): Promise<number> {
|
|||||||
return prisma.skippedPackage.count();
|
return prisma.skippedPackage.count();
|
||||||
}
|
}
|
||||||
|
|
||||||
|
export async function listUngroupedPackages(options: {
|
||||||
|
page: number;
|
||||||
|
limit: number;
|
||||||
|
}) {
|
||||||
|
const { page, limit } = options;
|
||||||
|
const skip = (page - 1) * limit;
|
||||||
|
|
||||||
|
const where = { packageGroupId: null, destMessageId: { not: null } };
|
||||||
|
|
||||||
|
const [items, total] = await Promise.all([
|
||||||
|
prisma.package.findMany({
|
||||||
|
where,
|
||||||
|
orderBy: { indexedAt: "desc" },
|
||||||
|
skip,
|
||||||
|
take: limit,
|
||||||
|
select: {
|
||||||
|
id: true,
|
||||||
|
fileName: true,
|
||||||
|
fileSize: true,
|
||||||
|
archiveType: true,
|
||||||
|
creator: true,
|
||||||
|
fileCount: true,
|
||||||
|
isMultipart: true,
|
||||||
|
partCount: true,
|
||||||
|
tags: true,
|
||||||
|
indexedAt: true,
|
||||||
|
previewData: true,
|
||||||
|
sourceChannel: { select: { id: true, title: true } },
|
||||||
|
},
|
||||||
|
}),
|
||||||
|
prisma.package.count({ where }),
|
||||||
|
]);
|
||||||
|
|
||||||
|
return {
|
||||||
|
items: items.map((p) => ({
|
||||||
|
id: p.id,
|
||||||
|
fileName: p.fileName,
|
||||||
|
fileSize: p.fileSize.toString(),
|
||||||
|
contentHash: "",
|
||||||
|
archiveType: p.archiveType,
|
||||||
|
creator: p.creator,
|
||||||
|
fileCount: p.fileCount,
|
||||||
|
isMultipart: p.isMultipart,
|
||||||
|
partCount: p.partCount,
|
||||||
|
tags: p.tags,
|
||||||
|
indexedAt: p.indexedAt.toISOString(),
|
||||||
|
hasPreview: !!p.previewData,
|
||||||
|
sourceChannel: p.sourceChannel,
|
||||||
|
matchedFileCount: 0,
|
||||||
|
matchedByContent: false,
|
||||||
|
})),
|
||||||
|
pagination: {
|
||||||
|
total,
|
||||||
|
totalPages: Math.ceil(total / limit),
|
||||||
|
page,
|
||||||
|
limit,
|
||||||
|
},
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
export async function countUngroupedPackages(): Promise<number> {
|
||||||
|
return prisma.package.count({
|
||||||
|
where: { packageGroupId: null, destMessageId: { not: null } },
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
export async function getPackageGroup(groupId: string) {
|
export async function getPackageGroup(groupId: string) {
|
||||||
return prisma.packageGroup.findUnique({
|
return prisma.packageGroup.findUnique({
|
||||||
where: { id: groupId },
|
where: { id: groupId },
|
||||||
|
|||||||
@@ -587,3 +587,24 @@ export async function linkPackagesToGroup(
|
|||||||
data: { packageGroupId: groupId },
|
data: { packageGroupId: groupId },
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
|
|
||||||
|
export async function createTimeWindowGroup(input: {
|
||||||
|
sourceChannelId: string;
|
||||||
|
name: string;
|
||||||
|
packageIds: string[];
|
||||||
|
}): Promise<string> {
|
||||||
|
const group = await db.packageGroup.create({
|
||||||
|
data: {
|
||||||
|
sourceChannelId: input.sourceChannelId,
|
||||||
|
name: input.name,
|
||||||
|
groupingSource: "AUTO_TIME",
|
||||||
|
},
|
||||||
|
});
|
||||||
|
|
||||||
|
await db.package.updateMany({
|
||||||
|
where: { id: { in: input.packageIds } },
|
||||||
|
data: { packageGroupId: group.id },
|
||||||
|
});
|
||||||
|
|
||||||
|
return group.id;
|
||||||
|
}
|
||||||
|
|||||||
@@ -1,7 +1,8 @@
|
|||||||
import type { Client } from "tdl";
|
import type { Client } from "tdl";
|
||||||
import type { TelegramPhoto } from "./preview/match.js";
|
import type { TelegramPhoto } from "./preview/match.js";
|
||||||
import { downloadPhotoThumbnail } from "./tdlib/download.js";
|
import { downloadPhotoThumbnail } from "./tdlib/download.js";
|
||||||
import { createOrFindPackageGroup, linkPackagesToGroup } from "./db/queries.js";
|
import { createOrFindPackageGroup, linkPackagesToGroup, createTimeWindowGroup } from "./db/queries.js";
|
||||||
|
import { config } from "./util/config.js";
|
||||||
import { childLogger } from "./util/logger.js";
|
import { childLogger } from "./util/logger.js";
|
||||||
import { db } from "./db/client.js";
|
import { db } from "./db/client.js";
|
||||||
|
|
||||||
@@ -77,3 +78,95 @@ export async function processAlbumGroups(
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* After album grouping, cluster remaining ungrouped packages from the same channel
|
||||||
|
* that were posted within a configurable time window.
|
||||||
|
* Only groups packages that were just indexed in this scan cycle (the `indexedPackages` list).
|
||||||
|
*/
|
||||||
|
export async function processTimeWindowGroups(
|
||||||
|
sourceChannelId: string,
|
||||||
|
indexedPackages: IndexedPackageRef[]
|
||||||
|
): Promise<void> {
|
||||||
|
if (config.autoGroupTimeWindowMinutes <= 0) return;
|
||||||
|
|
||||||
|
// Find which of the just-indexed packages are still ungrouped
|
||||||
|
const ungrouped = await db.package.findMany({
|
||||||
|
where: {
|
||||||
|
id: { in: indexedPackages.map((p) => p.packageId) },
|
||||||
|
packageGroupId: null,
|
||||||
|
},
|
||||||
|
orderBy: { sourceMessageId: "asc" },
|
||||||
|
select: {
|
||||||
|
id: true,
|
||||||
|
fileName: true,
|
||||||
|
sourceMessageId: true,
|
||||||
|
indexedAt: true,
|
||||||
|
},
|
||||||
|
});
|
||||||
|
|
||||||
|
if (ungrouped.length < 2) return;
|
||||||
|
|
||||||
|
const windowMs = config.autoGroupTimeWindowMinutes * 60 * 1000;
|
||||||
|
|
||||||
|
// Cluster by time proximity: walk through sorted list, start new cluster when gap > window
|
||||||
|
const clusters: typeof ungrouped[] = [];
|
||||||
|
let current: typeof ungrouped = [ungrouped[0]];
|
||||||
|
|
||||||
|
for (let i = 1; i < ungrouped.length; i++) {
|
||||||
|
const prev = current[current.length - 1];
|
||||||
|
const gap = Math.abs(ungrouped[i].indexedAt.getTime() - prev.indexedAt.getTime());
|
||||||
|
|
||||||
|
if (gap <= windowMs) {
|
||||||
|
current.push(ungrouped[i]);
|
||||||
|
} else {
|
||||||
|
clusters.push(current);
|
||||||
|
current = [ungrouped[i]];
|
||||||
|
}
|
||||||
|
}
|
||||||
|
clusters.push(current);
|
||||||
|
|
||||||
|
// Create groups for clusters with 2+ packages
|
||||||
|
for (const cluster of clusters) {
|
||||||
|
if (cluster.length < 2) continue;
|
||||||
|
|
||||||
|
// Derive group name from common filename prefix
|
||||||
|
const name = findCommonPrefix(cluster.map((p) => p.fileName)) || cluster[0].fileName;
|
||||||
|
|
||||||
|
try {
|
||||||
|
const groupId = await createTimeWindowGroup({
|
||||||
|
sourceChannelId,
|
||||||
|
name,
|
||||||
|
packageIds: cluster.map((p) => p.id),
|
||||||
|
});
|
||||||
|
|
||||||
|
log.info(
|
||||||
|
{ groupId, name, memberCount: cluster.length },
|
||||||
|
"Created time-window group"
|
||||||
|
);
|
||||||
|
} catch (err) {
|
||||||
|
log.warn({ err, clusterSize: cluster.length }, "Failed to create time-window group");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Find the longest common prefix among a list of filenames,
|
||||||
|
* trimming trailing separators and partial words.
|
||||||
|
*/
|
||||||
|
function findCommonPrefix(names: string[]): string {
|
||||||
|
if (names.length === 0) return "";
|
||||||
|
if (names.length === 1) return names[0];
|
||||||
|
|
||||||
|
let prefix = names[0];
|
||||||
|
for (let i = 1; i < names.length; i++) {
|
||||||
|
while (!names[i].startsWith(prefix)) {
|
||||||
|
prefix = prefix.slice(0, -1);
|
||||||
|
if (prefix.length === 0) return "";
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Trim trailing separators and partial words
|
||||||
|
const trimmed = prefix.replace(/[\s\-_.(]+$/, "");
|
||||||
|
return trimmed.length >= 3 ? trimmed : "";
|
||||||
|
}
|
||||||
|
|||||||
@@ -10,6 +10,8 @@ export const config = {
|
|||||||
/** Maximum file part size for Telegram upload (in MiB). Default 1950 (under 2GB non-Premium limit).
|
/** Maximum file part size for Telegram upload (in MiB). Default 1950 (under 2GB non-Premium limit).
|
||||||
* Set to 3900 for Premium accounts (under 4GB limit). */
|
* Set to 3900 for Premium accounts (under 4GB limit). */
|
||||||
maxPartSizeMB: parseInt(process.env.MAX_PART_SIZE_MB ?? "1950", 10),
|
maxPartSizeMB: parseInt(process.env.MAX_PART_SIZE_MB ?? "1950", 10),
|
||||||
|
/** Time window for auto-grouping ungrouped packages from the same channel (minutes). 0 = disabled. */
|
||||||
|
autoGroupTimeWindowMinutes: parseInt(process.env.AUTO_GROUP_TIME_WINDOW_MINUTES ?? "5", 10),
|
||||||
/** Maximum jitter added to scheduler interval (in minutes) */
|
/** Maximum jitter added to scheduler interval (in minutes) */
|
||||||
jitterMinutes: 5,
|
jitterMinutes: 5,
|
||||||
/** Maximum time span for multipart archive parts (in hours). 0 = no limit. */
|
/** Maximum time span for multipart archive parts (in hours). 0 = no limit. */
|
||||||
|
|||||||
@@ -47,7 +47,8 @@ import { readRarContents } from "./archive/rar-reader.js";
|
|||||||
import { read7zContents } from "./archive/sevenz-reader.js";
|
import { read7zContents } from "./archive/sevenz-reader.js";
|
||||||
import { byteLevelSplit, concatenateFiles } from "./archive/split.js";
|
import { byteLevelSplit, concatenateFiles } from "./archive/split.js";
|
||||||
import { uploadToChannel } from "./upload/channel.js";
|
import { uploadToChannel } from "./upload/channel.js";
|
||||||
import { processAlbumGroups, type IndexedPackageRef } from "./grouping.js";
|
import { processAlbumGroups, processTimeWindowGroups, type IndexedPackageRef } from "./grouping.js";
|
||||||
|
import { db } from "./db/client.js";
|
||||||
import type { TelegramAccount, TelegramChannel } from "@prisma/client";
|
import type { TelegramAccount, TelegramChannel } from "@prisma/client";
|
||||||
import type { Client } from "tdl";
|
import type { Client } from "tdl";
|
||||||
|
|
||||||
@@ -790,6 +791,9 @@ async function processArchiveSets(
|
|||||||
indexedPackageRefs,
|
indexedPackageRefs,
|
||||||
scanResult.photos
|
scanResult.photos
|
||||||
);
|
);
|
||||||
|
|
||||||
|
// Time-window grouping for remaining ungrouped packages
|
||||||
|
await processTimeWindowGroups(channel.id, indexedPackageRefs);
|
||||||
}
|
}
|
||||||
|
|
||||||
return maxProcessedId;
|
return maxProcessedId;
|
||||||
@@ -1053,6 +1057,43 @@ async function processOneArchiveSet(
|
|||||||
uploadPaths = splitPaths;
|
uploadPaths = splitPaths;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// ── Hash verification after split ──
|
||||||
|
// If we split/repacked, verify the split parts hash matches the original
|
||||||
|
if (splitPaths.length > 0) {
|
||||||
|
const splitHash = await hashParts(splitPaths);
|
||||||
|
if (splitHash !== contentHash) {
|
||||||
|
accountLog.error(
|
||||||
|
{ fileName: archiveName, originalHash: contentHash, splitHash, parts: splitPaths.length },
|
||||||
|
"Hash mismatch after split — file may be corrupted"
|
||||||
|
);
|
||||||
|
// Record notification for visibility
|
||||||
|
try {
|
||||||
|
await db.systemNotification.create({
|
||||||
|
data: {
|
||||||
|
type: "HASH_MISMATCH",
|
||||||
|
severity: "ERROR",
|
||||||
|
title: `Hash mismatch after splitting ${archiveName}`,
|
||||||
|
message: `Expected ${contentHash.slice(0, 16)}… but got ${splitHash.slice(0, 16)}… after splitting into ${splitPaths.length} parts`,
|
||||||
|
context: {
|
||||||
|
fileName: archiveName,
|
||||||
|
originalHash: contentHash,
|
||||||
|
splitHash,
|
||||||
|
partCount: splitPaths.length,
|
||||||
|
sourceChannelId: channel.id,
|
||||||
|
},
|
||||||
|
},
|
||||||
|
});
|
||||||
|
} catch {
|
||||||
|
// Best-effort notification
|
||||||
|
}
|
||||||
|
throw new Error(`Hash mismatch after split for ${archiveName}: expected ${contentHash}, got ${splitHash}`);
|
||||||
|
}
|
||||||
|
accountLog.debug(
|
||||||
|
{ fileName: archiveName, hash: contentHash.slice(0, 16), parts: splitPaths.length },
|
||||||
|
"Split hash verified — matches original"
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
// ── Uploading ──
|
// ── Uploading ──
|
||||||
// Check if a prior run already uploaded this file (orphaned upload scenario:
|
// Check if a prior run already uploaded this file (orphaned upload scenario:
|
||||||
// file reached Telegram but DB write failed or worker crashed before indexing)
|
// file reached Telegram but DB write failed or worker crashed before indexing)
|
||||||
|
|||||||
Reference in New Issue
Block a user