mirror of
https://github.com/xCyanGrizzly/DragonsStash.git
synced 2026-06-09 18:51:16 +00:00
fix(worker): skip integrity test for multipart ZIPs — unzip -t can't span them
All checks were successful
continuous-integration/drone/push Build is passing
All checks were successful
continuous-integration/drone/push Build is passing
Diagnosed from production: main downloaded several 28 GB ZIP sets
(CA 3D STUDIOS 2023-07.zip.001..007, 2023-08.zip.001..006, ...) and
rejected every one of them with:
"Archive integrity check failed: Command failed:
unzip -tqq /tmp/zips/.../CA 3D STUDIOS 2023-07.zip.001"
Root cause: the integrity test I added in 04effed passed `uploadPaths[0]`
to the archive tester. For byte-split multipart ZIPs (`.zip.001`,
`.zip.002`, ...), the first chunk isn't a valid ZIP on its own — the
central directory only exists at the END of the assembled archive.
unzip's spanned-ZIP support uses `.z01/.z02/.../.zip` naming, not
`.zip.001/.002`, so even pointing at the assembled-form parts wouldn't
help.
Three correctness changes:
1. Test runs on `tempPaths[0]` (the original downloaded file) instead
of `uploadPaths[0]` (which may be byte-split chunks we created).
For single-file ZIPs we re-split, this still tests the unsplit
original.
2. Skip the test entirely when archiveType=ZIP AND tempPaths.length>1
— these are source multipart ZIPs we can't validate without
concatenating, and the hash check + central-directory parse we
already do are sufficient structural signals.
3. RAR and 7Z multipart still ARE tested — `unrar t` and `7z t` both
auto-discover sibling parts when pointed at the first one.
This unblocks all multipart-ZIP ingestion for the main account. Hours
of downloaded archives that were being rejected will now pass through.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -1682,16 +1682,28 @@ async function processOneArchiveSet(
|
||||
}
|
||||
|
||||
// ── Pre-upload integrity test ──
|
||||
// Catch broken/encrypted archives before we burn upload bandwidth on
|
||||
// them. Cheap (unzip -t / unrar t / 7z t) compared to a multi-GB upload.
|
||||
// Skipped when we're reusing an existing upload — no point testing the
|
||||
// file again.
|
||||
const integrity = await testArchiveIntegrity(
|
||||
archiveSet.type === "7Z" ? "SEVEN_Z" : archiveSet.type,
|
||||
uploadPaths[0]
|
||||
);
|
||||
if (!integrity.ok) {
|
||||
throw new Error(`Archive integrity check failed: ${integrity.reason}`);
|
||||
// Catch broken/encrypted archives before we burn upload bandwidth.
|
||||
//
|
||||
// Important nuance: ZIP multipart archives use byte-level chunk naming
|
||||
// (`.zip.001`, `.zip.002`, ...). Individual chunks aren't valid ZIPs
|
||||
// — the central directory only exists in the last chunk and unzip can't
|
||||
// span the `.zip.001` naming convention. Testing the first chunk alone
|
||||
// always fails with "no central directory found". Skip the test for
|
||||
// those.
|
||||
//
|
||||
// RAR and 7z CLI tools auto-discover sibling parts when pointed at the
|
||||
// first part, so `unrar t` / `7z t` work for multipart RAR/7z.
|
||||
//
|
||||
// Single-file archives (regardless of whether WE re-split them for
|
||||
// upload size limits) are always testable on the original tempPaths[0]
|
||||
// since that's the unsplit downloaded file.
|
||||
const archType = archiveSet.type === "7Z" ? "SEVEN_Z" : archiveSet.type;
|
||||
const isMultipartZip = archType === "ZIP" && tempPaths.length > 1;
|
||||
if (!isMultipartZip) {
|
||||
const integrity = await testArchiveIntegrity(archType, tempPaths[0]);
|
||||
if (!integrity.ok) {
|
||||
throw new Error(`Archive integrity check failed: ${integrity.reason}`);
|
||||
}
|
||||
}
|
||||
|
||||
// ── Uploading ──
|
||||
|
||||
Reference in New Issue
Block a user