Use numeric indices for delta filenames, document limitations

- Delta files now named 0.zst, 1.zst etc — avoids path length issues
  and ambiguous separator substitution; manifest maps index to path
- PLAN.md: document delta naming rationale
- PLAN.md: document cross-file deduplication limitation and possible
  future approaches (zstd dictionary training, content-addressing, tar stream)
This commit is contained in:
2026-03-07 01:47:31 +00:00
parent f1faa992c9
commit ba67366cd6
2 changed files with 24 additions and 3 deletions

19
PLAN.md
View File

@@ -161,6 +161,25 @@ rsync meaningful exit codes:
Currently basic: any non-zero exit code throws. Finer-grained handling planned as part of the
operation abstraction refactor.
## Known Limitations
### Delta file naming
Delta files are named by numeric index (e.g. `0.zst`, `1.zst`) rather than by path. The manifest
maps each index to its source path. Path-based naming was considered but rejected because:
- Deep directory trees can exceed filesystem filename length limits
- Path separator substitution (e.g. `/``__`) is ambiguous for filenames containing that sequence
### Cross-file deduplication
Per-file deltas cannot exploit similarity between different files — each file is compressed/diffed
in isolation. Identical or near-identical files in different locations get no benefit from each
other. Approaches that could address this:
- `zstd --train` to build a shared dictionary from the corpus, then compress all deltas against it
- Content-addressed storage (deduplicate at the block or file level before delta generation)
- Tar the entire PEND tree and delta against the previous tar (single-stream, cross-file repetition
is visible to the compressor — but random access for restore becomes harder)
These are significant complexity increases and out of scope for now.
## Occasional Snapshots
Delta chains are efficient but fragile over long chains. Periodic full snapshots (every N deltas,

View File

@@ -65,16 +65,16 @@ export async function runCommand(config) {
}
const manifestChanges = [];
let fileIndex = 0;
for (const change of changes) {
const deltaFilename = change.path.replaceAll('/', '__') + backend.ext;
const outFile = join(filesDir, deltaFilename);
if (change.status === 'deleted') {
manifestChanges.push({ path: change.path, status: 'deleted' });
continue;
}
const deltaFilename = `${fileIndex}${backend.ext}`;
const outFile = join(filesDir, deltaFilename);
const prevFile = join(prev, change.path);
const newFile = join(pend, change.path);
@@ -97,6 +97,8 @@ export async function runCommand(config) {
status: change.status,
delta: join('files', deltaFilename),
});
fileIndex++;
}
// ── Phase 5: Write manifest + atomic commit ──────────────────