Sync Mechanism

Sync Mechanism

Detailed description of how file synchronization works internally.


Synchronization Flow

1. SyncValidator::canSync() — Pre-validation
2. Status → 'running'
3. For each Execution Step (1 or 2 for bidirectional):
   a. resolveFilesToSync() — Determine files
   b. getFileSizeEstimates() — Estimate sizes
   c. splitIntoBatches() — Split into batches
   d. For each batch:
      i.   compressFiles() on source server → .tar.gz
      ii.  Create JWT token (valid 15 min)
      iii. Download URL: {node}/download/file?token={jwt}
      iv.  HTTP POST /api/servers/{uuid}/files/pull
      v.   decompressFile() on target server
      vi.  Delete archives on both servers
4. Status → success | partial | failed
5. Create SyncLog

Incremental Sync

  • First Sync (last_sync_at = null): All files are synchronized
  • Subsequent Syncs: Only files with modified > last_sync_at
  • Directories: Recursive search for changed files (max 10 levels)
  • On errors: File is included as a precaution

Batch Splitting

When estimated file sizes exceed max_file_size_mb:

  1. Sort files by size (descending)
  2. First-Fit-Decreasing algorithm: New batch when currentBatch + nextFile > limit
  3. Individual files larger than the limit get their own batch
  4. Archive name: .server_sync_{pairId}_{timestamp}_b{batchNum}.tar.gz

Exclude Paths Matching

A path is considered excluded when:

  • Exact match: trim(path) === trim(exclude)
  • Prefix match: str_starts_with(path, exclude + '/')

Example: Exclude logs → excludes logs, logs/server.log, logs/2026/error.log.


Archive Cleanup

  • Temporary archives are deleted on both servers
  • On errors: Best-effort cleanup (errors are only logged)
  • Archive names start with .server_sync_ (hidden)
  • In full-scope sync, own archives are automatically skipped

File Detail Expansion

Top-level directories are resolved into individual file paths:

  • Maximum 500 entries
  • Recursive listing up to depth 5
  • Empty directories: "dirname/" as entry