15 Feb

Incubaid Research – Rediscovering the RSync Algorithm

Don’t walk the folder and ‘rsync’ each file you encounter. A small calculation will show you how bad it really is.

Suppose you have 20000 files, each 1KB. Suppose 1 rsync costs you about 0.1s (reading the file, sending over the signature, building the stream of updates, applying them). This costs you about 2000s or more than half an hour.

System administrators know better:they would not hesitate: “tar the tree, sync the tars, and untar the synced tar”.

Suppose each of the actions takes 5s (overestimating) you’re still synced in 15s.

via Incubaid Research – Rediscovering the RSync Algorithm. The right way to synch two remote file systems.