When block-level backup is enabled, backup job uploads only modified files "delta" (difference between previous version and current version) rather then the entire file. It implies the following consequence:
Each version depends on previous one and thus versions from the middle of the chain cannot be purged (deleted according to the retention policy). The whole chain will be purged only when the latest part of the chain can be purged (there should be enough subsequent independent versions).
This prevents retention policy from working since every version except for the latest one has versions depending on it. To sort this out, a Full backup needs to be scheduled. Full backup will upload modified files entirely, and all block-level uploads happening afterwards will depend on a new Full.
Here is the illustration of this process:
- Let F be a full backup, B be an incremental block-level backup and -> be a connection between backups. When backup job is started for the first time, an initial Full backup is being uploaded.
- Let d be the sign that a version is marked for deletion due to retention policy.
- Let us assume that we have backup scheduled to occur daily, retention policy set to keep 3 versions, and review two cases:
a) Full backup is not scheduled
b) Full backup is scheduled once a week (for example, on Sunday)
In both cases an initial full backup will be uploaded when the backup job is started for the first time (let's assume it occurs on Monday). Here is how the "version chain" will look like:
Next time backup job runs (on Tuesday), it will upload only differences:
F -> B
F -> B -> B
Fd -> B -> B -> B
Now we have 4 versions backed up and according to our retention policy we want to keep only 3 of them, but F cannot be purged since all subsequent increments depend on it. So, all 4 versions will remain and we will have the following chain on Friday:
Fd -> Bd -> B -> B -> B
Fd -> Bd -> Bd -> B -> B -> B
Now we come to a point when it splits into two cases:
a) Full backup is not scheduled and we will upload a new increment on Sunday:
Fd -> Bd -> Bd -> Bd -> B -> B -> B
and so it will go on until you run out of storage - because the chain of versions never ends.
b) Full backup is scheduled for Sunday:
Fd -> Bd -> Bd -> Bd -> B -> B -> F
Now all subsequent increments will depend on a new Full, but we still cannot purge anything since we need to keep 3 latest versions and two of them depend on the first Full. We need to upload two more increments and then retention will work:
Fd -> Bd -> Bd -> Bd -> Bd -> B -> F -> B
Fd -> Bd -> Bd -> Bd -> Bd -> Bd -> F -> B -> B
From this point, the whole "first" chain will be purged and we are going to have 3 versions as desired:
F -> B -> B
Two more things to notice:
- When a file is backed up, its "date modified" is recorded to Backup database. Then when the same file is to be backed up again - the software compares "date modified" of the current state of the file and what is recorded in the database. Thus, if the file contents changes but "date modified" stays the same (for example, robocopy can do this) - it will not be considered changed, so it will not be backed up.
- Data purge is performed in the end of a backup plan. So if a plan fails before data purge, the data will stay in storage until the next successful execution of the backup plan. This is how our Backup software prevents data loss that could be caused by early data deletion and consecutive backup failure.