Important note on file provenance and GDG

FAQ/Troubleshooting/General Discussion about the File System Namespace

Moderators: Steve Cranage, Robert Plaster

Post Reply
User avatar
Steve Cranage
Posts: 11
Joined: Wed Jul 21, 2021 10:56 pm
Location: Colorado Springs, CO
Contact:

Important note on file provenance and GDG

Post by Steve Cranage »

Generation Data Groups (GDG), aka versioning are necessary for data protection - specifically the ability to "rewind" a file back to a prior state. It's important to know that the concept is based on a file's kernel handle which is assigned when the file is first created. As long as the kernel handle remains the same, there will be an unbroken chain of children as the file is modified over time.

The problem is some applications break the file provenance by deleting the file un-expectantly. 'tar' is one example, it will on an extract delete any files that would collide before extracting the new one. We created a ds-tar that does not do this, so I'd suggest creating a tar alias to our own version. It is now included in the SMS rpm with the path /usr/bin/ds-tar.

'vi' is another issue since when you open an existing file with vi it copies the original file to a swap file, then a ':wq' relinks the swap file back to the original name. That is a problem in that it is not going to give you any provenance for prior copies, they will be recorded in the CDS as all separate files with different kernel handles, every generation will show as a different file with a generation of 0. We don't have the spare resources to fix vi right now, so I'd recommend using another editor that doesn't behave this way.
Steve Cranage
Principal Architect, Co-Founder
DeepSpace Storage

Post Reply