make renameData() more defensive during overwrites (#19548)

instead upon any error in renameData(), we still
preserve the existing dataDir in some form for
recoverability in strange situations such as out
of disk space type errors.

Bonus: avoid running list and heal() instead allow
versions disparity to return the actual versions,
uuid to heal. Currently limit this to 100 versions
and lesser disparate objects.

an undo now reverts back the xl.meta from xl.meta.bkp
during overwrites on such flaky setups.

Bonus: Save N depth syscalls via skipping the parents
upon overwrites and versioned updates.

Flaky setup examples are stretch clusters with regular
packet drops etc, we need to add some defensive code
around to avoid dangling objects.
This commit is contained in:
Harshavardhana
2024-04-23 10:15:52 -07:00
committed by GitHub
parent ee1047bd52
commit 9693c382a8
22 changed files with 460 additions and 282 deletions

View File

@@ -33,6 +33,8 @@ type DeleteOptions struct {
Recursive bool `msg:"r"`
Immediate bool `msg:"i"`
UndoWrite bool `msg:"u"`
// OldDataDir of the previous object
OldDataDir string `msg:"o,omitempty"` // old data dir used only when to revert a rename()
}
// BaseOptions represents common options for all Storage API calls
@@ -490,8 +492,14 @@ type WriteAllHandlerParams struct {
}
// RenameDataResp - RenameData()'s response.
// Provides information about the final state of Rename()
// - on xl.meta (array of versions) on disk to check for version disparity
// - on rewrite dataDir on disk that must be additionally purged
// only after as a 2-phase call, allowing the older dataDir to
// hang-around in-case we need some form of recovery.
type RenameDataResp struct {
Signature uint64 `msg:"sig"`
Sign []byte
OldDataDir string // contains '<uuid>', it is designed to be passed as value to Delete(bucket, pathJoin(object, dataDir))
}
// LocalDiskIDs - GetLocalIDs response.