Browse Source

Fixed block-boundary truncate issue

There has been a bug in the filesystem for a while where truncating to a
block boundary suffers from an off-by-one mistake that corrupts the
internal representation of the CTZ skip-list.

This mostly appears when the file_size == block_size, as file_size >
block_size includes CTZ skip-list metadata, so the underlying block
boundaries appear at slightly different offsets.

---

The reason for off-by-one issue is a nuance in lfs_ctz_find that we sort
of abuse to get two different behaviors.

Consider the situation where this bug occurs:

   block 0     block 1
  .--------.  .--------.
  | abcdef |<-| {ptr0} |
  | ghijkl |  | yzabcd |
  | mnopqr |  |        |
  | stuvwx |  |        |
  '--------'  '--------'

With these 24-byte blocks, there's an ambiguity if we wanted to point to
offset 24. We could point before the block boundary, or we could point
after the block boundary

Before:

   block 0     block 1
  .--------.  .--------.
  | abcdef |<-| {ptr0} |
  | ghijkl |  | yzabcd |
  | mnopqr |  |        |
  | stuvwx |  |        |
  '-------^'  '--------'
          '-- off=24 is here

After:

   block 0     block 1
  .--------.  .--------.
  | abcdef |<-| {ptr0} |
  | ghijkl |  | yzabcd |
  | mnopqr |  | ^      |
  | stuvwx |  | |      |
  '--------'  '-|------'
                '-- off=24 is here

When we want these two offsets depends on the context. We want the
offset to be conservative if it represents a size, but eager if it is
being used to prepare a block for writing.

The workaround/hack is to prefer the eager offset, after the block boundary,
but use `size-1` as the argument if we need the conservative offset.

This finds the correct block, but is off-by-one in the calculated
block-offset. Fortunately we happen to not use the block-offset in the
places we need this workaround/hack.

---

To get back to the bug, the wrong mode of lfs_ctz_find was used in
lfs_file_truncate, leading to internal corruption of the CTZ skip-list.

The correct behavior is size-1, with care to avoid underflow.

Also I've tweaked the code to make it clear the calculated block-offset
goes unused in these situations.

Thanks to ghost, ajaybhargav, and others for reporting the issue,
colin-foster-advantage for a reproducible test case, and rvanschoren,
hgspbs for the initial solution.
Christopher Haster 2 years ago
parent
commit
6dc18c38c1
1 changed files with 2 additions and 2 deletions
  1. 2 2
      lfs.c

+ 2 - 2
lfs.c

@@ -3348,7 +3348,7 @@ static lfs_ssize_t lfs_file_flushedwrite(lfs_t *lfs, lfs_file_t *file,
                     // find out which block we're extending from
                     int err = lfs_ctz_find(lfs, NULL, &file->cache,
                             file->ctz.head, file->ctz.size,
-                            file->pos-1, &file->block, &file->off);
+                            file->pos-1, &file->block, &(lfs_off_t){0});
                     if (err) {
                         file->flags |= LFS_F_ERRED;
                         return err;
@@ -3535,7 +3535,7 @@ static int lfs_file_rawtruncate(lfs_t *lfs, lfs_file_t *file, lfs_off_t size) {
         // lookup new head in ctz skip list
         err = lfs_ctz_find(lfs, NULL, &file->cache,
                 file->ctz.head, file->ctz.size,
-                size, &file->block, &file->off);
+                size-lfs_min(1, size), &file->block, &(lfs_off_t){0});
         if (err) {
             return err;
         }