|
|
@@ -396,42 +396,23 @@ for a file size:
|
|
|
|
|
|
Unfortunately, we're not quite done. The popcount function is non-injective,
|
|
|
so we can only find the file size from the block index, not the other way
|
|
|
-around. However, we can guess and correct. Consider an n' block index that
|
|
|
-is greater than n, we can find one pretty easily:
|
|
|
+around. However, we can solve for an n' block index that is greater than n
|
|
|
+with an error bounded by the range of the popcount function. We can then
|
|
|
+repeatedly substitute this n' into the original equation until the error
|
|
|
+is smaller than the integer division. As it turns out, we only need to
|
|
|
+perform this substitution once. Now we directly calculate our block index:
|
|
|
|
|
|
-
|
|
|
+
|
|
|
|
|
|
-where:
|
|
|
-n' >= n
|
|
|
-
|
|
|
-We can plug n' back into our popcount equation to find an N' file size that
|
|
|
-is greater than N. However, we need to rearrange our terms a bit to avoid
|
|
|
-integer overflow:
|
|
|
-
|
|
|
-
|
|
|
-
|
|
|
-where:
|
|
|
-N' >= N
|
|
|
-
|
|
|
-Now that we have N', we can find our block offset:
|
|
|
-
|
|
|
-
|
|
|
-
|
|
|
-where:
|
|
|
-off' >= off, our byte offset in the block
|
|
|
-
|
|
|
-Now we're getting somewhere. N' is greater than or equal to N, and as long as
|
|
|
-the number of pointers per block is bounded by the block size, it can only be
|
|
|
-different by at most one block. So we have two cases that can be determined by
|
|
|
-the sign of off'. If off' is negative, we correct n' and add a block to off'.
|
|
|
-Note that we also need to incorporate the overhead of the last block to get
|
|
|
-the right offset.
|
|
|
+Now that we have our block index n, we can just plug it back into the above
|
|
|
+equation to find the offset. However, we do need to rearrange the equation
|
|
|
+a bit to avoid integer overflow:
|
|
|
|
|
|
-
|
|
|
+
|
|
|
|
|
|
-It's a lot of math, but computers are very good at math. With these equations
|
|
|
-we can solve for the block index + offset while only needed to store the file
|
|
|
-size in O(1).
|
|
|
+The solution involves quite a bit of math, but computers are very good at math.
|
|
|
+We can now solve for the block index + offset while only needed to store the
|
|
|
+file size in O(1).
|
|
|
|
|
|
Here is what it might look like to update a file stored with a CTZ skip-list:
|
|
|
```
|