[Week 9] Now Make It Dynamic!

When I was originally reading the Demertzis et al. paper, I thought writing an updation scheme for the index would be the hardest part. After the storms I've conquered, this was a middling gale.

Summary

  • Implemented a merging scheme for encrypted indices
  • Added documentation and tests for IndexMerge
  • Discovered and analyzed a major bug

Details

This week was a return to normal for me. I had a precise idea of what the merging scheme would look like and which methods were to be exposed to the user. Some code from previous classes like IndexStorage and EncryptedSearch came handy when designing the skeleton of IndexMerge, which felt like a nice touch.

Writing the tests was a breeze, but the very first time I ran them I ran into a problem. Three keywords were missing one document id. To be concrete: the results for "open", "matrix" and "communication" were missing the document id "$e4uS...HGHg" even though they did appear in that document.

Now, there are multiple systems at play here: three indices being built, two indices being partitioned into "blocs", and one merge operation. One thing I noticed pretty early on was that the three keywords were the most frequent in the messages I was running the test over. Next: when I repeated the test, the missing document id was different.

Then I started backtracking through the process for where it could have faltered. This step took a while because of the shear volume of output from the test and the number of suspicious lines of code.

But eventually I narrowed my eyes at one method that seemed to be the culprit. IndexStorage.__convert_location was converting each local chunk's location to a location in a remote file. Notice the "a remote file" there, because several chunks did, in fact, split into multiple files before this method and those extra files were just ignored completely.

(I realize that I'm blaming the method as if it is a being with agency when in fact this is entirely Past Me's fault.)

Anyway, I've raised an issue for this and will sort it out once I've finished up with IndexMerge.

Comments

Popular Posts