[Week 5] Choppy Waters

Wrangling naughty data structures. That was my job this week.

Summary

  • Implemented code to blob-ify the encrypted index's datastore
  • Wrote an iterator-style interface for the client to store each blob in the Matrix Content Repository
  • Added tests for above functionalities

Details

My idea for storing extra-large-buckets was greenlit by my mentor, but (absolutely unlike my prediction) it was very difficult to properly implement. I'll summarize the basic idea here:

  1. We have a structure called a blob. Every blob is smaller than a defined size limit. Each blob corresponds to one file in the Matrix Content Repository and thus has one MXC URI.
  2. We will try to fit each level of our encrypted index into a separate blob.
  3. For the levels which are too big, we will try to divide them into sub-levels. Each sub-level is an array of one or more whole buckets. We will try to fit each sub-level into a separate blob.
  4. It may be that some buckets are too large for a blob and so they can't be saved as a sub-level. In this case we split up the bucket into equal-sized sub-buckets. Each sub-bucket is an array of strings. We will save each sub-bucket as a separate blob.

(The code for these steps is unavoidably ugly and I've rewritten it about eight times now 😭)

Once the blob-ification process is complete, we can provide each blob to the client to upload to the homeservers and save the MXC URIs in our IndexStorage object.

The next step is to update the Location objects in the encrypted index's lookup_table so that when search is performed, we only fetch the required files and not all the files of the level. This is what I'm working on as of writing this blog.

And it goes without saying that these methods have been thoroughly tested. Documentation, though, will have to wait until I'm done with the class.

Comments

Popular Posts