Research

Project 1: The Design Engine

Designing new indexing systems to meet application demands and hardware constraints is currently a manual process. In the Design Engine, we focus on mapping the elements that constitute primitive data structure components and formally describing them. From these primitive components, we can synthesize and model complex data structures, allowing for accurate predictions of how new indexing systems will perform. Finally, from this formal specification of data structures we create a well-defined space which search algorithms can use to find optimal indexing structures.

Project 2: Lossy Compression as an accelerator in Column Stores

This project focuses on using lossy compression to speed up predicate evaluation. The idea behind the project is to strategically trade determinism in value representation for more compact code values. This more compact representation helps in I/O performance and allows more values to be packed into SIMD registers, helping CPU performance. The result is an index that can be built quickly and provides a 3x-6x performance improvement in the select operator, regardless of query selectivity.

Project 3: Temporally Biased Sampling for Online Model Management

In this project, we consider the use of time biased sampling as a technique for online machine learning. We expand upon previously known methods for time biased sampling, providing the first algorithm to give complete control over the decay rate while bounding the maximum sample size. The full work can be seen at https://arxiv.org/abs/1801.09709