Utah Scalable Computer Systems: Large-scale Systems Atop Scale-out In-memory Storage

Large-scale Systems Atop Scale-out In-memory Storage

Splinter team at OSDI'18 in San Diego (Sara Moore, Ryan Stutsman, Chinmay Kulkarni, and Mazhar Naqvi).

Research into kernel-bypass, RDMA, and in-memory storage has resulted in distributed storage systems that can provide on-demand access to billions of pieces of information per second, but the benefit of these fast systems has not yet trickled down to applications. A key problem is that today, these systems are fast in part because they are simplistic and stripped-down, which limits how applications can interact with data.

The goal of this project has been to generalize the benefits of fast networking to support full, realistic applications. The work has spanned three thrusts.

Foremost, is a new model of stored procedures for μs-scale in-memory storage that overcomes the simplistic data models of today's fast, in-memory stores The goal is to eliminate inefficiency and forced data movement that comes from simplistic get() and put() storage interfaces, and the goal is to do it without sacrificing the performance of running native code on stored data (Splinter and Sandstorm).
For low-latency in-memory storage to be efficient it must also scale in response to changes in workload with minimum impact (especially on storage tail latency, since it hurts applications that collect data in real-time at scale). Hence, another goal has been minimizing the impact of data migration (Rocksteady) and replication (Tailwind) for in-memory storage, along with dynamically reapportioning memory resources among application caches (Memshare).
Finally, we have built several applications on top of scale-out, in-memory storage including real-time distributed aggregation, inference serving for machine learning models, state-of-the-art performance in real-time graph querying, and a fault-tolerant control plane for a 4G mobile control plane (ECHO).

Publications

μs-scale Stored Procedures

Splinter: Bare-Metal Extensions for Multi-Tenant Low-Latency Storage

Chinmay Kulkarni, Sara Moore, Mazhar Naqvi, Tian Zhang, Robert Ricci, and Ryan Stutsman

OSDI '18

Application-specific native-code storage-level functions with 10 μs response times.

JavaScript for Extending Low-latency In-memory Key-value Stores

Tian Zhang and Ryan Stutsman

HotCloud'17

JIT Javascript/WASM runtimes for embedding data-intensive operations within μs-scale storage systems.

Distribution

Rocksteady: Fast Data Migration for Low-latency In-memory Storage

Chinmay Kulkarni, Aniraj Kesavan, Tian Zhang, Robert Ricci, and Ryan Stutsman

SOSP '17

Tail-latency focused data migration for μs-scale in-memory storage that exploits workload skew.

Tailwind: Fast and Atomic RDMA-based Replication

Yacine Taleb, Ryan Stutsman, Gabriel Antoniu, and Toni Cortes

USENIX ATC '18

RDMA-based replication that accelerates both replication and normal-case request processing for μs-scale in-memory storage.

Memshare: Memory Resource Sharing in Multi-tenant Web Caches

Asaf Cidon, Daniel Rushton, Stephen M. Rumble, and Ryan Stutsman

USENIX ATC'17

Dynamic memory partitioning for multi-tenant web caches that improves hit rates while providing performance isolation.

Applications

ECHO: A Reliable Distributed Cellular Core Network for Hyper-scale Public Clouds

Binh Nguyen, Tian Zhang, Bozidar Radunovic, Ryan Stutsman, Thomas Karagiannis, Jakub Kocur, and Jacobus Van Merwe

Mobicom'18

A mobile control plane that externalizes state in fast storage for fault tolerance and high availability.

Impact

Beyond improvements to scale-out μs-scale storage, the work under this project has resulted in joint collaborations with Microsoft Research. Along with supporting graduate students, several undergraduates have participated in efforts related to the project, including two who helped coauthor the above publications and two that have completed undergraduate theses.

Team

Generously Sponsored By

Facebook VMware