Performant and Resilient Service Composition for Modern Cloud Applications
Author(s)
Li, Tianyu
DownloadThesis PDF (9.658Mb)
Advisor
Madden, Samuel R.
Terms of use
Metadata
Show full item recordAbstract
Modern cloud applications are often distributed systems composed from vendor-provided building blocks (e.g., object storage services, container orchestration services). Consequently, distributed fault-tolerance is a central concern for application correctness. Although each building block may offer individual fault-tolerance, the end-to-end application is still susceptible to failures, because the composition logic that orchestrates them may still fail. This thesis explores resilient composition, a systematic way to assemble fault-tolerant components into resilient end-to-end distributed applications. We begin by presenting the fail-restart system model, which captures the unique fault-tolerance challenges that arise when composing services. Based on this model, we define Composable Resilient Steps (CReSt), an atomic programming abstraction that guarantees fault-tolerance across the assembled application. We then detail efficient methods for implementing CReSt using a range of database techniques, and a novel distributed protocol that allow optimistic, speculative execution ahead of slower fault-tolerance safeguards. Together, these pieces allow developers to assemble fault-tolerant distributed systems that are correct by construction and often more performant than existing solutions.
Date issued
2025-05Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology