MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Optimizing SigmaOS for Efficient Orchestration of Fault-Tolerant, Burst-Parallel Workloads

Author(s)
Chang, Ryan
Thumbnail
DownloadThesis PDF (1.034Mb)
Advisor
Szekely, Ariel
Kaashoek, M. Frans
Terms of use
In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
SigmaOS is a multi-tenant cloud operating system designed for efficient orchestration of fault-tolerant, burst-parallel workloads. It provides users with isolated cloud environments called realms, where resources are accessed through a Unix-like filesystem interface, and supports applications built from procs—lightweight, rapidly-spawnable programs that can be both short-lived for bursty tasks or long-running and stateful for persistent services. However, the current prototype exhibits performance bottlenecks that hinder its scalability for larger, more demanding applications. This thesis addresses these limitations by introducing two key optimizations: (1) a rearchitected watch API, enhancing its efficiency and scalability for monitoring directory changes crucial for inter-proc coordination and event notification, and (2) a new ft/task server, providing a robust and high-performance mechanism for managing fault-tolerant bags of tasks, essential for applications like MapReduce. Through these enhancements, this work demonstrates significant improvements in SigmaOS’s performance on the MapReduce benchmark, showcasing improved scaling capabilities for larger cluster deployments, larger inputs, and more granular tasks. These optimizations are crucial steps towards enabling SigmaOS to effectively realize its vision as a scalable and performant platform for complex cloud workloads.
Date issued
2025-05
URI
https://hdl.handle.net/1721.1/162546
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.