Accelerating Burst Parallelism of SigmaOS processes with CRIU
Author(s)
Tang, Frederick
DownloadThesis PDF (916.9Kb)
Advisor
Szekely, Ariel
Kaashoek, Frans
Terms of use
Metadata
Show full item recordAbstract
σOS is a multi-tenant cloud operating system designed to integrate the agility of serverless environments with the interactivity of microservices. A goal of achieving this integration is the ability to start new instances of server processes quickly. However, σOS only handles σcontainer initialization, and does not assist with runtime and app initialization costs. One approach to overcome this challenge is to checkpoint processes using Checkpoint/Restore in Userspace (CRIU). CRIU is a linux toolset which can start new server instances by restoring them from a saved checkpointed state, avoiding the full cost of reinitialization and setup. This thesis introduces σCRIU, which adapts CRIU for burst-parallel spawning of microservices in σOS. σCRIU implements a number of optimizations: compressing checkpointed proc metadata to reduce network communication costs, implementing demand-paging using a lazy page service, and caching kernel metadatadata to reduce CRIU’s restore operation latency. These optimizations allow σCRIU to start new microservices on remote machines quickly while still making use of CRIU’s existing proven checkpoint and restore technology.
Date issued
2025-09Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology