Show simple item record

dc.contributor.authorChaudhry, Gohar Irfan
dc.contributor.authorChoukse, Esha
dc.contributor.authorGoiri, ??igo
dc.contributor.authorFonseca, Rodrigo
dc.contributor.authorBelay, Adam
dc.contributor.authorBianchini, Ricardo
dc.date.accessioned2025-12-17T16:42:21Z
dc.date.available2025-12-17T16:42:21Z
dc.date.issued2025-06-06
dc.identifier.isbn979-8-4007-1475-7
dc.identifier.urihttps://hdl.handle.net/1721.1/164381
dc.descriptionHOTOS 25, May 14–16, 2025, Banff, AB, Canadaen_US
dc.description.abstractCompound AI Systems, integrating multiple interacting components like models, retrievers, and external tools, have emerged as essential for addressing complex AI tasks. However, current implementations suffer from inefficient resource utilization due to tight coupling between application logic and execution details, a disconnect between orchestration and resource management layers, and the perceived exclusiveness between efficiency and quality. We propose a vision for resource-efficient Compound AI Systems through a declarative workflow programming model and an adaptive runtime system for dynamic scheduling and resource-aware decision-making. Decoupling application logic from low-level details exposes levers for the runtime to flexibly configure the execution environment and resources, without compromising on quality. Enabling collaboration between the workflow orchestration and cluster manager enables higher efficiency through better scheduling and resource management. We are building a prototype system, called Murakkab, to realize this vision. Our preliminary evaluation demonstrates speedups up to ~ 3.4× in workflow completion times while delivering ~ 4.5× higher energy efficiency, showing promise in optimizing resources and advancing AI system design.en_US
dc.publisherACM|Workshop on Hot Topics in Operating Systemsen_US
dc.relation.isversionofhttps://doi.org/10.1145/3713082.3730377en_US
dc.rightsCreative Commons Attributionen_US
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/en_US
dc.sourceAssociation for Computing Machineryen_US
dc.titleTowards Resource-Efficient Compound AI Systemsen_US
dc.typeArticleen_US
dc.identifier.citationGohar Irfan Chaudhry, Esha Choukse, Íñigo Goiri, Rodrigo Fonseca, Adam Belay, and Ricardo Bianchini. 2025. Towards Resource-Efficient Compound AI Systems. In Proceedings of the 2025 Workshop on Hot Topics in Operating Systems (HotOS '25). Association for Computing Machinery, New York, NY, USA, 218–224.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratoryen_US
dc.identifier.mitlicensePUBLISHER_POLICY
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
eprint.statushttp://purl.org/eprint/status/NonPeerRevieweden_US
dc.date.updated2025-08-01T08:32:24Z
dc.language.rfc3066en
dc.rights.holderThe author(s)
dspace.date.submission2025-08-01T08:32:24Z
mit.licensePUBLISHER_CC
mit.metadata.statusAuthority Work and Publication Information Neededen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record