Show simple item record

dc.contributor.authorDu, Xingran
dc.contributor.authorEmer, Joel
dc.contributor.authorSanchez, Daniel
dc.date.accessioned2025-09-02T19:16:04Z
dc.date.available2025-09-02T19:16:04Z
dc.date.issued2025-08-06
dc.identifier.isbn979-8-4007-1080-3
dc.identifier.urihttps://hdl.handle.net/1721.1/162597
dc.descriptionASPLOS ’25, Rotterdam, Netherlandsen_US
dc.description.abstractAutomata processing (AP) is a key kernel in data analytics and scientific computing. AP workloads process a stream of symbols with many automata (FSMs) in parallel, e.g., pattern-matching network traffic against many malicious strings. The need for high-performance AP has sparked the design of specialized accelerators. But prior AP accelerators are inefficient: AP workloads have substantial sparsity, but accelerators exploit no or limited sparsity. Specifically, each AP workload can be expressed as the concurrent traversal of all automata, which are encoded as graphs. But state-of-the-art accelerators store these graphs uncompressed, using bitsets. This allows the use of specialized memory crossbars that provide high parallelism and efficiency when graphs are dense. But many graphs are highly sparse, making crossbar-based accelerators inefficient. We present Hopps, the first automata processing accelerator that exploits sparse data representations. Hopps combines two types of processing units: one represents data uncompressed, which achieves high throughput but is space-inefficient, while the other uses a compressed-sparse representation, which achieves high space efficiency but lower and more variable throughput. To use Hopps well, we present a novel automata mapping algorithm that maps most work to high-throughput units, while keeping a large fraction of state in space-efficient units. Hopps's hybrid design relaxes several constraints in crossbar-based designs, allowing for more efficient high-throughput units (e.g., by using a large number of smaller crossbars). Thus, by making the uncommon case cheap, Hopps makes the common case even faster. We evaluate Hopps on AutomataZoo benchmarks. Hopps outperforms prior state-of-the-art accelerators Impala and SpAP by gmean 2.5x and 2.2x when using equal area.en_US
dc.publisherACM|Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3en_US
dc.relation.isversionofhttps://doi.org/10.1145/3676642.3736126en_US
dc.rightsCreative Commons Attribution-Noncommercial-ShareAlikeen_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/4.0/en_US
dc.sourceAssociation for Computing Machineryen_US
dc.titleHopps: Leveraging Sparsity to Accelerate Automata Processingen_US
dc.typeArticleen_US
dc.identifier.citationXingran Du, Joel S. Emer, and Daniel Sanchez. 2025. Hopps: Leveraging Sparsity to Accelerate Automata Processing. In Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3 (ASPLOS '25). Association for Computing Machinery, New York, NY, USA, 96–111.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratoryen_US
dc.identifier.mitlicensePUBLISHER_POLICY
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
eprint.statushttp://purl.org/eprint/status/NonPeerRevieweden_US
dc.date.updated2025-09-01T07:49:37Z
dc.language.rfc3066en
dc.rights.holderThe author(s)
dspace.date.submission2025-09-01T07:49:38Z
mit.licensePUBLISHER_CC
mit.metadata.statusAuthority Work and Publication Information Neededen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record