Show simple item record

dc.contributor.authorMakni, Mehdi
dc.contributor.authorBehdin, Kayhan
dc.contributor.authorAfriat, Gabriel
dc.contributor.authorXu, Zheng
dc.contributor.authorVassilvitskii, Sergei
dc.contributor.authorPonomareva, Natalia
dc.contributor.authorMazumder, Rahul
dc.contributor.authorHazimeh, Hussein
dc.date.accessioned2025-09-09T19:57:11Z
dc.date.available2025-09-09T19:57:11Z
dc.date.issued2025-08-03
dc.identifier.isbn979-8-4007-1454-2
dc.identifier.urihttps://hdl.handle.net/1721.1/162621
dc.descriptionKDD ’25, Toronto, ON, Canadaen_US
dc.description.abstractDifferentially private stochastic gradient descent (DP-SGD) is broadly considered to be the gold standard for training and fine-tuning neural networks under differential privacy (DP). With the increasing availability of high-quality pre-trained model checkpoints (e.g., vision and language models), fine-tuning has become a popular strategy. However, despite recent progress in understanding and applying DP-SGD for private transfer learning tasks, significant challenges remain - most notably, the performance gap between models fine-tuned with DP-SGD and their non-private counterparts. Sparse fine-tuning on private data has emerged as an alternative to full-model fine-tuning -- recent work has shown that privately fine-tuning only a small subset of model weights and keeping the rest of the weights fixed can lead to better performance. In this work, we propose a new approach for sparse fine-tuning of neural networks under DP. Existing work on private sparse finetuning often used fixed choice of trainable weights (e.g., updating only the last layer), or relied on public model's weights to choose the subset of weights to modify. Such choice of weights remains suboptimal. In contrast, we explore an optimization-based approach, where our selection method makes use of the private gradient information, while using off the shelf privacy accounting techniques. Our numerical experiments on several computer vision models and datasets show that our parameter selection method leads to better prediction accuracy, compared to full-model private fine-tuning or existing private sparse fine-tuning approaches. Our code is available here: https://github.com/mazumder-lab/SPARTA/tree/mainen_US
dc.publisherACM|Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.2en_US
dc.relation.isversionofhttps://doi.org/10.1145/3711896.3736842en_US
dc.rightsCreative Commons Attributionen_US
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/en_US
dc.sourceAssociation for Computing Machineryen_US
dc.titleSPARTA: An Optimization Framework for Differentially Private Sparse Fine-Tuningen_US
dc.typeArticleen_US
dc.identifier.citationMehdi Makni, Kayhan Behdin, Gabriel Afriat, Zheng Xu, Sergei Vassilvitskii, Natalia Ponomareva, Rahul Mazumder, and Hussein Hazimeh. 2025. SPARTA: An Optimization Framework for Differentially Private Sparse Fine-Tuning. In Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.2 (KDD '25). Association for Computing Machinery, New York, NY, USA, 2090–2101.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Operations Research Centeren_US
dc.contributor.departmentSloan School of Managementen_US
dc.identifier.mitlicensePUBLISHER_POLICY
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
eprint.statushttp://purl.org/eprint/status/NonPeerRevieweden_US
dc.date.updated2025-09-01T07:50:30Z
dc.language.rfc3066en
dc.rights.holderThe author(s)
dspace.date.submission2025-09-01T07:50:31Z
mit.licensePUBLISHER_CC
mit.metadata.statusAuthority Work and Publication Information Neededen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record