Greedy Algorithm Almost Dominates in Smoothed Contextual Bandits

Raghavan, Manish; Slivkins, Aleksandrs; Vaughan, Jennifer Wortman; Wu, Zhiwei Steven

dc.contributor.author	Raghavan, Manish
dc.contributor.author	Slivkins, Aleksandrs
dc.contributor.author	Vaughan, Jennifer Wortman
dc.contributor.author	Wu, Zhiwei Steven
dc.date.accessioned	2026-04-08T16:56:24Z
dc.date.available	2026-04-08T16:56:24Z
dc.date.issued	2023-04-30
dc.identifier.uri	https://hdl.handle.net/1721.1/165369
dc.description.abstract	Online learning algorithms, widely used to power search and content optimization onthe web, must balance exploration and exploitation, potentially sacrificing the experience of currentusers in order to gain information that will lead to better decisions in the future. While necessary inthe worst case, explicit exploration has a number of disadvantages compared to the greedy algorithmthat always ``exploits"" by choosing an action that currently looks optimal. We determine under whatconditions inherent diversity in the data makes explicit exploration unnecessary. We build on a recentline of work on the smoothed analysis of the greedy algorithm in the linear contextual bandits model.We improve on prior results to show that the greedy algorithm almost matches the best possibleBayesian regret rate of any other algorithm on the same problem instance whenever the diversityconditions hold. The key technical finding is that data collected by the greedy algorithm sufficesto simulate a run of any other algorithm. Further, we prove that under a particular smoothnessassumption, the Bayesian regret of the greedy algorithm is at most \~O(T 1/3) in the worst case, whereT is the time horizon.	en_US
dc.language.iso	en
dc.publisher	Society for Industrial & Applied Mathematics (SIAM)	en_US
dc.relation.isversionof	https://doi.org/10.1137/19M1247115	en_US
dc.rights	Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.	en_US
dc.source	Society for Industrial & Applied Mathematics (SIAM)	en_US
dc.title	Greedy Algorithm Almost Dominates in Smoothed Contextual Bandits	en_US
dc.type	Article	en_US
dc.identifier.citation	Raghavan, Manish, Slivkins, Aleksandrs, Vaughan, Jennifer Wortman and Wu, Zhiwei Steven. 2023. "Greedy Algorithm Almost Dominates in Smoothed Contextual Bandits." SIAM Journal on Computing, 52 (2).
dc.contributor.department	Sloan School of Management	en_US
dc.relation.journal	SIAM Journal on Computing	en_US
dc.eprint.version	Final published version	en_US
dc.type.uri	http://purl.org/eprint/type/JournalArticle	en_US
eprint.status	http://purl.org/eprint/status/PeerReviewed	en_US
dc.date.updated	2026-04-08T14:56:18Z
dspace.orderedauthors	Raghavan, M; Slivkins, A; Vaughan, JW; Wu, ZS	en_US
dspace.date.submission	2026-04-08T14:56:19Z
mit.journal.volume	52	en_US
mit.journal.issue	2	en_US
mit.license	PUBLISHER_POLICY
mit.metadata.status	Authority Work and Publication Information Needed	en_US

Files in this item

Name:: 19m1247115 (1).pdf
Size:: 549.4Kb
Format:: PDF
Description:: Published version

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record