| dc.contributor.author | Raghavan, Manish | |
| dc.contributor.author | Slivkins, Aleksandrs | |
| dc.contributor.author | Vaughan, Jennifer Wortman | |
| dc.contributor.author | Wu, Zhiwei Steven | |
| dc.date.accessioned | 2026-04-08T16:56:24Z | |
| dc.date.available | 2026-04-08T16:56:24Z | |
| dc.date.issued | 2023-04-30 | |
| dc.identifier.uri | https://hdl.handle.net/1721.1/165369 | |
| dc.description.abstract | Online learning algorithms, widely used to power search and content optimization onthe web, must balance exploration and exploitation, potentially sacrificing the experience of currentusers in order to gain information that will lead to better decisions in the future. While necessary inthe worst case, explicit exploration has a number of disadvantages compared to the greedy algorithmthat always ``exploits"" by choosing an action that currently looks optimal. We determine under whatconditions inherent diversity in the data makes explicit exploration unnecessary. We build on a recentline of work on the smoothed analysis of the greedy algorithm in the linear contextual bandits model.We improve on prior results to show that the greedy algorithm almost matches the best possibleBayesian regret rate of any other algorithm on the same problem instance whenever the diversityconditions hold. The key technical finding is that data collected by the greedy algorithm sufficesto simulate a run of any other algorithm. Further, we prove that under a particular smoothnessassumption, the Bayesian regret of the greedy algorithm is at most \~O(T 1/3) in the worst case, whereT is the time horizon. | en_US |
| dc.language.iso | en | |
| dc.publisher | Society for Industrial & Applied Mathematics (SIAM) | en_US |
| dc.relation.isversionof | https://doi.org/10.1137/19M1247115 | en_US |
| dc.rights | Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. | en_US |
| dc.source | Society for Industrial & Applied Mathematics (SIAM) | en_US |
| dc.title | Greedy Algorithm Almost Dominates in Smoothed Contextual Bandits | en_US |
| dc.type | Article | en_US |
| dc.identifier.citation | Raghavan, Manish, Slivkins, Aleksandrs, Vaughan, Jennifer Wortman and Wu, Zhiwei Steven. 2023. "Greedy Algorithm Almost Dominates in Smoothed Contextual Bandits." SIAM Journal on Computing, 52 (2). | |
| dc.contributor.department | Sloan School of Management | en_US |
| dc.relation.journal | SIAM Journal on Computing | en_US |
| dc.eprint.version | Final published version | en_US |
| dc.type.uri | http://purl.org/eprint/type/JournalArticle | en_US |
| eprint.status | http://purl.org/eprint/status/PeerReviewed | en_US |
| dc.date.updated | 2026-04-08T14:56:18Z | |
| dspace.orderedauthors | Raghavan, M; Slivkins, A; Vaughan, JW; Wu, ZS | en_US |
| dspace.date.submission | 2026-04-08T14:56:19Z | |
| mit.journal.volume | 52 | en_US |
| mit.journal.issue | 2 | en_US |
| mit.license | PUBLISHER_POLICY | |
| mit.metadata.status | Authority Work and Publication Information Needed | en_US |