Show simple item record

dc.contributor.authorYu, Miao
dc.contributor.authorMeng, Fanci
dc.contributor.authorZhou, Xinyun
dc.contributor.authorWang, Shilong
dc.contributor.authorMao, Junyuan
dc.contributor.authorPan, Linsey
dc.contributor.authorChen, Tianlong
dc.contributor.authorWang, Kun
dc.contributor.authorLi, Xinfeng
dc.contributor.authorZhang, Yongfeng
dc.contributor.authorAn, Bo
dc.contributor.authorWen, Qingsong
dc.date.accessioned2025-09-02T19:28:29Z
dc.date.available2025-09-02T19:28:29Z
dc.date.issued2025-08-03
dc.identifier.isbn979-8-4007-1454-2
dc.identifier.urihttps://hdl.handle.net/1721.1/162598
dc.descriptionKDD ’25, Toronto, ON, Canadaen_US
dc.description.abstractWith the rapid evolution of Large Language Models (LLMs), LLM-based agents and Multi-agent Systems (MAS) have significantly expanded the capabilities of LLM ecosystems. This evolution stems from empowering LLMs with additional modules such as memory, tools, environment, and even other agents. However, this advancement has also introduced more complex issues of trustworthiness, which previous research focusing solely on LLMs could not cover. In this survey, we propose the TrustAgent framework, a comprehensive study on the trustworthiness of agents, characterized by modular taxonomy, multi-dimensional connotations, and technical implementation. By thoroughly investigating and summarizing newly emerged attacks, defenses, and evaluation methods for agents and MAS, we extend the concept of Trustworthy LLM to the emerging paradigm of Trustworthy Agent. In TrustAgent, we begin by deconstructing and introducing various components of the Agent and MAS. Then, we categorize their trustworthiness into intrinsic (brain, memory, and tool) and extrinsic (user, agent, and environment) aspects. Subsequently, we delineate the multifaceted meanings of trustworthiness and elaborate on the implementation techniques of existing research related to these internal and external modules. Finally, we present our insights and outlook on this domain, aiming to provide guidance for future endeavors. For easy reference, we categorize all the studies mentioned in this survey according to our taxonomy, available at: https://github.com/Ymm-cll/TrustAgent.en_US
dc.publisherACM|Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.2en_US
dc.relation.isversionofhttps://doi.org/10.1145/3711896.3736561en_US
dc.rightsArticle is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.en_US
dc.sourceAssociation for Computing Machineryen_US
dc.titleA Survey on Trustworthy LLM Agents: Threats and Countermeasuresen_US
dc.typeArticleen_US
dc.identifier.citationMiao Yu, Fanci Meng, Xinyun Zhou, Shilong Wang, Junyuan Mao, Linsey Pan, Tianlong Chen, Kun Wang, Xinfeng Li, Yongfeng Zhang, Bo An, and Qingsong Wen. 2025. A Survey on Trustworthy LLM Agents: Threats and Countermeasures. In Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.2 (KDD '25). Association for Computing Machinery, New York, NY, USA, 6216–6226.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratoryen_US
dc.identifier.mitlicensePUBLISHER_POLICY
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
eprint.statushttp://purl.org/eprint/status/NonPeerRevieweden_US
dc.date.updated2025-09-01T07:49:54Z
dc.language.rfc3066en
dc.rights.holderThe author(s)
dspace.date.submission2025-09-01T07:49:55Z
mit.licensePUBLISHER_POLICY
mit.metadata.statusAuthority Work and Publication Information Neededen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record