Multi-Stage LLM Reasoning for Automated Detection and Classification of High-Impact Misinformation
Author(s)
Nair, Anushka Manchanda
DownloadThesis PDF (3.225Mb)
Advisor
Malone, Thomas
Terms of use
Metadata
Show full item recordAbstract
As of 2025, social platforms have become a primary news source, magnifying the reach of misleading content [1]. Exposure to misinformation has been linked to shifts in public attitudes and behavior, including vaccine uptake [2] and voting behaviors [3]. However, current misinformation detection approaches can often focus on a narrow definition of misinformation: factual claims that can be clearly judged as true or false. However, recent research suggests the problem lies elsewhere: overt falsehoods (“vaccines contain microchips”) can carry little harm, while technically accurate but decontextualized narratives can be more influential. Allen et al. (2024) [4] found that factually accurate ”vaccine-skeptical” content had a much greater impact on vaccine hesitancy than misinformation flagged by fact-checkers. These narratives can work by omitting information, misleading framing, or cherry-picked evidence, forms of manipulation that can elude traditional fact-checking. Though professional fact-checkers are often able to recognize these tactics and the broader context of information, they cannot keep pace with the volume of online content. This thesis designs a Large Language Model (LLM) based pipeline meant to partner with, rather than replace, human fact checkers. The system decomposes content into its explicit and implicit claims, rhetorical tactics, and the “missing context” questions it raises; retrieves evidence from fact-check databases and reliable sources; and synthesizes grounded explanations while assigning calibrated harm scores to guide triage. Evaluated on fact-checked tweets, the pipeline matched expert judgments in 92.6% of cases where experts agreed, and flagged for review posts where experts disagreed, a gray zone requiring human judgment. The system’s explanations ranked higher than crowdsourced Community Notes in helpfulness, clarity, and trustworthiness when assessed by an LLM, and harm evaluations aligned with human reviewers in 87.5% of cases, enabling prioritization of content with greatest potential impact. Despite constraints of sample size and processing latency, the results demonstrate the feasibility of a human–AI workflow that treats disagreement as a signal and directs scarce attention towards high-impact misinformation that current automated systems can miss.
Date issued
2025-09Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology