Correcting Misconceptions with Crowdsourced Narratives
Behavioral ScienceBelief ChangeBelief ModelingNLP
Interactive visualization communicating causes of racial inequity in Chicago based on an
internet discussion mined from the Reddit forum Change My View. Participants who interacted
with this data narrative demonstrated a significant shift in post-interaction beliefs about
racial inequality, demonstrating how crowdsourced data narratives can be applied to educate
random populations about social issues.
// Quick Summary
Apply data science methods to extend the causal language of internet arguments to build persuasive educational interventions on misconceptions
Test argument efficacy in behavioral experiments with thousands of participants randomly sampled from the U.S. population
The crowdsourcing methodology combines NLP, data visualization, and Bayesian modeling to develop scalable, generalizable educational interventions
The internet has given rise to conspiracy thinking, AI-generated media, and identity politics,
which all have the potential to severely limit democratic decision-making and national consensus.
However, not all corners of the internet promote this type of information. This project explores
how naturally occurring arguments on the internet, including those sourced from
communities like Reddit's Change My View, can be used as data narratives
to correct public misconceptions.
Using data science methods, I mine and parse online arguments, identify belief-shifting
conversations, extract the narrative and evidentiary structure of persuasive
responses, and repurpose them into educational interventions tested on randomized
populations of Americans.
Rather than designing corrective messages from scratch, this crowdsourcing approach provides a
scalable, bottom-up approach to curating corrective information that leverages readily available
dynamic online data.
Key Findings
On forums where deliberative norms are embedded in communication channels, users are incentivized to provide evidence for their claims. This evidence can revise people's beliefs about topics spanning gun control, immigration, and racial equality.
Causal evidence extracted from these conversations can shift beliefs among randomized populations.
This evidence can be transformed into data narratives that facilitate hands-on learning about socially polarizing topics.
Why It Matters
Traditional approaches to correcting public misconceptions use top-down intervention
tactics. These methods develop slowly, depend heavily on intervention designers'
expertise, and often struggle to reach diverse populations with varying beliefs.
A better approach sources naturalistically validated information to create educational content.
This approach enables scalable, AI-assisted designs that can both accelerate misconception
correction and uncover novel pathways not yet explored in research.
Business use case: Educational organizations and health communication teams can leverage this framework to create messaging based on real, user-generated reasoning observed on the internet.
Research use case: Cognitive scientists and communication scholars can model beliefs and predict change mechanisms in natural settings, using platforms like Reddit as real-world laboratories for studying epistemic reasoning and evidence framing.
How It Works
Source Argument Data: Collected 30,000+ argument threads from Reddit's Change My View using the Reddit API and custom NLP pipelines.
Identify Belief Change Messages: Developed NLP methods to extract features of corrective content and identify messages marked with a Δ (delta), indicating belief change.
Recompose as Educational Interventions: Converted these arguments into persuasive vignettes and interactive data narratives.
Deploy and Evaluate Interventions: Tested them with randomized groups (N = 3,000) using pre/post belief assessments.
Model Belief Updating: Used multilevel ordinal models to analyze post-intervention belief shifts.