Skip to content

labouz/stigma_character

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

51 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Phenotyping Stigma Experiences from Reddit: A Computational Analysis of PWUD Lived Experiences

This repository contains the code and resources for characterizing the lived experiences of stigma faced by people who use drugs (PWUD) as disclosed on Reddit.

🎯 Project Overview

Stigma represents a major barrier to treatment seeking among PWUD, contributing to a vast treatment gap where approximately 95% of individuals meeting DSM-5 criteria for substance use disorders receive no formal treatment. This project analyzes over 1 million Reddit posts to systematically identify, classify, and characterize stigma expressions across multiple substance-related communities.

Key Research Objectives

  • Develop and validate a comprehensive framework for identifying and classifying stigma expressions in social media discourse about substance use
  • Identify distinct patterns of stigma expression using unsupervised computational methods
  • Examine how these empirically-derived patterns relate to established theoretical frameworks of stigma.

📊 Dataset

Data Collection

  • Source: Reddit posts from 6 substance-related subreddits
  • Communities: r/Drugs, r/opiates, r/kratom, r/LSD, r/Stims, r/benzodiazepines
  • Total Posts: 1,033,619 posts (from 2.25M collected)
  • Time Range: 2008-2022
  • Data Source: Pushshift Reddit archive

Data Characteristics

  • Posts filtered for quality (minimum 10 words combined title and body)
  • Removed deleted/removed posts and throwaway accounts
  • Focus on substantive user disclosures of stigma experiences

🔬 Methodology

1. Stigma Type Classification Framework

We employ a mixed-methods approach combining:

  • Large Language Models (LLMs) with qualitative analysis
  • Human annotation for ground truth (500 annotated posts)
  • GPT-4o lableing with human evaluation on full dataset

2. Unsupervised Clustering Analysis

  • Identification of distinct stigma expression patterns
  • Three primary phenotypes discovered:
    • Internalized Stigma
    • Public Stigma
    • Righteous Indignation

3. Narrative and Linguistic Analysis

  • Examination of narrative strategies across stigma types
  • Analysis of subject-verb pairs, pronoun use, and discourse markers
  • Assessment of emotional framing and identity concealment patterns

📈 Key Findings

Stigma Type Distribution

  • Internalized Stigma: 36.0% (n=20,348)
  • Stigma Perceptions & Commentary: 25.3% (n=14,288)
  • Anticipated Stigma: 21.3% (n=12,056)
  • Enacted/Experienced Stigma: 13.7% (n=7,715)
  • Structural Stigma: 3.5% (n=1,985)

Temporal Patterns

  • Notable spike in stigma expressions around mid-2016
  • Consistent increase across all stigma types through 2020
  • r/Drugs subreddit shows highest concentration of posts

Three Stigma Phenotypes Identified

  1. Internalized Stigma: Self-blame and shame narratives
  2. Public Stigma: External discrimination experiences
  3. Righteous Indignation: Critical analysis and resistance to stigma

📚 Citation

If you use this code or data in your research, please cite:

@article{bouzoubaa2025phenotypes,
  title={Phenotypes of stigma expressed by people who use drugs on Reddit},
  author={Bouzoubaa, Layla and Aghakhani, Elham and Rezapour, Rezvaneh Shadi},
  journal={Social science \& medicine},
  pages={118889},
  year={2025},
  publisher={Elsevier}
}

📋 Ethics and Privacy

This research follows strict ethical guidelines:

  • IRB exemption obtained for analysis of publicly available, anonymized data
  • No usernames or identifiable metadata reported
  • All quoted posts paraphrased to prevent re-identification
  • Focus on aggregate patterns rather than individual cases

📞 Contact

Data is available upon reasonable request:

  • Corresponding Author: [Laya Bouzoubaa] - [email]
  • Primary Investigator: [Shadi Rezapour] - [email]

🙏 Acknowledgments

  • Reddit community members who shared their experiences
  • Research team members and annotators
  • Pushshift for providing Reddit data access

Note: This research aims to center the voices and experiences of PWUD, particularly those outside formal treatment systems, responding to calls for more comprehensive, context-specific approaches to addressing substance use stigma.

About

Exploring 1M+ disclosures on Reddit to characterize expressions of felt stigma. Code for paper published in Social Science & Medicine 2025

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors