Personalizing Content Moderation Through Pluralism

June 17, 2026

Farnaz Jahanbakhsh, Assistant Professor of Electrical Engineering and Computer Science, College of Engineering

Farnaz Jahanbakhsh’s research begins from a conviction that has shaped everything she has built: people are fundamentally different from one another, and systems that ignore that fact — averaging across users, flattening difference into a single policy — are not just less effective, they are doing something morally questionable. The concept she works from is pluralism: the idea that difference across people and communities is not noise to be cleaned away but a signal that deserves to be taken seriously in its own right. That commitment runs through her work on educational AI, recommender systems, and generative AI. But it shows up most vividly in a project on content moderation, led by her PhD student Rayan Rashed, that won a best paper honorable mention at CHI.

Content moderation on social media works through universal suppression. Platforms identify content that is deemed harmful according to a platform-wide policy and remove it or downrank it for everyone. Jahanbakhsh’s research began by questioning both halves of that approach: the universality of the harm definition, and the bluntness of suppression as a response.

The universality problem is real and underappreciated. A post joyfully announcing a pregnancy is entirely innocuous to most people and potentially devastating to someone who has recently experienced a miscarriage. A food photograph is background noise for most users and a genuine trigger — obsessive thoughts, intense cravings — for someone in recovery from a binge eating disorder. Both of these examples came not from hypotheticals but from research participants. No centralized platform policy can anticipate the specificity of those individual experiences. “There is no universal boundary that captures everybody’s experience of harm,” Jahanbakhsh said. “What is innocuous to one person can genuinely be damaging to another.”

The suppression problem is equally real. Even when harmful and valuable elements coexist in the same piece of content — a news article that a PTSD survivor wants to engage with, except for the descriptions of civilian deaths — the platform response is binary: show it or hide it. The participant recovering from an eating disorder might want to see life updates from friends; she does not want the food photograph dominating every post. Suppression cannot make that distinction.

Before building anything, Jahanbakhsh’s team went and talked to people. They recruited participants who had lived experience of exactly this problem: people with phobias, PTSD, people navigating difficult life events, people whose values simply diverged from the mainstream. The conversations were designed to surface not just what harm looked like for each person but what they had already tried on existing platforms, and where those attempts had failed. What came back was striking in its specificity. A person with arachnophobia did not want to avoid pictures of spiders in general — just close-up images. A participant with PTSD wanted to follow the news but could not read anything about civilian deaths without flashbacks. Sensitivities were temporally variable: a person might not be in a state to encounter certain content today and feel entirely differently next week. And participants were consistent on one point: they wanted to define their own boundaries without silencing the people posting the content. Any intervention would have to be on the recipient’s side.

From those conversations emerged the design framework. Rather than suppression, Jahanbakhsh’s team built toward personalized content transformation: intercepting content before a user sees it, analyzing it against that user’s stated sensitivity, and rendering a modified version that attenuates the triggering element while preserving as much of the original meaning and context as possible. The mechanism is a browser extension, deployed on Reddit feeds, that allows users to define their sensitivities in their own words. A user who says “I have an eating disorder that I’m managing — food pictures trigger cravings and obsessive thoughts” sets a sensitivity that the system applies in real time, invisibly, to every post in their feed. The transformed content is marked clearly as modified.

What the transformation actually looks like depends on what is triggering and for whom. In some cases the distressing element is removed from an image entirely. In others the image is rendered in an artistic style — e.g., impressionism, which can convey a sense of calm — that reduces its felt emotional impact without removing its informational content. Text transformations follow a parallel logic. The system decides what transformation to apply through an AI pipeline that weighs four goals simultaneously: preserve meaning, push the trigger as far as possible from its triggering form, keep the result natural and non-jarring, and remain ethically and culturally appropriate. Those goals trade off against one another, and the pipeline navigates that tradeoff for each specific user and each specific piece of content.

The user study results were strong, and one participant response in particular stayed with Jahanbakhsh: “I think what you have done is life-impacting for people. Even if it’s not perfect, I’m still thankful for it and I think other people are too.” Importantly, the intervention produced an unexpected finding about avoidance. Before having the tool, participants had been severing connections — unfollowing people, leaving communities — to escape content they couldn’t handle. With the tool, they were re-engaging: following those connections again, reading that content in a form they could digest. Personalized transformation, counterintuitively, was enabling engagement rather than encouraging withdrawal.

The research raises a question that Jahanbakhsh is now actively pursuing: if this kind of transformation works for individual harm reduction, could it work at the level of civic discourse? Could AI help participants in polarized conversations engage with each other’s actual ideas and intentions, rather than the most inflammatory surface reading of what someone said? The content moderation project was, in that sense, a proof of concept for something larger — that pluralism, built into the architecture of an AI system from the start, can produce interventions that are both more effective and more respectful of the people they serve.