AI raters face hidden struggles in Google’s content moderation workforce

Story Highlight

– Rachael Sawyer’s role turned into AI content moderation.
– Workers face stress and anxiety from high demands.
– Google employs thousands through contractors like GlobalLogic.
– AI raters feel essential yet undervalued and invisible.
– Safety measures in AI content have reportedly loosened.

Full Story

In spring 2024, Rachael Sawyer, a technical writer based in Texas, received a LinkedIn message from a recruiter for a position labeled as a writing analyst. She anticipated that the role would mirror her previous content creation experiences. However, upon commencing work a week later, she discovered the position diverged significantly from her expectations. Instead of engaging in writing, Sawyer’s responsibilities involved assessing and managing content produced by artificial intelligence.

Sawyer’s role initially encompassed reviewing meeting notes and chat summaries generated by Google’s AI system, Gemini, as well as evaluating short films created by the AI. Gradually, her duties shifted towards managing distressing material, including violent and sexually explicit content, predominantly in textual form, which she had to flag for removal. Her comments reflect her frustration: “I was shocked that my job involved working with such distressing content. Not only was there no warning given during onboarding, but the job title and description never hinted at any content moderation requirement.”

The nature of her work has induced anxiety and panic attacks, exacerbated by the absence of mental health resources from her employer. Sawyer is one of thousands employed as AI raters for Google through GlobalLogic, a subsidiary of the Japanese conglomerate Hitachi. The labor force assesses and moderates content for Google’s AI services, including its flagship chatbot, Gemini, which was introduced at the beginning of the previous year. Other companies, such as Accenture and Appen, also provide Google with similar AI rating services.

Google has reasserted itself in the AI sector over the past year, launching several products designed to compete with offerings like OpenAI’s ChatGPT. Its latest model, Gemini 2.5 Pro, is claimed to outperform OpenAI’s O3, based on metrics from LMArena, which evaluates AI model performance. The accuracy of these AI models is critical, prompting thousands of raters to work diligently to ensure that the responses generated are suitable for users.

Outside the spotlight that often shines on data labelers, there exists a significant, yet largely invisible, workforce dedicated to moderating AI outputs. This includes individuals like Sawyer, who strive to guarantee that countless AI users receive only appropriate and safe responses.

AI models are trained on extensive datasets sourced from the internet. Raters form a crucial layer within the broader AI development framework and, while earning better wages than data annotators in regions like Nairobi or Bogotá, they receive significantly less compensation than the engineers behind the scenes in Mountain View. Despite their fundamental role in maintaining model quality, these workers often feel overlooked.

Adio Dinika, a researcher at the Distributed AI Research Institute in Bremen, Germany, articulated this sentiment, pointing out, “AI isn’t magic; it’s a pyramid scheme of human labor. These raters are the middle rung: invisible, essential and expendable.” Commenting on the situation, Google stated, “Quality raters are employed by our suppliers and are temporarily assigned to provide external feedback on our products. Their ratings are one of many aggregated data points that help us measure how well our systems are working, but do not directly impact our algorithms or models.” GlobalLogic did not respond to inquiries for this article.

The contractors involved in hiring AI workforce include GlobalLogic, which categorizes its raters into generalist and super raters. Initially, in 2023, GlobalLogic employed just 25 super raters. However, as competition in the AI realm intensified, this number surged to nearly 2,000 raters, predominantly based in the United States and focusing on English content moderation.

While compensation for GlobalLogic raters begins at $16 per hour for generalist roles and $21 per hour for super raters, their significant qualifications are often undervalued in the workforce. Many gainful arrangements are seen as a reprieve within a shaky job market, yet disillusionment is growing, particularly regarding the tightening deadlines and pressures surrounding content quality. One rater expressed concerns, noting, “They are people with expertise who are doing a lot of great writing work, who are being paid below what they’re worth to make an AI model that, in my opinion, the world doesn’t need.”

A former employee remarked on their initial enthusiasm in working with the evolving AI models, but soon felt the strain of an increased tempo. Tasks that once had a 30-minute completion time dwindled to 15 minutes, requiring the rapid assessment of extensively detailed responses. Moreover, concerns regarding the product’s safety have been echoed, as evidenced by a letter submitted to Congress by a contract worker who suggested that the pace was rendering Google Bard — the predecessor to Gemini — flawed and potentially dangerous.

Another employee who joined in 2024 revealed their experience of working under severe constraints, often receiving minimal guidance. “We had no idea where it was going, how it was being used or to what end,” they shared. The inconsistency of guidelines further complicated their ability to uphold quality standards amidst pressures to expedite ratings.

In May 2024, Google unveiled AI Overviews, which aggregate web responses. However, these new features prompted public scrutiny after a user received absurd suggestions, including advice to use glue on pizza dough and consuming rocks. Despite prior shock, the GlobalLogic workforce was not surprised by these outputs, indicating a recurring theme in their experiences with the AI systems.

Rebecca Jackson-Artis, a writer from North Carolina, who began working with GlobalLogic in late 2024, also faced similar challenges. Initially told to prioritize quality over speed, she soon found herself scrutinized for taking too long to complete ratings. Reflecting on her work, she revealed a personal moral struggle when tasked with entering sensitive data regarding healthcare without the necessary expertise.

The regulatory landscape for AI content continues to evolve, with recent guidelines appearing to relax restrictions about the portrayal of hate speech and other harmful content, particularly when user-generated. Sawyer noted that previous bans on certain types of content have softened, suggesting a shift towards acceptability under specific conditions.

Despite the growing AI sector, the job stability of raters remains precarious, notably since GlobalLogic has experienced ongoing layoffs in 2025, reducing its workforce to around 1,500. Workers have expressed dwindling trust in the products they assist in developing, with many opting to restrict or completely avoid using AI themselves, aware of the operational realities behind its development.

Sawyer concluded, “I just want people to know that AI is being sold as this tech magic – but it’s not. It’s built on the backs of overworked, underpaid human beings.”

Story Highlight

Full Story

Related News

China’s robotics revolution aims to replace factory workers

Mumsnet campaigns for a social media ban for under-16s amid mental health concerns