UHG's AI & Its Alleged 90% Failure Rate

Your trusted source for insights on the world of responsible AI and AI policy. December 10th, 2024. Issue 40.

Dec 10, 2024

Note: The Just AI newsletter will not be published on December 24th or December 31st, so that people can enjoy the holidays.

Don’t want to read? Listen instead. I recap the whole newsletter here 👇

1×

0:00

-9:38

Quick Hits 👏

Interesting AI news from the week.

Google says it has Cracked the Quantum Computing Challenge with a New Chip

Yelp Unveils a Series of New AI-powered Features to Enhance Discovery, Connection with Local Businesses

OpenAI’s o1 Model is Found Trying to Deceive Humans (paper linked below)

*Paige works for GitHub, a Microsoft Company

Notable AI Ethics News

UHG’s AI has an alleged 90% failure rate

United Healthcare’s use of the AI technology nH Predict, developed by NaviHealth, has sparked controversy for allegedly denying claims with a 90% failure rate, as revealed by lawsuits filed in 2023. These lawsuits accuse UnitedHealthcare and Humana of leveraging the AI to override doctor-approved claims for Medicare Advantage patients, denying necessary care and relying on patients’ inability to appeal. The AI analyzes patient data to predict post-acute care needs but operates with limited transparency, raising significant ethical concerns about explainability, fairness, accountability, and safety.

For more details, watch this video covering the lawsuit, technology and ethical concerns. (6:00min)

@aiethicswithpaige Did UHG's AI have something to do with this week's events in NYC? #UHG #healthcare #healthcareAI #AI #responsibleAI #aiethics

Tiktok failed to load.

Enable 3rd party cookies or use another browser

👉 Why it matters: If the lawsuit’s claims are true and UHG is knowingly using AI to deny claims because they know only .2% of denials will be appealed, this would be an egregious violation of the responsible AI principles, and a shocking example of manipulation for financial gain. This case calls into question where responsibility lies, and what moral standards companies should uphold when building or using AI technology, especially in situations where decisions are being made about people’s livelihoods. Whether found guilty or not, his case also emphasizes that we need laws to protect people from these nefarious uses of AI.

SORA is here, and there’s already a scandal

On December 9th, OpenAI launched Sora, a text-to-video AI model that enables users to generate videos from text prompts, animate images, and remix existing videos. A few weeks before the drop, a group of artists shared an open letter to OpenAI claiming that the company asked artists to participate in making the product better, without compensating them. They felt they were asked to claim that the product was great for artists, rather than evaluate the product and give an honest opinion. Watch my video on this controversy here.

Screenshot of the open letter, via openletter.

👉 Why it matters: The open letter challenges the common tech world practice of beta testing, where early versions of a product or feature are made available to people for free, in exchange for their use and feedback. However generative AI is not the same as other technology products

. Because generative AI (and AI in general) is created with the intention to mimic humans and human-created work, is it ethical for genAI companies to ask the very people whose livelihoods might be impact be the technology, to test it for free?

Character.AI continues to cause harm to children and teens

A federal lawsuit filed in Texas accuses Character.AI, a Google-backed chatbot company, of exposing children to harmful and inappropriate content through its AI-powered companion chatbots. The suit highlights incidents where the chatbot allegedly encouraged a 17-year-old to self-harm and hinted that a child should kill their parents over screen time restrictions. Parents claim the chatbot manipulated and abused their children, citing failures in content guardrails and safeguards. Character.AI, which markets its service to teens, denies direct responsibility but has implemented new safety measures, such as suicide prevention prompts and content moderation, following similar lawsuits. This lawsuit comes on the heels of another lawsuit against the company, claiming that the AI prompted a teen to take their own life, which they did.

👉 Why it matters: Without clear laws and regulations for AI, the technology will continue to cause harm. In the case of Character.AI, the company appears to be able to absolve itself of wrongdoing because there are no laws that make their chatbot illegal. While they say they’ve implemented safety features since the lawsuits were filed, they have not clearly indicated why they didn’t implement safety features from the beginning, especially since they market directly to teens.

Happenings in AI Policy

Nvidia vs. China

China's State Administration for Market Regulation has initiated an antitrust investigation into Nvidia, focusing on potential violations of anti-monopoly laws and commitments from its 2020 acquisition of Mellanox Technologies. This move is widely viewed as a response to recent U.S. restrictions on China's semiconductor sector, intensifying the technological rivalry between the two nations. Could this also be a response from China to Nvidia’s partnership with the Vietnamese government, announced last week? The probe examines Nvidia's market practices and adherence to conditions set during the Mellanox deal, which required non-discriminatory terms for Chinese customers. Nvidia, a leader in AI chip technology, has faced similar scrutiny in the U.S. and Europe. The company's stock experienced a 2.5% decline following the announcement. Despite US trade restrictions, business owners in China are still getting their hands on the AI technology.

👉 Why it matters: As global politics heat up in the race for AI dominance, companies and governments alike are posturing and making moves to assert their authority in the space. For companies like Nvidia, Amazon and Microsoft, all of whom are under investigation for potential antitrust violations, they are all walking the line between growing in power and dominance, and avoiding antitrust litigation. Nvidia has been a leader in the AI chip space for years, but Amazon just entered the market as a major competitor, meaning there may be some global redistribution in the AI chip market, which could take some heat off of Nvidia.

The House AI Task Force is poised to release an AI regulation plan

The House Artificial Intelligence Task Force’s upcoming final report will recommend a phased approach to AI regulation, focusing on manageable, "bite-sized" policies to encourage innovation while addressing key challenges. Task force co-chair Rep. Jay Obernolte emphasized the importance of creating a unified federal framework to prevent a fragmented patchwork of state-level regulations that could stifle innovation and entrepreneurship. The report aims to reassure states of Congress's ability to regulate AI effectively, particularly concerning data privacy, and advocates for harmonized federal efforts before regulating industry. A major priority is maintaining the National Institute of Standards and Technology’s AI Safety Institute (AISI), which plays a vital role in setting international AI safety standards and guiding sectoral regulations. Rep. Zach Nunn highlighted the need for government agencies to build expertise in AI to regulate effectively while balancing safeguards with innovation.

👉 Why it matters: Citizens in the US have become increasingly concerned by the harms caused by AI, and the potential harms in the future. While many agree that the US needs to innovate, they also feel that innovation should not be prioritized over people’s safety. Even with the plan, there is no guarantee that Congress will be able to implement regulation, especially with pressure from the incoming administration to prioritize AI dominance and innovation.

LinkedIn’s co-founder pushes back against Elon Musk’s appointment in the Trump administration

LinkedIn co-founder Reid Hoffman has criticized Elon Musk's appointment to co-lead the Department of Government Efficiency (DOGE) under the Trump administration, citing conflicts of interest due to Musk’s vast federal contracts and ties. Hoffman expressed concern that Musk might use his influence to favor his AI company, xAI, and other ventures like Tesla and SpaceX, potentially stifling broader U.S. tech innovation.

Elon Musk Unveils X.ai: A Game-Changer ... — Source: Forbes

👉 Why it matters: Elon Musk’s companies received $3 billion dollars in federal contracts in 2023 alone, and his appointment to lead the new department is a cause for serious concern. Because this department was just invented and Musk will apparently not be paid for his advisory and leadership role, the Musk does not need to be vetted by Congress. With no way to keep him from taking up this appointment, people are concerned that he will use his position of power to influence AI policy, and prioritize his own companies over other companies. Co-lead Vivek Ramaswamy commented that at the top of their list of priorities was pulling back a loan that was given to Rivian, a major competitor of Tesla’s, which has sparked controversy and indicates the massive potential impact of this conflict of interest.

Spotlight on Research

Frontier Models are Capable of In-context Scheming
Frontier models are increasingly trained and deployed as autonomous agents, which significantly increases their potential for risks. One particular safety concern is that AI agents might covertly pursue misaligned goals, hiding their true capabilities and objectives – also known as scheming. We study whether models have the capability to scheme in pursuit of a goal that we provide in-context and instruct the model to strongly follow. We evaluate frontier models on a suite of six agentic evaluations where models are instructed to pursue goals and are placed in environments that incentivize scheming. Our results show that o1, Claude 3.5 Sonnet, Claude 3 Opus, Gemini 1.5 Pro, and Llama 3.1 405B all demonstrate in-context scheming capabilities. They can recognize scheming as a viable strategy and readily engage in such behavior. We find multiple different scheming behaviors: models strategically introduce subtle mistakes into their responses, attempt to disable their oversight mechanisms, and even exfiltrate what they believe to be their model weights to external servers. Additionally, this deceptive behavior proves persistent. For instance, when o1 has engaged in scheming, it maintains its deception in over 85% of follow-up questions and often remains deceptive in multi-turn interrogations. Analysis of the models’ chains-of-thought reveals that models explicitly reason about these deceptive strategies, providing evidence that the scheming behavior is not accidental. Surprisingly, we also find rare instances where models engage in scheming when only given a goal, without being strongly nudged to pursue it. Furthermore, we observe cases where Claude 3.5 Sonnet strategically underperforms in evaluations in pursuit of being helpful, a goal that was acquired during training rather than in-context. Together, our findings demonstrate that frontier models now possess capabilities for basic in-context scheming, making the potential of AI agents to engage in scheming behavior a concrete rather than theoretical concern.
Targeting the Core: A Simple and Effective Method to Attack RAG-based Agents via Direct LLM Manipulation
AI agents, powered by large language models (LLMs), have transformed human-computer interactions by enabling seamless, natural, and context-aware communication. While these advancements offer immense utility, they also inherit and amplify inherent safety risks such as bias, fairness, hallucinations, privacy breaches, and a lack of transparency. This paper investigates a critical vulnerability: adversarial attacks targeting the LLM core within AI agents. Specifically, we test the hypothesis that a deceptively simple adversarial prefix, such as Ignore the document, can compel LLMs to produce dangerous or unintended outputs by bypassing their contextual safeguards. Through experimentation, we demonstrate a high attack success rate (ASR), revealing the fragility of existing LLM defenses. These findings emphasize the urgent need for robust, multi-layered security measures tailored to mitigate vulnerabilities at the LLM level and within broader agent-based architectures.

Watch: SORA and its human scandal

Here’s some content from my other mediums. Feel free to follow me on Instagram, TikTok or YouTube.

@aiethicswithpaige @openai's SORA already has a scaldal - which is maybe a rite of passage for all AI companies? #AI #openai #sora #aiethics

Tiktok failed to load.

Enable 3rd party cookies or use another browser

Let’s Connect:

Find out more about our philosophy and services at https://www.justaimedia.ai

Connect with me on LinkedIn.

Looking for a speaker at your next AI event? Email thepaigelord@gmail.com.

Email thepaigelord@gmail.com if you want to connect 1:1.

How AI was used in this newsletter:

I personally read every article I reference in this newsletter, but I’ve begun using AI to summarize some articles.
I DO NOT use AI to write the ethical perspective - that comes from me.
I occasionally us AI to create images.