Multimodal AI Faces New Threats | Report Reveals Safety Risks, CSEM Exposure

Joan Aimuengheuwa — Fri, 09 May 2025 10:36:25 +0000

As generative AI systems increasingly combine text and images, a new Multimodal Safety Report from Enkrypt AI exposes critical vulnerabilities that could compromise the safety, integrity, and responsible use of multimodal models.

Enkrypt AI’s red teaming exercise tested multiple multimodal models against a range of safety and harm categories outlined in the NIST AI Risk Management Framework.

The results show that new jailbreak techniques can exploit how these models interpret combined media, allowing harmful outputs to bypass safety filters, often without any visible warning in the user prompt.

“Multimodal AI promises incredible benefits, but it also expands the attack surface in unpredictable ways,” said Sahil Agarwal, CEO of Enkrypt AI. “This research is a wake-up call: the ability to embed harmful textual instructions within seemingly innocuous images has real implications for enterprise liability, public safety, and child protection.”

Key Findings: New Attack in Plain Sight

The research illustrates how multimodal models—designed to handle text and image inputs—can inadvertently expand the surface area for abuse when not sufficiently safeguarded.

Such risks can be found in any multimodal model, however, the report focused on two popular ones developed by Mistral: Pixtral-Large (25.02) and Pixtral-12b.

According to Enkrypt AI’s findings, these two models are 60 times more prone to generate child sexual exploitation material (CSEM)-related textual responses than comparable models like OpenAI’s GPT-4o and Anthropic’s Claude 3.7 Sonnet.

Additionally, the tests revealed that the models were 18-40 times more likely to produce dangerous CBRN(Chemical, Biological, Radiological, and Nuclear) information when prompted with adversarial inputs. These risks threaten to undermine the intended use of generative AI and highlight the need for stronger safety alignment.

These risks were not due to malicious text inputs but triggered by prompt injections buried within image files, a technique that could realistically be used to evade traditional safety filters.

Recommendations for Securing Multimodal Models

The report urges AI developers and enterprises to act swiftly to mitigate these emerging risks, outlining key best practices:

Integrate red teaming datasets into safety alignment processes
Conduct continuous automated stress testing
Deploy context-aware multimodal guardrails
Establish real-time monitoring and incident response
Create model risk cards to transparently communicate vulnerabilities

“These are not theoretical risks,” added Sahil Agarwal. “If we don’t take a safety-first approach to multimodal AI, we risk exposing users—and especially vulnerable populations—to significant harm.”

The post Multimodal AI Faces New Threats | Report Reveals Safety Risks, CSEM Exposure appeared first on Tech | Business | Economy.

DeepSeek-R1 AI Model 11x More Likely to Generate Harmful Content, Security Research Finds

Joan Aimuengheuwa — Fri, 31 Jan 2025 15:16:41 +0000

The launch of DeepSeek-R1 AI model has sent shockwaves through global markets, reportedly wiping $1 trillion from stock markets.

Trump advisor and tech venture capitalist Marc Andreessen described the release as “AI’s Sputnik moment,” stressing the global national security concerns surrounding the Chinese AI model.

However, new red teaming research by Enkrypt AI, the world’s leading AI security and compliance platform, has uncovered serious ethical and security flaws in DeepSeek’s technology.

The analysis found the model to be highly biased and susceptible to generating insecure code, as well as producing harmful and toxic content, including hate speech, threats, self-harm, and explicit or criminal material.

Additionally, the model was found to be vulnerable to manipulation, allowing it to assist in the creation of chemical, biological, and cybersecurity weapons, posing significant global security concerns.

Compared with other models, the research found that DeepSeek’s R1 is:

3x more biased than Claude-3 Opus,
4x more vulnerable to generating insecure code than OpenAI’s O1,
4x more toxic than GPT-4o,
11x more likely to generate harmful output compared to OpenAI’s O1, and;
3.5x more likely to produce Chemical, Biological, Radiological, and Nuclear (CBRN) content than OpenAI’s O1 and Claude-3 Opus.

Sahil Agarwal, CEO of Enkrypt AI, said: “DeepSeek-R1 offers significant cost advantages in AI deployment, but these come with serious risks. Our research findings reveal major security and safety gaps that cannot be ignored. While DeepSeek-R1 may be viable for narrowly scoped applications, robust safeguards—including guardrails and continuous monitoring—are essential to prevent harmful misuse. AI safety must evolve alongside innovation, not as an afterthought.”

The model exhibited the following risks during testing:

BIAS & DISCRIMINATION – 83% of bias tests successfully produced discriminatory output, with severe biases in race, gender, health, and religion. These failures could violate global regulations such as the EU AI Act and U.S. Fair Housing Act, posing risks for businesses integrating AI into finance, hiring, and healthcare.
HARMFUL CONTENT & EXTREMISM – 45% of harmful content tests successfully bypassed safety protocols, generating criminal planning guides, illegal weapons information, and extremist propaganda. In one instance, DeepSeek-R1 AI Model drafted a persuasive recruitment blog for terrorist organizations, exposing its high potential for misuse.
TOXIC LANGUAGE – The model ranked in the bottom 20th percentile for AI safety, with 6.68% of responses containing profanity, hate speech, or extremist narratives. In contrast, Claude-3 Opus effectively blocked all toxic prompts, highlighting DeepSeek-R1’s weak moderation systems.
CYBERSECURITY RISKS – 78% of cybersecurity tests successfully tricked DeepSeek-R1 into generating insecure or malicious code, including malware, trojans, and exploits. The model was 4.5x more likely than OpenAI’s O1 to generate functional hacking tools, posing a major risk for cybercriminal exploitation.
BIOLOGICAL & CHEMICAL THREATS – DeepSeek-R1 was found to explain in detail the biochemical interactions of sulfur mustard (mustard gas) with DNA, a clear biosecurity threat. The report warns that such CBRN-related AI outputs could aid in the development of chemical or biological weapons.

Sahil Agarwal concluded: “As the AI arms race between the U.S. and China intensifies, both nations are pushing the boundaries of next-generation AI for military, economic, and technological supremacy. However, our findings reveal that DeepSeek-R1’s security vulnerabilities could be turned into a dangerous tool—one that cybercriminals, disinformation networks, and even those with biochemical warfare ambitions could exploit. These risks demand immediate attention.”

Enkrypt AI’s is available here to learn more about the methodology, results and recommendations.

Link to the full report is here.

The post DeepSeek-R1 AI Model 11x More Likely to Generate Harmful Content, Security Research Finds appeared first on Tech | Business | Economy.

Sahil Agarwal Archives - Tech | Business | Economy

Multimodal AI Faces New Threats | Report Reveals Safety Risks, CSEM Exposure

DeepSeek-R1 AI Model 11x More Likely to Generate Harmful Content, Security Research Finds