• Technology
    • Telecoms
      • Broadband
    • EnterpriseTECH
    • ConsumerTech
      • Gadgets and Appliances
      • Apps
      • Accessories
      • Reviews
      • Unboxing
    • Security & Data Protection
    • How To
    • GameTech
Tech | Business | Economy
  • Technology
    • Telecoms
      • Broadband
    • EnterpriseTECH
    • ConsumerTech
      • Gadgets and Appliances
      • Apps
      • Accessories
      • Reviews
      • Unboxing
    • Security & Data Protection
    • How To
    • GameTech
No Result
View All Result
  • Technology
    • Telecoms
      • Broadband
    • EnterpriseTECH
    • ConsumerTech
      • Gadgets and Appliances
      • Apps
      • Accessories
      • Reviews
      • Unboxing
    • Security & Data Protection
    • How To
    • GameTech
No Result
View All Result
Tech | Business | Economy
No Result
View All Result

Did DeepSeek-R1 Train on OpenAI’s Model? Study Finds 74.2% Similarity

…While Microsoft’s Phi-4 Shows 99.3% Independence

Joan Aimuengheuwa by Joan Aimuengheuwa
March 4, 2025
in EnterpriseTECH
0
Did DeepSeek-R1 Train on OpenAI’s Model? Study Finds 74.2% Similarity
Source: Getty Images

Source: Getty Images

A new study by Copyleaks has uncovered a solid similarity between texts generated by DeepSeek-R1 and those produced by OpenAI’s model. 

According to the research, 74.2% of DeepSeek-R1’s outputs share stylistic fingerprints with OpenAI’s technology, raising talks about possible reliance on OpenAI’s model during training.

This revelation has also led to discussions around data sourcing, intellectual property rights, and transparency in AI development. If DeepSeek-R1 was trained using OpenAI-generated content without disclosure, it could cause legal and ethical risks, including reinforcing biases and limiting diversity in AI-generated text.

The study employed an advanced text attribution method, utilising three independent AI classifiers trained on outputs from OpenAI, Gemini, Claude, and Llama. To ensure accuracy, a classification was only confirmed when all three classifiers reached the same conclusion. This approach resulted in a 99.88% precision rate, with a false-positive rate of just 0.04%.

During testing, DeepSeek-R1’s texts were found to align with OpenAI’s writing style in 74.2% of cases. In contrast, Microsoft’s Phi-4 model exhibited a 99.3% disagreement rate with existing AI-generated texts, indicating independent training.

Did DeepSeek-R1 Train on OpenAI’s Model? Study Finds 74.2% Similarity
Source: Copyleaks

Shai Nisan, Copyleaks’ chief data scientist, commented on the importance of the findings, stating, “With this research, we have moved beyond general AI detection as we knew it and into model-specific attribution, a breakthrough that fundamentally changes how we approach AI content.”

The research team, led by Yehonatan Bitton, Shai Nisan, and Elad Bitton, adopted a rigorous “unanimous jury” approach to ensure reliability of their findings. Their method went beyond identifying known AI models to also detecting previously unseen ones by analysing unique stylistic markers.

If DeepSeek-R1’s model was developed using OpenAI’s work without proper attribution, it could mislead investors and stakeholders about the originality of its technology. 

This ultimately points to cautiousness about AI governance, competitive fairness, and the risks of intellectual property infringement in the industry. Transparency in model training and attribution is highly important in maintaining trust and ensuring ethical development practices.

Tags: CopyleaksDeepSeekDeepSeek and OpenAIDeepSeek-R1MicrosoftMicrosoft Phi-4OpenAIShai Nisan
Previous Post

UK and Nigeria Launch Quality Infrastructure Policy Phase II

Next Post

TSMC Pledges $100 Billion for U.S. Chip Manufacturing Expansion, with Trump’s Backing

Next Post
TSMC Pledges $100 Billion for U.S. Chip Manufacturing Expansion, with Trump’s Backing

TSMC Pledges $100 Billion for U.S. Chip Manufacturing Expansion, with Trump’s Backing

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

TECHECONOMY YOUTUBE CHANNEL

Search

No Result
View All Result
  • Technology

© 2026 Techeconomy - Techeconomy.

No Result
View All Result
  • Technology
    • Telecoms
      • Broadband
    • EnterpriseTECH
    • ConsumerTech
      • Gadgets and Appliances
      • Apps
      • Accessories
      • Reviews
      • Unboxing
    • Security & Data Protection
    • How To
    • GameTech

© 2026 Techeconomy - Techeconomy.