Perplexity AI has announced the open-sourcing of R1 1776, a post-trained version of its DeepSeek-R1 large language model (LLM). The release aims to address censorship and bias issues previously observed in the model, particularly related to sensitive topics censored by the Chinese Communist Party (CCP). The model weights are now available for download on HuggingFace, and users can also access the model through Perplexity's Sonar API.
R1 1776 is designed to provide unbiased, factual, and accurate responses to a broad range of sensitive queries. It builds on DeepSeek-R1, which is recognized for its reasoning capabilities comparable to state-of-the-art models like o1 and o3-mini. However, the original R1 faced criticism for avoiding sensitive topics or offering responses aligned with CCP narratives. R1 1776 overcomes these limitations by undergoing a post-training process that focuses on "de-censoring" while maintaining high-quality reasoning and mathematical abilities.
Today we're open-sourcing R1 1776—a version of the DeepSeek R1 model that has been post-trained to provide uncensored, unbiased, and factual information. pic.twitter.com/yZ44qAUqoF
— Perplexity (@perplexity_ai) February 18, 2025
The post-training involved:
- Identifying approximately 300 topics censored by the CCP.
- Developing a multilingual censorship classifier to collect relevant user prompts.
- Creating a dataset of 40,000 multilingual prompts while ensuring user privacy and consent.
- Training the model using Nvidia's NeMo 2.0 framework on this dataset.
Evaluations confirmed that R1 1776 performs on par with the base R1 model in reasoning tasks while providing uncensored responses across diverse, sensitive topics.
The Company Behind It
Perplexity AI is known for developing advanced LLMs aimed at delivering accurate and comprehensive information. This release aligns with its commitment to transparency and addressing global censorship challenges. By open-sourcing R1 1776, Perplexity AI seeks to empower researchers and developers with tools that promote free access to information.
This release is significant for several reasons:
- It addresses a critical limitation of LLMs in handling politically sensitive or censored topics.
- It enables broader use cases in research, journalism, and education by ensuring factual and unbiased outputs.
- It sets a precedent for transparency in AI development by making the model weights publicly available.
R1 1776 represents a step forward in creating AI systems that balance ethical considerations with technical excellence.