Perplexity AI unlocks R1 1776 for developers with open-source release

· 2 min read
Perplexity

Perplexity AI has announced the open-sourcing of R1 1776, a post-trained version of its DeepSeek-R1 large language model (LLM). The release aims to address censorship and bias issues previously observed in the model, particularly related to sensitive topics censored by the Chinese Communist Party (CCP). The model weights are now available for download on HuggingFace, and users can also access the model through Perplexity's Sonar API.

R1 1776 is designed to provide unbiased, factual, and accurate responses to a broad range of sensitive queries. It builds on DeepSeek-R1, which is recognized for its reasoning capabilities comparable to state-of-the-art models like o1 and o3-mini. However, the original R1 faced criticism for avoiding sensitive topics or offering responses aligned with CCP narratives. R1 1776 overcomes these limitations by undergoing a post-training process that focuses on "de-censoring" while maintaining high-quality reasoning and mathematical abilities.

The post-training involved:

  1. Identifying approximately 300 topics censored by the CCP.
  2. Developing a multilingual censorship classifier to collect relevant user prompts.
  3. Creating a dataset of 40,000 multilingual prompts while ensuring user privacy and consent.
  4. Training the model using Nvidia's NeMo 2.0 framework on this dataset.

Evaluations confirmed that R1 1776 performs on par with the base R1 model in reasoning tasks while providing uncensored responses across diverse, sensitive topics.

The Company Behind It

Perplexity AI is known for developing advanced LLMs aimed at delivering accurate and comprehensive information. This release aligns with its commitment to transparency and addressing global censorship challenges. By open-sourcing R1 1776, Perplexity AI seeks to empower researchers and developers with tools that promote free access to information.

This release is significant for several reasons:

  1. It addresses a critical limitation of LLMs in handling politically sensitive or censored topics.
  2. It enables broader use cases in research, journalism, and education by ensuring factual and unbiased outputs.
  3. It sets a precedent for transparency in AI development by making the model weights publicly available.

R1 1776 represents a step forward in creating AI systems that balance ethical considerations with technical excellence.

Source