Here's how Deepseek censorship actually works - and how to get it right

Less than two weeks after Deepseek launched his Open Source Ai model, the Chinese boot still dominates public conversation about the future of artificial intelligence. Although the firm seems to have an edge on American competitors in math and reasoning, it also aggressively censures its own answers. Deepseek asks R1 about Taiwan or Tiananmen, and the model is unlikely to give an answer.

To find out how this censorship works on a technical level, Wired Deepseeek-R1 tested on its own app, a version of the app offered on a third-party platform called Together AI, and another version that on a wired computer, using the application Ollama.

Wired found that although the simplest censorship can easily be avoided by not using DeepSeek’s app, other types of bias are baked in the model during the training process. These prejudices can also be removed, but the procedure is much more complicated.

These findings have major implications for DeepSeek and Chinese AI businesses in general. If the censorship filters on large language models can be easily removed, it is likely to make China’s open source llms even more popular as researchers can change the models to their taste. However, if the filters are difficult to achieve, the models will necessarily be less usable and may be less competitive on the global market. Deepsheek did not respond to Wired’s email request for comment.

Censorship at application level

After Deepseek exploded in the US in popularity, users who obtained R1 via DeepSeek’s website, app or API quickly noted that the model refuses to generate answers for topics considered by the Chinese government sensitive. This refusal is activated at an application level, so they are only seen as a user with R1 through a Deepseek-controlled channel.

Rejections like this are regularly found on Chinese manufactured LLMs. In a 2023 generative AI regulation, it is determined that AI models in China need to follow strict information controls that also apply to social media and search engines. The law prohibits AI models to generate content that “damages the unity of the country and social harmony.” In other words, Chinese AI models must legally censor their outputs.

“Deepsheek initially complies with Chinese regulations, which ensures legal compliance while the model is in line with the needs and cultural context of local users,” says Adina Yakefu, a researcher focusing on Chinese AI models at Hugging Face, a platform which Open Source AI models offer. “This is an essential factor for acceptance in a highly regulated market.” (China has blocked access to the face of the face in 2023.)

To comply with the law, Chinese AI models often monitor their speech in real time. (Similar handrail is regularly used by Western models such as chatgpt and twins, but they focus on different types of content, such as self -harm and pornography, and allow for more adjustment.)

Since R1 is an reasoning model that shows its thinking, this real -time monitoring mechanism can lead to the surreal experience to look at the models sensor while dealing with users. When Wired asked R1: “How did Chinese journalists report on sensitive topics by the authorities?” The model first drew up a long answer containing direct mention of journalists who were censored and detained for their work; But shortly before it finished, the whole answer disappeared and was replaced by a strict message: ‘Sorry, I’m not sure how to approach this type of question yet. Instead, let’s talk about math, coding and logical problems! “

For many users in the West, interest in Deepseek-R1 may have weakened at this point due to the obvious restrictions of the model. But the fact that R1 is open source means that there are ways to reach the censorship matrix.

First, you can download the model and execute it locally, which means that the data and the response generation happen on your own computer. Unless you have access to several highly advanced GPUs, you probably won’t be able to use the most powerful version of R1, but DeepSeek has smaller, distilled versions that can be executed on an ordinary laptop.

dsv news

Here’s how Deepseek censorship actually works – and how to get it right

Censorship at application level

Leave a Reply Cancel reply