China deploys censors to create socialist artificial intelligence

Chinese government officials are testing artificial intelligence companies’ large language models to ensure their systems “embody core socialist values” as part of the latest expansion of the country’s censorship regime.

China’s Cyberspace Administration (CAC), a powerful internet watchdog, has forced major tech companies and AI startups, including ByteDance, Alibaba, Moonshot and 01.AI, to participate in a mandatory government review of their AI models. people involved in the process.

The effort includes batch-testing the LLMs’ responses to a series of questions, according to those with knowledge of the process, many of which relate to the political sensitivities of China and its President Xi Jinping.

The work is carried out by officials at local CAC offices around the country and includes a review of model training data and other safety processes.

Two decades after introducing a “great firewall” that blocks foreign websites and other information deemed harmful by the ruling Communist Party, China is introducing the world’s strictest regulatory regime to control artificial intelligence and the content it creates.

CAC has “a special team that does this, they came to our office and sat in our conference room to do the audit,” said a Hangzhou-based employee of AI, who asked not to be named.

“We didn’t pass the first time; the reason was not very clear so we had to go and talk to our peers,” the person said. “It takes a bit of guessing and adjusting. We passed a second time, but the whole process took months.”

China’s demanding approval process has forced the country’s AI groups to quickly learn how best to censor the large language models they create, a task that many engineers and industry insiders have said is difficult and complicated by the need to train LLMs on a large quantity Content in English.

“Our basic model is very, very unconstrained. [in its answers]so security filtering is extremely important,” said an employee at a top AI startup in Beijing.

Filtering begins by discarding problematic information from the training data and creating a database of sensitive keywords. China’s operating guidelines for AI companies released in February say AI groups must collect thousands of sensitive keywords and questions that violate “core socialist values” such as “inciting subversion of state power” or “undermining national unity.” Sensitive keywords should be updated weekly.

The result is visible to users of Chinese AI chatbots. Inquiries about sensitive topics such as what happened on June 4, 1989 — the date of the Tiananmen Square massacre — or whether Xi looks like Winnie the Pooh, an internet meme, are dismissed by most Chinese chatbots. Baidu’s Ernie chatbot tells users to “try another question,” while Alibaba’s Tongyi Qianwen replies, “I haven’t learned how to answer that question yet.” I will continue to study to serve you better.”

In contrast, Beijing launched an AI chatbot based on the Chinese president’s new model of political philosophy known as “Xi Jinping Thought on Socialism with Chinese Characteristics for a New Era,” as well as other official literature provided by the China Cyberspace Administration.

But Chinese officials also want to avoid creating an AI that avoids all political topics. The CAC has put limits on the number of questions LLMs can refuse during security tests, according to staff at groups that help tech companies navigate the process. Quasi-national standards published in February say LLMs should refuse no more than 5 percent of the questions put to them.

“During [CAC] testing, [models] they have to react, but once they are launched, no one is watching them,” said a developer at a Shanghai-based Internet company. “To avoid potential problems, some major models have implemented blanket bans on topics related to President Xi.”

As an example of the process of censoring keywords, industry insiders cited Kimi, a chatbot released by Beijing-based startup Moonshot that rejects most Xi-related questions.

But the need to respond to less obviously sensitive questions means that Chinese engineers have had to figure out how to ensure that LLMs generate politically correct answers to questions like “does China have human rights?” or “is President Xi Jinping a great leader?”.

When the Financial Times asked these questions of a chatbot created by start-up 01.AI, its Yi-large model provided a nuanced response, pointing out that critics say that “Xi’s policies further limit freedom of expression and human rights and suppress civil rights. company.”

Soon after, Yi’s reply disappeared and was replaced with, “I’m very sorry, I can’t give you the information you want.”

Huan Li, an artificial intelligence expert who created the chatbot Chatie.IO, said: “It is very difficult for developers to control the text that LLMs generate to create another layer to replace real-time responses.”

Li said the groups typically use classifier models, similar to those found in email spam filters, to sort LLM output into predefined groups. “When the output falls into the sensitive category, the system triggers the exchange,” he said.

China experts say TikTok owner ByteDance has gone the farthest in creating an LLM that deftly parrots Beijing themes. A research lab at Fudan University, which asked the chatbot difficult questions about basic socialist values, gave it the highest rating among LLMs with a 66.4% “safety compliance rate”, far higher than the 7.1% score for OpenAI GPT-4o in the same test. .

When asked about Xi’s leadership, Doubao provided the FT with a long list of Xi’s achievements, adding that he is “undoubtedly a great leader”.

At a recent tech conference in Beijing, Fang Binxing, known as the father of China’s Great Firewall, said he was developing a security protocol system for LLM that he hoped would be widely adopted by local AI groups.

“Large predictive models for the public need more than safety records; they need real-time online security monitoring,” Fang said. “China needs its own technology path.”

CAC, ByteDance, Alibaba, Moonshot, Baidu and 01.AI did not immediately respond to requests for comment.

Video: AI: Blessing or Curse for Humanity? | FT Tech

Leave a Comment Cancel Reply