A Chinese quant fund that pioneered AI

One of the pack of Chinese AI hopefuls trying to take on the likes of OpenAI comes from an unusual source: the quant fund that dominates the country’s financial sector.

High-Flyer Capital Management, a Chinese quantitative hedge fund that has grown to roughly Rmb60 billion ($8 billion) since launching in 2015, uses AI and algorithms in part to identify patterns or variables that could affect stock prices.

Now, it has turned that knowledge and infrastructure into a powerful artificial intelligence model that has been brought to market and which experts say is on par with leading Western efforts. DeepSeek-V2 can answer questions, write code and reason.

DeepSeek costs significantly less than competitors, at around Rmb2 for every million output tokens – or words returned for a query – which has sparked a price war among Chinese AI providers.

A week after its launch in May, tech giant ByteDance slashed prices as low as Rmb0.60 per million output tokens. Rival Alibaba then slashed usage prices for some of its models by up to 97 percent, and Baidu gave away two of its Ernie models for free.

The launch of the new model, which quickly attracted thousands of Chinese developers, highlights that tech giants such as Baidu and Alibaba face stiff competition from nimbler upstarts, despite their early lead in generative artificial intelligence. It also focused on the highly competitive generative AI in China.

“The gap between the US and China is not as big as everyone thinks,” Liu Qingfeng, founder of Chinese AI group iFlytek, said at a recent technology meeting in Macau. “In many verticals our [models] are better than theirs.”

DeepSeek’s development is fueled by funding from sister hedge fund High-Flyer. Its funds have returned 151 percent since 2017, or 13 percent annualized, achieved in China’s battered domestic stock market. The country’s benchmark CSI 300 index, which tracks China’s top 300 stocks, rose 8 percent over the same period, according to research provider Simu Paipai.

In February, Beijing cracked down on quant funds, blaming their high-speed algorithmic trading for the stock market sell-off at the start of the year. Since then, High-Flyer funds have underperformed the CSI 300 by four percentage points.

High-Flyer and DeepSeek did not respond to requests for comment.

The quantum fund began in a Chengdu apartment where founder Liang Wenfeng, a computer science graduate from Zhejiang University, experimented with automated stock trading, according to local media reports. His profile on the Register of China Asset Management Associations states that he was a freelancer until 2013, when he founded his first investment firm.

By 2021, all of High-Flyer’s strategies were using AI, according to manager Cai Liyu, using strategies similar to those promoted by the hugely profitable hedge fund Renaissance Technologies. “AI helps extract valuable data from massive data sets that can be useful for predicting stock prices and making investment decisions,” he said during a roadshow that was broadcast online that year.

Cai said the company’s first computing cluster cost nearly Rmb200 million and that High Flyer invested around Rmb1 billion to build a second supercomputer cluster that would span an area roughly the size of a football field. Most of their profits went back into the AI ​​infrastructure, he added.

The second cluster, now complete, connects more than 10,000 high-end Nvidia processors to servers and storage, giving DeepSeek the computing power to train a large model, according to archived versions of the company’s website. The group acquired Nvidia A100 chips before Washington restricted their supply to China in mid-2022.

“We always wanted to do experiments on a larger scale, so we always tried to deploy as much computing power as possible,” founder Liang told Chinese technology website 36Kr last year. “We wanted to find a paradigm that could fully describe the entire financial market.”

The company is one of six Chinese groups with more than 10,000 A100 processors, which are commonly considered the computing threshold for self-training large models, according to Guosheng Securities. The other five are Chinese tech giants, though their collective computing power pales in comparison to American companies. Meta said it will have computing power equal to nearly 600,000 of Nvidia’s more advanced H100 chips by the end of the year.

Tests conducted by research groups rank DeepSeek-V2 among the best LLMs in the world. Researchers at the University of Waterloo in Canada ranked it among the top 10 models behind OpenAI’s GPT-4, Anthropic’s Claude and Chinese rival 01.AI.

The DeepSeek model is also open source, allowing AI researchers to inspect its structure and copy it.

“The architecture of the model is very unique,” said Andrew Carr, chief scientist at US-based AI animation startup Cartwheel. “DeepSeek took this idea called expert blending, where you break the model down into smaller parts, to an extreme, with hundreds of little experts.”

Carr said the model came close to the latest Meta Llama 3, but at a lower price. Its price is about 100 over OpenAI’s GPT-4 and a fifth of Anthropic’s Claude 3 Haiku.

Tiezhen Wang, an engineer at New York-based AI research center Hugging Face, said the DeepSeek team reduced what the model needed to remember while allowing it to “handle multiple tasks simultaneously without slowing down.”

In China, the pricing strategy helped sign up developers. Wang Zixu, a programmer based in northern China, said he switched from using OpenAI GPT-4 for coding assistance to DeepSeek because of the lower prices.

Despite the price advantage, some industry experts said DeepSeek could be losing money at its low price. Its computing power may also lag behind its competitors as Nvidia releases new chips that are banned from being exported to China.

Still, High-Flyer’s AI arm is trying to be the first to reach artificial general intelligence, the point where machines have greater cognitive abilities than humans.

“We believe AGI is the violent beauty of model x data x computing power,” said one job posting for DeepSeek. “Go on a ‘deep quest’ with us on the way to AGI!”

More news from Nian Liu in Beijing

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top