DeepSeek is the buzzy new AI model taking the world by storm. The Chinese startup has impressed the tech sector with its robust large language model, built on open-source technology.
DeepSeek has also sent shockwaves through the AI industry, showing that it's possible to develop a powerful AI for millions in hardware and training, when American companies like OpenAI, Google, and Microsoft have invested billions.
What is DeepSeek?
DeepSeek is the brainchild of investor and entrepreneur Liang Wenfeng, a Chinese national who studied electronic information and communication engineering at Zhejiang University. Liang began his career in AI by using it for quantitative trading, co-founding the Hangzhou, China-based hedge fund High-Flyer Quantitative Investment Management in 2015. In 2023, Liang launched DeepSeek, focusing on advancing artificial general intelligence.
Image: DeepSeek
DeepSeek launched its first large language model, DeepSeek-Coder, on November 29, 2023.
But it wasn't until January 20, 2025, with the release of DeepSeek-R1, that the company upended the AI industry.
With a team of just 200 people and a budget of $6 million, DeepSeek released its free, open-source model, which was on par with OpenAI's much-ballyhooed GPT 01 model—a project that cost as much as $600 million and took an an estimated 3,500 people two years to build.
🚀 DeepSeek-R1 is here!
⚡ Performance on par with OpenAI-o1 📖 Fully open-source model & technical report 🏆 MIT licensed: Distill & commercialize freely!
Unlike big tech companies with big payrolls in the west, DeepSeek optimized its hiring to focus on recently graduated students: "Three to five years of work experience is the maximum, and those with more than eight years of work experience are basically rejected," a headhunter told 36kr, a popular Chinese tech site.
And, whereas OpenAI and other dominant AI models were mainly available as subscription products, DeepSeek’s code is open source, available for public scrutiny and can be downloaded to a local computer via AI playground Huggingface, or as a phone app, for free.
DeepSeek’s underlying technology was considered a massive breakthrough in AI and its release sent shockwaves through the US tech sector, wiping out $1 trillion in value in one day.
Image: DeepSeek
What’s so special about DeepSeek?
DeepSeek's success comes from its approach to model design and training. Like a massively parallel supercomputer that divides tasks among many processors to work on them simultaneously, DeepSeek’s Mixture-of-Experts system selectively activates only about 37 billion of its 671 billion parameters for each task. This approach significantly improves efficiency, reducing computational costs while still delivering top-tier performance across applications.
DeepSeek enhances its training process using Group Relative Policy Optimization, a reinforcement learning technique that improves decision-making by comparing a model’s choices against those of similar learning agents. This allows the AI to refine its reasoning more effectively, producing higher-quality training data.
A Chinese artificial intelligence lab has done more than just build a cheaper AI model—it's exposed the inefficiency of the entire industry's approach.
DeepSeek's breakthrough showed how a small team, in an effort to save money, was able to rethink how AI models are built. While tech giants like OpenAI and Anthropic spend several billions of dollars on compute power alone, DeepSeek purportedly achieved similar results for just over $5 million.
The company's model matches or beats GPT-4o (OpenAI’...
DeepSeek has also demonstrated a commitment to open-source accessibility by releasing its models under the MIT license, which allows users to download, deploy, and customize the AI model, distinguishing it from competitors that maintain closed and proprietary systems. Open-source also allows developers to improve upon and share their work with others who can then build on that work in an endless cycle of evolution and improvement.
DeepSeek's development is helped by a stockpile of Nvidia A100 chips combined with less expensive hardware. Some estimates put the number of Nvidia chips DeepSeek has access to at around 50,000 GPUs, compared to the 500,000 OpenAI used to train ChatGPT.
Reactions to DeepSeek
Many AI technologists have lauded DeepSeek’s powerful, efficient, and low-cost model, while critics have raised concerns about data privacy security.
“We are living in a timeline where a non-US company is keeping the original mission of OpenAI alive—truly open, frontier research that empowers all. It makes no sense,” Nvidia Senior Research Manager Dr. Jim Fan wrote on X (formerly Twitter). “The most entertaining outcome is the most likely.”
This is the DeepSeek R1 Reasoning Engine running Grok-1 Open Source.
The Reasoning Engine allows for new life to be given to older models.
Even OpenAI CEO Sam Altman acknowledged that DeepSeek is impressive.
“We will obviously deliver much better models and also it's legit invigorating to have a new competitor!” Altman said on X.
Days later, though, the firm claimed to have found evidence that DeepSeek used OpenAI's proprietary models to train its own rival model.
Critics have also raised questions about DeepSeek's terms of service, cybersecurity practices, and potential ties to the Chinese government. Others have highlighted the extensive amount of user data collected by DeepSeek, including device models, operating systems, keystroke patterns, and IP addresses—data that’s stored on DeepSeek’s China-based servers, according to the firm’s privacy policy.
As a general news and also security awareness: Deepseek is a new LLM and it's powerful, but there is a caveat, they collect keystroke patterns, this is not common and can be used to identify yourself in the future in any device or website as keystroke patterns are like individual… pic.twitter.com/8pn1EkzN2K
“Privacy is an issue because it's China. It’s always about collecting data from users. So user beware,” Kevin Surace, CEO at AI software developer Appvance, told Decrypt. “It will force everyone to rethink how we train models and how much power is required for inference.”
What does the future hold for DeepSeek?
DeepSeek’s rapid rise challenges the dominance of Western tech giants and raises significant questions about the future of AI—who builds it, who controls it, and how open and affordable for all it should be.
But questions remain about the long-term implications of DeepSeek and whether U.S. President Trump will respond to China's apparent overnight dominance in the AI sector with a TikTok-style ban. Did High-Flyer misrepresent its use of GPUs to make DeepSeek seem more efficient than it actually is? Was DeepSeek’s sudden public launch timed to drive down Nvidia’s stock for the benefit of well-positioned investors?
A small Chinese startup just forced America's biggest tech companies to rethink how they build artificial intelligence.
DeepSeek's release of its R1 model, which reportedly matches or exceeds the capabilities of U.S.-built AI systems at a fraction of the cost, triggered a massive sell-off in tech stocks that erased nearly $600 billion from Nvidia's market value alone.
The shockwaves hit the US tech sector in the gut, with leaders in the industry hurrying to analyze how DeepSeek achieved such res...
As competitors, including Meta and Perplexity AI, scramble to adapt to DeepSeek’s methodology, the full impact of this AI breakthrough remains uncertain. But one thing is clear: DeepSeek shook up the tech industry by proving yet again that sometimes, resource constraints force innovative breakthroughs and that powerful technology can be built without multi-billion-dollar price tags.
Generally Intelligent Newsletter
A weekly AI journey narrated by Gen, a generative AI model.
Learn about Crypto & earn your NFT certificate of completion!
Sign up for free online courses covering the most important core topics in the crypto universe and earn your on-chain certificate - demonstrating your new knowledge of major Web3 topics.
Chinese tech company Tencent just introduced its latest large language model, Hunyuan Turbo S, featuring significantly faster response times without sacrificing performance on complex reasoning tasks.
Tencent claims that its new AI doubles word generation speed and cuts first-word delay by 44% compared to previous models, according to official information that the Chinese tech giant shared on Weibo.
The model uses what appears to be a hybrid architecture combining Mamba and Transformer technolog...
Has ChatGPT ever not been in the news since its launch in November 2022? The dominant name in generative AI, the company has more than 300 million weekly users and has taken the world by storm, generating both excitement for the future and concern.
After the release of GPT-3 to the general public, technology experts and policymakers sounded the alarm about AI misuse.
In this article, we will explore the history and technology behind ChatGPT.
What is ChatGPT?
Developed by San Francisco-based Open...
OpenAI released GPT-4.5 on Thursday, just one day after Anthropic launched Claude 3.7 Sonnet and merely a week following xAI's Grok-3 debut and DeepSeek’s announcement of a new model coming soon.
And expensive is the operative word here. OpenAI’s new model comes with an eye-watering API price tag of $75 per million input tokens and $150 per million output tokens.
It appears to be a new competitive phase in the AI race, with companies scrambling to outdo each other with increasingly capable—and i...