Welcome to 9to5Neural. AI evolves rapidly, and we’re here to help you stay informed. Recently, we reported that U.S. AI companies are facing stiff competition from China’s DeepSeek R1. Today, DeepSeek’s influence has extended to Wall Street, resulting in a 17% drop in NVIDIA shares. Let’s delve into the story of DeepSeek, NVIDIA’s counteraction, and the broader implications for AI advancement.
What is DeepSeek?
DeepSeek is a Chinese AI company that emerged from a hedge fund named High-Flyer. Founded in 2023 by Liang Wengeng, the firm is based in Hangzhou, Zhejiang, China. Prior to DeepSeek, Wengeng co-founded High-Flyer, centering its efforts on AI investments.
DeepSeek initiated the training of its models before the U.S. government curtailed China’s access to American AI chips. As a result, the company possesses a considerable quantity of NVIDIA GPUs sourced prior to these restrictions.
Nonetheless, DeepSeek has been required to function within the limits of restricted access to additional NVIDIA hardware. This limitation may have prompted the firm to concentrate on the innovation presented by its V3 model.
DeepSeek has showcased its capability to compete with OpenAI’s latest o3 model. ChatGPT o3, which follows o1, is possibly named as o2 is already associated with an established UK telecom provider.
Nonetheless, DeepSeek has developed a model that is nearly as competitive but necessitates significantly fewer resources and operates at a fraction of the cost compared to OpenAI’s offering.
DeepSeek’s current position can be credited to its focus on refining existing models instead of adopting the same methods employed by American firms. It’s fair to acknowledge that DeepSeek leverages the foundational work accomplished by recognized AI companies, while simultaneously adapting to U.S. restrictions on exporting AI chips to China by optimizing existing models through distillation.
DeepSeek training methodology
This narrative is only partway through. The future remains uncertain, yet it seems probable that OpenAI and other U.S. AI companies will prioritize model distillation to reduce operational costs and maintain competitiveness. Essentially, DeepSeek’s achievements appear replicable by American firms; it’s now a question of focusing on model efficiency in light of emerging competition.
However, model distillation isn’t the only strategy that has propelled DeepSeek in the AI race. The company has also employed AI training other AIs, departing from the traditional human-in-the-loop method that emphasizes human-annotated datasets.
The advantage of the AI-training-AI approach lies in its scalability, requiring less human involvement. The drawback, however, is the risk of magnifying errors, complicating alignment checks—ensuring AI models align with our values and operate as intended.
Supervised fine-tuning and reinforcement learning from human feedback help maintain unbiased AI responses. This ensures the quality of data utilized.
While a drastic change in how U.S. AI firms uphold data quality isn’t anticipated, a notable shift toward AI training AI is forthcoming. This transition has always been a goal for OpenAI and similar entities; DeepSeek may have accelerated the need to reach this milestone.
$6 million tanks $600 billion
For those following DeepSeek, you might have encountered the $6 million figure from their research paper detailing its latest model. The assertion is that V3 was developed for under $6 million, utilizing less capable NVIDIA H800 hardware. However, this claim could be valid while overlooking earlier investment costs for training prior models, not to mention the NVIDIA inventory secured before the U.S. export restrictions on AI chips.
Another figure worth noting is $600 billion—the market cap NVIDIA lost today alone. Investor anxiety concerning DeepSeek’s cost-effective models has resulted in diminished perceived growth potential for NVIDIA.
This perspective seems overly reactionary and shortsighted. My view is that DeepSeek demonstrates a new level of efficiency in developing existing AI models. This could mean a faster progression towards major AI advancements.
In essence, deploying more NVIDIA GPUs may still be crucial for advancing AI technology, just that progress may be quicker now. Remember, the AI race is about future achievements, not current standings.
AI isn’t a solved problem
This leads us to OpenAI’s ambitious Stargate Project. Essentially, Stargate is a Texas facility designed to maximize computational power. If future AI models can achieve more with less processing power, that means they could accomplish even greater feats with the current compute capacity Stargate aims to optimize.
There’s a substantial gap between the aspirations of these firms and our present capabilities. DeepSeek’s influence might compel other AI companies to reconsider their current objectives. We’ll need to observe DeepSeek’s forthcoming developments to gauge whether it is truly a pioneering firm.
In closing, a few additional notes.
NVIDIA has found a silver lining in DeepSeek’s advancements, stating:
DeepSeek represents a significant AI progression and serves as an excellent example of Test Time Scaling. DeepSeek’s research demonstrates how novel models can be generated through this technique, leveraging widely-available models and compute that comply with export regulations. Inference necessitates a substantial number of NVIDIA GPUs and high-performance networking. We now recognize three scaling laws: pre-training, post-training, and the new test-time scaling.
In simpler terms, we’re improving our systems while they’re operational, but we still require the necessary resources to succeed.
NVIDIA remains up 93% year-over-year and 1,782% over the past five years.
OpenAI is also expected to expand offerings for ChatGPT o3-mini, partly as a reaction to DeepSeek’s emerging competition.
President Trump addressed the implications of DeepSeek on Monday, according to Reuters:
The emergence of DeepSeek, an AI developed by a Chinese company, should serve as a wake-up call for our industries, underlining the necessity of an intense focus on competitive advantage.
I’ve researched Chinese technology companies, particularly one that presents a more rapid and less costly method for AI development. This can be advantageous as it reduces expenses without sacrificing results, which I regard positively.
We are known for innovation and creativity. Thus, this could be seen as an encouraging sign, suggesting you can achieve desired outcomes at lower costs moving forward.
The AI race is heating up, and the AI sector is the next frontier akin to NASA’s journey.
DeepSeek has temporarily suspended new account registrations due to a significant cyber attack affecting its services. Visitors to chat.deepseek.com are currently met with the following notification:
Due to widespread malicious attacks on DeepSeek’s services, account registration may be intermittent. Please be patient and try again later. Registered users can log in as usual. We appreciate your understanding and support.
Despite this, we successfully created a new account after several hours of attempting on Monday.
Additionally, a viral social media post has claimed that downloading DeepSeek for iOS gives the Chinese AI company extensive access to personal data, including emails and messages. Thankfully, this assertion is unfounded as iOS operates differently. Users can opt to register with Sign in with Apple, thus generating a disposable email address for extra privacy. However, it’s important to note that DeepSeek does have access to user inputs within the chatbot.
Furthermore, DeepSeek has encountered issues responding to queries about the events of 1989 at Tiananmen Square, instead advising users to engage in discussions regarding math, coding, and logic problems.
Stay tuned for further updates on the latest AI advancements in the upcoming edition of 9to5Neural—exclusively on DMN! You can read the previous issue here.
Top iPhone accessories
FTC: We utilize income-earning automated affiliate links. More.