However, the true price of the development of Deepseek’s new models remains unknown, as one digit quoted in a single research document may not capture the complete picture of its cost. “I don’t believe it’s $ 6 million, but even if it’s $ 60 million, it’s a game changer,” says Umesh Padval, managing director of Thomvest Ventures, a company in Cohere and other AI – Firms invested. “This will put pressure on the profitability of businesses focused on consumers AI.”
Shortly after Deepseek unveiled the details of his latest model, says Ghodsi of Dataricks, says clients have started asking if they can use it, as well as Deepseeek’s underlying techniques to cut costs at their own organizations. He adds that one approach is used by DeepSeek’s engineers, known as distillation, which involves the use of the output of one major language model to train another model, is relatively cheap and simple.
Roadfall says that the existence of models such as Deepsheek eventually benefits from companies that want to spend less on AI, but he says that many firms may have reservations to rely on a Chinese model for sensitive tasks. So far, at least one prominent AI firm, Portlexity, has publicly announced that it is using Deepseek’s R1 model, but it is said to be offered ‘completely independent of China’.
Reprit CEO Amjad Massad, a starting which AI coding tools offer, told Wired that he thinks that Deepseek’s latest models are impressive. While he still finds that Anthropic’s sonnet model is better at many computer engineering tasks, he found that R1 is especially good at converting text assignments into code that can be executed on a computer. “We are examining this especially for agency reasons,” he adds.
DeepSeek’s latest two offers-Deepseek R1 and DeepSeek R1-Nul-Is Able to the same kind of simulated reasoning as the most advanced systems of Openai and Google. They all work by dividing problems into constituent parts to tackle it more effectively, a process that needs a significant amount of extra training to ensure that the AI reliably reaches the right answer.
A paper posted by DeepSeek Researchers was given an account of the approach the company used to create its R1 models last week, which he believes is known on some benchmarks about as well as Openai’s groundbreaking reasoning model As O1. The Tactics DeepSeek used includes a more automated method to learn how to solve correctly, as well as a strategy to transfer skills from larger models to smaller.
One of the hottest topics of speculation about DeepSeek is the hardware it may have used. The question is particularly noteworthy because the US government has set up a series of export control and other trade restrictions over the past few years aimed at obtaining and manufacturing China’s ability to obtain the latest chips needed to build advanced AI and to manufacture.
In a research article from August 2024, DeepSeek indicated that it had access to a group of 10,000 Nvidia A100 chips, announced under US restrictions in October 2022. In a separate article of June of that year, DeepSeek said that an earlier model was created in it called DeepSeek-V2 was developed using clusters of NVIDIA H800 computer chips, a less capable component developed by NVIDIA to get to to meet the US export controls.
A source at one AI business that trains large AI models, which asked to be anonymous to protect their professional relationships, estimates that DeepSeek probably used about 50,000 Nvidia chips to build its technology.
Nvidia refused to comment directly on which of his chips Deepsheek may have relied on. “Deepseek is an excellent AI advance,” a spokeswoman for Nvidia said in a statement, adding that the reasoning approach of the startup “requires a significant number of NVIDIA GPUs and high performance networks.”
But Deepseek’s models are built, it seems that they show that a less closed approach to the development of AI is momentum. In December, Clem Delangue, CEO of Huggingface, predicted a platform that houses artificial intelligence models, predicted that a Chinese company would take the lead in AI due to the speed of innovation that took place in Open Source models, which largely embraced China . “It went faster than I thought,” he says.