With the advent of generative artificial intelligence (GenAI), the amount of data being generated and stored will continue to increase, and the demand for cloud storage will increase accordingly. Cloud service providers will also start implementing GenAI in their offerings, making it easier for users to ask questions in natural language. Knowledge of coding will be a “good to have” instead of a “must have,” democratizing the usage of applications. All of the developments in the AI space wouldn’t be possible without chips, which are essential to the development of large language models (LLMs).
Just like an engine is the driving force for a car that determines how fast a car can travel, graphics processing units (GPUs) are the driving force behind GenAI tools and applications. The problem today is that everyone wants GPUs—especially those designed to train large language models (LLMs). Just like there was a post-pandemic shortage of cars due to high demand, there is now a shortage of GPUs that train AI models. The result is that enterprises will need to wait longer, which may impact their ability to run larger workloads.
Microsoft listed the shortage of GPUs as a potential risk factor in last year’s annual report, stating, “We continue to identify and evaluate opportunities to expand our datacentre locations and increase our server capacity to meet the evolving needs of our customers, particularly given the growing demand for AI services. Our datacentres depend on the availability of permitted and buildable land, predictable energy, networking supplies, and servers, including graphics processing units (‘GPUs’) and other components.” OpenAI CEO Sam Altman has also testified to Congress, highlighting the acute shortage of GPUs.
The demand for GPUs to train AI models has surged since the advent of GenAI, but very few companies make the chips. The most prominent is Nvidia, which has an estimated 80%–95% share of the market for GPU chips that train AI models. The GenAI wave enabled Nvidia to almost triple its stock price and cross $1 trillion in market cap in the past year (see Exhibit 1). Nvidia has promised to increase its production capacity; however, the lead times involved in increasing capacity mean that demand will continue to outstrip supply in the near future.
Data source: companiesmarketcap.com, January 2024
Source: HFS Research, 2024
In the last few months, AWS, Microsoft, and Google have showcased launches of their chips. Investing in building chips means they can reduce their reliance on externally sourced chips. The sale of these chips to other companies could also open up an alternate revenue stream, though it’s too early to say if this will happen. The impact of these moves is more likely to be felt in the long term and won’t stop the current chip shortage.
AWS launches chips that reduce the power needed to train AI models
Microsoft to use its chips to power its subscription software offerings
Google aims to train LLMs faster with a new tensor processing unit
With the rise of GenAI and the competition between LLMs, the demand for GPUs to train models will continue to rise. With a near-term shortage of GPUs, there will be a focus on optimizing resources and trying to get more done with existing resources. Given the scale of demand for chips, the entry of hyperscalers in AI chips won’t eat away at core chip development companies in the short term. It is also difficult to gain the supply chain expertise that traditional chip developers have. However, it will spread out the dependency on chips, with more companies entering the space.
Using custom company-made chips will help with costs and give hyperscalers greater flexibility in the advancement of LLMs. It may also open a future revenue stream. For smaller companies and providers, it will be important to have partnerships with the major players in the space while not getting into the business themselves, given the high cost of entry and expertise needed.
Even before the GenAI boom, Nvidia was a market leader in AI chips; however, the demand for chips has spurred a shortage that no single company can manage alone. The move by hyperscalers to invest in chips is welcome and will help reduce chip dependency, spur innovation, and advance their LLMs.
Register now for immediate access of HFS' research, data and forward looking trends.
Get StartedIf you don't have an account, Register here |
Register now for immediate access of HFS' research, data and forward looking trends.
Get Started