GenAI will require a strong cloud backbone, but chip shortages are proving to be a pain

With the advent of generative artificial intelligence (GenAI), the amount of data being generated and stored will continue to increase, and the demand for cloud storage will increase accordingly. Cloud service providers will also start implementing GenAI in their offerings, making it easier for users to ask questions in natural language. Knowledge of coding will be a “good to have” instead of a “must have,” democratizing the usage of applications. All of the developments in the AI space wouldn’t be possible without chips, which are essential to the development of large language models (LLMs).

Everyone wants GPUs

Just like an engine is the driving force for a car that determines how fast a car can travel, graphics processing units (GPUs) are the driving force behind GenAI tools and applications. The problem today is that everyone wants GPUs—especially those designed to train large language models (LLMs). Just like there was a post-pandemic shortage of cars due to high demand, there is now a shortage of GPUs that train AI models. The result is that enterprises will need to wait longer, which may impact their ability to run larger workloads.

Microsoft listed the shortage of GPUs as a potential risk factor in last year’s annual report, stating, “We continue to identify and evaluate opportunities to expand our datacentre locations and increase our server capacity to meet the evolving needs of our customers, particularly given the growing demand for AI services. Our datacentres depend on the availability of permitted and buildable land, predictable energy, networking supplies, and servers, including graphics processing units (‘GPUs’) and other components.” OpenAI CEO Sam Altman has also testified to Congress, highlighting the acute shortage of GPUs.

Nvidia makes the most of the shortage

The demand for GPUs to train AI models has surged since the advent of GenAI, but very few companies make the chips. The most prominent is Nvidia, which has an estimated 80%–95% share of the market for GPU chips that train AI models. The GenAI wave enabled Nvidia to almost triple its stock price and cross $1 trillion in market cap in the past year (see Exhibit 1). Nvidia has promised to increase its production capacity; however, the lead times involved in increasing capacity mean that demand will continue to outstrip supply in the near future.

Exhibit 1: Nvidia’s market cap tripled in 2023, crossing the $1 trillion mark

Data source: companiesmarketcap.com, January 2024
Source: HFS Research, 2024

Hyperscalers respond with chip launches of their own

In the last few months, AWS, Microsoft, and Google have showcased launches of their chips. Investing in building chips means they can reduce their reliance on externally sourced chips. The sale of these chips to other companies could also open up an alternate revenue stream, though it’s too early to say if this will happen. The impact of these moves is more likely to be felt in the long term and won’t stop the current chip shortage.

AWS launches chips that reduce the power needed to train AI models

The new Graviton4 and Trainium2, launched by AWS, have better performance, more cores, and memory than previous versions. The processors will reduce the power needed to train AI models.
Graviton4 has 30% better performance than Graviton3 and contains 50% more cores and up to 75% better memory. Trainium2 can train foundation and LLMs in “a fraction of the time” and improve energy efficiency by up to two times.
Anthropic, Databricks, Datadog, Epic, Honeycomb, and SAP are among AWS customers using the new chips.

Microsoft to use its chips to power its subscription software offerings

Microsoft launched two chips at its Ignite conference. The Maia 100 artificial intelligence chip will likely compete with Nvidia’s highly sought-after AI graphics processing units, and the Cobalt 100 Arm chip is aimed at general computing tasks.
Microsoft said it does not plan to sell the chips but will use them to power its subscription software offerings as part of its Azure cloud computing service.

Google aims to train LLMs faster with a new tensor processing unit

Along with its new Gemini LLM launch, Google launched a new cloud tensor processing unit, TPU v5p, an upgrade on its prior version. Google claims that TPU v5p can train a large language model like GPT3-175B 2.8 times faster than the TPU v4. The Gemini LLM was trained on the in-house Google chips.

Reducing dependency and addressing chip shortage are key factors driving developments

With the rise of GenAI and the competition between LLMs, the demand for GPUs to train models will continue to rise. With a near-term shortage of GPUs, there will be a focus on optimizing resources and trying to get more done with existing resources. Given the scale of demand for chips, the entry of hyperscalers in AI chips won’t eat away at core chip development companies in the short term. It is also difficult to gain the supply chain expertise that traditional chip developers have. However, it will spread out the dependency on chips, with more companies entering the space.

Using custom company-made chips will help with costs and give hyperscalers greater flexibility in the advancement of LLMs. It may also open a future revenue stream. For smaller companies and providers, it will be important to have partnerships with the major players in the space while not getting into the business themselves, given the high cost of entry and expertise needed.

The Bottom Line: Hyperscalers’ chip ambitions are a welcome boost.

Even before the GenAI boom, Nvidia was a market leader in AI chips; however, the demand for chips has spurred a shortage that no single company can manage alone. The move by hyperscalers to invest in chips is welcome and will help reduce chip dependency, spur innovation, and advance their LLMs.

GenAI will require a strong cloud backbone, but chip shortages are proving to be a pain

Everyone wants GPUs

Nvidia makes the most of the shortage

Exhibit 1: Nvidia’s market cap tripled in 2023, crossing the $1 trillion mark

Hyperscalers respond with chip launches of their own

Reducing dependency and addressing chip shortage are key factors driving developments

The Bottom Line: Hyperscalers’ chip ambitions are a welcome boost.

Login

Register

Insight. Inspiration. Impact.

Author

Suhas A R Associate Practice Leader

Related

Congratulations!

Sign In

Insight. Inspiration. Impact.

Email
	If you don't have an account, Register here

Username
Password

	Remember Me Lost your password?

Email
	If you don't have an account, Register here

Username
Password

	Remember Me Lost your password?