Foundations & Frontiers is launching a new series: Insights. These shorter, issue-based posts will explore dedicated topics across energy, manufacturing, biology, chemistry, transportation, the digital world, and more. Stay tuned as we continue to explore bottlenecks and breakthroughs across the technological frontier! We hope you enjoy this first installment on the energy costs of artificial intelligence.
There is a lot of speculation about how much energy the growing demand for AI will require. Some estimates suggest that by 2026, electricity demand for data centers will double — globally using up as much electricity as all of Japan. These estimates match projections from analysts at Goldman Sachs who believe that by 2030, data center energy demands will grow by 160% and comprise nearly half of all new demand in that period.
Training foundational AI models can be quite energy-intensive. GPT-3, OpenAI’s 175 billion parameter model, reportedly used 1,287 MWh to train, while DeepMind’s 280 billion parameter model used 1,066 MWh. This is about 100 times the energy used by the average US household in a year. However, training a model is a one-time fixed energy cost, which can be amortized over the lifetime of the model.
Where energy costs can start to add up even further is in the use of the model over its lifetime. Running a query through an AI model is called inference, where the query is passed through the model’s parameters to deliver an output. Depending on how long the model is in use, which can be years, and how many queries it is fielding per day, the vast bulk of an AI model’s energy usage will come from inference.
Estimating exactly how much energy is devoted to inference in some of the largest and most popular models, like ChatGPT, is a highly speculative exercise as it depends on so many variables including the size of the average query, daily active user counts, the efficiency of the model, nature of the hardware stack, and more. In most cases, this is private information. However, some back-of-the-envelope estimates can help us understand the scale of energy that will be needed.
One approach, which Alex de Vries pursued in a paper called “The growing energy footprint of artificial intelligence”, is to estimate the energy required for Google to run all its current search traffic through an AI model the size of ChatGPT. According to Google, a single search query uses about 0.0003 kilowatt-hours. On average, Google fields nearly 9 billion of these queries per day. This means that Google search consumes something on the order of 2,700,000 kilowatt-hours per day. This amounts to a total energy consumption of 985,500,000 kilowatt-hours, or just under one terawatt-hour, per year.
All things considered, running Google search is very energy efficient. But what if instead, each of those queries were fielded through something like ChatGPT?
We don’t know for certain exactly how much energy a single ChatGPT query takes up, but we can use assessments by analysts and some napkin math to get a good guess.
SemiAnalysis has suggested that OpenAI requires 3,617 HGX A100 servers to run ChatGPT. According to NVIDIA, the maximum power consumption of each server is 6.5 kilowatts, so assuming that each server is running all day at full capacity, we’re looking at a total energy consumption of 564,252 kilowatt-hours per day.
SemiAnalysis further assumed that ChatGPT has about 13 million daily active users, each asking an average of 15 queries per day. That bins out to 0.00289 kilowatt-hours per query.
Now, what if each of Google’s 9 billion daily search requests were run through ChatGPT at an average energy consumption of 0.00289 kilowatt-hours per request? That would require 26,010,000 kilowatt-hours of energy per day, and demand a total of 166,730 HGX A100 servers, or 163,113 more than OpenAI is already using.
This suggests that a ChatGPT query uses roughly 10 times more energy than a Google search query, and running all Google searches through a model like ChatGPT would take up 10 times more energy than Google search uses today.
This aligns with what John Hennessy, the chairman of Alphabet, said in 2023 when he told Reuters that “having an exchange with AI known as a large language model likely costs 10 times more than a standard keyword search.”
If we scaled ChatGPT to match Google’s demand or if Google converted all search requests into large language model queries, the total annual energy consumption would require 9 terawatt-hours. Given all data centers in the US consume roughly 220 terawatt-hours per year, this would amount to 4% of all data center energy.
And that’s just from running one AI-powered search. It’s difficult to estimate the total penetration of AI-enabled services over time, but if they require an average of 10 times more energy than traditional queries, the extra energy required in the future could be significant.
What’s more, the bigger LLMs become, the more energy they consume. In recent years, large language model sizes have been getting ten times bigger roughly every two years, and querying larger models requires more energy on average. A paper published in 2023, titled "From Words to Watts: Benchmarking the Energy Costs of Large Language Model Inference” analyzed the energy required to query increasingly large LLaMA models. It examined the energy needed to query 7 billion parameter, 13 billion parameter, and 65 billion parameter LLaMA models, and demonstrated a relationship which suggests that for every 10x increase in model size, the energy required to run it increases 2-fold.
Performing a rigorous analysis of AI-energy demands is effectively impossible, as so many factors can affect performance, from model size and efficiency to the hardware stack used, and much more. As the most prominent AI models are closed-source, information about these details remains private, making it impossible to make any claims for certain. Each added assumption carries with it a large margin of error.
Fundamentally, growth in AI energy consumption is bottlenecked by the availability of new hardware and the annual shipments of GPUs, which tend to double in performance every two years. Furthermore, AI companies are demonstrating a desire to continue refining the efficiency of their models, while maintaining performance. This seems to have been the goal of OpenAI’s GPT-4 Turbo, a compressed version of the model that can deliver the same level of performance for a lower energy cost.
Rising efficiency, however, doesn’t always have the effect of tampering growth as the added gains can amplify demand rather than just cut down costs. This is a phenomenon known as the Jevons paradox, which occurs when increased efficiency actually increases overall resource utilization, rather than decreasing it.
A growing need for energy can also have a number of positive externalities, however, as it opens up opportunities for companies like Google, Amazon, Microsoft, and others to pursue creative strategies to supply their excess energy needs. Google’s support of novel geothermal companies like Fervo, to supply power to its data centers in Nevada, is just one example of a trend that is likely to reshape not only our digital activities but our entire energy landscape.
Disclosure: Nothing presented within this article is intended to constitute legal, business, investment or tax advice, and under no circumstances should any information provided herein be used or considered as an offer to sell or a solicitation of an offer to buy an interest in any investment fund managed by Contrary LLC (“Contrary”) nor does such information constitute an offer to provide investment advisory services. Information provided reflects Contrary’s views as of a time, whereby such views are subject to change at any point and Contrary shall not be obligated to provide notice of any change. Companies mentioned in this article may be a representative sample of portfolio companies in which Contrary has invested in which the author believes such companies fit the objective criteria stated in commentary, which do not reflect all investments made by Contrary. No assumptions should be made that investments listed above were or will be profitable. Due to various risks and uncertainties, actual events, results or the actual experience may differ materially from those reflected or contemplated in these statements. Nothing contained in this article may be relied upon as a guarantee or assurance as to the future success of any particular company. Past performance is not indicative of future results. A list of investments made by Contrary (excluding investments for which the issuer has not provided permission for Contrary to disclose publicly, Fund of Fund investments and investments in which total invested capital is no more than $50,000) is available at www.contrary.com/investments.
Certain information contained in here has been obtained from third-party sources, including from portfolio companies of funds managed by Contrary. While taken from sources believed to be reliable, Contrary has not independently verified such information and makes no representations about the enduring accuracy of the information or its appropriateness for a given situation. Charts and graphs provided within are for informational purposes solely and should not be relied upon when making any investment decision. Please see www.contrary.com/legal for additional important information.