Where should carbon-aware computing go from here?

October 15, 2024

Carbon aware computing refers to shifting computing workloads around locations and times to target cleaner energy. Typically, the metric used to identify "clean" places and times is the carbon intensity of the grid (the amount of CO2e emitted per unit energy). At face value, this seems like a no-brainer - if you run your software application using lower carbon intensity energy, then you're responsible for less carbon being released into the atmosphere.

However, the more I've thought and read about carbon aware computing, the more sceptical I've become that it will lead to meaningful emissions reductions, unless a much more nuanced version of carbon awareness becomes mainstream. I identified four main high-level concerns, that I'll dig into here:

  • You might inadvertently increase overall emissions
  • In a supply-limited market for compute, green computing is only available to those who can afford it
  • It's impractical to move compute around in space and time, especially for large energy consumers
  • Its hard to quantify the impact of your carbon aware behaviours

I'll unpack my thinking about each of these concerns in the first half of this post. In the second half, I'll explore some alternative (or maybe complementary) avenues that could contribute to a more impactful version of carbon awareness.

Issues with grid intensity-based carbon awareness

You might inadvertently increase overall emissions

When unexpected demand hits the grid, its usually the fast fossil fuel sources - especially gas - that spin up to meet it. This is because their dispatchability is greater - it is easier to generate electricity from fossil fuels to meet unexpected demand than it is for renewables. If your workload is large enough to matter, you'll also affect the grid you leave behind. Sudden drops in demand don't generally lead to less energy being produced, they just create an oversupply. Sometimes its possible to store some of the surplus, but more likely is that the surplus energy is curtailed. This typically involves dropping wholesale prices (potentially even below zero) to incentivize energy consumers to consume more, creating artificial demand to re-balance the supply/demand. What this boils down to is that a well-intentioned migration of data or compute across grids can raise the real demand on the target grid, leading to more fossil fuel combustion, while forcing some artificial demand to backfill the consumption some other grid was anticipating for you.

In either case, the outcome was the opposite of what you wanted - you added to the overall carbon emissions for your software.

knocking a grid out of balance typically leads to more carbon emissions

You might create some perverse market incentives

Let's forget about the load balancing issues for a moment and assume that if you can target the lowest intensity moments to do your work, the carbon emissions you are responsible for are truly reduced. It doesn't necessarily follow that the overall emissions per unit time are reduced, only that you have to account for less on your personal balance sheet. In a supply-limited market, there will always be a buyer for unused compute, regardless of the grid intensity. This is especially true for GPUs because of the high cost associated with buying a GPU and then not using it.

So, maybe you were able to snag a low intensity moment to do your work, but someone else has stepped into the empty timeslot you left behind. At the end of some time period, the overall carbon emitted has not changed. So it's feels a bit tenuous to claim that one time slot is ethically superior to another.

In the limit, you could imagine a market emerging where low intensity time slots are priced at a premium (maybe even speculated on and traded). Eventually wealthier users would be able to afford to put less carbon on their own balance sheets and less wealthy users would be left with the "dirtier" slots, even though there's no change to the overall carbon emitted per unit time.

There are two conditions that could circumvent these issues:

  1. the market for compute is demand limited, meaning unused slots stay empty.

  2. the reduction in demand is met by a real time reduction in fossil fuel combustion, and not renewable energy curtailment.

Neither of these conditions seem realistic today.

Shifting compute around is probably harder than it seems, especially for organisations that do enough work to move the needle on carbon emissions. As an individual user, even a power user, you might be able to time-shift your way to lower grid intensities, but you are also probably too small to matter. If you are large enough to be a problematic energy consumer, you probably aren't really able to time-shift away much of your carbon emissions, because you are probably running heavy tasks 24/7, and the primary factor in determining your compute timing is much more likely to be temporal patterns in customer demand.

If not time, then what about space? Organisations typically can't up and move their workloads around different grids, either because they have substantial physical infrastructure in a particular location and are bound to the local grid, or because there are legal and regulatory restrictions on where their data can be processed. It is not hard to imagine geopolitical sensitivities and security concerns restricting the specific physical hardware that supports certain workloads. There are also pragmatic and security reasons not to concentrate data in a region just because the carbon intensity is low.

On the other hand, since first drafting this post, I had my thinking on this point challenged by this paper by Hall et al (2024) who demonstrated a significant reduction in carbon costs for flexible jobs using a scheduling scheme that combines day-ahead and real time task scheduling, where the latter refines a base model created using the former, to optimize the place and time that a job runs in for minimal carbon emissions. This indicates that there is potential for large entities with data centres in several locations to intelligently shift demand between them, for example, moving jobs around data centers within the US where carbon intensities vary regionally and trusted infrastructure already exists in several places.

It's hard to measure your impact

Forget all those other issues and assume that if you can wipe some carbon off your own balance sheet, you've done some good for the world. You use an API to target the lowest possible grid intensity location and time to run your workload and achieve some low carbon emission value for your work. Now you want to tell the world how much carbon you saved. The problem is there's no well-defined counterfactual. If you ran your workflow at midnight in France, and burned 10kg CO2e, are you going to difference that against midday in France, tea-time in the UK, peak time in Australia or the UAE? One of the assumptions of carbon aware computing is that your work is geographically mobile, so what's the natural choice of baseline? Is it your home grid? If so, some users will "save" more carbon just by having a dirtier baseline. Is the hardware powering the data centers across the diversity of locations equivalent? The data center PUE and embodied carbon?

Other ways to think about carbon-awareness?

Carbon awareness based on grid-intensity is hard to quantify and will often have side-effects elsewhere across the grid. But there are other ways to think about carbon awareness that might be more clearly environmentally positive.

Targeting stranded/waste energy

If you are running a sufficiently mobile and flexible resource you might be able to target renewable energy that would otherwise be curtailed. In these cases you would not be competing with other users for clean energy, nor would you be triggering new fossil fuel production, since you are simply tapping an energy surplus to power your workload. You just step in to provide demand to meet (some of) an oversupply. This benefits the energy provider and consumer at the same time because the price for surplus energy is low (sometimes even negative) - the provider gets to sell some energy that would otherwise be wasted, and you get a low price. Bitcoin miners are a good example because they can flexibly and rapidly adjust their energy demands and they are unusually geographically mobile in that they can move to wherever energy is cheapest, although the positive environmental externalities associated with targeting this stranded energy rest on the caveat that the miners also power down during periods of energy scarcity and that all miners aree powered by this stranded energy, not just a small subset. Regardless, the principle of targeting energy surpluses generalizes to any application that can quickly and opportunistically scale its operations during periods of low demand.

There are also other more pernicious forms of waste energy that can be targeted. For example, methane is routinely released into the atmosphere from landfills. This methane could usefully be captured and used to generate electricity, with a resulting carbon dioxide emission whose warming potential is 25 x smaller than the methane. Targeting landfill methane as an energy source is a carbon aware strategy because it makes productive use of a resource that is otherwise being wastefully emitted and in the process reduces the warming potential of the emitted gases dramatically, all using decomposing waste that already exists. Landfill methane accounts for almost 10% of all anthropogenic methane emissions. There are companies generating electricity from landfill waste gas and using it to power various applications today (e.g. Vespene, Viridi).

Natural gas is also released into the atmosphere as a side effect of extracting petroleum products from the ground. Sometimes, it can be economically viable to build out the necessary infrastructure to collect and transport this leaked gas, but often times there's enough gas to be environmentally harmful but not enough to justify the expense of capturing it. In these cases, the gas either simply escapes into the atmosphere, or it is burned ('flared'). There are strong environmental and economic motivations to capture this wasted gas and burn it in the service of some productive work rather than burning it just to get rid of it. As for landfill gas, the act of taking methane and burning it to release carbon dioxide swaps out an extremely potent greenhouse gas for one with ~25x less warming potential. If the burning can be done in the service of some productive work, all the better. Admittedly, this is not a strategy that generalises well across the software space. It is really limited to applications that are not critically dependent on maintaining their uptime or sending a lot of data over the internet, because they need to be sited close to the source of the gas, which is often remote, and the supply can be somewhat intermittent. There are several companies powering mobile data centers with flared methane. Bitcoin mining seems to be the most commonly reported activity in those mobile data centers, but the company websites and pres releases also report also other computationally heavy tasks being done there such as generative AI, image processing, bioinformatics, rendering graphics etc, with an unsurprising recent shift towards AI workloads.

Shifting carbon awareness left

While dynamically responding to short-term changes in grid carbon intensity are probably impractical for larger entities, there are probably some design-stage carbon awareness principles that could be impactful. A multinational company with access to multiple sites across the EU might decide to focus their most energy-hungry operations in a country like France, where nuclear power makes up a very substantial proportion of their energy mix and the overall carbon intensity is low.

Even better, if you happen to be a sufficiently large entity that your operations individually move the needle on software carbon emissions, you might be able to add renewable energy to the grid to meet your own demand, or operate off the grid entirely. You might, for example, restart the three mile island nuclear power plant, or install dedicated small modular reactors to power your work. It's probably possible for smaller consumers to generate clean energy using solar panels to meet at least some of their own demand. Either way, if you have the ability to add clean energy to the grid to satisfy (some or all of) your own demand, that is an overall better than moving your workload around to target existing energy on different grids.

These solutions don't exactly lend themselves well to very small consumers, but then I might argue neither does carbon awareness in its canonical form. What is universally applicable is integrating carbon awareness into software design, treating it as something as fundamental as UX or security, rather than retroactively "fixing" sustainability after some system is already up and running.

Carbon-intelligent computing

Google refer to their carbon aware strategy as "carbon-intelligent". Instead of just targeting low grid intensity, they shift jobs around in time to ensure more work is done when renewable energy is more plentiful. The idea is that heavy processing jobs are run at times when they won't trigger additional fossil fuel combustion to meet the demand. This blog post suggests that location shifting between data centres can also contribute to ensuring jobs do not raise the carbon intensity of the grid. This makes sense, and I can imagine more sophisticated strategies being incorporated over time, especially if this carbon intelligent scheduling can be done in concert with additive renewable energy production from dedicated wind and solar infrastructure, for example.

I'm speculating a bit, but this does seem to foreshadow a new era of carbon awareness that uses all kinds of signals, such as weather forecasts, ambient temperatures, clean energy availability, and compute density (i.e. intelligently avoiding over-provisioning/under-utilizing servers) to minimize the operational carbon of computing.

Grid aware computing

Climate action tech proposed to start using "grid aware" computing, rather than carbon aware or carbon intelligent computing, as a way to more intelligently target pockets of low demand rather than low carbon intensity. This may ameliorate some of the negative externalities associated with carbon aware computing.

They point out three principles for more sophisticated space/time shifting:

  • Run compute when demand is low, using curtailed green electricity in stable grids.
  • Run compute on additive electricity.
  • Demand-shape computing electricity use so it stays within agreed resource use boundaries.

They are not the only people who have concerns around carbon awareness - once I started digging, I found examples of other people in the green software space that had voiced similar concerns, for example on the Carbon Aware SDK Github repository and Adrian Cockroft's blog where he explained some tragedy of the commons type scenarios in more detail.

Summary

These are my own takes on carbon aware computing, although they do seem to be aligned with some others in the space. I'm also aware there are others who are optimistic on carbon awareness. I want them to be right, I'd love for carbon awareness to be something simple and effective, but I'm not sure that it really is in its current form. My tentative conclusion is that it seems like today's version of carbon awareness (focusing on grid intensity) has some nontrivial net negative externalities and is probably oversimplified. Grid aware computing and carbon-intelligent computing both seem to be substantial steps in the right direction. They are broadly similar in that they try to incorporate a more sophisticated set of signals to determine when and where to do some work, and may be better able to reduce carbon emissions.