Measuring the carbon footprint of your Python applications

#python

With environmental targets an ever-growing concern, building applications that are as efficient as possible is more important than ever. Data is the key to understanding the environmental impact your applications have.

It could influence design and architectural choices during the software development lifecycle. For example, it could help answer questions such as whether moving a component from a platform like Amazon EC2 to a serverless offering like AWS Lambda is worthwhile.

With this in mind, how do you begin to measure the carbon footprint of your software and start to put it into a more readable format? In the case of Python applications, hopefully this post will help.

What is CodeCarbon?

CodeCarbon is an open-source Python package created to compare the carbon footprint of various machine-learning models but can be used for more general applications. It takes the infrastructure hosting the code, location of the host and the execution time to estimate the amount of carbon dioxide equivalents (CO2e) used in a single execution.

The package does this by monitoring the application’s energy consumption which it then multiplies by the carbon intensity. Everywhere in the world (even down to individual data centres) uses a different mix of energy sources, each with a different carbon intensity. This varying data mix means that the statistics need to be localised, as some locations get more electricity from fossil fuels and some fossil fuels have a higher carbon intensity than others.

All this information can be used to build a better picture of the environmental impact of, and resources being used by, the software, which allows you to optimise it to reduce them.

How to install CodeCarbon

You can use the Python Package Index (PyPI) repository to install CodeCarbon. This can be done by including the package name in a requirements file that is used in a setup script or by using this command:

Grab this code from GitHub

How to monitor emissions

CodeCarbon’s implementation may feel familiar, working a bit like a timer or progress bar. There are three different ways to use CodeCarbon:

As an object
As a decorator
As a context manager

Each of these can be combined to give more granular figures as well as a broad overview.

Adding the instrumentation does come with an overhead. A test using the object method added around 1.2 seconds to the execution time, which resulted in 0.000043kWh of additional energy used. While this overhead is regrettable, the benefits of gathering this data outweigh the small overhead.

As an object

The object implementation is likely to be the most commonly used. If you’re looking to use CodeCarbon on a function as a service (FaaS) like AWS Lambda, then the object can be initialised outside of the handler function to be re-used in subsequent invocations. The example below shows how the object implementation can be used in this way.

Grab this code from GitHub

As a decorator

The second option is to use a decorator. The decorator option is useful if you only wish to track the emissions of a limited number of functions or the application as a whole. This method is unlikely to be suitable for monitoring an entire application in a FaaS context because the file might be lost.

Grab this code from GitHub

As a context manager

Finally, we can use a context manager to wrap the code we want to monitor. The context manager functions similarly to the object method but is a little cleaner. The drawback of this is that it is not being reused on subsequent invocations when used in a FaaS context.

Grab this code from GitHub

How to make your emissions data more visible

Each of these implementations outputs a file. The file contains data including the emissions in kilograms (kg), energy consumed in kilowatt-hours (kWh) and the country hosting the infrastructure.

View larger version

Once this file is available, we can extract this data to make it more visible and track it over time. The file lets us see whether changes to our business logic have positive or negative effects on our carbon footprint. It also tells us how large that impact is.

The snippet below shows an example of how to improve the visibility of this data by submitting it as a custom metric using Amazon CloudWatch.

Grab this code from GitHub

You can even add these CloudWatch metrics to a dashboard to make them easy to access and monitor. In terms of pricing, Amazon’s free tier provides up to 3 dashboards, with each additional dashboard costing $3 per month. It’s important to note that custom metrics are not included in the free tier and are billed based on the number of custom metrics tracked and the number of API requests to log values against that metric.

We’ve shared example project code on GitHub.

How does this help my organisation?

Political and business leaders worldwide use large amounts of data from a variety of sources to make informed decisions. This data could be anything, from unemployment figures to recycling rates. The difficulty in decision making comes when there are large amounts of data but no dashboards or reports that make the insights the data contains easy to see.

Some organisations are beginning to use carbon budgeting to help achieve their environmental goals. Meanwhile, researchers and politicians have started to consider taxing businesses based on their emissions. Carbon taxation would punish organisations that have the largest carbon footprints while providing more funding for green innovation.

Greater data visibility is useful for digital leaders who want to monitor this data more closely due to carbon budgeting. Emissions figures could be used like code coverage, aiming to keep the emissions as close to or lower than before new changes are introduced. The figures could also influence the design or architectural choices, particularly with regards to proof of concepts.

Both of these uses help drive down emissions. Knowing the numbers makes people more aware of the resources used, which means we don’t make decisions that cost the Earth.

Today we’ve taken a deep dive into Python applications, but this is just one of many ways an organisation can use data to reduce energy consumption. If you’re interested in assessing the environmental impact of your entire digital estate, or harnessing the potential of your data to make better decisions, we’d love to help. You can find out more about our data services, or just drop us a line.

Add Post View All