Software is polluting the world

Mar 18, 2020

5:20-min read

Hello from 20 Minutes into the Future. Tonight we’ll be looking at the climate cost of big tech. It’s worse than you think.

Way back in 2011 venture capitalist Marc Andressen famously said “software is eating the world.” Being a VC “world” of course means “global financial markets.” He smartly pointed out how the emergent tech companies had better valuations / share price / profit margins / etc. than non-tech companies.

Years earlier in 2006 mathematician Clive Humby said “The world's most valuable resource is no longer oil, but data.” By that he meant it’s valuable but it has to be mined, refined, and changed into something actionable before it leads to profit. If you work in tech you’re probably sick to death of hearing “data is the new oil.”

Sadly they’re both right but in ways they didn’t mean. It’s more accurate to paraphrase Andressen and say that software is polluting the world. Because data is like oil in that it powers the world and comes at terrible climate costs.

Not a subscriber yet? 20 Minutes into the Future is 100% ad free and always will be. Sign up for weekly commentary & related links to help you dig deeper into big tech behaving badly.

Data Landfills

If you’re like me and are of a certain age you’ll be forgiven for thinking all this digital stuff was supposed to help fight pollution. Sadly all of our digital junk is still in fact junk. And like physical junk it’s problematic.

Think about your email for a second. How long have you had your email? How many years worths of attachments are you hanging onto?

Right now, data centers consume about 2 percent of the world’s electricity, but that is expected to reach 8 percent by 2030. Moreover, only about 6 percent of all data ever created is in active use today, according to research from Hewlett Packard Enterprise. That means 94 percent is sitting in a vast “landfill” with a massive carbon footprint.
“It’s costing us the equivalent of maintaining the airline industry for data we don’t even use,” said Andrew Choi, a senior research analyst at Parnassus Investments, a $27 billion environmental, social and governance firm (ESG) in San Francisco.

Emphasis mine. And that’s even taking into account all the renewable energy efforts big tech likes to trumpet about. That’s fucking mental.

But, unlike a lot of tech problems it’s easy to see how we can take personal action to help. While you’re at home social distancing take some time every day to delete those old emails you don’t need. The planet will thank you.

The Carbon Cost of Training Data

AI researchers have long suspected there was a significant climate cost to their work. But until June 2019 there wasn’t much qualified data about it. That changed with research from Emma Strubell, Ananya Ganesh, Andrew McCallum.

The team took a look at 4 common models — Transformer, ELMo, BERT, and GPT-2 — that have been responsible for the biggest leaps forward in natural language processing (NLP) in recent years. What’s more they modelled the energy mix on the one used by AWS. The results were not REALLY NOT GOOD™.

In a new paper, researchers at the University of Massachusetts, Amherst, performed a life cycle assessment for training several common large AI models. They found that the process can emit more than 626,000 pounds of carbon dioxide equivalent—nearly five times the lifetime emissions of the average American car (and that includes manufacture of the car itself).

Digging deeper they highlighted that the costs are likely under-estimated. Why? Because it takes several rounds of training to get a model consumer-ready. All that tweaking and tuning is wrecking the climate. And that's even if engineers don't just say fuck it and start over from square one.

What’s more, the researchers note that the figures should only be considered as baselines. “Training a single model is the minimum amount of work you can do,” says Emma Strubell, a PhD candidate at the University of Massachusetts, Amherst, and the lead author of the paper. In practice, it’s much more likely that AI researchers would develop a new model from scratch or adapt an existing model to a new data set, either of which can require many more rounds of training and tuning.

How can this get better? There are researchers at Oxford that are working on reducing the amount of data to train a model. Time will tell how quickly they can bring that to market. Regardless it will likely require a cultural change in engineering teams as well. This is very hard graft but it has to happen.

Sick and tired of big tech behaving badly? 20 Minutes into the Future is about holding the bastards to account. One way we can do that is by spreading the word of their misdeeds.

Twitterbots & Climate Catastrophe Denialism

The Guardian recently reported on new research from Brown University. The team was trying to ascertain how climate misinformation was spreading on Twitter. Shocking no one who’s spent any time on that hellsite, it’s bots 🤖

On an average day during the period studied, 25% of all tweets about the climate crisis came from bots. This proportion was higher in certain topics – bots were responsible for 38% of tweets about “fake science” and 28% of all tweets about the petroleum giant Exxon.

As with other instances of 21st century propaganda this is overwhelmingly coming from the far-right:

Conversely, tweets that could be categorized as online activism to support action on the climate crisis featured very few bots, at about 5% prevalence. The findings “suggest that bots are not just prevalent, but disproportionately so in topics that were supportive of Trump’s announcement or skeptical of climate science and action”, the analysis states.

As we’ve discussed before this is problematic because of the availability heuristic. Which, put very simply, means the more we see something the more likely we are to believe it or at least believe other people believe it. That’s especially dangerous here where the case is closed on climate science and it’s just a bunch of alt-right nihilists and petrol death-cultists arguing in bad faith against it.

What’s more, these bot posts can be amplified via promoted tweets. That is the groups responsible for the bots can pay to advertise it across the platform. How might Twitter fix this? They could put similar restrictions in place to the ones that they use for tobacco products. Such a simple policy change and it could have a solid impact.

The solutions to big tech problems aren’t always about computing power. All too often it’s about human willpower.

Dig deeper with these stories from across the web:

The environmental cost of keeping mail and files online keeps rising (Japan Times)
Training a single AI model can emit as much carbon as five cars in their lifetimes (MIT Technology Review)
Revealed: quarter of all tweets about climate crisis produced by bots. (The Guardian)

Thanks for reading 20 Minutes into the Future. Have a friend or colleague who'd like the newsletter? Invite them to sign up.

Good night and good future,
Daniel

20 Minutes into the Future is a critical look at how technology is shaping our lives today. And what actions we can take for a better tomorrow. If you're not already a subscriber and found this newsletter worth your while then please sign up.

My name is Daniel Harvey and I write 20 Minutes into the Future. I’m a product designer and have written for Fast Company, Huffington Post, The Drum, & more. If you're pissed about the current state of tech and want to see us do better then you’ve found a kindred spirit.

You can email me at daniel.harvey@gmail.com or follow me on Twitter @dancharvey.

20 Minutes into the Future

Software is polluting the world

Data Landfills

The Carbon Cost of Training Data

Twitterbots & Climate Catastrophe Denialism

Discussion about this post