Product Engineering // Blog

Spreadsheets are sabotaging your business

People writing spreadsheets are terrible and they continue to make mistakes. A study in the UK in 2011 found that 94% of spreadsheets contain errors when deployed.

This isn’t just one or two errors in the file: 5.2% of cells in unaudited spreadsheets contain errors. Billions of dollars have been lost in some cases and there are plenty more examples in the public domain. With so many errors going on, an obvious safety measure might be to have important sheets double-checked, yet the same study showed that only 13% of these UK workers had their sheets audited internally. Just think of how draining it would be to audit a spreadsheet anyway.

Humans don’t make this amount of errors in other domains, not even programming, where errors were costing just the US economy $59.5 billion dollars annually all the way back in 2002. This figure presumably omitted spreadsheets from the figure and perhaps you’d like to guess what the worldwide figure was for pure “software bugs” in 2016 … (answer − please be seated).

Software engineers study for years and their purpose in life is often to solve problems and to enable empowerment and creation of society and the future. In discussing the software bugs, my point is that creating software is inherently difficult. People fiddling with spreadsheets are technically programmers − Excel is Turing Complete, meaning that anything that’s computable can be computed in Excel (as long as the sheet is sufficiently large).

Old Nokia phone with a spreadsheet on the screen.

‘That was the thing at the time. That was the new technology at the time. It looks a little dated now, I can see that.’ − Nelly

The fundamental problem with spreadsheets is that they encourage the notion that users aren’t actually programing. The whole user interface is built around ancient processes that were being used in the time of hieroglyphics.

But the whole premise of the ancient spreadsheets was that a human could easily process what was happening between the columns, since the operations were addition and multiplication, not compound interest or VLOOKUP.

Man fiddling with a mess of cables.

Whoops

Hiding the details of calculations between cells means that these calculations are magic and they can’t be properly inspected. Real code is read far more often than it’s written, yet for spreadsheets, there’s a design assumption that the code never really needs to be read after it’s originally written perfectly on the first attempt.

People rely on lazy comments and over-the-top decoration around the place to infer what’s happening. Accountants are not just writing a nicely formatted report; they’re writing an absolutely awful program that breaks all good programming practices. 

Algorithms by complexity − leftpad; quicksort; git merge; self-driving car; Google search back-end; sprawling Excel spreadsheet built up over 20 years by a church group in Nebraska to coordinate their scheduling.

© xkcd

The problems with spreadsheets are already problematic, but what makes them even worse is their lack of traceability. Have you ever received an email with spreadsheet_v2.xls attached? Oh but Jenny was also working on changes, so now she has her own version 2 that’s different to Bruce’s version 2.

One approach is for you all to fiddle together with something in the cloud. What real software engineers do though is to rely on something like Git, where the file is stored in the cloud and changes are submitted with descriptions. You can (technically) see the difference between copies of a cloud-hosted spreadsheet, but they come with no description, nor the ability to know when a particular cell was last changed.

Spreadsheets also suffer from the fact that they’re keeping data and program instructions in the same place. An algorithmic workflow should not be tightly coupled to its data − it should not be regarded as a fundamentally new spreadsheet just because some input numbers were changed; nor should it matter too much if some colors and fonts were changed.

In this paradigm, not only are employees allowed to write terrible code, they’re allowed to write code that is only run once rather than continuously. Maybe they’ll reuse last-year’s spreadsheet and change some numbers for this year’s results, but that’s only going to happen if the same person is still working for the company. Otherwise, the new person is going to realize that the previous spreadsheet was a confusing mess and they’ll start from scratch in their own way.

If I were writing a business report in a real programming language, I might make a claim for instance that we should pursue options in New Zealand 🇳🇿, since Amazon isn’t doing business there. I could implement error checks around this assumption by having a bot that periodically visits amazon.co.nz to see what’s happening there. 

Are you like Gary? Gary has a problem. The only time he gets visibility into how his business is performing is at the end of the month, or worse, at the end of the quarter. It's like trying to drive your car with a broken widndshield − you can't really see where you're going.

A company advertising their ability to vaguely replace spreadsheets.

Imagine writing a marketplace-tracking bot in Microsoft Excel − it is not at all feasible. When anyone claims that spreadsheets can’t do complex tasks though, Excel-lovers love jumping to the defence [interrupting], “but, but, VBA!!”

Well, VBA is esoteric and outdated, making it the most hated language amongst real programmers. Programmers are constantly having to learn new languages every few years (making traditional education unfeasible) − we don’t stick with paradigms from 1991. When new algorithmic breakthroughs are announced, they’re inherently difficult to code, so you’ll need to hope that it’s available in your programming language of choice.

For Excel users, that’s VBA or C# (for plugins). It’s not worth going into the why, but the fact is that people who write open-source software typically can’t be bothered making it available for spreadsheets. This isn’t just bad for availability, but also scrutability, which we’ll revisit in a bit.

When Excel users cite their love of VBA and plugins, they fail to mention the staggering number of security intrusions that have been injected by macros. Why are you granting super-user permissions (free reign over your computer) to a program that’s only meant to be doing a few calculations in your spreadsheet? In its list of 35 strategies to mitigate cyber attack, the Australian Signals Directorate includes as #5 that people need to disable Microsoft Office macros.

Open-source software is available for everybody to scrutinise, so that bugs (leading to possible hacking) can be spotted ahead of time. Even when the code is open-source, there have been disasters like Heartbleed, where a bug that was unnoticed for 3 years exposed 17% of the Internet’s servers to attack. Code that can’t be publicly scrutinised, like Microsoft Excel itself, has been targeted by “the most menacing malware in history”, when Iranian nuclear enrichment plants were sabotaged.

As a side note, neither data, nor algorithms (often written with Greek) are suited to spreadsheets. Why is everything 2D? Algorithms are 1-dimensional and data might have any number of dimensions. The 2D layouts and the sheet-switching are major contributors to all of the bugs in spreadsheets. When we have to add letters together and scroll for 9 hours to get to the bottom of the sheet, isn’t that enough of a sign that something is seriously messed up?

 i want GIF

The Solution

It’s probably evident by now that I’m going to advocate giving up spreadsheets completely, so you’ll unfortunately have to stop throwing “Microsoft Excel” on your résumé. Funding the Singularity is one possible solution to fixing the problems of spreadsheets, but if only it could come sooner. For the moment, there is actually an exciting solution − Jupyter Notebooks!

Jupyter notebooks that look fancy

 

Jupyter notebooks use the easiest programming language to learn, giving access to the state of the art in terms of natural-language processing, computer vision, web scraping and even monetary transactions, enabling you to automatically order a hitman to avenge your death.

Performing calculations in a first-class programming language rather than a leaky abstraction not only allows the avoidance of mistakes, but the ability to scale your business. Many companies unfortunately have to waste money buying powerful computers so that pen-pushers can run all their business calculations on their laptop.

Using Jupyter notebooks, we can offload the processing power to the cloud if we like; even massive calculations like machine learning. Can you imagine your staff trying to get this going, then turning their computers off and on again when it isn’t finished after two minutes?

Not only can IPython / Jupyter notebooks decouple you from computing power; they can decouple you from data visualisation. You can see your data however you like by leveraging plot.ly.

Making data-processing and reporting scalable is inevitably a core component of making a business that’s scalable. If you’re running a business, why not structure it to eventually run the whole industry? If you want to create and capture lasting value, build a monopoly!

Maybe this article isn’t enough − if you need extra help in revising your business to stop making silly mistakes that might sink your company; or if you’d like help restructuring your business for scalable computing, then we can chat about designing new business models and conquering your entire industry. Contact us.

Weekly Industry Newsletter

Top Posts

  • What is a Platform?

    Read more

  • Platform vs. Linear: Business Models 101

    Read more

  • Amazon Business: Understanding the Threat to B2B Distributors

    Read more

  • Bloomberg: B2B Distributors Have Opportunity for Marketplace

    Read more

  • White Paper: How B2B Distributors Take On Amazon Business

    Read more

Read articles by