You cannot work in close proximity to technical people, particularly those who build systems, for long without hearing the term “technical debt” bandied around.
Technical debt is what you are adding to every time you choose an easy or quick solution now, rather than looking at longer-term strategies. It is the technical expression of ‘failing to plan is planning to fail.’ And it has consequences.
Suppose a technical system is held together with metaphorical prayers and duct tape. In that case, it will often be too fragile to effectively maintain (i.e. trying to update or patch it is likely to cause an outage or just break it irreparably). Given the earliest systems in an organisation are also often the most important to its operation, the most critical systems are usually the ones carrying the most technical debt.
While it’s often thought of as just adding friction to systems, possibly the occasional outage, it is much more insidious and damaging when we consider that most technical debt also involves what I like to call security debt. If you can’t update a critical system to maintain it effectively, it will be vulnerable to cybersecurity threats. Known vulnerabilities are a key factor in most cybersecurity incidents, and these exist because systems go unmaintained.
This is a big enough problem on its own, but then we look at the world of Operational Technology (OT) and Industrial Control Systems (ICS), and things get worse quickly. OT refers to technologies used to manage physical systems, often but not always industrial (building HVAC, access control systems, lifts, etc., are also OT systems). ICS is a subset, specifically the systems that monitor, manage, and control industrial processes.
OT, ICS, and CNI
The most important OT processes are those in Critical National Infrastructure (CNI), everything from power plants to water treatment facilities. These systems were often automated before security was a major concern in the way it is now, connected to the internet to allow remote or centralised monitoring, and carry enough technical debt that they are often impossible to maintain.
Manufacturers for some of these systems no longer exist, and their expense means that they are certainly not refreshed every few years as other IT systems should be.
The first known and most famous attack against ICS systems involved the Stuxnet malware, uncovered in 2010. Stuxnet is still one of the most sophisticated cyberweapons developed to date and has been repurposed to carry out other attacks after the one that (rightly) made it famous.
To keep the story short, the Stuxnet malware was developed to compromise Microsoft Windows machines to gain an initial foothold on a network, after which it would seek out the controllers which automated gas centrifuges for separating nuclear material. Estimates are that Stuxnet ruined roughly one-fifth of Iran’s nuclear centrifuges and set back the national nuclear programme for several years.
Stuxnet was developed to be subtle; it did not simply cause centrifuges to fail but introduced random variances in their operations which caused them to fail faster. It’s estimated that it was a year after release before it was discovered, and the discovery was more luck than planning.
It isn’t only malware that we need to worry about affecting CNI systems. While the Colonial pipeline attack did not involve any targeted ICS systems (it was down to a compromised password, and the shutdown was precautionary), if the attackers had aimed at indeed causing chaos rather than deploying off-the-shelf ransomware, the attack would not have been detected as quickly. It potentially would have affected the pipeline’s ICS systems. Given the control those systems had, the damage could have been much more significant. The attackers showed no signs of attempting to breach those critical systems.
In another recent case, an attack against a Florida water treatment plant was down to old software being used (that technical debt again). And while it was detected and further safety measures were in place, the attackers adjusted the amount of sodium hydroxide (used in small quantities to lower the acidity of water and in large concentrations capable of causing chemical burns) upwards about 100 times. Fortunately, it was detected. But in this case, we are seeing attackers deliberately and maliciously trying to cause damage. Whether they want to send a message and know that other measures would prevent the attack or genuinely attempting to poison the water supply is unknown, and only limited information is shared.
So far, there are only two attacks confirmed to have destroyed equipment (though a recent hospital ransomware attack has been held directly responsible for a loss of life). The Stuxnet attack damaged nuclear processing centrifuges in a very careful way. A second attack, some years later, occurred in Germany, where a steel mill was compromised. The attackers managed to disrupt the control systems of a blast furnace to enough of a degree that it resulted in ‘massive’ damage, believed to be through overheating and removing the ability for the furnace to be shut down.
More seriously, or at least more impactfully, shortly afterwards, the Ukraine power grid was deliberately targeted with a strain of malware named ‘Black Energy’, resulting in over 200 000 customers losing power. A year later, the same happened with a different attack using more sophisticated malware known as Crash Override. Both of these were perfectly capable of being much more serious, as the attackers chose only to cut the power and not reconnect it out of phase, which would have been catastrophic.
Another attack in 2017 was the first deliberately targeted at safety systems designed to enforce emergency shutdowns when human life is at risk. The attack was initially believed to be a malfunction in the equipment until the security team sent in to investigate determined it was due to malware and was part of an effort to develop the capacity to cause physical harm.
Realistically there aren’t any easy answers to this technical debt problem. The effort and expense needed to update these systems are beyond what the organisations who own them are willing to or able to afford. The alternative solution of fully isolating the facilities requires re-engineering processes highly dependent on interconnection and effective communications.
It is vitally important whenever designing a new system to consider the technical decisions being taken carefully. The risks of not doing so are becoming more visible every day.
Cyber Security Fundamentals: Security and Technical Debt Collection
By James Bore
James Bore is an independent cybersecurity consultant, speaker, and author with over a decade of experience in the domain. He has worked to secure national mobile networks, financial institutions, start-ups, and one of the largest attractions’ companies in the world, among others. If you would
like to get in touch for help with any of the above, please reach out at james@bores.com
Leave a Reply