Technical Debt: The End of Service Challenge

Technical Debt: The End of Service Challenge

How many of us are familiar with the phrase technical debt? The term is largely associated with software development, but it can also be applied to IT infrastructure.

So, what exactly does it mean?

It’s the practice of maintaining older versions of technology believing it’s more cost effective than upgrading. Recent announcements from IBM had me considering the concept once again and I think it’s a valuable exercise to explore the implications.

First up was calling time on IBM i 7.3. This version is due to go out of support in September, but IBM will no doubt offer a Service Extension for those who believe they cannot upgrade.

How much does IBM i 7.3 Service Extension cost and what does it deliver?

You can expect to pay double the current charges at the very least which places you in technical debt. However, the important thing to understand is what does this really deliver. IBM have defined Service Extension as:

  • Basic usage and problem rediscovery support for specified out-of-service IBM i product releases
  • Coverage as specified in “Type of Service Extension”
  • Voice and electronic remote technical support
  • Support is provided during normal business hours in the country where your product is licensed and/or your contract registered for non-business critical problems. For business critical (severity 1), support is provided 24 hours a day, every day of the year.

All looks good but the second bullet point is the most important. When you drill down into “Type of Service Extension” you are typically going to find the phrase “Usage and Known Defect”.  What this means is that if a known problem arises on your system, you can apply the fix that has already been written and tested. But what if the issue you have has not been previously identified? Well, in this instance IBM will not investigate or write a new fix; but will encourage you to upgrade to a later version of the OS where no doubt the issue has been resolved.

At first glance this may seem unfair as an application that is critical enough to warrant you taking out Service Extension should receive the focus you require.  And therein lies the problem of Technical Debt. Surely if the application is that important, I would argue that upgrading to the latest release is a major priority. We can’t keep putting off upgrades because they may appear to be difficult or introduce an element of downtime you don’t think the business can support. Trying to force through an upgrade to fix an issue will pose greater risk than a planned project.

The second announcement was the End of Service for POWER8 hardware. Support for the scale out systems ends in April 2024 and the scale up systems October 2024. In much the same way as the operating system example above, IBM will continue to support the hardware through a Service Extension.

How much does POWER8 Service Extension cost and what does it deliver?

Costs are not as easy to predict as the OS example and rely on a number of factors e.g. the availability of parts. We expect the cost to be roughly 3 times the current maintenance charges in the first year and for this to increase year on year until IBM finally pull the plug on the hardware and end all support.

So, what does it deliver?  We all know the IBM Power systems to be pretty fault tolerant but the older a system becomes the more likely we are to see component failure. If you Google ‘mean time before failure’ you’ll find plenty of examples of this.  In this example it is easier to define what the support no longer provides and then you can see of it is fit for purpose (thank you Craig Cannon for the following definition).

Service Extension will no longer provide:

  • Parts availability base on supplies available. No new parts to be built/procured by IBM and/or suppliers
  • Preventive service
  • Support for newly reported defects or previously reported or known defects for which no updates, patches, or fixes were created
  • Development of any new machine code updates, patches, or fixes (including those designed to address security)

Ok then, so parts are no longer to be built or procured by IBM. You will have to rely on reconditioned parts being available. Maybe not an issue, as you would expect there to be plenty available. However, you are probably going to have to compromise on any kind of Service Level you have currently. Not all parts will be available in all geographies, you can guarantee that.  So, unless you have a fully fault tolerant system, mirrored at bus level for example, you could spend a lot of time crossing your fingers waiting for a part to arrive. Not the best situation to be in.

In short, there’s no more preventative service, support for new defects or development of machine code. If you install firmware updates to your system in a form of scheduled maintenance, which you should be doing, (it’s done as a matter of course on Intel infrastructure so why not Power?) you will no longer be able to do this. As before, the chance of a new defect arising is pretty slim. But it’s still an exposure that you are subjecting yourself to and paying 3 times the maintenance charge for the privilege.

Imagine suffering either of the above.

What if a failure to the motherboard takes almost a week to be replaced? (Not unusual.) Or a defect in the microcode on a Raid controller cripples performance? (Has been known to happen.)  And you then realise an upgrade is the only way to resolve these issues. In this instance, you will likely have to start the 3 month upgrade process. And yes, it will take this long at the very least. Generally, you will wait 5-6 weeks for an order to be signed off and for the kit to arrive. Next, you’ll have to plan a migration to new hardware and possibly a new release of the OS. While everyone asks, ‘are we done yet?’

I understand there are many issues that will prevent organisations from keeping up to date with hardware and OS releases. But if we take a step back and recognise the importance of the system to the business; then I am sure we can get the buy in required to ensure application availability is maintained. After all, in most cases these are a key component to the running of the business and provides applications and central Databases. Without these, many production lines, banking applications, retail outlets and logistics companies (to name a few) will fail.