Why Firms Need to View Operational Resilience as a Strategic Imperative

Drafted by Ben Saunders: OpRes Founder

Roughly an 11-minute read

Introduction

Early last week, I was speaking with a leading cloud service provider and one of their Subject Matter Experts. We were discussing all things operational resilience and both agreed that the recent Operational Resilience policy updates are good news for both customers and firms themselves. Regulation can often be seen as a burden for firms to implement. Taking significant time and financial investment to ensure that compliance targets are obligated. However, in the realm of operational resilience, we here at OpRes believe that firms need to view operational resilience as a strategic imperative. So that they can stand out from their competitors in an increasingly digital-first sector. 

Indeed, as digital channels continue to redefine how the financial services sector works, new vulnerabilities are prompting an increased urgency to achieve operational resilience for firms of all shapes and sizes. In the last 5 years alone, there has been a surge in digital-only finance propositions which rely heavily on the infrastructure and capabilities of a trusted set of third-party service providers powered by cloud computing. This brings new challenges for firms in respect of the relationships they hold with third parties. As well as the means by which they interact with customers, and the changing vectors of cybersecurity attacks. 

In addition, ongoing disruptions to critical financial infrastructure, across multiple channels, have resulted in increased scrutiny from governments, regulators, the media, and perhaps most importantly frustrated customers. I, for one, checked out of my traditional bricks and mortar Bank ~ 20-months ago because of constant service disruptions to online channels!

With the likes of PS21/3 coming to the fore, firms across financial services will invariably be embarking on sizable transformation programs to both uplift, modernise, or in some cases replace their IT infrastructure. However, as they do so, firms need to ensure that they drive tight alignment of people, process, and technology changes in order to embed operational resilience in the day-to-day thinking of their organisation. In doing so, operational resilience, much like security, should become a first-class citizen in all firms. 

Over the course of this blog, we will discuss:

  1. Why operational resilience is emerging as a key theme for firms of all shapes and sizes. 

  2. What firms need to do, so they achieve operational resilience targets.

  3. New & growing challenges firms face, in order to be operationally resilient for their customers.

  4. Why deeper testing of scenarios is required to ensure operational resilience.

Why is operational resilience emerging as a strategic imperative for firms?

There are three key factors that are increasing firms’ focus in respect of operational resilience. They are:

  1. Regulatory drivers.

  2. Changes in customers behaviour and demands.

  3. Internal pressures and risk mitigation.

Let's break each of these down in more detail. 

1. Increasing Regulatory Focus - Combining Financial & Operational Resilience

We have spoken at length around the growing regulatory focus that ensures all firms continue to deliver services to costumes and prevent intolerable harm, in the midst of service disruptions. Regulatory bodies across the globe are establishing their own interpretations of what operational resilience is. Resulting in firms needing to take different measures to comply with policies when they operate across borders. 

Because of the increasingly connected, always-on nature of society, what would have been deemed as a minor operational “blip” a few years ago. Is now amplified 1000% by both mainstream and social media channels. In March 2021, the FCA and PRA published their joint and finalised policies regarding operational resilience for their regulated members. Noting that “Operational disruptions and the unavailability of important business services have the potential to cause wide-reaching harm to consumers and risk to market integrity, threaten the viability of firms and cause instability in the financial system”. 

Firms have until March 2022 to map their important business services and conduct scenario testing to validate the interventions they must make to increase their operational resilience capabilities. Beyond 2022, there is a 3 year period to ensure they operate within newly baselined impact tolerances, whilst they must have plugged any major gaps in their armour that will have been identified as a result of the extensive mapping and testing exercise.


2. Increasing Digitization of Customer Facing Products

The onset of the Coronavirus pandemic dealt a hammer blow to industries across multiple sectors. This was no different in financial services. Bank branches and ATM’s were one of the main financial casualties of Coronavirus as lockdown rules forced the industry to adapt at lightspeed. Over the last year, regulators and governments have stepped forward to ensure physical cash use is protected but if current trends continue, bank branches could become a thing of the past.

In comparison, the dominant channel for customers purchasing new personal insurance policies is often through pricing aggregators, websites, or 3rd party API integrations. Insurance firms we speak with are reporting a steady decline in new business acquired through brokers and agents. 

In the face of this shift in customer demands, there is a need for firms to deliver exceptional experiences to their customers, all whilst maintaining operational resilience. This is amidst competition from FinTechs who are not besieged with years of technical debt built from mergers and acquisitions. As an example of this fine balancing act firms face, between July 2019 and August 2020, the FCA was notified of 88 outages to digital banking channels by U.K. current account providers.

These changing customer preferences are driving firms to provide more connected, and increasingly digitized processes, creating new expectations and new risks. Thus, increasing the focus on operational resilience across all firms.

3. Mitigate Risks and Protect Brand Reputation

No firm wants to go down and experience a service disruption that could cause intolerable harm to customers. In turn, resulting in long-lasting reputational damage with loyal customers and potentially other firms operating across financial markets. The change in customer demands we previously spoke of now requires firms to provide access to services 24/7, 365 days a year. In the face of these demands, firms are launching new, digital by default propositions that require continual oversight and monitoring. These propositions are often underpinned by process automation and modern engineering practices akin to DevOps & continuous deployment to deliver change at speed.

Whilst firms require a greater velocity of change, this often brings a growing reliance on third parties and higher-level services, delivered by public cloud service providers. This is creating new interaction models with partners and suppliers and to a certain extent is blurring the lines of responsibility with the evolution of “Shared Responsibility Models” 

Invariably, the cries of social media mean that when things do go wrong…..pretty much everyone hears about it. In rapid time! Indeed, customers are usually the first to notify firms of service disruptions which if not acted upon swiftly can result in awkward public relations efforts to restore brand confidence.  

 

How can firms achieve Operational Resilience?

With increasing regulatory scrutiny and industry commentators stressing the importance of operational resilience to firms. The breadth of services offered by organisations, often across borders, can make it difficult for material risk takers and board members to implement effective oversight controls. As such, there are some bold steps firms need to make in order to achieve their operational resilience targets. Namely; 

  1. Treat resilience as a first-class citizen in their digital transformation efforts. 

  2. Tackle system upgrades head-on. 

  3. Implement sound protection, response and learning frameworks. 

  4. Ensure board-level visibility of disruptions.

Let's expand on each of these points in more detail.

1. Building Operational Resilience into Your Firm's Digital Transformation Efforts

Time to market has become increasingly important, with the onset of digitally centric financial propositions. Whilst, not a day goes by when there is an announcement regarding the launch of a new neo-banking proposition. Or a FinTech securing $MM Series B funding to support their scaling needs. Many of these services are often brought together through interconnected, cloud-based architectures to support agile delivery at a high velocity. However, when embarking on building new products or instigating new partnerships, firms must ensure that optimal levels of vetting have been conducted in order to identify potential risks to themselves, their customers, and the wider market. 

As such, having a sound understanding of how important business services are pieced together is critical for firms. Whilst mapping the technology to which these business services touch is imperative, to ensure firms can deliver a resilient customer experience. Furthermore, building a deep understanding of 3rd & 4th party dependencies is crucial when building service restoration processes and setting impact tolerances for digital propositions. 

It is all well and good setting a service level agreement of 99.99% and a Recovery Time Objective of 2-minutes. However, if one of your 3rd party suppliers faces off into a 4th party supplier who is not contracted to meet those availability or restoration of service targets…. then intolerable harm could be experienced by customers sooner, rather than later. Furthermore, having all the latest and greatest telemetry in place can certainly help. However, firms need to ensure that their expectations are matched across each stage of their supply chain in order to meet their digital transformation and operational resilience aspirations. 


2. Tackle System Upgrades Head-On 

Upgrades are hard! Especially when it involves millions of customer records and even many more millions in financial accounting data. Whilst the volume and frequency of cyber attacks by malicious parties are only on the increase. As such, firms can no longer opt to delay upgrading or replacing heritage systems that carry technical debt and security risks. 

Indeed, there has been a spate of news headlines from failed banking migrations in recent years and this has increased caution across firms. However, transitioning to new systems that comprise of highly-available, multi-region services with modern cloud-based architectures should positively impact operational resilience over time. However, we recognise that this comes at a significant cost for firms and is often a once-in-a-generation exercise! 


3. Implement a protection, response and learning-based resilience framework

The growing adoption of cloud-based services relies heavily upon modern engineering practices to ensure secure and highly available architectures. This often results in the implementation of practices like Continuous Delivery, DevSecOps, and capabilities such Infrastructure as Code to establish scalable and hardened propositions. With this, firms often implement modern observability strategies and apply site reliability concepts to establish real-time monitoring and alerting capabilities

As firms optimise their digital propositions, this gives customers increased opportunities to interact with services in real-time. In turn, when disruptions are experienced firms must have well-documented service restoration procedures. In addition, they must also have tried and tested run-books that clearly define responsibilities and accountabilities for each stage of a major incidents resolution.

Embedding continuous learning procedures can also enable firms to ask “How do we prevent this from happening again?” and with the right levels of performance and monitoring insights, they will also have the capacity to identify triggers for service disruptions. In short, firms need to expect that failures will happen and prepare for these disruptions accordingly. 

4. Ensure There is Board Level Visibility of Disruptions

Upon reconfirming how their business services are architected, firms should be able to tease out new system and business performance metrics. As well as customer preferences and consumption habits across their respective journeys. This may improve a firm's capacity to record and track key resilience indicators which can be used for board-level reporting. However, ensuring this information can be communicated, broken down, and understood easily is a key requirement in the midst of a major incident. 

Financial organisations are often heavily siloed and this can result in stifling the distribution of key information to material risk takers and board members. It is therefore imperative that firms have established clear and concise communication channels so that senior stakeholders are informed about service disruptions in a timely manner. 

In addition, being able to provide information covering the following data points is crucial to ensuring board members are suitably informed about the outage. Key data points might include:

  • When did the incident start?

  • What systems and channels are impacted?

  • What is the severity and impact of the disruption on customers?

  • What does this disruption mean for the business?

  • What does this disruption mean for the market?

  • Who is owning the resolution process?

  • What is the planned pathway to resolution?

  • When did normal service get restored? 

  • How did normal service get restored?

  • What caused the disruption and how will we protect against it in the future?

New & Growing Challenges Firms Face, To Be Resilient for Their Customers

Earlier we discussed the complexity firms face when balancing the velocity of change, with customer demands and regulatory needs. Furthermore, there are additional challenges that firms face when aiming to provide high levels of operational resilience to their customers. Three of these new and growing challenges are; 

  1. The need to set impact tolerances.

  2. Managing 3rd Party Suppliers.

  3. Data Security.

That said, these challenges do bring new opportunities for firms. Let’s unpack each of these areas in more detail. 

1. The Need to Set Impact Tolerances

In their joint policy statement, the PRA and FCA stressed the need for firms to set impact tolerances for their important business services by March 2022. Namely, if a disruption is experienced to an important business service how long can the firm tolerate the outage, before intolerable harm is passed onto customers and the wider financial market? Whilst mapping business services end to end is a detailed and lengthy process. In doing so, it will certainly help board members and material risk-takers to consider their comfort level with the firm's resilience posture.

In mapping business services and setting impact tolerances, firms also have a real opportunity to hit the “Reset & Modernise” button. Historically, firms have tended to focus on individual assets or systems rather than the end-to-end customer experience. By flipping their focus to a horizontal, customer-centric perspective firms can start to ask whether the systems they have in place are architected sufficiently to deliver the best service to their customers. In addition, this surfaces opportunities to also ask:

  • When do we see the biggest spike in customer transactions and across which channels?

  • Does it make sense to simplify and streamline our channels for customers?

  • Do we need to explore new channels?

  • Are we spending too much money on certain aspects of the business service? 

  • Can those funds be reallocated for future investment initiatives? 

  • Are our suppliers meeting their service obligations?

  • Can we get these services cheaper elsewhere and pass on those savings to our customers?

2. Managing 3rd Party Suppliers

A business service can only be as strong as the lowest common denominator in its technology supply chain. As such, there is an increasing need for firms to ensure they have the right level of governance and oversight of their third-party suppliers. Guidelines published by the FCA and the EBA have provided firms with a set of best practices to follow. Furthermore, as the financial sector consumes greater volumes of Software as a Service (SaaS) this introduces further risks around 4th party suppliers which the 3rd party may be heavily reliant upon, but the firm has very little, if any, interaction with.

In addition, this growing reliance on 3rd parties introduces new concepts such as shared responsibility models. However, it is imperative that firms clearly document and demonstrate a clear understanding of what the firm will do and what aspects belong to these third-party providers in the instance of an outage. Particularly so, as the lines of demarcation alter when different consumption models of cloud-based resources are applied across SaaS, IaaS and PaaS.

Data Security

Financial services, specifically Banking, faces one of the most daunting tasks when it comes to the security of their customers' data. The emergence of regulatory policies such as GDPR makes this an increasingly burdensome task. However, it is a necessity for firms to maintain the integrity of their customer’s data and it is not uncommon for firms to apply a Recovery Point Objective (RPO) of 0 minutes for their important business services. 

In the event of mission-critical data sets becoming compromised or corrupted, the impact could be seismic not just on customers and the firm but the global economy. 

Finally, data sprawl, replication, and migration is a very real challenge for firms particularly as 3rd and 4th party suppliers become more prevalent in their technology stacks. In the instance of migration, the intended destination is one of increased operational resilience. However, this can create more headaches for firms. As system failures, or the loss of key customer data can often be experienced as part of modernisation efforts. 

Increased Testing of Extreme, But Likely Scenarios

In a previous blog, we explained what an important business service is. Whilst we also covered 9 questions firms need to ask themselves, as they build their respective scenario testing strategies. This is an area that has become of greater interest to regulators. As a result, more demands are being placed on firms to demonstrate that sufficient levels of testing exist at each stage of their software development life cycle.

However, this also expands into “What If?” analysis across their supply chains, their workforces (e.g. a global pandemic), and their technology systems. 

At the very least, firms need to grasp and report on:

  • What testing is being performed?

  • What is being tested and the frequency of the testing?

  • The coverage of the testing and the type of testing being executed.

  • What the testing has revealed and how any unexpected outcomes were identified.

  • The firms plan to respond, mitigate or nullify any unexpected outcomes from the testing.

 

In Closing:

It is evident that there is a long road ahead for firms, as they build a deeper understanding of their operational resilience. Indeed, it will take time and significant effort to address operational resilience challenges for firms of varying scale & complexity. However, in a sector that is increasingly digital by default, experiencing downtime for a matter of seconds could have severe consequences on firms, customers, and the wider financial system. 

Firms need to act swiftly by building common terminologies, frameworks, and responsibilities across the organisation so that a coherent operational resilience strategy can be formulated. This means the securing of board-level support and making operational resilience a strategic imperative for the organisation.

Previous
Previous

The EU's - Digital Operational Resilience Act: 5 Things Firms Need to Know

Next
Next

Onboarding in OpRes - Create Your Operational Resilience Framework in Minutes