Docy Child

INFRA-14 – Fault Tolerance

Estimated reading: 2 minutes 309 views

What is this control really about?

The Fault Tolerance control is confirming your organization’s ability to continue to operate despite failures or malfunctions. The focus of this control is in your production environment and critical storage systems. Are they built for redundancy and high availability?

Fault tolerance can include:

  • Failover, High Availability – to prevent requests from being sent to non-operable servers. Requires effective failure detection.
  • Balancing Load 
  • Reducing Overload of Individual Nodes – This is a very important effect of good load balancing, but not a necessary outcome. For example, strict round-robin balancing can result in requests being sent to an overloaded system.

There is no requirement to the method and type used.

Some organizations have a high availability or redundancy mechanism in place instead and that is also fine. You have the freedom to edit the TrustCloud control to be specific about your methodology.

As long as your organization has a mechanism to continue to function as usual despite failures, this control can be met.

Available tools in the marketplace

The following listing is “crowdsourced” from our customer base or from external research. TrustCloud does not personally recommend any of the tools below, because we haven’t personally used them. 

Azure load balancer
AWS Elastic Load Balancing
GCP load balancing

Available templates

  • No templates for this section

What is required to implement this control?

Define your strategy for fault tolerance. Are you solely focusing on redundancy, high availability or fault tolerance? Though similar, they differ a bit.

Redundancy: Two servers with duplicate or mirrored data

High Availability: servers have maximum uptime by removing all single points of failure

Fault-tolerance: limited functionality in the event of a failure

Once your strategy is identified, purchase the necessary tool to implement your strategy. If you are using a cloud provider option, there are many guides on the configuration.

What evidence is the auditor looking for?

  • Screenshot of the failover, redundancy or high-availability configuration

An example of what an artifact can look like

  1. Screenshot of the failover, redundancy or high-availability configuration

Example is for AWS. In your unique environment, take a screenshot showing the enabled configuration of some type of fault tolerance mechanism.

Fault Tolerance control

Join the conversation

Twitter Facebook LinkedIn

❤️  Joyfully crafted by a 100% distributed team.