TRELLIS should be robust at three distinct levels:
Within each application component, all possible errors should be detected, reported and corrected without causing the solution to fail. All requests for data access should be contained so that a failure only causes the request itself to fail and return an appropriate error to the requestor without bringing down the entire solution, impacting other requests or users.
Loose coupling and isolation of functionality within the solution improves overall quality, fault detection and correction.
Each physical component of the solution should be deployed in a redundant configuration with multiple paths and devices configured in parallel. Where possible, equipment and servers should be virtualized across multiple hosts. Physical and virtual servers should be load balanced, clustered or replicated.
TRELLIS solutions should have a documented Disaster Recovery Plan (DRP) that would be invoked in the case of a wide spread component failure (eg the loss of a data center). The DRP should cover: