Optimal Design of Fault-Tolerant Distributed Systems Based on a Recursive Algorithm

Hoang Pham, Shambhu J. Upadhyaya

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

This paper addresses the issue of optimal design (in terms of the number of processors) of a distributed system and is based on a recursive algorithm for fault tolerance (RAFT). The reliability and performance of the system using RAFT are determined as a function of reliability of individual processors and the number of fault modes in a processor. Also discussed are how to determine the design policies when the objective is to minimize the average system cost given the cost of each processor and the cost of the system failure. Several numerical examples illustrate the results. Reader Aids - Purpose: Widen state of the art Special math needed for explanations: Probability theory Special math needed to use results: None Results useful to: Reliability analysts.

Original languageEnglish (US)
Pages (from-to)375-379
Number of pages5
JournalIEEE Transactions on Reliability
Volume40
Issue number3
DOIs
StatePublished - Aug 1991
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Safety, Risk, Reliability and Quality
  • Electrical and Electronic Engineering

Keywords

  • Distributed system
  • Fault tolerance
  • Optimization
  • System reliability

Fingerprint

Dive into the research topics of 'Optimal Design of Fault-Tolerant Distributed Systems Based on a Recursive Algorithm'. Together they form a unique fingerprint.

Cite this