What’s Wrong with the Fault-Tree Linking Approach for Complex PRA Models?

By Steven A Epstein and Donald Wakefield

Download the full document as PDF

Executive Summary

Analysts using complex probabilistic risk assessment (PRA) models at nuclear power plants have two approaches from which to choose: Fault Tree Linking (FTL) or Large Event Tree Linking (ETL) methods. The choice of which method is utilized depends primarily on history (which method was used by the analysts who first developed the specific PRA model) or economics (e.g., which method is used by the majority of PRAs in a utility merger). However, users of either method should be aware of the problems and limitations of their approach. In particular, the current generation of PRA analysts and users need to address problems which a previous generation did not have to face due to the growing complexity of current models and the extensive applications to which models are routinely being applied.

The authors of this article clearly have a bias toward, as well as extensive experience with, the ETL method. A number of tools have been built into the RISKMAN® code which implements the ETL method to solve problems that still remain in FTL. This paper discusses a number of key problems that illustrate, “What’s Wrong with the FTL Approach?” This paper summarizes these problems, provides an estimate of the significance of the problem, and provides references where some of these problems have been examined in greater detail. Note that for some of these problems, the quantitative level of significance can be estimated, while for others, it is unknowable with the current approach.

Since these problems may impact important risk-informed decisions, the PRA analyst is obligated to examine these problems and identify the extent to which they restrict his or her model.

1986 – USA Today Article

The fallout from the explosion at the Chernobyl nuclear power plant has extended all the way to San Diego.

That’s where Steven Epstein works as a computer programmer for Management Analysis Co., a consulting firm for the nuclear power industry. There, the 38-year-old former rabbinical student designed a computer program that simulates nuclear plants in trouble.

Scans of the original article are reproduced below:

RISKMAN – 20 Years After

Download the full document as PDF

RISKMAN®, Celebrating 20+ Years of Excellence!
Mr. Donald Wakefield
Mr. Steven Epstein
Dr. Yongjie Xiong
Mr. Kamyar Nouri
ABS Consulting, 300 Commerce Drive, Suite 200, Irvine, CA 92602-1300, USA

Abstract: RISKMAN® is a PC-based, general purpose, integrated tool for quantitative risk analysis. Initiated with software programs first developed for main frames, and with development supported by a user’s group spanning three continents, the PC version of RISKMAN® now celebrates more than 20 years of risk-based applications. While mostly used in the nuclear power industry and related government organizations, RISKMAN® is also used in the offshore oil industry, marine industry, aerospace, and for specialty applications such as for assessing the risks associated with the excavation and destruction of abandoned chemical weapons.

Quantification of Fault Tree Models for Initiating Events

Donald J. Wakefield and Steven A. Epstein
ABSG Consulting Inc.
Irvine, CA, USA

Download the full document as PDF

INTRODUCTION

It has become common practice throughout the industry for PSA analysts to model multi-train support systems using fault trees. Accepted modeling and quantification procedures for doing so, however, are not available. The following is a list of possible issues in such system initiator models and their quantification:

  1. The desire to minimize the changes from the system fault trees constructed to evaluate the conditional failure probability of a system in response to some other initiator, as compared to the fault tree for the same system used to compute initiating event failure frequencies.
  2. The difficulties in constructing a fault tree to account for failure combinations within the system failure occurrence fault tree when separate events are used to represent the event occurrence rate versus the component unavailability (i.e., alternate mission times) for the same component failure mode.
  3. The appropriate accounting of all initial operating configurations of the system (i.e., alignments), especially how the initial alignment changes the equipment assumed to be normally operating. For example, when the normally operating pumps are rotated, multiple initial system alignments are often needed for time-averaged models.
  4. The proper identification of the normally operating failure modes from the complete list of basic events appearing in the system fault tree, when not all basic events that represent failure modes to operate, involve normally operating equipment. For example, some standby pumps must first start in response to another operating equipment failure mode, and then it too may fail to operate after successfully starting.
  5. The potential for excessive truncation of low probability system failure combinations during fault tree logic reduction.
  6. The appropriate use of different mission times for occurrence rates and component unavailabilities for the same component failure mode; e.g., pump failures to run.
  7. The need to account for the different restoration times of failed components when considering other component failures in the same system failure combination.
  8. The degree to which the fault tree quantification adequately approximates the Markov model solutions which account for repair assuming constant repair rates.
  9. The need to incorporate the importance of basic events leading to system failure occurrence frequencies so that their contribution to the core damage frequency can be determined.
  10. The difficulty in computing and combining basic event importance measures when the same component failure mode may involve different mission times; e.g., for failures per year and for conditional failure probabilities prior to restoration of the first equipment failed.

Many of the above issues are not applicable to system initiators that involve single train systems in which all
components are normally operating. The resulting single element cutsets for single train, normally operating systems are easily quantified using standard fault tree techniques by replacing the event unavailabilities by event occurrence rates as suggested in Reference 1. Similarly, support systems which involve redundancy but in which still only one train is normally operating can also be easily modeled and quantified using the same technique of replacing the event unavailabilities with event occurrence frequencies for the normally operating failure modes. In this case, occurrence frequencies are substituted for the basic events representing the normally operating train and after Boolean reduction exactly only one such occurrence frequency appears in the minimal cutsets.

The issues enumerated address the more general problem; i.e., that for a system with multiple, normally operating trains. The proposed approach which follows addresses each of the above listed issues.