Rethinking Lab Reliability
April 01, 2002
Appeared in R&D: Lab Design Handbook
The nation's deadly encounter with anthrax last fall was a compelling admonition about the need for greater public health resources and defenses against bioterrorism. The incident brought with it a subtle warning of greater and perhaps more immediate relevance for public and private laboratories everywhere.
Within the storm of news coverage accompanying the mounting anthrax scare came this quietly disturbing item, as reported Oct. 14, 2001, in the New York Times: "The laboratories of the federal Centers for Disease Control in Atlanta were shut down for 16 hours on Wednesday night and Thursday morning when a cable short-circuited, causing a power failure, just as scientists were trying to verify an anthrax infection in a New York patient, an employee of NBC News."
Research laboratories are often among the most carefully engineered and meticulously managed facilities of any kind. They are classified and codified to meet exacting standards for cleanliness, pressurization, safety, and good practices. The most sophisticated laboratory is built with a keen understanding of the processes and controls needed to protect the integrity of the science and the viability of the work produced within its walls. Yet the events of last year suggest that we need to think more broadly about what constitutes failsafe lab functioning.
Critical laboratories must be made more reliable, assured of uninterruptible operations in the face of unforeseen events. We've already seen that, in its worst case, lab downtime can threaten human life and public safety. In any case, unplanned downtime can threaten the livelihood of a research enterprise or institution.
Stated as such, the problem may seem daunting. The solution is not. It simply requires a different way of thinking about laboratory infrastructure. It requires a system of defining and designing reliability much as we already do for safety, cleanliness, and other laboratory practices. Fortunately, we can get a jump on the problem by looking at, and learning from, other industries.
Predicting the Preventable
Imagine that a power outage shuts down a cold box, or lab refrigerator, storing a new compound that has taken five years and $5 million in research costs to produce. How many minutes of warming can you withstand before incurring an irretrievable loss?
Now assume that a power or mechanical outage could compromise the lab pressurization systems needed to contain a deadly biohazard. What training, maintenance, and backup systems do you have in place to guarantee that it could never happen?
Worst-case scenarios easily come to mind, but the point is not to needlessly conjure up imaginary catastrophes. The point is to prepare for what we know might happen, by methodically predicting, assessing, and designing against the preventable loss, efficiently and affordably.
We use benchmarks to guide this process. Just as there are benchmarks for cleanliness or biosafety levels in the laboratory realm, there are benchmarks established in other industries for reliability tolerance. These tools, born amid the extreme data-dependence of the emergent online financial world, have new significance and usefulness for lab design and operation.
How much downtime can you afford? Building a more reliable lab begins with this deceptively simple question--deceptive, because few companies could be expected to answer it unaided. Some amount of downtime is unavoidable, and planned downtime is even desirable to allow necessary equipment maintenance, upgrades, and replacement. While a facility manager may naively aim for 100% "uptime" in lab operation, that target is too expensive and aggressive to attain through engineering and backup equipment. How close to 100% can you reasonably and affordably reach? How close do you need to be?
Those answers lie in the processes of reliability auditing and benchmarking, services in which our firm specializes. Benchmarking uses the industry standards for reliability that define degrees of near-perfection, or absolute uptime, and translates that into seconds, minutes, or hours of acceptable downtime per year. As shown on the accompanying table (location), targets can range from 99.9999% reliability (amounting to less than 1 min of downtime per year) to 99% (allowing almost 90 hr of downtime per year).
Your degree of downtime tolerance depends on the costs--in capital, time, and human safety--of those precious minutes, and varies entirely by the type of lab and its work product. It's easy to assume that the Centers for Disease Control, for instance, no longer finds 16 hr of downtime in its diagnostic labs to be acceptable, although another lab well might.
Once you've reasoned out, and set, your target reliability, your lab design can be fitted to that target by integrating system redundancies; backup and uninterruptible power supplies; information technology safeguards; security; fire and life safety equipment; and (most often overlooked) procedures for operations, maintenance, monitoring, controls and disaster recovery. Any of these measures might already be incorporated ad hoc into a sophisticated laboratory facility, but only a systematic approach can build reliability into a calculable business asset--one that offsets the hidden risks of serious business interruption that are otherwise present.
All labs are designed to specific criteria, and sophisticated laboratories are designed to a hierarchy of objectives. Those objectives would likely be, in priority sequence: ensuring human safety; exceeding codes or guidelines; ensuring comfort in use; and conserving energy. To those objectives, which are themselves the defining characteristics of the leading-edge laboratory, our firm is now adding a fifth: reliability.
The lab of the future must be engineered as reliable, or the lessons of today will be lost.
Donald Procz is senior VP and a leader of the Science and Technology practice for Syska Hennessy Group (www.syska.com). With primary locations in New York and Los Angeles, the firm provides consulting, engineering, technology, and construction services from 11 offices nationwide, serving an international roster of clients.