Introduction to Functional Safety

Author: HIMA Paul Hildebrandt GmbH

Published: 01.11.2023


Introduction

Today’s industrialized cities are convenient and safe to live in because of advanced appliances, machinery, and systems. Millions of citizens can co-exist comfortably in a technologically-enabled world that has been simplified by equipment and devices, both large and small. Whether the task involves preparing breakfast, travelling through busy cities, or occupying buildings, it can be easy to take for granted the role of modern technology in making our lives safer.

How many of us stop to consider what would happen if the office shredding machine, public train, or building elevator malfunctioned dangerously? When we stop to think about it, we realize that if this equipment did not work safely and optimally, we would notice any adverse effects very quickly.

Everyday equipment, if not fitted with the right safeguards, can be harmful to people, property, and our environment. Even the most common appliance, such as a toaster, can pose a danger when it fails. As we navigate our cities in a safe manner, we should not forget that for the most part, systems and devices all around us are safe by design.

Equipment can cause serious harm to humans if it malfunctions or if it is not operated correctly. To avoid endangering people, these items must contain built-in safety systems to protect us. Today, almost every type of machine or piece of equipment is designed and manufactured with safety in mind, considering what is known as Functional Safety.

In this introduction, we will delve deeper into what Functional Safety is and learn how its principles are used to keep us safe. We will discuss key terms that are used to arrive at a commonly-accepted understanding of Functional Safety. This introduction will give practical examples encountered in everyday life to give deeper meaning into how widespread and essential Functional Safety principles are to modern societies.

1 RAMS

RAMS is an acronym and stands for reliability, availability, maintainability, and safety. RAMS is related to dependability and is a central term in many different application areas, ranging from the automotive, railway, factory, and process industries.

It is applied to equipment and machinery as a systematic principle to make sure that every component functions well to make the equipment work as intended. As our understanding of Functional Safety increases, there are reminders everywhere that the field relies on universal principles and terms to ensure common practice. We will discuss these terms in more detail in future sections.

For now, we will provide an explanation of the RAMS principle in detail as its acronym defines important quality features of systems used in many application areas. For the layperson, there appears to not be much that differentiates the terms in the RAMS principle. However, there are key differences that are important to know.

1.1 Reliability

Reliability is defined as the system’s ability to perform a specific function within a specific amount of time under specific conditions.

Just by looking at the definition, it is clear that there are several important terms are used: system, specific function, time and conditions. Specific in this context means that there are specifications. In practice, such specifications are written and managed according to subject area rules. For instance, product specifics are defined as part of product standards. A specific function has thus been defined in a corresponding process. The same is true for the required time and other boundary conditions.

Example: Turning on the light

The system is composed of the switch, the wires, and the lamp. It operates reliably if the lamp starts to glow within 10 milliseconds after pressing the switch. The glow would be the specified function and the boundary conditions would include the power supply and the switch-on delay of the lamp.

Figure 1: If a light bulb that is plugged into a power source illuminates when the switch is pressed, it is considered to be reliable. 
Source:   Adobe Stock

1.2 Availability

Availability is the probability that the system operates properly whenever its use is called upon. An essential concept in this definition is the notion of probability. As the system exists in the real world, no absolute quantities are given. That’s why it is also true that the availability of a real system is always less than 100%. Two other important concepts related to availability are a system's proper working order and the boundary conditions. In the realm of Functional Safety, availability is crucial because the profitability of a machine or an entire production plant depends on it.

Example:    Driving a car

We cannot drive a car on an empty tank. The same is true if the tank is full, but the ignition key is missing

Figure 2:  Fuel is a necessary condition for a car to be available whenever its use is called upon. 
Source:    Adobe Stock

1.3 Maintainability

Maintainability is a system’s ability to be repaired or maintained. If we think about consumer goods, it could be tempting to assume that they are already designed to just survive the warranty period, so maintainability is not a concern. In the industrial environment, however, the maintainability of a system is far more significant. In fact, not only is the overall sustainability of a system critical, but so are considerations about the costs associated with a potential system failure and the risks of a dangerous accident.

Example:    Driving a car

One obvious way to avoid damaged brakes is to periodically replace the brake pads. The car’s manual will normally recommend the frequency at which brake pads must be changed. If a driver does not perform this maintenance action, they will likely have to replace the brake pads and the brake discs in the end. This means that regular repairs have an impact on the variable cost of running a machine.

Figure 3: The ability to replace worn brake pads at recommended intervals is a crucial part of a car’s maintainability
Source:   Adobe Stock

1.4 Safety

Safety is a system property that indicates the extent to which a system, in the event of failure, can cause harm to people, the environment, or other property.

The definition of safety focuses on preventing harm to people, the environment, or other assets. We will discuss this more in the next section. A system's safety describes the probability that, in the event of a failure, people will be injured.

Example:    Riding an elevator
Imagine an elevator in a regular building. If the doors to the elevator open and the elevator is not there, a distracted person could fall into the shaft. This clearly presents a hazard, meaning the system is unsafe.

Figure 4: An open elevator shaft presents an unsafe situation for regular users.
Source:  Adobe Stock

From the definitions listed above, we learn that Functional Safety is perfectly applicable to the RAMS principle. If engineers and product developers integrate reliability, availability, maintainability, and safety into every design, implementation, and operational element, then there is likely to be reduced risk of danger.

Consider this, the aviation industry uses RAMS extensively to ensure that flight systems are reliable and safe. The principle applies to every part of the plane, not least to the engine, avionics, and flight control systems. Original equipment manufacturers such as aircraft manufacturers must apply RAMS, as must every downstream supplier making constituent parts.

One good example of a RAMS-designed product in aviation is the flight control system. Naturally, this system is mission-critical to the plane and cannot go down, which is why modern aircraft have two or more independent flight control systems that can be interchanged if one system fails. Each system comes complete with its own sensors and processors and is completely independent of the other system.

We have already seen that risk is a major theme in Functional Safety. Reducing it to tolerable levels is paramount. This is why so much effort is put into ensuring that there are universal practices to monitor and measure risk.

Functional Safety experts use various methods to monitor and assess risks. This risk assessment begins at the earliest design stages of a product or system. In some cases, these methods are used to conduct post-mortems of why systems have failed. There are different methods used to analyze and assess hazards and risks. Some of these include:

  • Hazard and operability analysis (HAZOP) is the method most used in the process industry
  • Failure mode and effects analysis (FMEA) is used in a number of industries including automotive, aerospace, medical, mechanical, and electrical engineering
  • Probabilistic risk assessment (PRA) is a methodology for evaluating risks associated with complex technical systems, employed in multiple industries including healthcare
  • Layer of protection analysis (LOPA) is a quantitative instrument for analyzing and assessing risks in scenarios with major adverse consequences. It is commonly used in the process industry
  • Fault tree analysis (FTA) is used in primarily safety and reliability engineering to understand how the system can fail, determine the best path to reduce risk and calculate event probabilities

These methods often use advanced analytics and calculations and are an area that requires deep technical knowledge and expertise. We will not discuss these methods in this introduction as they will be covered in future sections. For now, we will provide an important definition of what Functional Safety actually is.

2 What is Functional Safety?

Functional Safety is a field of application concerned with the idea that systems and devices must operate safely and reliably. Every item has an intended purpose, whether that is a coffee machine designed to brew a hot beverage or a train to transport passengers across a great distance. These machines work as intended most of the time, but there is always a risk that if they fail, they can cause harm to people, property, or the environment.

Functional Safety is the field of knowledge that systematizes the principles required to ensure that equipment is designed and manufactured to operate in the manner it is intended to, without posing a risk of damage or injury to the user.

Safety, as a broad field of application, covers many areas, of which Functional Safety is only one. There are many different types of safety, including electrical safety, fire safety, and occupational safety, among others.

Figure 5: Some of the main safety areas, one of which is Functional Safety
Source:   HIMA Paul Hildebrandt GmbH

Functional Safety is concerned with making sure that machinery and devices work correctly in response to commands they receive, and crucially, what fail-safes are activated should they malfunction. Every piece of equipment has the potential to be a hazard through incorrect deployment or through a fault. For example, defective electronics in any home appliance can pose serious risks to users, ranging from electrical shocks to fires and other hazards.

Functional Safety designers and engineers work hard to identify potentially dangerous conditions and reasons for malfunction within systems and equipment.They search for situations or events that could result in an accident and they implement measures to contain them. Their work is mainly concerned with incorporating safety measures in equipment from the design stage, so they work better during operation. 

The process areas in which most safety incidents occur are shown in the following graphic:

Figure 6:  Safety incident analysis.
Source:    Based on 34 investigated incidents in the UK Health and Safety Executive (GB): Out of Control? Why control systems go wrong and how to prevent failure? 1995 (2nd edition 2003, source: © Health & Safety Executive HSE – UK)

What would happen if Functional Safety was not prioritized?

Imagine someone enters an elevator and the doors close before they are fully through, and the person is crushed. What if someone opens a microwave oven door while it is still in operation and is exposed to dangerous microwave energy?

Through a safety mechanism, the elevator doors retract when a person has not yet crossed through, and a microwave oven switches off if the door is opened during operation.

Figure 7: Safety mechanisms such as sensors on an elevator door allow people to safely cross through.
Source:   Adobe Stock

These safety guardrails are intentionally inserted by designers who are adhering to Functional Safety principles. By following the rules of Functional Safety, we have the assurance that every regulated item of equipment contains risk reduction measures required to operate safely under a range of conditions.

3 Examples of Functional Safety in industries

Elements of Functional Safety are useful in ensuring that everyday products such as televisions, computers, and even heavy machinery operate safely and reliably. In the transport sector, Functional Safety minimizes the risk and impact of accidents and other incidents. Signaling systems in our modern cars, trains and aircraft keep things running safely.

Functional Safety is at play in other areas such as building access and control. Consider the fire alarm systems, elevators, and security systems that we barely pay attention to. These features are integrated into buildings to make them safer for building occupants and property.

There are yet more applications for Functional Safety. Think of medical devices such as pacemakers, insulin pumps, and implantable devices. They all require Functional Safety mechanisms to ensure that they operate as designed and do not endanger or harm patients.

Figure 8: Pacemakers require safety mechanisms to ensure that batteries and electrical wiring operates safely so as not to endanger patients.
Source:     Adobe Stock

Up to now, we have been discussing these terms as if they were unconnected and separate from each other. This is not the case, as Functional Safety is a highly systematized area whose detailed rules and conventions are based on the foundational terms we have introduced, such as safety, risk, harm, and hazard.

4 Functional Safety standards and regulations

In the earlier section of this introduction, we mentioned that regulators are involved in defining key terms and how they are applied. Who are these regulators? For many years, Functional Safety role-players across the world have been striving to develop harmonized terminology and standards for an increasingly globalized society.

While the history of Functional Safety is much more detailed than we have room to discuss here, we can trace today’s standards to the development of electricity in the 1800s. Electricity slowly came into everyday use over the course of the 19th century because of the efforts of many pioneering scientists like Nicola Tesla and Thomas Edison, whose innovations advanced electrical science towards mainstream acceptance. By the 1880s, scientists started to realize that a lack of common terminology, measurements and ratings was hampering the advancement of electrical science. 

To address this, the IEC (International Electrotechnical Commission) was founded in London in June 1906. It became the leading organization for technical standards related to electrical systems. Today, the IEC is a global, not-for-profit membership organization whose work underpins quality infrastructure and international trade in electrical and electronic goods. 

The IEC brings together more than 170 countries and provides a global, neutral and independent standardization platform to experts globally. In 1997, the IEC published the standard IEC 61508 which covered the Functional Safety of electrical/electronic/programmable electronic safety-related systems. 

Figure 9: Any E/E/PE safety-related system, such as a revolving door, or any other application, is covered by IEC 61508
Source:     Adobe Stock

This standard contained guidelines, rules, and requirements intended to ensure that safety-critical systems and processes around the world operate correctly and reliably. After the launch of IEC 61508, the standard made its way into the safety-critical and safety-related digital systems mainstream over the next few years. 

The standard covers the entire lifecycle of a safety-related system, from concept and design to decommissioning. Today, these widely accepted standards define safety requirements for products and systems, as well as safety-related processes and procedures that need to be followed to develop and maintain them.

4.1 What systems does IEC 61508 cover?

The IEC 61508 code is a standard that applies to safety-related systems that incorporate electrical and/or electronic and/or programmable electronic (E/E/PE) devices.

The code seeks to cover possible hazards caused by the failure of the safety functions carried out by the E/E/PE safety-related systems. Crucially, these safety functions are normally distinct from hazards arising from the equipment itself, such as electric shock.

Programmable electronic safety-related systems can be highly complex and typically incorporate programmable controllers, programmable logic controllers, microprocessors, and application-specific integrated circuits. For designers and engineers to adhere to IEC 61508 standards correctly, the entire E/E/PE safety-related system, ranging from its sensors, control logic and communication systems, and actuators must all abide by the code individually and systemically.

Around the world, any E/E/PE safety-related system, regardless of the application, is covered by IEC 61508. The range of E/E/PE safety-related systems to which IEC 61508 can be applied is vast. Almost every industry uses it. The standard has gained more adoption around the world mainly because previous safety standards were theoretical in nature and their rules were not positioned in terms of real-world performance. An emphasis on quantitative risk reduction, lifecycle considerations, and general practices makes this standard different from its predecessors.

While IEC 61508 is the overarching standard, there are other industry-specific standards with discreet areas of application that are used in different industries and contexts to ensure the safety of products and processes.

New entrants into the study of Functional Safety should know that compliance with Functional Safety standards is often a legal requirement for many industries. Failure to comply could result in serious consequences, including legal liability. Here's an overview of some of the major standards for specific sectors:

4.1.1 Overarching standard

The IEC 61508 is the main standard that provides a framework for the development of safety-related systems across all industries. That is why it is also referred to as the umbrella standard. A large part of its aim is to ensure that safety is paramount throughout the system and that safety risks are identified, assessed, and mitigated at every stage. Compliance with the standard is often required by law and is essential for ensuring the safety of people and the environment. The standard is widely used in industries such as process control, nuclear power, railways, and automotive.

4.1.2 Automotive sector

ISO 26262 is a standard based on IEC 61508. It specifies the requirements for the Functional Safety of electrical and electronic systems in passenger cars, trucks, buses, and motorcycles. It is intended to provide a common framework for the development of safety-related systems that are integrated into these vehicles.

4.1.3 Process industry 

IEC 61511 applies to this area. This standard provides guidance for the development of safety instrumented systems (SIS) used in the process industry to ensure the safe operation of industrial processes.

4.1.4 Machinery industry

For this industry, the Functional Safety standards that govern it are IEC 62061 and ISO 13849. These standards provide guidance for the design and development of safety-related control systems used in machinery applications.

4.1.5 Nuclear industry

IEC 60880, IEC 62138 and IEC 61513 are relevant application-specific standards used in the nuclear industry. These standards provide guidance for the development of instrumentation, control and electrical power systems of nuclear facilities.

4.1.6 Avionics industry

For the avionics industry, the primary Functional Safety standard is the DO-178B/DO-178C. These standards provide guidance for the development of safety-critical software used in avionics systems, including flight control systems, navigation systems, and communication systems.

Figure 10: The avionics industry is covered by DO-178B/DO-178C, although electrical sub-components must abide by IEC 61508 standards.
Source:      Adobe Stock

4.1.7 Railway industry

For the railway industry, the major Functional Safety standard is the EN 50126 standard. EN 50126 is part of a series of standards, known as the Railway RAMS standards, which also includes EN 50128 for software and EN 50129 for safety-related electronic systems.

4.1.8 Medical industry

Relevant application-specific standards used in the medical industry are IEC 60601 and IEC 62304. IEC 60601 provides a framework for the design, development, and testing of medical devices to ensure they are safe and effective for use by patients and healthcare professionals. 
IEC 62304 is the international standard that covers the entire software development process and is intended to ensure that medical device software is safe, reliable, effective, and complies with regulatory requirements.

4.1.9 Robotics industry

In the robotics industry, ISO 10128 is the international application-specific standard that specifies general and specific safety requirements for industrial robots and robotic systems, including equipment integrated into the robot.

It is worth remembering that many of these standards apply to multiple industries, and some industries also have additional sector-specific standards or regulations that are in force. 

5 Key terms

As we grapple with the depth and complexity of Functional Safety, there are some key terms and concepts that are important to understand to get a fuller understanding of the field and be conversant in the area.

5.1 What is a fault?

Faults can come in many forms but are largely split into two types: systematic faults and random faults.

5.1.1 Systematic faults

Faults are called systematic when their cause and consequences can be anticipated. They are any faults in the way of applying methods or processes whose consequent failure shows up in a deterministic way. This consequent failure is referred to as systematic failure.

Systematic faults could be:

  • Production faults
  • Planning faults
  • Wiring faults
  • Engineering faults
  • Material faults

5.1.2 Random faults

Faults are called random when it is not possible to predict when they will occur. They are typically physical faults related to hardware that can happen suddenly and are brought on by excessive stress. The consequent failure is called a random failure

Examples of random failures are:

  • Contact failure
  • Soldered joint failure
  • PCB/semi-conductor failure
  • Relay stiction
  • Resistor/capacitor degradation
  • The reasons for random faults can be traced to normal wear and tear of material, component ageing, or external influences like adverse operating conditions. Even if a random fault can be traced back to a systematic reason, the major point is the unpredictability, hence the use of the term random.

5.2 What is harm?

What danger can a fault pose? Malfunctions caused by faults can cause harm. Harm is defined as physical damage or injury, something that causes someone or something to be hurt or broken. Harm must have a source, and there must also be a cause. So, harm can be called a state from which sometimes harm occurs. 

We can imagine that a hazard might be a state, or an event, which contains a significant set of causal factors of an accident. The causes don’t have to always happen for the item to contain enough harm.

5.3 What is a hazard?

A hazard is a substance, a situation, or an item that can cause injury or damage to people, assets, or the environment. Other definitions for hazards in Functional Safety are as follows:

Potential source of harm.
Note - The term includes danger to persons arising within a short time scale (for example, fire and explosion) and also those that have a long-term effect on a person’s health (for example, release of a toxic substance).
Source: IEC 61508-4:2010

An inherent physical or chemical property that has the potential to cause damage to persons, possessions, or the environment.
Source: American Institute for Chemical Engineers (IChE)

5.4 What is safety?

Safety, in the context of Functional Safety, is commonly known as freedom from unacceptable risk. The term unacceptable risk automatically throws up some questions. When is risk unacceptable and who gets to decide what is the correct answer?

In some Functional Safety literature, we also find a slightly different definition of safety, which is that safety includes the freedom from risk that is not tolerable. For our purposes, the terms unacceptable and not tolerable can be interchangeable as they both mean that systems and machinery cannot be free of risk. No equipment is ever risk-free, but to achieve acceptable safety, we must control well-understood risks and keep their occurrence and severity to acceptable levels.

At this stage, we may notice that the idea of acceptability keeps coming up. It is the right of every society to decide if its public amenities are safe based on its own internal standards. For example, in the United States, electric kettles are not as widely used as they are in a country like the UK. As a result, the safety standards for electric kettles in the US are not as strict as they are in the UK. Electric kettles in the US do not have to meet the same strict regulations for automatic shut-off features that are required in the UK. Additionally, some electric kettles sold in the US may not have a grounded plug, which increases the risk of electric shock.

By contrast, the UK has some of the most stringent safety standards in the world for these common home appliances. We can see from this example that two countries have different assessments on how to make the same appliance safe. Therefore, we can see that safety is a changeable concept depending on who is looking at it.

This difference in safety standards between the US and the UK can have tangible consequences. Imagine if a UK citizen travels to the US and purchases an electric kettle without realizing that the appliance doesn't have the same safety features as a kettle purchased in their home country. This could increase their risk of injury from electric shock or fire through incorrect deployment.

Figure 11: Electric kettles in different countries are subject to different safety standards
Source:     Adobe Stock

What we take from this example is that safety is a movable concept, firstly, and secondly, it is the job of someone or some entity to decide what acceptable risk is. Usually, it is the job of regulators to declare applicable safety standards for equipment as small as electric kettles and as large as industrial chemical plants. Moving forward, if we are to discuss safety, then we must ask, safety from what? 

5.5 What are risks?

In many scenarios, risk is defined according to the size of its potential. For this reason, risk is not treated the same in every industry, with different industrial sectors having their own classifications. The size of the risk is commonly defined as the size of the combination of the probability of occurrence of harm and the severity of that harm.

Figure 12: Risk calculation
Source:     HIMA Paul Hildebrandt GmbH

Not all risks are the same. The refrigerator in a home could malfunction and could lead to an electrical fire, which could cause serious damage or injury. While these domestic risks are serious enough, it is reasonable to imagine that risks in a large nuclear plant or a chemical manufacturing plant could be much higher to people and the environment (in fact, major accidents have happened in such plants, causing significant harm to people and the environment). 

This brings us to the idea of tolerable risk. What level of risk can we tolerate and who decides on the level? Tolerable risk is not the same in every country. For example, the World Health Organization estimates that the number of road deaths per year in Europe is 7 per 100,000. By contrast, there are an estimated 27 deaths per 100,000 in Africa.

Figure 13: Approach to risk reduction: onion model.
Source:     HIMA Paul Hildebrandt GmbH

5.5.1 When is a risk tolerable?

In the previous section, we identified that sometimes machinery or systems do not function as intended, and the result could be a negative one for people and property close by. We can assume that if a person who is going through everyday life is to enjoy safety, then they must avoid risk, and if they are at risk, what is a tolerable amount of risk, and how is this decided?

Regulators go to great lengths to define exactly what they mean by each of these words, but for the purposes of this introduction, we will present the most accepted middle-ground definitions.

There is no such thing as zero risk when dealing with most systems and equipment. Rather, the goal of Functional Safety is to reduce risk to a tolerable level and to cut down its damaging impact. Functional Safety is concerned with understanding risk to calculate how likely it is that an adverse event will happen and how much harm it could cause.

Functional Safety is a set of concepts, which, when deployed, are intended to protect people, materials, and the environment. People who work in this area are chiefly concerned with avoiding systematic faults that can happen in equipment and controlling accidental faults with the goal of reducing risk to tolerable levels. 

Figure 14: The process of determining risk reduction
Source:      Based on IEC 61511-3:2016, HIMA Paul Hildebrandt GmbH

What level of risk can we tolerate and who decides on the level? Tolerable risk is not the same in every country. For example, the World Health Organization estimates that the number of road deaths per year in Europe is 7 per 100,000. By contrast, there are an estimated 27 deaths per 100,000 in Africa.

Figure 15: Estimated road traffic death rate per 100,000.
Source:      World Health Organization, 2018.

The fact that these values remain roughly the same per continent year after year implies that each society believes these risk levels are acceptable. So, the idea of tolerable risk is a movable idea. For this reason, the safety of systems can be engineered to very low levels or very high levels. It simply depends on the risk tolerance stance of the authorities in control.

5.6 Safety lifecycle

A safety lifecycle is a set of activities that are carried out during the development and operation of a safety-critical system to ensure that it operates safely and reliably. It is based on a lifetime consideration to take all aspects of risk into consideration that might come up. It forces users and developers to pay attention not only to the moment but think about future, too. A safety lifecycle starts with the risk analysis and continues throughout the entire service life of a piece of machinery up to decommissioning. As depicted in Figure 16, the safety lifecycle of a manufacturing plant could involve the following phases:

  • Risk analysis
  • Specification
  • Planning and implementation
  • Installation and start-up
  • Modification after start-up

Figure 16 also reveals a thorough safety management, accurate technical requirements and continuous qualification of personnel are critical to maintain a high-level of safety and ensure that the plant operates in compliance with industry standards and regulations. Within these clusters, measures are defined and implemented in all phases of the lifecycle.

This example of a safety lifecycle is based on an investigation conducted by the UK Health & Safety Executive (HSE) and includes fewer steps than the safety lifecycle presented in IEC 61508 or EN 61511. However, the procedures applied are generally the same.

Figure 16: Example for SIS architecture and SIF including several different modules.
Source:     HIMA Paul Hildebrandt GmbH

5.6.1 Risk analysis

Risk analysis is crucial because it helps to identify and assess potential hazards and risks associated with the operation of the plant as well as potential causes of faults, thus allowing risk to be reduced to a tolerable level. Risk analysis involves evaluating the likelihood and consequences of various events, such as equipment failures or human errors, that could lead to accidents or harm to people, the environment, or the plant itself. This phase lays the foundation for establishing safety requirements.

These results will influence areas that require possible solutions for risk reduction, selection of usable equipment, and qualification of personnel. As we show in Figure 17, in general, the severity of harm can be mitigated by active layers (reacting in case of), passive layers (limiting by existence) and automatic layers (as in a SIS with rapid detection and response).

Figure 17: Risk reduction considerations 
Source:     HIMA Paul Hildebrandt GmbH

5.6.2 Specification

Based on the findings of the risk analysis, safety requirements of the equipment are identified and translated into detailed and accurate functional and performance requirements for the system. This includes developing detailed technical specifications for safety systems and equipment, as well as outlining the criteria that these systems must meet to ensure the safety of the plant and its operators. The requirements must be clear, concise, and verifiable.

5.6.3 Planning and implementation

In this phase, detailed plans are developed implementing safety measures and systems. This involves for instance designing safety protocols, selecting appropriate safety technologies, and creating a roadmap for integrating them into the plant's infrastructure. Proper planning is essential to ensure that safety systems function as intended.

In practice, this phase includes the design of any hardware and accompanying architecture as well as considerations relating to technologies such as E/E/PES devices, mechanical constructions, or additions to buildings.

5.6.4 Installation and start-up

The safety systems and equipment specified and planned in the previous phases can now be physically installed and integrated into the plant. Rigorous testing and validation are conducted to ensure that the safety measures operate correctly under various conditions. The plant is brought online gradually, and operators are trained to use the safety systems effectively.

While during prior steps verification is done, in this step validation is a major focus. Verification involves checking whether the product is being developed according to the specifications, while validation is about ensuring that the right product is being developed to meet the user's needs, including safety requirements.

5.6.5 Modifications after start-up

The safety lifecycle is not a static process; it continues even after the plant is operational. Modifications and upgrades are often necessary to adapt to changing conditions, technologies, or regulations. This phase involves evaluating the existing safety systems, identifying areas for improvement or modification, and implementing changes to enhance the overall safety of the plant.

5.7 Safety integrity and SIL

Safety integrity is chiefly concerned with making sure safety-related systems carry out their safety functions up to a good level that is predictable and according to set standards.

A Safety Integrity Level (SIL) is a numerical value that is reached by making a probability-based assessment of the likelihood that a safety-related system will fail. Here, safety engineers are not simply calculating how safe a piece of equipment is, but how safe the safety system is.

A high SIL number means there is the expectation of a high risk, with major adverse consequences in case of a fault. So, the general requirement of any safety-conscious regulator is for a low probability of failure, and the safety-related system is expected to work well and perform its safety function. Equipment and systems are normally placed on a SIL rating of between 1 and 4. Therefore, a SIL 4 system has been calculated to be highly unlikely to fail. You would expect this rating for a nuclear plant. At the opposite end of the scale, a SIL 1 system has a relatively higher chance of failure, but the risks and consequences have been considered as being low. This would be a normal rating for a toaster.

As the SIL rating increases, it becomes much more complex and costly to implement the system, therefore, it does not make practical sense to make every item a SIL 4. Also, if a system requires a SIL 4 just to be safe, then engineers must consider changing the sub-processes to make them safer.

As the cost of adverse consequences in SIL 4 settings are very high, it is often not in the interests of plant owners to carry such a heavy burden of risk. This is why plant owners actively drive processes to reduce risk.

Note that the same item could be rated at SIL 1 and SIL 4 in different contexts. Consider an everyday piece of equipment like a set of automated doors. In a public building it could be rated as a SIL 1 item. However, in a nuclear plant, it will likely be rated as a SIL 4 item. This is a remnant of SIL classification methods that originated in the 1960s when it was the convention to classify items per function. Today, we strive to gain a more precise understanding of probability and consequence in individual situations. As we have said, individual items are now rarely designated by their own SIL levels. This is why the exact same automated door in a nuclear plant could be operating in a SIL 4 level in terms of the amount of risk in the overall environment.

Figure 17: Large plant installations may require SIL 4
Source:     Adobe Stock

5.8 Safety management system

A safety management system is concerned with taking care of the safety of a system throughout its lifecycle. The range of safety management systems extends from design, implementation, and operation. The main goal is to ensure that the risks posed to people or property are minimized.

Safety management systems are requested by the different IEC standards as well as by legal requirements such as the Seveso Directive in Europe. A typical safety management system is made up of a formal set of processes outlining the approach taken by a firm to manage the safety of its employees and its assets. These could include procedures, policies, and activities that the organization carries out to make sure that risks remain within tolerable levels. Safety management systems are integral to risk-based decision-making for organizations.

Many activities go into safety management, key among them is identifying hazards, assessing risks, and looking into defining safety requirements. As we have mentioned, it also includes safety planning as well as verification, validation, and monitoring.

5.9 Safety instrumented function and safety instrumented system

In simple words, a safety instrumented function (SIF) is a safety measure designed to prevent or mitigate a hazardous event and is necessary to achieve Functional Safety. It can either be a safety-related protective or control function that is intended to bring a process to a safe state when predetermined conditions are met. One common example of a SIF is the high-level trip in a tank. Imagine a large storage tank that holds a liquid, such as oil. If the tank were to overfill, it could lead to environmental spills, equipment damage, or safety hazards. So, when the level is reached, the system trips. This is an example of how a SIF works.

A safety instrumented system (SIS) is a control system that implements one or more SIFs and is meant to guard against the risk of a hazard coming to harm people or the environment. An SIS must provide an immediate response to make sure accidents do not happen and it must also reduce their hazard level.

Building on the tank example, a SIS would represent another level of safety over and above a SIF, in this case, the high-level trip. Our imaginary plant will also likely be fitted with sensors that constantly monitor the liquid level in the tank. Once the oil level is reached and the tank trips, the sensors (acting as a logic solver, or ‘brain’ of the system) could command an actuator to activate a pump to drain excess liquid from the tank. Together, these components form the SIS for the tank. This example is demonstrated in Figure 19.

Figure 18: Example for SIS architecture and SIF including several different modules
Source: HIMA Paul Hildebrandt GmbH

An SIS typically includes safety interlock systems designed to prevent the unsafe operation of machinery or equipment by ensuring that certain conditions are met before the equipment can operate. Emergency Shutdown Systems (ESD) are another example of SIS and are designed to automatically shut down a process when a hazardous condition is detected. This could look like measuring and tracking gas pressure reading in a plant which can be shut down if required.

The SIF and SIS concepts are important in Functional Safety as they enable engineers to identify the critical safety functions in a system and design systems that ensure those functions are performed correctly. By properly designing, implementing, and testing SISs, engineers can reduce the likelihood of catastrophic events and increase the overall safety of industrial systems.

Conclusion

Functional Safety is an important field in today’s ever-advancing societies. For people and property to remain safe, it is crucial that equipment is designed and manufactured to operate in the manner it is intended to, without posing a risk of damage or injury to the user. Functional Safety practitioners must pay close attention to all the requirements of this area of study to make sure they produce the safest and most risk-free products. 

This introductory chapter was intended to provide an entry into the area of Functional Safety, which is a broad and complex field. Future chapters will delve deeper into other key areas, providing even greater detail into the workings of this important field.

Sources

List of references

[1]   “Functional safety essential to overall safety | IEC.” https://www.iec.ch/basecamp/functional-safety-essential-overall-safety

[2]   “IEC 60050 - International Electrotechnical Vocabulary - Welcome,” IEC - International Electrotechnical Commission. https://www.electropedia.org/

[3]   “Out of control: Why control systems go wrong and how to prevent failure - HSG238.” https://www.hse.gov.uk/pubns/books/hsg238.htm

[4]   “Safety in the future | IEC.” https://www.iec.ch/basecamp/safety-future

[5]   D. Craig and Amec, “SIL and Functional Safety – some lessons we still have to learn,” IChemE, journal-article, 2008. [Online]. Available: https://www.icheme.org/media/8971/xxiv-poster-06.pdf 

[6]   L. Nara, American Institute of Chemical Engineers, and European Process Safety Centre, CHEF Manual. 2018. [Online]. Available: https://www.aiche.org/sites/default/files/docs/book-pages/chef_manual_v1.1.pdf 

[7]   P. B. Ladkin, Causalis Limited, and University of Bielefeld, “An overview of IEC 61508 on E/E/PE Functional Safety,” 2008. [Online]. Available: https://www.eic2.com/pdf/IEC61508FunctionalSafety.pdf 

List of standards

[8]   Functional safety - Safety instrumented systems for the process industry sector - Part 1: Framework, definitions, system, hardware and application programming requirements, IEC 61511-1:2016.

[9]   Functional safety - Safety instrumented systems for the process industry sector - Part 3: Guidance for the determination of the required safety integrity levels, IEC 61511-3:2016.

[10]   Functional safety of electrical/electronic/programmable electronic safety-related systems - Part 4: Definitions and abbreviations, IEC 61508-4:2010.

[11]   Railway Applications - The Specification and Demonstration of Reliability, Availability, Maintainability and Safety (RAMS) - Part 1: Generic RAMS Process, DIN EN 50126-1:2018-10.

Download and share