Calculating Mean Time Between Failure

Track Mean Time Between Failure from your EAM system with our FREE Spreadsheet

Mean Time Between Failure Basics

The purpose of this blog is to introduce the concept of calculating Mean Time Between Failure (MTBF) and offer our FREE Excel Spreadsheet to calculate MTBF directly from EAM solutions such as SAP and Maximo.  When it comes to estimating how often an asset fails, there are many layers of sophistication from preparing data, calculating values, and then interpreting results.  The level of analysis rigor to apply in an MTBF calculation is driven by the specific use case, an organization’s maturity, and the required level of accuracy in the results. We are just scratching the surface on this topic in this blog so please reach out to us here if you would like to learn about more sophisticated approaches to managing equipment failure rates.

Want to skip the MTBF intro and start getting reliability insights right away with our FREE MTBF Calculator Spreadsheet?  Scroll to the bottom and download!

Managing the Mean Time Between Failures (MTBF) of your equipment is one of the most basic yet effective approaches to measure performance, identify bad actors and drive reliability improvement programs.  Typically represented in years, months, or days, MTBF measures the average length of time that an asset has operated without interruption.  The metric provides key insights to help drive tactics to ensure assets are operating to their fullest potential at the optimal cost profile and should be part of any maintenance and reliability professional’s toolkit.

Unfortunately, the MTBF calculation is not always well understood and while the mathematical equation is simple, many still struggle to gain visibility to their equipment failure rates.  We see the MTBF measure as a cornerstone to driving equipment reliability improvements and are often asked to help organizations determine the best approach to calculate the failure rate of their equipment.  To help customers gain visibility to their failure rates, we offer several approaches from simple to complex depending on input data quality, required accuracy and level of calculation sophistication.  This blog introduces our simplest approach (which is also FREE) to quickly calculate equipment MTBF utilizing a Microsoft Excel template.

Data Inputs to Calculate MTBF

The most common source of data for calculating MTBF is work order data from a computerized maintenance management system (CMMS) or Enterprise Asset Management (EAM) system.  Work orders are typically utilized to plan repair activities, order replacement parts and source labor which typically provides a good record of when a failure has occurred.  While the work order might be the most common approach to determining when an asset has failed, they can present calculation challenges depending on the completeness of the information supplied during the order closure process.  The subject of work order data quality is, and will most likely be forever, debated amongst maintenance and reliability professionals.  Some will take the position that the data must be perfect to be utilized in an MTBF calculation which unfortunately will prevent them from utilizing this very valuable measurement of equipment performance.  “Perfect is the enemy of good” is an aphorism used to describe this engineering perspective, an insistence on perfection often prevents implementation of good improvements.  The good news is that our experience in leveraging work order data for reliability measurements over the last 20 years, tells us that many of you now have work order data that is appropriate for the use case most reliability programs need today:  an estimation of rate of failure to make better reliability and maintenance investment decisions.

Other technological advancements, such as the rapid expansion of sensors and monitoring systems, are making it more common to automatically identify and document equipment failures with greater accuracy.  For equipment that is already monitored by a control system or process historians, you may already have direct access to the running state of the machine which can help you determine when an asset has failed in addition to automatically calculating run times which are key inputs to MTBF calculations.  For assets that are not actively monitored, it is now possible to install affordable and non-invasive sensors which can monitor conditions (such as energy usage) to help automatically detect when an asset is not running and document each time the asset starts or stops.

MTBF Calculation Approaches

In its simplest form, MTBF is calculated by taking the total time an asset is running and dividing it by the number of failures that happened over that same period of time or:

MTBF = Running time between installation date and last failure / # of Failures

Running time:  This is the total amount of time an asset is running over a specific time period.  Note that if you have assets that are not operating 24×7, running time is not represented by calendar time.

# of Failures:  This is the total number of equipment failures or breakdowns which have occurred over a specific time period.

 

Here is a very simple example.  Take a cooling water pump that continuously operates at a manufacturing facility.  The pump does not have a standby spare, and we want to calculate its specific MTBF over a 5-year period.  The asset has experienced 3 failures over the 5 years.

Running Time = 5 years

# Of Failures = 4

MTBF = 1.25 Years (5 years / 4 failures)

For this blog, the focus is on a simple method to calculate MTBF directly from work order data.  If you have more descriptive data sets and supporting calculation tools, you may want to utilize a Weibull Distribution for your failure rate analysis.  A Weibull analysis will provide a ‘beta’ value (or shape factor) which offers additional insights on the failure pattern (infant mortality, random, wear-out) associated with the MTBF value.  If you are interested in learning more about Weibull analysis, we highly recommend this tutorial from our partners at Prelical – it’s a great introduction to utilizing this mathematical technique for estimating failure rates.

MTBF is calculated – Now What?

With MTBF calculated, there are many ways to utilize this information to make better reliability and maintenance decisions, in fact too many to offer a complete list in this blog.  Here are a couple of common use case’s reliability professionals are driving once they have calculated MTBF on their equipment:

Identify poor performing equipment.  Evaluate the performance your equipment by comparing against similar equipment in a similar operating context.  In the previous example, MTBF was estimated at 1.25 years for the cooling water pump.  With this information it is now possible to benchmark it’s specific performance against industry standards such as the Itus Asset Twin Library, OREDA database (Oil and Gas Industry Specific) or OEM specific performance guarantees.  Once compared, it is easy to identify bad actors in your equipment population, evaluate reasons for the poor performance and establish a plan to improve the asset strategy.

Measuring the effectiveness of your reliability initiatives.  While MTBF is a lagging indicator, it is useful in assessing the effectiveness of a reliability and maintenance improvement program.  The pump example we reviewed demonstrates how we can measure the specific performance of one asset over time.  If you have leveraged a strategy development process such as a Failure Modes and Effects (FMEA) analysis and developed an optimized strategy or preventative maintenance plan for an equipment class (i.e., centrifugal pumps), the MTBF calculation can also be used to measure the effectiveness of that strategy.  As you implement your maintenance and monitoring strategy, the equipment class should become more reliable over time and MTBF should increase.

Analyzing future operational risk.  For equipment failures which directly impact production, MTBF can be a key input to evaluating future operational risk.  Keeping with our simple cooling water pump example which has an MTBF of 1.66 years.  If this pump is needed to run for the next 2 years to meet projected demand, it is highly likely it will experience at least one failure during that run cycle.  With a calculated MTBF value, it is possible to utilize solutions such as Asset Risk Analyzer to determine future operational risk, communicate potential downtime implications to management and justify investment in a reliability improvement initiative.

MTBF provides a practical approach to measure the current and historical failure rates for industrial equipment.  With visibility to MTBF information, maintenance and reliability professionals can make wise decisions on where to focus efforts and investments to meet business objectives.  Historically, the MTBF measurement has been considered an advanced reliability technique but is much more common today as organizations have implemented foundational asset management systems and processes.

If you are not currently utilizing MTBF to enhance your decision making, consider using our FREE Spreadsheet to get started.  Register below to get instant access to our template.

Download MTBF Calculator

Our MTBF Spreadsheet will walk you through the entire process of getting data from your EAM system, how to further classify records as breakdowns vs preventative maintenance, as well as specifics on how the calculations work and what to do with the results.  Register below and get instant access to start actively managing your equipment failure rates!