How to Address Age-Related Equipment Failure

This article contains insights from the book “Maintenance Control” by James Borowski, a maintenance professional and UpKeep customer. If you want additional help predicting and preventing equipment failures, you can download two chapters of the book for free that deal with equipment failure and reliability.

Age-related equipment failures are associated with time in service. Equipment experiencing this category of failure are those whose surfaces are in direct contact with the material being handled or the product being manufactured.

Systems tied to this failure type may be exposed to shock, mechanical vibration, a corrosive atmosphere, oxidizing chemicals or vapors, or heat evaporation that take their physical toll. These systems may also consist of consumable items or surfaces like wear plates.

Examples of specific equipment types that are typically considered to fail based on time in service include simple electromechanical systems with pins, bushings, chains, sprockets, belts and sheaves; pump rotating groups; valve seats; material handling conveyors; hopper liners; DC motor brushes; hydraulic filters; electrical components like limit switches and relays; vehicle tires; clutches; and internal combustion engines.

How to address age-related failures with maintenance

With equipment and components that fail based on time, a task is scheduled in a maintenance management system that addresses restoration or overhaul.

Perform scheduled restoration/overhaul tasks

A restoration/overhaul task is used for equipment that has worn to the point where failure is imminent. Scheduled restoration implies that the equipment can be rebuilt or overhauled before failure occurs. The restoration process is intended to return the equipment to its original “like new” condition or close to it. The result is that the equipment is reborn and given a new life.

For example, let’s assume that a production unit has a roll line consisting of a series of individually driven rolls coupled to a gearbox and electric motor. History has shown that because of the plant environment the gearboxes typically have a Useful Life of 24 months. In this application, to avoid a functional failure, a scheduled maintenance task is created to replace the boxes before an operating age of 24 months.

Useful Life is the point from initial installation to the point just before multiple failures begin to take place. For our gearbox example, the Useful Life is 24 months. But, as the curve points out, there are multiple failures of equipment before the wear-out zone is reached—one failure every two months. Likewise, most of the equipment that is taken out of service at the 24-month period still has many additional months of good service left–—some as much as another year.

From our example, it can be seen that restoring or overhauling equipment on the basis of time is ineffective and wasteful. It is ineffective in preventing functional failures since many pieces of equipment will fail before the wear-out zone. It is wasteful because equipment with significant operating life will be taken out of service too early.

With attributes of inefficiency and waste, scheduled restoration tasks continue to be used and will always be around. Arguably, they are not the most effective proactive maintenance task, but they are simple to schedule.

The important thing here is that a scheduled restoration task should be used only with equipment having a Bathtub curve failure mode…

Bathtub curve that shows how equipment fails over time

…or a Conditional Wear Curve with Wear-out Zone failure mode:

Graph of conditional wear curve with wear-out zone

No other failure mode curves apply. These are the only two failure modes with wear-out zones.

Scheduled restoration/overhaul tasks have been used for many years to improve equipment reliability, regardless if they are inefficient, wasteful or not. But keep in mind that there are some assumptions when using this strategy:

  • In fact, the equipment in question does wear out over time. The equipment has an age-related failure curve with a wear-out zone.
  • A large percentage of the equipment type must survive to the wear-out zone. There can’t be peaks or bursts of random failures along the way to the wear-out zone.
  • The equipment must be capable of being restored or overhauled. The item in question can’t be a throwaway, having no value once it has worn out.

Perform scheduled on-condition tasks

It is quite apparent from the Conditional Probability of Failure Curve of our example that random failures take place even with equipment scheduled to be overhauled after a time in service. In a thorough reliability program, these random failures are treated in the same way as any random failure. That is, scheduled on-condition tasks, like detailed inspections, are carried out. The inspections may be visual and act as insurance that nothing out of the ordinary is going on.

For example, cables of an overhead crane may be scheduled for change-out every 9 months. Experience may show this is the Useful Life in a certain application. Yet, the maintenance department may still thoroughly inspect every foot of the cable every two months to ensure that there are no breaks or fraying of cable strands. This is a scheduled inspection of an on-condition task. The cable stays in service on the condition that it is not frayed. When 9 months of service comes along, the cable is changed. At that point, the Wear-out Zone has been reached for the cable.

In many cases scheduled inspections of equipment that have an age-related failure mode may use predictive technologies like infrared scans to detect heat, non-destructive testing for identifying stress cracks, and oil analysis to determine the type and accumulation of dirt particles.

For example, assume a truck engine has a useful life of 10,000 operating hours. With anything beyond these hours, a truck fleet can be expected to see multiple failures. Consequently, a strategy is developed that says truck engines will be taken out of service at a point not to exceed 10,000 operating hours.Yet, as we have seen, multiple engines within the fleet can be expected to fail randomly.

To prolong the life of every engine in the truck fleet, the maintenance organization does a couple of things. First, regular oil changes are scheduled, performed, and tracked for compliance. Second, a sample of the oil is analyzed after each oil change to determine if it contains any excessive wear particles. If it does, maintenance schedules the engine exchange in a timely manner so that the truck doesn’t fail in service.

This is common sense and common practice. Just because equipment may fail on a time basis, experience says that things can happen to bring the equipment to its knees beforehand. This strategy only holds for equipment that has some degree of complexity. For example, wear plates and conveyor belts that can be readily observed by an operator or any interested party does not require predictive analysis to determine when failure might occur.

It is very common for items that have a history of scheduled restoration tasks, overhauls, or component exchanges to have scheduled inspections of on-condition tasks to guard against random failure.

Perform scheduled discard/throwaway tasks

A scheduled discard proactive task is assigned to equipment where a critical functional failure cannot be tolerated for any reason.A condition where a critical item must be removed from service before a failure occurs is called a safe-life limit. That is, while in service, the item must have a 100% probability of surviving to the next period. There can be no chance of the item failing. These are typically associated with simple pieces of equipment.

For example, a battery operating a critical sensing device like a gas analyzer or a light bulb in a panel indicating a critical condition must not be allowed to fail. Individual batteries and bulbs are tested in a laboratory and their failure points noted. Based upon the testing, the period before any failures begin is then divided by 2, 3, or sometimes 4 to provide a margin of safety. When that point is reached in service, the item (battery or bulb in this case) is taken out of service and discarded knowing full well more life is probably available from the item.

For an item with a safe-life limit, the goal is not to collect any failure data.

A critical item can also be removed and discarded for economic consequences. This is called an economic-life limit and is treated in the same manner as the conditional probability bathtub and slowly increasing wear curves discussed in the scheduled restoration/overhaul sections. And, similar to the restoration task, the assumptions are that:

  • The equipment wears out over time having an age-related failure mode with wear-out curve
  • A large percentage of this equipment type must survive to the wear-out zone
  • The equipment is NOT intended to be rebuilt and is discarded or scrapped

For example, an elevator for a reheat furnace may be driven by a bronze nut/steel screw arrangement. The life of the screws and nuts may be 16 months, but they are changed at 14 months because of an economic-life limit. If any of the screw/nut drives fails early, it creates a significant economic hardship (delay costs) and must be avoided. Consequently, the screw/nut drives are changed out early even though more life remains. Upon being removed from service, the screws and nuts are scrapped.

Just as the scheduled restoration/overhaul task, this type of scheduled task is wasteful and inefficient. But for the sake of safety or economic issues related to financial risk, it is a management decision to accept the waste.