Guide to Predictive Maintenance--Impact of Maintenance

AMAZON multi-meters discounts AMAZON oscilloscope discounts

Maintenance costs are a major part of the total operating costs of all manufacturing or production plants. Depending on the specific industry, maintenance costs can rep resent between 15 and 60 percent of the cost of goods produced. For example, in food related industries, average maintenance costs represent about 15 percent of the cost of goods produced, whereas maintenance costs for iron and steel, pulp and paper, and other heavy industries represent up to 60 percent of the total production costs.

These percentages may be misleading. In most American plants, reported maintenance costs include many nonmaintenance-related expenditures. For example, many plants include modifications to existing capital systems that are driven by market-related factors, such as new products. These expenses are not truly maintenance and should be allocated to nonmaintenance cost centers; however, true maintenance costs are substantial and do represent a short-term improvement that can directly impact plant profitability.

Recent surveys of maintenance management effectiveness indicate that one-third--33 cents out of every dollar--of all maintenance costs is wasted as the result of unnecessary or improperly carried out maintenance. When you consider that U.S. industry spends more than $200 billion each year on maintenance of plant equipment and facilities, the impact on productivity and profit that is represented by the maintenance operation becomes clear.

The result of ineffective maintenance management represents a loss of more than $60 billion each year. Perhaps more important is the fact that ineffective maintenance management significantly affects the ability to manufacture quality products that are competitive in the world market. The losses of production time and product quality that result from poor or inadequate maintenance management have had a dramatic impact on U.S. industries' ability to compete with Japan and other countries that have implemented more advanced manufacturing and maintenance management philosophies.

The dominant reason for this ineffective management is the lack of factual data to quantify the actual need for repair or maintenance of plant machinery, equipment, and systems. Maintenance scheduling has been, and in many instances still is, predicated on statistical trend data or on the actual failure of plant equipment.

Until recently, middle- and corporate-level management have ignored the impact of the maintenance operation on product quality, production costs, and more important, on bottom-line profit. The general opinion has been "Maintenance is a necessary evil" or "Nothing can be done to improve maintenance costs." Perhaps these statements were true 10 or 20 years ago, but the development of microprocessor- or computer based instrumentation that can be used to monitor the operating condition of plant equipment, machinery, and systems has provided the means to manage the maintenance operation. This instrumentation has provided the means to reduce or eliminate unnecessary repairs, prevent catastrophic machine failures, and reduce the negative impact of the maintenance operation on the profitability of manufacturing and production plants.

1. MAINTENANCE MANAGEMENT METHODS

To understand a predictive maintenance management program, traditional management techniques should first be considered. Industrial and process plants typically employ two types of maintenance management: run-to-failure or preventive maintenance.

1.1 Run-to-Failure Management

The logic of run-to-failure management is simple and straightforward: When a machine breaks down, fix it. The "If it ain't broke, don't fix it" method of maintaining plant machinery has been a major part of plant maintenance operations since the first manufacturing plant was built, and on the surface it sounds reasonable. A plant using run-to-failure management does not spend any money on maintenance until a machine or system fails to operate.

Run-to-failure is a reactive management technique that waits for machine or equipment failure before any maintenance action is taken; however, it’s actually a "no maintenance" approach of management. It’s also the most expensive method of maintenance management. Few plants use a true run-to-failure management philosophy. In almost all instances, plants perform basic preventive tasks (i.e., lubrication, machine adjustments, and other adjustments), even in a run-to-failure environment.

In this type of management, however, machines and other plant equipment are not rebuilt, nor are any major repairs made until the equipment fails to operate. The major expenses associated with this type of maintenance management are high spare parts inventory cost, high overtime labor costs, high machine downtime, and low production availability.

Because no attempt is made to anticipate maintenance requirements, a plant that uses true run-to-failure management must be able to react to all possible failures within the plant. This reactive method of management forces the maintenance department to maintain extensive spare parts inventories that include spare machines or at least all major components for all critical equipment in the plant. The alternative is to rely on equipment vendors that can provide immediate delivery of all required spare parts.

Even if the latter option is possible, premiums for expedited delivery substantially increase the costs of repair parts and downtime required to correct machine failures.

To minimize the impact on production created by unexpected machine failures, maintenance personnel must also be able to react immediately to all machine failures. The net result of this reactive type of maintenance management is higher maintenance cost and lower availability of process machinery. Analysis of maintenance costs indicates that a repair performed in the reactive or run-to-failure mode will average about three times higher than the same repair made within a scheduled or preventive mode. Scheduling the repair minimizes the repair time and associated labor costs. It also reduces the negative impact of expedited shipments and lost production.

Fig. 1 Typical bathtub curve.

1.2 Preventive Maintenance

There are many definitions of preventive maintenance, but all preventive maintenance management programs are time-driven. In other words, maintenance tasks are based on elapsed time or hours of operation. Fig. 1 illustrates an example of the statistical life of a machine-train. The mean-time-to-failure (MTTF) or bathtub curve indicates that a new machine has a high probability of failure because of installation problems during the first few weeks of operation. After this initial period, the probability of failure is relatively low for an extended period. After this normal machine life period, the probability of failure increases sharply with elapsed time. In preventive maintenance management, machine repairs or rebuilds are scheduled based on the MTTF statistic.

The actual implementation of preventive maintenance varies greatly. Some programs are extremely limited and consist of only lubrication and minor adjustments.

Comprehensive preventive maintenance programs schedule repairs, lubrication, adjustments, and machine rebuilds for all critical plant machinery. The common denominator for all of these preventive maintenance programs is the scheduling guideline-time.

All preventive maintenance management programs assume that machines will degrade within a time frame typical of their particular classification. For example, a single stage, horizontal split-case centrifugal pump will normally run 18 months before it must be rebuilt. Using preventive management techniques, the pump would be removed from service and rebuilt after 17 months of operation. The problem with this approach is that the mode of operation and system or plant-specific variables directly affect the normal operating life of machinery. The mean-time-between-failures (MTBF) is not the same for a pump that handles water and one that handles abrasive slurries.

The normal result of using MTBF statistics to schedule maintenance is either unnecessary repairs or catastrophic failure. In the example, the pump may not need to be rebuilt after 17 months. Therefore, the labor and material used to make the repair was wasted. The second option using preventive maintenance is even more costly. If the pump fails before 17 months, it must be repaired using run-to-failure techniques.

Analysis of maintenance costs has shown that repairs made in a reactive (i.e., after failure) mode are normally three times greater than the same repairs made on a scheduled basis.

1.3 Predictive Maintenance

Like preventive maintenance, predictive maintenance has many definitions. To some workers, predictive maintenance is monitoring the vibration of rotating machinery in an attempt to detect incipient problems and to prevent catastrophic failure. To others, it’s monitoring the infrared image of electrical switchgear, motors, and other electrical equipment to detect developing problems. The common premise of predictive maintenance is that regular monitoring of the actual mechanical condition, operating efficiency, and other indicators of the operating condition of machine-trains and process systems will provide the data required to ensure the maximum interval between repairs and minimize the number and cost of unscheduled outages created by machine-train failures.

Predictive maintenance is much more, however. It’s the means of improving productivity, product quality, and overall effectiveness of manufacturing and production plants. Predictive maintenance is not vibration monitoring or thermal imaging or lubricating oil analysis or any of the other nondestructive testing techniques that are being marketed as predictive maintenance tools.

Predictive maintenance is a philosophy or attitude that, simply stated, uses the actual operating condition of plant equipment and systems to optimize total plant operation.

A comprehensive predictive maintenance management program uses the most cost effective tools (e.g., vibration monitoring, thermography, tribology) to obtain the actual operating condition of critical plant systems and based on this actual data schedules all maintenance activities on an as-needed basis. Including predictive maintenance in a comprehensive maintenance management program optimizes the avail ability of process machinery and greatly reduces the cost of maintenance. It also improves the product quality, productivity, and profitability of manufacturing and production plants.

Predictive maintenance is a condition-driven preventive maintenance program. Instead of relying on industrial or in-plant average-life statistics (i.e., mean-time-to-failure) to schedule maintenance activities, predictive maintenance uses direct monitoring of the mechanical condition, system efficiency, and other indicators to determine the actual mean-time-to-failure or loss of efficiency for each machine-train and system in the plant. At best, traditional time-driven methods provide a guideline to "normal" machine-train life spans. The final decision in preventive or run-to-failure programs on repair or rebuild schedules must be made on the basis of intuition and the personal experience of the maintenance manager.

The addition of a comprehensive predictive maintenance program can and will provide factual data on the actual mechanical condition of each machine-train and the operating efficiency of each process system. This data provides the maintenance manager with actual data for scheduling maintenance activities. A predictive maintenance program can minimize unscheduled breakdowns of all mechanical equipment in the plant and ensure that repaired equipment is in acceptable mechanical condition. The program can also identify machine-train problems before they become serious. Most mechanical problems can be minimized if they are detected and repaired early. Normal mechanical failure modes degrade at a speed directly proportional to their severity. If the problem is detected early, major repairs can usually be prevented.

Predictive maintenance using vibration signature analysis is predicated on two basic facts: (1) all common failure modes have distinct vibration frequency components that can be isolated and identified, and (2) the amplitude of each distinct vibration component will remain constant unless the operating dynamics of the machine train change. These facts, their impact on machinery, and methods that will identify and quantify the root cause of failure modes are developed in more detail in later SECTIONS.

Predictive maintenance using process efficiency, heat loss, or other nondestructive techniques can quantify the operating efficiency of non-mechanical plant equipment or systems. These techniques used in conjunction with vibration analysis can provide maintenance managers and plant engineers with information that will enable them to achieve optimum reliability and availability from their plants.

Five nondestructive techniques are normally used for predictive maintenance management: vibration monitoring, process parameter monitoring, thermography, tribology, and visual inspection. Each technique has a unique data set that assists the maintenance manager in determining the actual need for maintenance.

How do you determine which technique or techniques are required in your plant? How do you determine the best method to implement each of the technologies? How do you separate the good from the bad? Most comprehensive predictive maintenance pro grams use vibration analysis as the primary tool. Because most normal plant equipment is mechanical, vibration monitoring provides the best tool for routine monitoring and identification of incipient problems; however, vibration analysis does not provide the data required on electrical equipment, areas of heat loss, condition of lubricating oil, or other parameters that should be included in your program.

1.4 Other Maintenance Improvement Methods

Over the past 10 years, a variety of management methods, such as total productive maintenance (TPM) and reliability-centered maintenance (RCM), have been developed and touted as the panacea for ineffective maintenance. Many domestic plants have partially adopted one of these quick-fix methods in an attempt to compensate for perceived maintenance shortcomings.

Total Productive Maintenance

Touted as the Japanese approach to effective maintenance management, the TPM concept was developed by Deming in the late 1950s. His concepts, as adapted by the Japanese, stress absolute adherence to the basics, such as lubrication, visual inspections, and universal use of best practices in all aspects of maintenance.

TPM is not a maintenance management program. Most of the activities associated with the Japanese management approach are directed at the production function and assume that maintenance will provide the basic tasks required to maintain critical production assets. All of the quantifiable benefits of TPM are couched in terms of capacity, product quality, and total production cost. Unfortunately, domestic advocates of TPM have tried to implement its concepts as maintenance-only activities. As a result, few of these attempts have been successful.

At the core of TPM is a new partnership among the manufacturing or production people, maintenance, engineering, and technical services to improve what is called overall equipment effectiveness (OEE). It’s a program of zero breakdowns and zero defects aimed at improving or eliminating the following six crippling shop-floor losses:

• Equipment breakdowns

• Setup and adjustment slowdowns

• Idling and short-term stoppages

• Reduced capacity

• Quality-related losses

• Startup/restart losses

A concise definition of TPM is elusive, but improving equipment effectiveness comes close. The partnership idea is what makes it work. In the Japanese model for TPM are five pillars that help define how people work together in this partnership.

Five Pillars of TPM. Total productive maintenance stresses the basics of good business practices as they relate to the maintenance function. The five fundamentals of this approach include the following:

1. Improving equipment effectiveness. In other words, looking for the six big losses, finding out what causes your equipment to be ineffective, and making improvements.

2. Involving operators in daily maintenance. This does not necessarily mean actually performing maintenance. In many successful TPM programs, operators don’t have to actively perform maintenance. They are involved in the maintenance activity-in the plan, in the program, and in the partner ship-but not necessarily in the physical act of maintaining equipment.

3. Improving maintenance efficiency and effectiveness. In most TPM plans, though, the operator is directly involved in some level of maintenance. This effort involves better planning and scheduling better preventive maintenance, predictive maintenance, reliability-centered maintenance, spare parts equipment stores, and tool locations-the collective domain of the maintenance department and the maintenance technologies.

4. Educating and training personnel. This task is perhaps the most important in the TPM approach. It involves everyone in the company: Operators are taught how to operate their machines properly and maintenance personnel to maintain them properly. Because operators will be performing some of the inspections, routine machine adjustments, and other preventive tasks, training involves teaching operators how to do those inspections and how to work with maintenance in a partnership. Also involved is training super visors on how to supervise in a TPM-type team environment.

5. Designing and managing equipment for maintenance prevention. Equipment is costly and should be viewed as a productive asset for its entire life.

Designing equipment that is easier to operate and maintain than previous designs is a fundamental part of TPM. Suggestions from operators and maintenance technicians help engineers design, specify, and procure more effective equipment. By evaluating the costs of operating and maintaining the new equipment throughout its life cycle, long-term costs will be minimized. Low purchase prices don’t necessarily mean low life-cycle costs.

Overall equipment effectiveness (OEE) is the benchmark used for TPM programs. The OEE benchmark is established by measuring equipment performance. Measuring equipment effectiveness must go beyond just the availability or machine uptime. It must factor in all issues related to equipment performance. The formula for equipment effectiveness must look at the availability, the rate of performance, and the quality rate. This allows all departments to be involved in determining equipment effectiveness. The formula could be expressed as:

Availability ¥ Performance Rate ¥ Quality Rate = OEE

The availability is the required availability minus the downtime, divided by the required availability. Expressed as a formula, this would be:

Required Availability Downtime Required Availability

- ¥= 100

The required availability is the time production is to operate the equipment, minus the miscellaneous planned downtime, such as breaks, scheduled lapses, meetings, and the like. The downtime is the actual time the equipment is down for repairs or changeover.

This is also sometimes called breakdown downtime. The calculation gives the true availability of the equipment. This number should be used in the effectiveness formula.

The goal for most Japanese companies is greater than 90 percent.

The performance rate is the ideal or design cycle time to produce the product multi plied by the output and divided by the operating time. This will give a performance rate percentage. The formula is:

Design Cycle Time Output Operating Time Performance Rate

¥ ¥= 100

The design cycle time or production output is in a unit of production, such as parts per hour. The output is the total output for the given time period. The operating time is the availability value from the previous formula. The result is a percentage of performance. This formula is useful for spotting capacity reduction breakdowns. The goal for most Japanese companies is greater than 95 percent.

The quality rate is the production input into the process or equipment minus the volume or number of quality defects divided by the production input. The formula is:

Production Input Quality Defects Production Input Quality Rate

- ¥= 100

The production input is the unit of product being fed into the process or production cycle. The quality defects are the amount of product that is below quality standards (not rejected; there is a difference) after the process or production cycle is finished.

The formula is useful in spotting production-quality problems, even when the customer accepts the poor-quality product. The goal for Japanese companies is higher than 99 percent.

Combining the total for the Japanese goals, it’s seen that:

90% ¥ 95% ¥ 99% = 85%

To be able to compete for the national TPM prize in Japan, equipment effectiveness must be greater than 85 percent. Unfortunately, equipment effectiveness in most U.S. companies barely breaks 50 percent-little wonder that there is so much room for improvement in typical equipment maintenance management programs.

Reliability-Centered Maintenance

A basic premise of RCM is that all machines must fail and have a finite useful life, but neither of these assumptions is valid. If machinery and plant systems are properly designed, installed, operated, and maintained, they won’t fail, and their useful life is almost infinite. Few, if any, catastrophic failures are random, and some outside influence, such as operator error or improper repair, causes all failures. With the exception of instantaneous failures caused by gross operator error or a totally abnormal outside influence, the operating dynamics analysis methodology can detect, isolate, and prevent system failures.

Because RCM is predicated on the belief that all machines will degrade and fail (P-F curve), most of the tasks, such as failure modes and effects analysis (FMEA) and Weibull distribution analysis, are used to anticipate when these failures will occur.

Both of the theoretical methods are based on probability tables that assume proper design, installation, operation, and maintenance of plant machinery. Neither is able to adjust for abnormal deviations in any of these categories.

When the RCM approach was first developed in the 1960s, most production engineers believed that machinery had a finite life and required periodic major rebuilding to maintain acceptable levels of reliability. In his guide Reliability-Centered Maintenance (1992), John Moubray states:

The traditional approach to scheduled maintenance programs was based on the concept that every item on a piece of complex equipment has a right age at which complete overhaul is necessary to ensure safety and operating reliability. Through the years, however, it was discovered that many types of failures could not be prevented or effectively reduced by such maintenance activities, no matter how intensively they were performed. In response to this problem, airplane designers began to develop design features that mitigated failure consequences-that is, they learned how to design airplanes that were failure tolerant. Practices such as the replication of system functions, the use of multiple engines, and the design of damage tolerant structures greatly weakened the relationship between safety and reliability, although this relationship has not been eliminated altogether.

Mobray points to two examples of successful application of RCM in the commercial aircraft industry-the Douglas DC-10 and the Boeing 747. When his guide was written, both of these aircraft were viewed as exceptionally reliable; however, history has changed this view. The DC-10 has the worst accident record of any aircraft used in commercial aviation; it has proven to be chronically unreliable. The Boeing 747 has faired better, but has had several accidents that were directly caused by reliability problems.

Not until the early 1980s did predictive maintenance technologies, such as micro processor-based vibration analysis, provide an accurate means of early detection of incipient problems. With the advent of these new technologies, most of the founding premises of RCM disappeared. The ability to detect the slightest deviation from optimum operating condition of critical plant systems provides the means to prevent deterioration that ultimately results in failure of these systems. If prompt corrective action is taken, it effectively stops the degradation and prevents the failure that is the heart of the P-F curve.

2 OPTIMIZING PREDICTIVE MAINTENANCE

Too many of the predictive maintenance programs that have been implemented have failed to generate measurable benefits. These failures have not been caused by technology limitation, but rather by the failure to make the necessary changes in the work place that would permit maximum utilization of these predictive tools. As a minimum, the following proactive steps can eliminate these restrictions and as a result help gain maximum benefits from the predictive maintenance program.

1 Culture Change

The first change that must take place is to change the perception that predictive technologies are exclusively a maintenance management or breakdown prevention tool.

This change must take place at the corporate level and permeate throughout the plant organization. This task may sound simple, but changing corporate attitude toward or perception of maintenance and predictive maintenance is difficult. Because most corporate-level managers have little or no knowledge or understanding of maintenance-or even the need for maintenance-convincing them that a broader use of predictive technologies is necessary is extremely difficult. In their myopic view, breakdowns and unscheduled delays are solely a maintenance issue. They cannot understand that most of these failures are the result of non-maintenance issues.

From studies of equipment reliability problems conducted over the past 30 years, maintenance is responsible for about 17 percent of production interruptions and quality problems. The remaining 83 percent are totally outside of the traditional maintenance function's responsibility. Inappropriate operating practices, poor design, non-specification parts, and a myriad of other non-maintenance reasons are the primary contributors to production and product-quality problems, not maintenance.

Predictive technologies should be used as a plant or process optimization tool. In this broader scope, they are used to detect, isolate, and provide solutions for all deviations from acceptable performance that result in lost capacity, poor quality, abnormal costs, or a threat to employee safety. These technologies have the power to fill this critical role, but that power is simply not being used. To accomplish this new role, the use of predictive technologies should be shifted from the maintenance department to a reliability group that is charged with the responsibility and is accountable for plant optimization. This group must have the authority to cross all functional boundaries and to implement changes that correct problems uncovered by their evaluations.

This approach is a radical departure from the traditional organization found in most plants. As a result, resistance will be met from all levels of the organization. With the exception of those few employees who understand the absolute need for a change to better, more effective practices, most of the workforce won’t openly embrace or voluntarily accept this new functional group; however, the formation of a dedicated group of professionals that is absolutely and solely responsible for reliability improvement and optimization of all facets of plant operation is essential. It’s the only way a plant or corporation can achieve and sustain world-class performance.

Staffing this new group won’t be easy. The team must have a thorough knowledge of machine and process design, and be able to implement best practices in both operation and maintenance of all critical production systems in the plant. In addition, they must fully understand procurement and plant engineering methods that will provide best life-cycle cost for these systems. Finally, the team must understand the proper use of predictive technologies. Few plants have existing employees who have all of these fundamental requirements.

This problem can be resolved in two ways. The first approach would be to select personnel who have mastered one or more of these knowledge requirements. For example, the group might consist of the best operations, maintenance, engineering, and predictive personnel available from the current workforce. Care must be taken to ensure that each group member has a real knowledge of his or her specialty area. One common problem that plagues plants is that the superstars in the organization don’t have a real, in-depth knowledge of their perceived specialty. In other words, the best operator may in fact be the worst contributor to reliability or performance problems.

Although he or she can get more capacity through the unit than anyone else, the practices used may be the root-cause of chronic problems.

If this approach is followed, training for the reliability team must be the first priority.

Few existing personnel will have all of the knowledge and skills required by this function, especially regarding application of predictive technologies. Therefore, the company must provide sufficient training to ensure maximum return on its investment.

This training should focus on process or operating dynamics for each of the critical production systems in the plant. It should include comprehensive process design, operating envelope, operating methods, and process diagnostics training that will form the foundation for the reliability group's ability to optimize performance.

The second approach is to hire professional reliability engineers. This approach may sound easier, but it’s not because there are very few fully qualified reliability professionals available, and they are very, very expensive. Most of these professionals prefer to offer their services as short-term consultants rather than become a long-term employee. If you try to hire rather than staff internally, use extreme caution. Résumés may sound great, but real knowledge is hard to find. For example, we recently inter viewed 150 "qualified" predictive engineers but found only 5 with the basic knowledge we required. Even then, these five candidates required extensive training before they could provide acceptable levels of performance.

2 Proper Use of Predictive Technologies

System components, such as pumps, gearboxes, and so on, are an integral part of the system and must operate within their design envelope before the system can meet its designed performance levels. Why then, do most predictive programs treat these components as isolated machine-trains and not as part of an integrated system? Instead of evaluating a centrifugal pump or gearbox as part of the total machine, most predictive analysts limit technology use to simple diagnostics of the mechanical condition of that individual component. As a result, no effort is made to determine the influence of system variables, like load, speed, product, or instability on the individual component. These variations in process variables are often the root-cause of the observed mechanical problem in the pump or gearbox. Unless analysts consider these variables, they won’t be able to determine the true root-cause. Instead, they will make recommendations to correct the symptom (e.g., damaged bearing, misalignment), rather than the real problem.

The converse is also true. When diagnostics are limited to individual components, system problems cannot be detected, isolated, and resolved. The system, not the individual components of that system, generates capacity, revenue, and bottom-line profit for the plant. Therefore, the system must be the primary focus of analysis.

When one thinks of predictive maintenance, vibration monitoring, thermography, or tribology is the normal vision. These are powerful tools, but they are not the panacea for plant problems. Used individually or in combination, these three cornerstones of predictive technologies cannot provide all of the diagnostics required to achieve and sustain world-class performance levels. To gain maximum benefit from predictive technologies, the following changes are needed: Process parameters, such as flow rates, retention time, temperatures, and others, are absolute requirements in all predictive maintenance and process optimization programs. These parameters define the operating envelope of the process and are essential requirements for system operation.

In many cases, these data are readily available.

On systems that use computer-based or processor logic control (PLC), the parameters or variables that define their operating envelopes are automatically acquired and then used by the control logic to operate the system. The type and number of variables vary from system to system but are based on the actual design and mode of operation for that specific type of production system. It’s a relatively simple matter to acquire these data from the Level I control system and use it as part of the predictive diagnostic logic. In most cases, these data combined with traditional predictive technologies provide all of the data an analyst needs to fully understand the system's performance.

Manually operated systems should not be ignored. Although the process data is more difficult to obtain, the reliability or predictive analyst can usually acquire enough data to permit full diagnostics of the system's performance or operating condition. Analog gauges, thermocouples, strip chart recorders, and other traditional plant instrumentation can be used. If plant instrumentation includes an analog or digital output, most microprocessor-based vibration meters can be used for direct data acquisition. These instruments can directly acquire most proportional signal outputs and automate the data acquisition and management that is required for this expanded scope of predictive technology.

Because most equipment used in domestic manufacturing, production, and process plants consists of electromechanical systems, our discussion begins with the best methods for this classification of equipment. Depending on the plant, these systems may range from simple machine-trains, such as drive couple pumps and electric motors, to complex continuous process lines. Regardless of the complexity, the methods that should be used are similar.

In all programs, the primary focus of the predictive maintenance program must be on the critical process systems or machine-trains that constitute the primary production activities of the plant. Although auxiliary equipment is important, the program must first address those systems on which the plant relies to produce revenue. In many cases, this approach is a radical departure from the currently used methods in traditional applications of predictive maintenance. In these programs, the focus is on simple rotating machinery and excludes the primary production processes.

Electromechanical Systems

Predictive maintenance for all electromechanical systems, regardless of their complexity, should use a combination of vibration monitoring, operating dynamics analysis, and infrared technologies. This combination is needed to ensure the ability to accurately determine the operating condition, to identify any deviation from accept able operations, and to isolate the root-cause of these deviations.

Vibration Analysis. Single-channel vibration analysis, using microprocessor-based, portable instruments, is acceptable for routine monitoring of these critical production systems; however, the methods used must provide an accurate representation of the operating condition of the machine or system. The biggest change that must be made is in the parameters that are used to acquire vibration data.

When the first microprocessor-based vibration meter was developed in the early 1980s, the ability to acquire multiple blocks of raw data and then calculate an average vibration value was incorporated to eliminate the potential for spurious signals or bad data resulting from impacts or other transients that might distort the vibration signature. Generally, one to three blocks of data are adequate to acquire an accurate vibration signature. Today, most programs are set up to acquire 8 to 12 blocks of data from each measurement point. These data are then averaged and stored for analysis.

This methodology poses two problems. First, this approach distorts the data that will ultimately be used to determine whether corrective maintenance actions are necessary.

When multiple blocks of data are used to create an average, transient events, such as impacts and periodic changes in the vibration profile, are excluded from the stored average that is the basis for analysis. As a result, the analyst is unable to evaluate the impact on operating condition that these transients may cause.

The second problem is time. Each block of data, depending on the speed of the machine, requires between 5 and 60 seconds of acquisition time. As a result, the time required for data acquisition is increased by orders of magnitude. For example, a data set, using 3 blocks, may take 15 seconds. The same data set using 12 blocks will then take 60 seconds. The difference of 45 seconds may not sound like much until you multiply it by the 400 measure points that are acquired in a typical day (5 labor hours per day) or 8,000 points in a typical month (100 labor hours per month).

Single-channel vibration instruments cannot provide all of the functions needed to evaluate the operating condition of critical production systems. Because these instruments are limited to steady-state analysis techniques, a successful predictive maintenance program must also include the ability to acquire and analyze both multichannel and transient vibration data. The ideal solution to this requirement is to include a multichannel real-time analyzer. These instruments are designed to acquire, store, and display real-time vibration data from multiple data points on the machine-train. These data provide the means for analysts to evaluate the dynamics of the machine and greatly improve their ability to detect incipient problems long before they become a potential problem.

Real-time analyzers are expensive, and some programs in smaller plants may not be able to justify the additional $50,000 to $100,000 cost. Although not as accurate as using a real-time analyzer, these programs can purchase a multichannel, digital tape recorder that can be used for real-time data acquisition. Several eight-channel digital recorders on the market range in price from $5,000 to $10,000 and have the dynamic range needed for accurate data acquisition. The tape-recorded data can be played back through most commercially available single-channel vibration instruments for analysis. Care must be taken to ensure that each channel of data is synchronized, but this methodology can be used effectively.

Operating Dynamics Analysis. Vibration data should never be used in a vacuum.

Because the dynamic forces within the monitored machine and the system that it’s a part of generate the vibration profile that is acquired and stored for analysis, both the data acquisition and analysis processes must always include all of the process variables, such as incoming materials, pressures, speeds, temperatures, and so on, that define the operating envelope of the system being evaluated.

Generally, the first five to ten measurement points defined for a machine-train should be process variables. Most of the microprocessor instruments that are used for vibration analysis are actually data loggers. They are capable of either directly acquiring a variety of process inputs, such as pressure, temperature, flow, and so on, or permitting manual input by the technician. These data are essential for accurate analysis of the resultant vibration signature. Unless analysts recognize the process variations, they cannot accurately evaluate the vibration profile. A simple example of this approach is a centrifugal compressor. If the load changes from 100 percent to 50 percent between data sets, the resultant vibration is increased by a factor of four. This is caused by a change in the spring constant of the rotor system.

By design, the load on the compressor acts as a stabilizing force on the rotating element. At 100 percent load, the rotor is forced to turn at or near its true centerline. When the load is reduced to 50 percent, the stabilizing force is reduced by one-half; however, spring constant is a quadratic function, so a 50 percent reduction of the spring constant or stiffness results in an increase of vibration amplitude of 400 percent.

Infrared Technologies. Heat and/or heat distribution is also an essential tool that should be used for all electromechanical systems. In simple machine-trains, it may be limited to infrared thermometers that are used to acquire the temperature related process variables needed to determine the machine or system's operating envelope. In more complex systems, full infrared scanning techniques may be needed to quantify the heat distribution of the production system. In the former technique, noncontact, infrared thermometers are used in conjunction with the vibration meter or data logger to acquire needed temperatures, such as bearings, liquids being transferred, and so on. In the latter method, fully functional infrared cameras may be needed to scan boilers, furnaces, electric motors, and a variety of other process systems where surface heat distribution indicates the system's operating condition.

The Total Package. The combination of these three technologies or methods is the minimum needed for an effective predictive maintenance program. In some instances, other techniques, such as ultrasonics, lubricating oil analysis, Meggering, and so on, may be needed to help analysts fully understand the operating dynamics of critical machines or systems within the plant. None of these technologies can provide all of the data needed for accurate evaluation of machine or system condition; however, when used in combination and further augmented with a practical knowledge of machine and system dynamics, these techniques can and will provide a predictive maintenance program that will virtually eliminate catastrophic failures and the need for corrective maintenance. These methods will also extend the useful life and minimize the life cycle cost of critical production systems.

Predictive Maintenance Is More Than Maintenance

Traditionally, predictive maintenance is used solely as a maintenance management tool. In most cases, this use is limited to preventing unscheduled downtime and/or catastrophic failures. Although this function is important, predictive maintenance can provide substantially more benefits by expanding the scope or mission of the program.

As a maintenance management tool, predictive maintenance can and should be used as a maintenance optimization tool. The program's focus should be on eliminating unnecessary downtime, both scheduled and unscheduled; eliminating unnecessary preventive and corrective maintenance tasks; extending the useful life of critical systems; and reducing the total life-cycle cost of these systems.

Plant Optimization Tool. Predictive maintenance technologies can provide even more benefit when used as a plant optimization tool. For example, these technologies can be used to establish the best production procedures and practices for all critical production systems within a plant. Few of today's plants are operating within the original design limits of their production systems. Over time, the products that these lines produce have changed. Competitive and market pressure have demanded increasingly higher production rates. As a result, the operating procedures that were appropriate for the as-designed systems are no longer valid. Predictive technologies can be used to map the actual operating conditions of these critical systems and to provide the data needed to establish valid procedures that will meet the demand for higher production rates without a corresponding increase in maintenance cost and reduced useful life.

Simply stated, these technologies permit plant personnel to quantify the cause-and effect relationship of various modes of operation. This ability to actually measure the effect of different operating modes on the reliability and resultant maintenance costs should provide the means to make sound business decisions.

Reliability Improvement Tool. As a reliability improvement tool, predictive maintenance technologies cannot be beat. The ability to measure even slight deviations from normal operating parameters permits appropriate plant personnel (e.g., reliability engineers, maintenance planners) to plan and schedule minor adjustments that will prevent degradation of the machine or system, thereby eliminating the need for major rebuilds and associated downtime.

Predictive maintenance technologies are not limited to simple electromechanical machines. These technologies can be used effectively on almost every critical system or component within a typical plant. For example, time-domain vibration can be used to quantify the response characteristics of valves, cylinders, linear-motion machines, and complex systems, such as oscillators on continuous casters. In effect, this type of predictive maintenance can be used on any machine where timing is critical.

The same is true for thermography. In addition to its traditional use as a tool to survey roofs and building structures for leaks or heat loss, this tool can be used for a variety of reliability-related applications. It’s ideal for any system where surface temperature indicates the system's operating condition. The applications are almost endless, but few plants even attempt to use infrared as a reliability tool.

The Difference. Other than the mission or intent of how predictive maintenance is used in your plant, the real difference between the limited benefits of a traditional predictive maintenance program and the maximum benefits that these technologies could provide is the diagnostic logic used. In traditional predictive maintenance applications, analysts typically receive between 5 and 15 days of formal instruction.

This training is always limited to the particular technique (e.g., vibration, thermography) and excludes all other knowledge that might help them understand the true operating condition of the machine, equipment, or system they are attempting to analyze.

The obvious fallacy in this approach is that none of the predictive technologies can be used as stand-alone tools to accurately evaluate the operating condition of critical production systems. Therefore, analysts must use a variety of technologies to achieve anything more than simple prevention of catastrophic failures. At a minimum, analysts should have a practical knowledge of machine design, operating dynamics, and the use of at least the three major predictive technologies (i.e., vibration, thermography, and tribology). Without this minimum knowledge, they cannot be expected to provide accurate evaluations or cost-effective corrective actions.

In summary, there are two fundamental requirements of a truly successful predictive maintenance program: (1) a mission that focuses the program on total-plant optimization and (2) proper training for technicians and analysts. The mission or scope of the program must be driven by life-cycle cost, maximum reliability, and best practices from all functional organizations within the plant. If the program is properly structured, the second requirement is to give the personnel responsible for the program the tools and skills required for proper execution.

3 It Takes More Than Effective Maintenance

Plant performance requirements are basically the same for both small and large plants.

Although some radical differences exist, the fundamental requirements are the same for both. Before we explore the differences, we need to understand the fundamental requirements in the following areas:

• Plant culture

• Sales and marketing

• Production

• Procurement

• Maintenance

• Information management

• Other plant functions

Plant Culture

The foremost requirement of world-class plant performance is a work environment that encourages and sustains optimum performance levels from the entire workforce. This plant culture must start with senior management and be inherent throughout the entire workforce. Without a positive work environment that encourages total employee involvement and continuous improvement, there is little chance of success.

Sales and Marketing

The sales and marketing group must provide a volume of new business that can sustain acceptable levels of production performance. Optimum equipment utilization cannot be achieved without a backlog that permits full use of the manufacturing, production, or process systems; however, volume is not the only criteria that must be satisfied by the sales and marketing group. They must also provide (1) a product mix that permits effective use of the production process, (2) order size that limits the number and frequency of setups, (3) delivery schedules that permit effective scheduling of the process, and (4) a sales price that provides a reasonable profit. The final requirement of the sales group is an accurate production forecast that permits long-range production and maintenance planning.

Production

Production management is the third criteria for acceptable plant performance. The production department must plan and schedule the production process to gain maximum use of their processes. Proper planning depends on several factors: good communication with the sales and marketing group, knowledge of unit production capabilities, adequate material control, and good equipment reliability. Production planning and effective use of production resources also depend on coordination with procurement, human resources, and maintenance functions within the plant. Unless these functions provide direct, coordinated support, the production planning function cannot achieve acceptable levels of performance from the plant.

In addition, the production department must execute the production plan effectively.

Good operating procedures and practices are essential. Every manufacturing and production function must have, and use, standard operating procedures that support effective use of the production systems. These procedures must be constantly evaluated and upgraded to ensure proper use of critical plant equipment.

Equipment reliability is essential for acceptable production performance. Contrary to popular opinion, maintenance does not control equipment reliability; the production department has an equal responsibility. Operating practices and the skill level of production employees have a direct impact on equipment reliability; therefore, all facets of the production process, from planning to execution, must address this critical issue.

The final requirement of effective production is employee skills. All employees within the production group must have adequate job skills. Human resources or the training department must maintain an evaluation and training program that ensures that employee skill levels are maintained at acceptable levels.

Procurement

The procurement function must provide raw materials, production spares, and other consumables at the proper times to support effective production. In addition, these commodities must be of suitable quality and functionality to permit effective use of the process systems and finished product quality. The procurement function is critical to good performance of both production and maintenance. This group must coordinate its activities with both functions and provide acceptable levels of performance.

In addition, they must implement and maintain standard procedures and practices that ensure optimum support for both the production and maintenance functions. At a minimum, these procedures should include vendor qualification, procurement specifications based on life-cycle costs, incoming inspection, inventory control, and material control.

Maintenance

The maintenance function must ensure that all production and manufacturing equipment is kept in optimum operating condition. The normal practice of quick response to failures must be replaced with maintenance practices that sustain optimum operating condition of all plant systems. It’s not enough to have the production system operate. The equipment must reliably operate at or above nameplate capacity without creating abnormal levels of product-quality problems, preventive maintenance down time, or delays. Maintenance prevention, not quick-fixes of breakdowns, should be the objective.

Maintenance planning and scheduling are essential parts of effective maintenance.

Planners must develop and implement both preventive and corrective maintenance tasks that achieve maximum use of maintenance resources and the production capacity of plant systems. Good planning is not an option. Plants should adequately plan all maintenance activities, not just those performed during maintenance outages.

Standard procedures and practices are essential for effective use of maintenance resources. The practices should ensure proper interval of inspection, adjustment, or repair. In addition, these practices should ensure that each task is properly completed.

Standard maintenance procedures (SMPs) should be written so that any qualified craftsman can successfully complete the task in the minimum required time and at minimum costs.

Adherence to SMPs is also essential. The workforce must have the training and skills required to effectively complete their assigned duties. In addition, maintenance management must ensure that all maintenance employees follow standard practices and fully support continuous improvement.

Information Management

Effective use of plant resources absolutely depends on good management decisions.

Therefore, viable information management is critical to good plant performance. All plants have an absolute requirement for a system that collects, compiles, and interprets data that define the effectiveness of all critical plant functions. This system must be capable of providing timely, accurate performance indices that can be used to plan, schedule, and manage the plant.

Other Plant Functions

In medium and large plants, other plant functions play a key role in plant performance.

Smaller plants either don’t have these functions or they are combined within either the production or maintenance functions. These functions include human resources, plant engineering, labor relations, cost accounting, and environmental control. Each of these departments must coordinate its activities with sales, production, and maintenance to ensure acceptable levels of plant performance.

4 Small Plants

All plants must adhere to the basics discussed, but small plants face unique constraints.

Their size precludes substantial investments in labor, tools, and training that are essential to effective asset management or to support continuous improvement. Many small plants are caught in a Catch-22. They are too small to support effective planning or to implement many of the tools, such as predictive maintenance and computer-based maintenance management systems (CMMS), that are required to improve performance levels. At the same time, they must improve to survive. In addition, the return on investment (ROI) generated by traditional continuous improvement programs is generally insufficient to warrant implementing these programs.

Predictive maintenance is a classic example of this Catch-22. Because of their size, many small plants cannot justify implementing predictive maintenance. Although the program will generate similar improvements to those achieved in larger plants, the change in actual financial improvement may not justify the initial and recurring costs associated with this tool. For example, a 1 percent improvement in availability in a large plant may represent an improvement of $1 million to $100 million.

The same improvement in a small plant may be $1,000 to $10,000. Large plants can afford to invest the money and labor required to achieve these goals. In small plants, the cost required to establish and maintain the predictive program may exceed the total gain.

The same Catch-22 prohibits implementing formal planning, procurement, and training programs in many smaller plants. The perception is that the addition of nonrevenue-generating personnel to provide these functions would prohibit accept able levels of financial performance. In other words, the bottom line would suffer.

This view may be true to a point, but few plants can afford not to include the essentials of plant performance.

In many ways, small plants have a more difficult challenge than larger plants; however, with proper planning and implementation, small plants can improve their performance and gain enough additional market share to ensure both survival and long-term positive growth. They must exercise extreme caution and base their long-range plan on realistic goals.

Some plants attempt to implement continuous improvement programs that include too many tools. They assume that full, in-house implementation of predictive maintenance, CMMS, and other continuous improvement tools are essential requirements of continuous improvement. This is not true. Small plants can implement a continuous improvement program that achieves the increased performance levels needed without major investments. Judicious use of continuous improvement tools, including outside support and modification of in-house organizations, will permit dramatic improvement without being offset by increased costs.

Continuous improvement tools, such as CMMS, information management systems, and the like, are available for small plants. These systems are specifically designed for this application and provide all of the functionality required to improve performance, without the high costs of larger, more complex systems. The key to successful implementation of these tools is automation. Small plants cannot afford to add personnel whose sole function is to maintain continuous improvement systems or the predictive maintenance program. Therefore, these tools must provide the data required to improve plant effectiveness without additional personnel.

5. Large Plants

Because of the benefits generated by continuous improvement programs, large plants can justify implementation; however, this should not be used as justification for implementing expensive or excessive programs. A typical tendency is to implement multiple improvement programs, such as total productive maintenance, just-in-time manufacturing, and total quality control, which are often redundant or conflict with each other. Frankly, this shotgun approach is not justified. Each of these programs adds an overhead of personnel whose sole function is program management. This increase in indirect personnel cannot be justified. Continuous improvement should be limited to a single, holistic program that integrates all plant functions into a focused, unified effort.

Large plants must exercise more discipline than their smaller counterparts. Because of their size, the responsibilities and coordination of all plant functions must be clearly defined. Planning and scheduling must be formalized, and communication within and among functions is much more difficult.

An integrated, computer-based information management system is an absolute requirement in larger plants. At a minimum, this system should include cost accounting, sales, production planning, maintenance planning, procurement, inventory control, and environmental compliance data. These data should be universally avail able for each plan function and con figured to provide accurate, timely management and planning data. Properly implemented, this system will also provide a means to effectively communicate and coordinate the integrated functions, such as sales, production, maintenance, and procurement, into an effective unit.

Large plants must also exercise caution. The tendency is to become excessive when implementing continuous improvement programs. Features are added to the information management system, predictive maintenance program, and other tools that are not needed by the program. For example, one plant added the ability to include video clips in its CMMS. Although this added feature may have been of some value, it was not worth the $12 million additional cost.

Continuous improvement is an absolute requirement in all plants, but these programs must be implemented logically. Your program must be designed for the unique requirements of your plant. It should be designed to minimize the costs required to implement and maintain the program and to achieve the best ROI. In my 30 years as a manager and consultant, I have not found a single plant that would not benefit from a continuous improvement program; however, I have also seen thousands of plants that failed in their attempt to improve. Most of these failures were the result of either (1) restricting the program to a single function, such as maintenance or production, or (2) inflated costs generated by adding unnecessary tools. Both of these types of failures are preventable. If you approach continuous improvement in a logical, plant specific manner, you can be successful regardless of plant size.

Prev. | Next

Article Index HOME