Increasing Machine Reliability and Uptime--Introduction

HOME | FAQ | Books | Links


AMAZON multi-meters discounts AMAZON oscilloscope discounts


Ask any plant manager in the world if he is interested in plant safety and he will answer in the affirmative. Ask him about his desire to produce reliably and he will probably give you the same answer. But interests and desires are not always aligned with a thoughtful and consistent implementation strategy and some of our readers will have to examine to what extent they are - or are not - in tune with Best-of-Class (BOC) practices.

Over the years, we have come to appreciate that reliability improvement and machinery uptime are virtual synonyms. To achieve uptime optimization, the machinery specification and actual design must be right.

The machine must be operated within its design envelope. It must also be maintained correctly.

This harmonizes with the various editions the book, Machinery Failure Analysis and Troubleshooting where it’s emphasized that, to capture high reliability, plant equipment has to be free of:

  • • design defects
  • • fabrication deficiencies
  • • material defects
  • • assembly or installation flaws
  • • maintenance errors
  • • unintended operation
  • • operator error

Indeed, and as we shall see, these seven failure categories are implicitly recognized whenever a facility is being planned and put into service.

They are also recognized when performing failure analysis, because all failures of all machines will fit into one or more of these seven failure categories. It should be noted that the three major frames or boxes of Fig. 1 contain these categories as well.


Fig. 1. Elements contributing to machinery uptime.

But that is not the full story. Certainly a plant organization uses and manages the functional endeavors described as Specification & Design, Operation, and Maintenance. It’s easy to visualize that various subcategories exist and that these, too, must somehow be managed. But they are properly managed only by a few, and we call them the BOC performers. These leading plants are reliability-focused, whereas the "business as usual" plants are stuck in an outdated cycle of repeat failures. We chose to label the latter as repair-focused.

In essence, it’s our purpose to highlight the various issues that need to be addressed by plants that wish to achieve, optimize, and sustain machinery uptime. To that end, this guide describes what BOC companies are doing. Likewise, a bit of introspection may point out where the reader has an opportunity to improve.

Prerequisites for Capturing Future Uptime

There are important prerequisites for achieving machinery uptime. Much reliability-related work must be done - and is being done - by BOC companies before a plant is built. Reliability audits and reviews are part of this effort and must be adequately staffed. The cost of these endeavors is part of a reliability-focused project. Moreover, the cost estimates and appropriation requests for such projects are never based on the initial cost of least expensive machinery. Instead, they are always based on data obtained from bidders that build reliability into their equipment.

Competent machinery engineers assist in the bid evaluation process and assign value to maintenance cost avoidance and reliability improvement features to Bidder A over Bidder B [1].

Yet, not always are owners going for the lowest first cost. When it’s evident that an existing plant is in trouble or in obvious need of improvement, equipment owners very often switch tactics and go for "high tech." They then procure the latest fad hardware and software. They belatedly attempt to institute crafts training and look to older retirees for instant improvement. To teach maintenance procedures or whatever other topic, they often engage teacher-trainers that have once worked for companies with name recognition, preferably ones that advertise their products or prowess on TV. But while some of these teacher-trainers have sufficient familiarity with process machinery to know why the client owner experiences repeat failures, others do not. As an example, just ask some of these teacher-trainers to explain why authoritative texts consider oil slinger rings an inferior lube application method for many pumps used in process plants. Then, sit back and listen to their answers. The short term solution entails working only with competent, field-experienced, and yet analytically trained, reliability consultants. The long-term solution is to groom one's own talent and skills.

Grooming Talent and Skills

Many managers fail to see the need to groom talent, to hire and hold on to people with the ability, motivation, and desire to learn all there is to be learned about a technical subject. They often delude themselves into believing that they can always hire a contractor to do the work, but don’t realize that few contractors are better informed or better qualified than their own, albeit often ill-prepared employees. Managers often fail to recognize that machinery uptime optimization is ultimately achieved by talent that is deliberately groomed. This "groomed talent" includes people who are keenly interested in reading technical journals and the proceedings of technical symposia and conferences. This "groomed talent" relentlessly pursues self-training as well as outside training opportunities.

In essence, then, good managers nurture good people. Good managers challenge their technical employees to become subject-matter experts.

They encourage these employees to map out their own training plans and then facilitate implementing these plans. Good managers will see to it that these employees, from young maintenance technicians to wizened senior engineers, become valuable and appreciated contributors. They also see to it that good technical employees are respected and rewarded accordingly.

A good workforce must have rock-solid basic skills. It would be of no benefit to buy better bearings and then allow unacceptable work practices to persist. Work practices must conform to certain standards and these standards must be put in writing. Then, these standards must be transformed into checklists or similar documents that are used at the workbench or in the field location where such work is being performed.

Management's role includes allocation of resources to produce the requisite standards and verifying that they are being consistently applied. The standards and checklists must become part of a culture that builds basic skills. Moreover, the standards must be adhered to with determination and consistency. They should not be compromised as an expedient to reach the limited short-term goal of "just get it running again quickly." Neither should compliance with standards be allowed to become just one more of the many temporary banner exhortations that fizzle out like so many "flavors of the month." By far the most important organizational agent in accomplishing the long-term reliability objectives of an industrial enterprise is totally focused on employee training. While this requirement may be understood to cover all employees regardless of job function, we are here confining our discussion to a plant's reliability workforce. A good organization will map out a training plan that is the equivalent of a binding contract between employer and employee. There has to be accountability in terms of proficiency achieved through this targeted training.

But before we delve into this training-related subject, we must explore current trends and recent inclinations that largely focus on procedural issues. We must also examine sound organizational setups as they relate to achieving optimized machinery uptime.

Sound Organizational Setup Explained

Smart organizations use a dual ladder of advancement, as discussed a little later in this section. However, regardless of whether or not a dual career path approach is used, two short but straightforward definitions are in order:

1. The function of a maintenance department is to routinely maintain equipment in operable condition. It’s thus implied that this department is tasked with restoring equipment to as-designed or as-bought condition.

2. Reliability groups are involved in structured evaluations of upgrade opportunities. They perform life-cycle cost studies and develop implementation strategies whenever component upgrading makes economic sense.

For a reliability improvement group to function most effectively, its members have to be shielded from the day-to-day preventive and routine equipment repair and restoration involvement. Best Practices Plants often issue guidelines or predefine a trigger mechanism that prompts involvement by the reliability group. Examples might include equipment that fails for the third time in a given 12-month period, equipment distress that has or could have caused injury to personnel, failures that caused an aggregate loss in excess of $20,000, and so forth.

There must be a true quest for real improvement, not the quest for reciting and invoking improvement methodologies. While the quest for continuous, real, and lasting improvement is to be commended, the quest for merely invoking continuous improvement methodologies often turns into a chase after the elusive "magic bullet." All employees and all job functions must embrace the pursuit of real and lasting improvement. This collaborative effort is no different from the desire to have a reliable automobile. In addition to the fundamental design being right, the driver operator and maintenance technician must do their part if acceptable "automobile uptime" is to be achieved. However, while every job function in a reliability-focused plant must participate in this quest, the process must be defined and supervised by enlightened managers.

Regarding the quest for continuous improvement methodologies, we have seen a veritable alphabet soup of acronyms come and go since the early references to Predictive Maintenance (PdM) in the mid-1950s. An "ME" campaign (meaning Manufacturing Excellence) was among them; few people at the affected location remember it. In 1975, a campaign aimed at making "every man a manager" was instituted in some plants known to the authors; it, too, failed miserably. While striving toward self-directed workforces is a laudable goal, it requires a core of competent and well-informed people.

As of 2009, PdM has survived and TPM, TPR, and ODR/OSS are foremost among the early twenty-first-century reliability methodologies and initiatives. But the point is that while it’s OK to have one's methodologies or even advanced technology-related procedures right, it’s not OK to neglect the basics, the fundamentals of machinery reliability and optimized uptime. There will never be high reliability and optimized uptime where mechanics and technicians either lack the understanding or are not practicing the basics.

Finally, we should always recall that it’s not OK to understand or per haps blindly follow methodologies while, at the same time, disregarding common sense. The authors disagree with the notion expressed by some that in modern industry there is no longer a place for preventive maintenance (PM). Yet, we know only too well that modern industry cannot confine its practices to PM alone. Other approaches must supplement PM and even the question "who's doing PM" must be examined.

PdM, TPM, TPR, and ODR/DSS Explained

Routine preventive maintenance has served industry since the Industrial Revolution in the late eighteenth century. And PM still has its place in the many thousands of instances where avoiding failure by prevention of defect development, i.e. PM, makes more economic sense than allowing flaws to develop to the point where they become detectable, but also irreversible. An excellent example is changing oil in an automobile.

This kind of PM is surely more cost-effective than keeping the same oil in the crankcase for many years, but analyzing it periodically for metal chips. While such periodic analyses would constitute PdM, that type of PdM makes no economic sense. Yet, properly used in an overall program of uptime optimization, PdM is indeed relevant, important, and representative of best practices.

By the mid-1950s, PdM, with its instrumentation routines aimed at spotting developing defects, came into being. PdM encompasses vibration monitoring and analysis, thermographic and ultrasonic examinations and inspections, and a host of other methods. All of these are intended to predict failure progression to the point where planned equipment shutdowns would prevent major damage and excessive downtime.

However, in order to maintain the equipment in optimal condition, new and progressive maintenance techniques needed to be established and a measure of "fine tuning" looked attractive. Such "fine tuning" involves the cooperation of equipment and process support personnel, equipment operators, and equipment suppliers. As was shown in the auto mobile uptime analogy, these three must again work together to eliminate equipment breakdowns, reduce scheduled downtime, and maximize asset utilization for optimum achievement of throughput and product quality.

Assuming it’s being properly implemented, Total Productive Maintenance (TPM) can provide the methods and work processes to measure and eliminate much of the non-productive time. Once TPM has been successfully implemented, a facility is considered ready to progress to Total Process Reliability (TPR). Total Process Reliability views every maintenance event as an opportunity to upgrade manufacturing processes, hardware, software, work and operating procedures, and even management and supervisory methods. On the equipment level, TPR practitioners would always (!) be in a position to answer the two all-important questions: (i) is upgrading possible and (ii) is upgrading justified by prevailing economics.

Total Productive Maintenance often involves the use of an information management system, planned maintenance activities, emphasis on preventive maintenance, assessing equipment utilization to eliminate non essential assets (reducing numbers of equipment), operator and mechanic training, to some extent decentralizing asset responsibility, and operator ownership of equipment through basic care - a concept that leads into Operator-Driven Reliability (ODR). In turn, ODR might lead to Decision Support Systems (DSS).

Reliability-Focused Plants and Operator Involvement

We believe that process plants worldwide can be divided into those that are repair-focused and those that are decidedly reliability-focused. The former will have trouble surviving, whereas the latter will stay afloat with considerably less difficulty. Repair-focused facilities emphasize parts replacement and have neither the time nor the inclination to make systematic improvements. Rarely do they identify why the parts failed, and rarer yet do they implement the type of remedial action that makes repeat failures a thing of the past. Reliability-focused plants, on the other hand, view every repair event as an opportunity to upgrade. Whenever cost justified, this upgrading is being done by adhering more closely to smarter work processes, by following better procedures, by selecting superior components, implementing better quality controls, using more suitable tools, etc. That, then, gets at the heart of maximizing machinery uptime.

Upgrade measures are employed with considerable forethought by reliability-focused companies. These companies will first identify the feasibility of such measures and will then determine their cost-effectiveness and quantitative justification. To do this effectively and over the long haul, they will employ trained engineers. The term "trained engineers" implies that they are informed researchers and readers that use analytical methods to make sound, experience-based decisions. Companies hold on to trained, highly motivated engineers by creating and nurturing a work environment that is conducive to high employee morale. Intelligent, highly productive operators are part of this work environment.

Since even the best-trained engineers cannot go it alone, they are given competent help. With that in mind, reliability-focused companies recognize the critically important role of the equipment or process operator.

Best-in-Class companies are, therefore, poised to pursue ODR initiatives.

Operator contributions are necessary because operators are the first to notice deviations from normal operation. They, the operators, are best equipped to understand the interactions between process and equipment behavior.

Operators need training. Their responsibilities and accountabilities must be defined and "institutionalized." Institutionalizing means that their job functions and actions, their responses and the implementation steps they follow must become mandatory routines as opposed to optional routines. More than two decades ago, plants in California and Texas experimented with this concept; they called it the multi-skill approach and assigned operators certain ODR tasks.

Operator-Driven Reliability is nearly always part of a generally applied maintenance plan: A distinct group of activities that makes things happen, rather than simply suggesting what should happen. In the Hand book of Industrial Engineering, author Ralph Peters outlines a number of common-sense steps. He strongly recommends starting with an over all strategic maintenance plan like TPM or RCM (Reliability-Centered Maintenance) and asks that the interested entity include defined goals and objectives for ODR within this plan. A top-notch reliability-focused facility would understand that ODR is a deliberate process for gaining commitment from operators to:

• Keeping equipment clean and properly lubricated

• Keeping fasteners tightened

• Detecting and reporting symptoms of deterioration

• Providing early warnings of catastrophic failures

• Making minor repairs and being trained to do them

• Assisting maintenance in making selected repairs

• Start with necessary communication between maintenance, operators, and the rest of the total operation to gain commitment and internal cooperation

• Develop list of major repairs in the future

• Utilize leadership-driven, self-managed teams, e.g. "reliability improvement teams"

• Develop written and specific team charter

• Have teams evaluate/determine the best methods for operator cleaning, lubrication, inspection, minor repairs, and level of support during repairs

• Develop written procedures for operators and include them in quality and maintenance guides

• Evaluate the current predictive and preventive maintenance procedures and include those that the operator can do as part of ODR

• Document startup, operating, and shutdown procedures along with commissioning and changeover practices

• Consider quality control and health, safety, and environment requirements

• Document operator training requirements and what maintenance groups must do to support these requirements

• Develop operator training certification to validate operator performed tasks.

Modern process plants train their operating technicians to have a general understanding of the manufacturing processes, process safety, basic asset preservation, and even interpersonal skills.

Operator involvement in reliability efforts ensures the preservation of a plant's assets. Operator activities thus include the electronic collection of vibration and temperature data and spotting deviations from the norm.

Operator activities do not, however, encompass data analysis; data analysis is the reliability technician's task. Additional activities include routine mechanical tasks such as the replacement of gauges and sight glasses, and assisting craftspeople engaged in the verification of critical shutdown features and instruments. Also, operating personnel participate in electric motor testing and electric motor connecting/disconnecting routines.

The creation of functional departments tasked with both data capture and data analysis should be closely examined. Such departments may not be efficient; they risk involving expert analyst personnel in mundane data collection routines. It should be noted that operators are the first line of defense, the first ones to spot deviations from normal operation.

For optimum effectiveness, they should be used in that capacity, i.e. data collection should be assigned to operators.

Supporting the Operator

ODR must be given tangible support by virtually every one of the other job functions represented at a specific facility. This recognition should logically lead to the development of well thought-out and appropriately configured DSS.

Decision Support Systems might be described as an advanced, multi faceted asset management system which aims at automating an industrial reliability maintenance decision-making process. This process integrates monitoring and diagnostic approaches that include Distributed Control Systems (DCS), Computerized Maintenance Management Systems (CMMS), internal and external websites, and the many other sources of the company's own internal knowledge. Once successfully implemented, a sound, well-developed DSS will be a powerful source of information allowing rapid and exact equipment and process diagnosis, failure analysis, corrective action mapping, and so forth. It will turn the operator into a knowledge worker who will be supported by true expert systems.

Awareness of Availability Needs and Outage and Turnaround Planning

Another prerequisite for maximizing machinery uptime is being aware of the availability needs of one's plant. If production is seasonal or not sensitive to shutdown frequency or duration (within reason, of course), it makes little economic sense to demand the maximum in machinery availability. There cannot be any one simple rule covering the many possibilities and options, and management personnel must seek input from knowledgeable reliability professionals.

As an example, a plastics extruder that must stay on line for very long periods of time without shutdown may have to be equipped with a non-lubricated coupling connecting it to its driver. Conversely, a plastics extruder employed in a process requiring its helical screw rotor to be exchanged for a different one during monthly changes to substantially different plastic products could be equipped with a less expensive gear coupling that might have to be re-greased every month.

Being aware of one's equipment availability needs is also important for intelligent planning of downtime events for inspection and repair.

Outage planning (sometimes called turnaround, also called "IRD" for inspection and repair downtime) is closely related to awareness of avail ability. It boggles the mind how often management neglects this issue.

It defies common sense to buy the cheapest equipment and then expect long, trouble-free operation without shutdowns. A plant that bought bare bones machinery must expect more outages than a plant that thoroughly investigated the life-cycle cost of better machinery and then carefully specified this equipment before placing purchase orders.

There are certain ethylene plants that, in 2004, operated with 8-year outage intervals while others barely made it to 5 years. The reader will intuitively know which of the two had, at the design and inquiry stages, pre-invested in detailed machinery reliability assessment efforts. Attempts by the 5-year plant to move into the 8-year category are costly and slow.

To again use an automobile analogy, buying a certain model with a six cylinder engine will cost less than buying it with eight cylinders, but the incremental cost of later converting from six to eight cylinders will be far greater.

Modern outage planning uses in-plant reliability data acquired over time. Without data, such planning will involve considerable guessing.

Using data from one's own operations and from similar plants and equipment elsewhere, the scope and mandate of these activities is to impart reliability, availability, and maintainability in methodical and even mathematical fashion. Needless to say, this won’t be done by default; instead, it requires management involvement and stewardship.

Insurance and Spare Parts Philosophies

In the early 2000s, a very competent reliability professional explained that his company continues to have issues with its spare parts philosophy and overall parts management. He described a situation that is very common today:

Unfortunately, what we have done to ourselves over the last 20 years is a piecemeal approach that is too frequently found wanting. The plant inevitably stays down for two days when it should only have been down for 18-24 hrs after an unplanned shutdown. I am now being further challenged by being asked to set up the spares for our new world-scale methanol plant. Surely the spares that we stock for a syngas turbine should be somewhat generic. The fact that we have three different turbine manufacturers simply means getting the relevant part numbers/serial numbers to the warehousing people to complete an administrative exercise as all the other factors, i.e. risk, production loss etc., are similar.

Each plant differs from the next one in certain respects. Although two refineries or fertilizer plants may represent identical designs, they are not likely to have identically trained or motivated staff. One plant takes perhaps greater risks in areas where operating prudence should be practiced. Some plants allow adequate time for turbine warm-up while others use the incredibly risky "full speed ahead on lukewarm" approach. Or, although professing to perform failure analysis, many plants will replace failed parts before even understanding why the part failed in the first place. In doing so, they set themselves up for repeat failures.

Some facilities employ structured and well-supervised maintenance supervision, work execution, and follow-up inspection, while others are quite remiss in allocating time and resources to these pursuits. Also, one plant may be located in a geographic area blessed with competent repair shops while the other is not. Smart plants do a considerable amount of pooling of major turbomachinery spares, i.e. several plants have access to a common spare. Moreover, some plants have found it prudent to specify and procure certain turbo equipment diaphragms made from readily repairable steel rather than difficult-to-repair cast iron. Some will only purchase steam turbine blading that represents prior art, while others will buy prototype blade contours that promise perhaps a fraction of a per cent higher energy efficiency. Certain blades falling into this category are then subjected to high operational stresses and are prone to fail prematurely.

Even well-designed turbine blades are at risk if the steam supply system is unreliable or deficient in some ways.

Needless to say, the list could go on. Any reasonable determination of recommended spare parts must include not only consideration of the above but also an analysis of prior parts consumption trends and an assessment of storage practices, to name but a few key items. It’s no secret that most users are reluctant to share their field experience and related pertinent information by publishing it. Broadcasting past mistakes, existing shortcomings, and underperformance threatens the job security of plant management. Conversely, educating others as to the details that had ensured past successful operations is frowned upon as "sharing a competitive advantage with the enemy." The answer? Experience shows that competent consultants with lots of practical field experience should be engaged to periodically audit HP and major chemical plants. That is the only logical answer to the question of spare parts stocking in a highly competitive environment. To the best of our knowledge, there is no magic computer program that can manipulate the almost endless number of variables that must be weighed and taken into consideration to determine how many spares are needed in petrochemical plants.

Reliability-Focus versus Repair-Focus

To be profitable, an industrial facility must abandon its repair focus and move toward becoming almost exclusively reliability-focused. There are many ways to reach this goal and the best path to success may depend on a facility's present state of affairs, so to speak. Here, then, is just one more reminder. Assuming you want to move toward best practices (BP) and are - pardon the suggestion - a "Room-for-Improvement" (RFI) plant, you may wish to compare your present organizational lineup and its effectiveness against BP pursued and implemented at process plant locations elsewhere.

A comparison of repair-focused plants with reliability-focused facilities is in order. It should be realized that conscientiously maintaining reliability focus is synonymous with implementing the desire to optimize machinery uptime.

• The reliability function at repair-focused facilities is not generally separated from the plant maintenance function. At repair-focused plants, traditional maintenance priorities and "fix it the way we've always done it" mentality win out more often than warranted. In contrast, reliability-focused facilities know precisely when upgrading is warranted and cost-justified. Again, they view every maintenance event as an opportunity to upgrade and are organized to respond quickly to proven opportunities.

• The reward system at repair-focused plants is often largely production-oriented and is not geared toward consistently optimizing the bottom-line life-cycle-cost (LCC) impact. At repair-focused facilities the LCC concept is not applied to upgrade options. This differs from reliability-focused facilities that are driven by the consistent pursuit of longer-term LCC considerations. Here, life-cycle costing is applied on both new and existing (worthy of being considered for upgrading) equipment.

• At repair-focused companies, reliability professionals have insufficient awareness of the details of successful reliability implementations elsewhere. The situation is different at reliability-focused facilities that provide easy access to mentors and utilize effective modes of self-teaching via mandatory(!) exposure to trade journals and related publications. Management at these BOC facilities arranges for frequent and periodic "shirt-sleeve seminars." These informal in-plant seminars are actually briefing sessions that give visibility to the reliability technicians' work effort. They disseminate technical information in single-sheet laminated format and serve to upgrade the entire workforce by slowly changing the prevailing culture.

• Lack of continuity of leadership is found at many repair-focused plants. These organizations don’t seem to retain their attention span long enough to effect a needed change from the present repair focus to the urgently needed reliability focus. The influence of both mechanical and I&E equipment reliability on justifiably coveted process reliability does not always seem to be appreciated at repair focused plants. On the other hand, we know of no BP organization (top quartile company) that is repair-focused. Experts generally agree that successful players must be reliability-focused to survive in the coming decades.

• Some of the most successful BP organizations have seen huge advantages in randomly requiring maintenance superintendents and operations superintendents switching jobs back and forth. There is no better way to impart appropriate knowledge and "sensitivity" to both functions.

• At repair-focused facilities, failure analysis and effective data logging are often insufficient and generally lagging behind industry practices.

Compared to that, BP organizations interested in machinery uptime extension involve operations, maintenance, and project/reliability personnel in joint failure analysis and logging of failure cause activities. A structured and repeatable approach is being used and account abilities are understood.

• At the typical RFI facility, the plant where there is "room for improvement," there are gaps in planning functions and process mechanical coordinator (PMC) assignments. There is also an apparent emphasis on cost and schedule that allows non-optimized equipment and process configurations to be installed and, sometimes, replicated. At RFI plants, reliability-focused installation standards are rarely invoked and responsible owner follow-up on contractor or vendor work is practiced infrequently.

• Best Practices organizations actively involve their maintenance and reliability functions in contractor follow-up. Life-cycle cost considerations are given strong weight. Also, leading BP organizations have contingency budgets that can be tapped in the event that cost justified debugging is required. They don’t tolerate the notion that operations departments must learn to live with a constraint.

• A reliability-focused BP organization will be diligent in providing feedback to its professional workforce. The typical repair-focused company does not use this information route.

Mentoring, Resources, and Networking

Occasionally, even a repair-focused organization has both Business Improvement and Reliability Improvement teams in place. As it plans to move toward BOC status, the repair-focused plant must make an honest appraisal of the effectiveness of these teams. Their value obviously hinges on the technical strength and breadth of experience of the various team members.

At the typical repair-focused location, maintenance-technical personnel are often unfamiliar with helpful written material that could easily point them in the right direction. As an example, repair-focused companies often use only one mechanical seal supplier. Moreover, access to the manufacturer is sometimes funneled entirely through a distributor.

In contrast, BP or BOC organizations have full access to the design offices of several major mechanical seal manufacturers. They have acquired, and actively maintain, a full awareness of competing products.

They will find sound and equitable means to select whichever seal con figuration, material choice, etc. necessary to meet specified profitability objectives. This is reflected in their contract with a seal alliance partner.

At repair-focused companies, a single asset may require costly maintenance work effort every year, while another, seemingly identical asset, lasts several years between shutdowns. This paradox is tackled and solved at BP organizations. They provide access to mentors whose assistance will lead to true root cause failure analysis (RCFA). The result is authoritative and immensely cost-effective definition of what is in the best interest of the company. Based on experience and analysis, this could be repeat repair, upgrading, or total replacement.

Repair-focused plants seem to "re-invent the wheel," or use ineffective and often risky trial-and-error approaches. Reliability-focused multi plant or international organizations make extensive use of networking.

Relatively informal, very low cost Network Newsletters use input from grass-roots contributors who gain "visibility" and "name recognition" by being eager to communicate their successes to other affiliates. A Network Chairperson is being used to communicate with plant counterparts. This job function is assigned to an in-plant specialist on a rotational basis.

Many well-intentioned companies endeavor to identify and implement the best, or most appropriate, reliability organization. Some opt to divide their staff along traditional lines into Technical Services, Operations, and Maintenance divisions, departments, or just plain work functions. They often place their reliability personnel under the Maintenance Management umbrella, but then have second thoughts when reliability professionals end up immersed in fighting the "crisis of the day," as it were.

While it has been our experience that organizational alignments are considerably less important than the technical expertise, resourcefulness, motivation, and drive of individual employees, there are obvious advantages to an intelligent lineup. What, then, is an "intelligent lineup," or sound organizational setup?

Dual Career Paths at Top Companies

Top performing companies have created two career paths for their personnel. Where two career paths exist, upward mobility and rewards or recognition by promotion are possible in either the administrative or technical ladders of advancement. This dual ladder represents perhaps the only sound and proven way to keep key technical personnel in such industries as hydrocarbon processing. Some engineers would not want to become managers, and there are not enough management openings to promote all competent engineers to such positions.

Where there are two career paths, there is income and recognition parity between such administrative and technical job functions as

Administrative side Technical side Group Leader Project Engineer Section Supervisor Staff Engineer Senior Section Supervisor Senior Staff Engineer Department Head Engineering Associate Division Manager Senior Engineering Associate Plant Manager Scientific Advisor Vice President Senior Scientific Advisor

Recognition and reward approaches have much to do with management style. There are many gradations and cultural differences that make one approach preferred over the next one. It’s not possible to either know or judge them all. Suffice it to say that a thoughtless reward and recognition system is a serious impediment to employee satisfaction.

More Keys to a Productive Reliability Workforce

An under-appreciated workforce is an unmotivated, unhappy, and inefficient workforce. Such a workforce will rarely, if ever, perform well in areas of safety and reliability. How, then, will the interdependent safety, reliability, and profitability goals be achieved? Forty years ago, world-renowned efficiency expert Dr. W. Edwards Deming provided the answer. He stipulated 14 "Points of Quality" that fully met the objectives of both employer and employee and are as true and relevant today as they have ever been. Deming had aimed his experience-based recommendations at the manufacturing industries and we transcribed his 14 points into wording that might find listening ears in the process plant reliability environment [2]. Here is our expanded recap:

1. As was brought out earlier in this guide, view every maintenance event as an opportunity to upgrade. Investigate its feasibility beforehand; be proactive.

2. Ask some serious questions when there are costly repeat failures.

There needs to be a measure of accountability. Recognize, though, that people benefit from coaching, not intimidation.

3. Ask the responsible worker to certify that his or her work product meets the quality and accuracy standards stipulated in your work procedures and checklists. That presupposes that procedures and checklists exist.

4. Understand and redefine the function of your purchasing department. Support this department with component specifications for critical parts, then insist on specification compliance. "Substitutes" or non-compliant offers require review and approval by the specifying reliability professional.

5. Define and insist on daily interaction between process (operations), mechanical (maintenance), and reliability (technical and project) workforces.

6. Teach and apply RCFA from the lowest to the highest organizational levels.

7. Define, practice, teach, and encourage employee resourcefulness.

Maximize input from knowledgeable vendors and be prepared to pay them for their effort and assistance. Don’t "re-invent the wheel."

8. Show personal ethics and evenhandedness that are valued and respected by your workforce.

9. Never tolerate the type of competition among staff groups that causes them to withhold critical information from each other or from affiliates.

10. Eliminate "flavor of the month" routines and meaningless slogans.

 

11. Reward productivity and relevant contributions; let it be known that time spent at the office is in itself not a meaningful indicator of employee effectiveness.

12. Encourage pride in workmanship, timeliness, dependability, and the providing of good service. Employer and employee honor their commitments.

13. Map out a program of personal and company-sponsored mandatory training.

14. Exercise leadership and provide direction and feedback.

"CARE" - Deming's Method Streamlined and Adapted to Our Time

In early 2000, a Canadian consulting company] developed a training course that brings Deming's method into new focus. They concluded that companies can be energized with empathy and, using the acronym CARE, conveyed the observation that companies excel when management gives consistent evidence of

• Clear direction and support

• Adequate and appropriate training

• Recognition and reward

• Empathy.

Although mentioned last, empathy is the cornerstone of the approach.

But, let us first consider the other letters.

The Letter "C": Clear Direction Via Role Statement

Regarding the first letter of the acronym, "C," we believe that clear direction involves role statements and training plans. A lack of role statements for reliability professional can lead to inefficiency and encourages being trapped in a cycle of "fire-fighting." Not having written role statements deprives the entire organization of a uniform understanding of roles and expectations for reliability professionals. Not having a role statement may turn the reliability professional into a maintenance technician, a per son who is more involved in maintaining the status quo than a person engaged in true failure avoidance through engineered component and systems upgrades. Clearly then, BP organizations use role statements as a roadmap to achieving mutually agreed-upon goals. Among other things, this allows meaningful performance appraisals.

The four CARE items represent rather fundamental principles of management. Still, while empathy forms the foundation, it alone won’t deliver full results for any organization. The drive toward certain success starts with clear direction and support. Clear direction must be put down in writing. For example, and as was alluded to earlier in this section, reliability professionals must receive this clear direction in the form of a role statement. Their role might include, but not be limited to, those mentioned below.

1. Assistance role

• Establishment of equipment failure records and stewardship of accurate data logging by others. Know where we are in comparison with BOC performers.

• Review of preventive maintenance procedures that will have been compiled by maintenance personnel.

• Review of maintenance intervals. Understand when, where, and why we deviate from BP.

2. Evaluation of new materials and recommendation of changes, as warranted by LCC studies.

3. Investigation of special, or recurring equipment problems. Example:

• Ownership of failures that occur for the third time in any 12-month period.

• Coaching others in RCFA.

• Definition of upgrade and failure avoidance options.

4. Serving as contact person for original equipment manufacturers.

• Understanding how existing equipment differs from models that are being manufactured today.

• Being able and prepared to explain if upgrading existing equipment to state-of-art status is feasible and/or cost-justified.

5. Serving as contact person for other plant groups.

• Communicate with counterparts in operations and maintenance departments.

• Participate in Service Factor Committee meetings.

6. Develop priority lists and keep them current.

• Understand basic economics of downtime. Request extension of outage duration where end-results would yield rapid payback.

• Activate resources in case of unexpected outage opportunities.

7. Identify critical spare parts.

• Arrange for incoming inspection of critical spare parts prior to placement in storage locations.

• Arrange for inspection of large parts at vendor's/manufacturer's facilities prior to authorizing shipment to plant site.

• Define conditions allowing procurement from non-OEMs.

8. Review maintenance costs and service factors.

• Compare against Best-in-Class performance.

• Recommend organizational adjustments.

• Compare cost of replacing versus repairing; recommend best value.

9. Periodically communicate important findings to local and affiliate management.

• Fulfill a networking and information-sharing function.

• Arrange for key contributors to make brief oral presentations to mid-level managers (share the credit, give visibility to others).

10. Develop training plans for self and other reliability team contributors.

The above listing represents a role statement for equipment reliability engineers. While it represents a summary that can be expanded or modified to address specific needs, it’s representative of the written "clear direction" that is being taught in the CARE program.

The "support" element is re-enforced in items 9 and 10, above. In one highly successful and profitable company, an astute plant manager organized a mid-level management "steering committee" which every week invited a different lower-level employee to make a ten-minute presentation on how they performed their work. The vibration technician explained how early detection of flaws saved the company time and money, an instrument technician demonstrated the key ingredients of an on-line instrument testing program, etc. Each reliability issue or program had a management sponsor or "champion," who saw to it that a program stayed on track, and that organizational and other obstacles were removed.

The Letter "A"

Next, there is a melding of "Clear Direction and Support" with "Adequate and Appropriate Training." How so? Well, training plans were initiated by the employee, which means he or she had to give considerable thought to long-term professional growth. The initial proposal by the employee was reviewed, supplemented, modified, often amplified, but always given top priority by management.

In addition to structured self-training, a reliability professional at BOC plants prepares "shirt-sleeve seminars" - training sessions lasting per haps ten minutes. He rolls up his sleeves and, at the end of an assembly of personnel for safety talks, presents a reliability and uptime optimization topic to those present. At shirt-sleeve seminars, key learnings are being discussed and disseminated. These key learnings include reminders that reliability principles must be consistently employed by everyone. Site management must verify continuity of this dissemination effort and endorse the application of reliability principles such as consistent use of checklists.

But training, of course, must go beyond "shirt-sleeve" seminars. Best Practices organizations encourage salaried professionals to submit their projected training plans, both long term and short term, in writing. These plans are then critically reviewed and employer requirements reconciled with an employee's developmental needs. Input from competent consultants is often enlisted. Best Practices organizations make active and consistent use of what they have learned.

Note that our earlier statements on "clear direction and support" introduced the training issue. Let us face it, we are losing the ability to apply basic mathematics and physics to equipment issues in our workplace situations. As an example, hundreds of millions of dollars are lost each year due to erroneous lubrication techniques alone. The subject is not dealt with in a pragmatic sense in the engineering colleges of industrialized nations. The connection between Bernoulli's law taught in high-school physics classes and the proper operation of constant level lubricators is lost on a new generation of computer-literate engineers. Managers chase after the "magic bullet" - salvation must be in "high tech," they think.

That is an incredibly costly misconception. We have truly neglected to understand the importance of the non-glamorous basics. We’re no longer interested in time-consuming details. We have encouraged our senior contributors to retire early. All too often, no thought is given to the consequences. Assumptions are made that one could hire contractors to do the thinking for us and not many decision-makers see the fallacy in this reasoning. It should be obvious that at times contract personnel are even less qualified, or have less incentive, to determine the LCC of different alternatives and address the root causes of repeat failures of machinery.

We have become "big picture" men, from the maintenance technician all the way up to the company CEO. We cannot be bothered by details, have no time for details, and are not rewarded for dealing with details.

But, as some outstanding performers have clearly shown, attention to detail is perhaps the most important step they took to get to the top.

They have developed and continue to insist on adequate and appropriate training. This training deals with not only concepts and principles, but hundreds of details as well.

Employees of Best-in-Class companies develop their own short- and long-range training plans. Time and money are budgeted and the training plan signed off by the employee and his or her manager. A training plan has the status of a contract. It can only be altered by mutual consent.

The training plan for a machinery-technical employee was published in Improving Machinery Reliability [1] and typically consists of four columns, as replicated below.

====

Career Years

1 2 3 4 5 6 7

---

"Knowledge of:

Company organization Rotating equipment types Company's communication routines Relevant R&D studies, vendor capabilities, in-house technical files Pump and compressor design Machinery reliability appraisal techniques Gear design Major refining processes Machinery design audits Machinery piping Major chemical processes Materials handling equipment Hyper compressors Thin-film evaporators Plastics extruders Fiber processing equipment Patent and publication matters

---

Work Capability in:

Interpretation of flow sheets, piping & instrument diagrams Elementary technical support tasks, e.g. alignment, vibration monitoring Essential computer calculations Design specification consulting & support Machinery performance testing Start-up assistance, all-fluid machines Company standards updates General technical service tasks elementary troubleshooting Machine-electronic interfaces General troubleshooting

"shirt-sleeve seminars" (conduct informal training) Machinery quality assessment and verification Start-up advisory tasks Appraisal documentation update tasks Hyper compressor specifics Machinery design audits Technical publications

---

"Leading Expertise in "

Machinery optimization

Machinery maintenance

Machinery selection

Machinery failure analysis

======

A career development training plan was developed along the same lines [3]. Here is the format we have seen for imparting knowledge to new, intermediate, and advanced machinery engineers.

I. NEW ENGINEER (Plant mechanical engineer hiree) Years 1 and 2, possibly years 1 through 5.

A. On-the-job training Rotational assignments within the plant in various groups to be exposed to different job functions for familiarization. Areas to be covered should include machinery, mechanical, inspection, electrical, instrumentation, operations, maintenance, etc.

B. In-house training (Applicable to headquarters/central engineering locations) Plant and/or corporate standards development/revisions and updates

• Courses in the above

• Courses dealing with industry standards (API, NEMA, NPRA, etc.)

• Machinery (compressors, pumps, steam and gas turbines, gears, turboexpanders, etc.)

• Failure analysis and troubleshooting (Seven Root Cause method, "FRETT")

• Practical lubrication technology for machinery

• Machinery vibration monitoring and optimized analysis

• Predictive monitoring (lube oil analysis, valve temperature monitoring, etc.).

C. Outside training pursuits (Suggested minimum once/year, preferred frequency twice/year)

1. General vendor-type information courses. Examples:

• A major manufacturer's gas turbine maintenance seminar

• Major mechanical seal manufacturers' training courses

• A major manufacturer's compressor technology, selection, application, and maintenance seminar

• Compressor Control (Anti-Surge) and Turbomachinery Governor Control courses

• A major turbomachinery manufacturer's lube and seal oil systems maintenance course

• Coupling manufacturer's training course, etc.

2. Texas A&M University Turbomachinery Symposium

3. Texas A&M University International Pump Users Symposium

4. Professional Advancement courses in

• Machinery Failure Analysis and Prevention

• Machinery Maintenance Cost Saving Opportunities

• Compressor and Steam Turbine Technology

• Machinery for Process Plants

• Reciprocating Compressor Operation and Maintenance

• Piping Technology

• Practical Mechanical Engineering Calculation Methods.

D. Personal training (Mandatory review of tables of contents of applicable trade journals, books, conference proceedings, etc. Mandatory collection and cataloging of copies of articles that are of potential future value).

Here are some examples of trade journals that often prove useful to equipment reliability professionals:

• Hydrocarbon Processing

• Maintenance Technology

• Oil and Gas Journal

• Chemical Engineering

• Control Design

• Gas Turbine World

• Chemical Processing

• Hydraulics and Pneumatics

• Power Engineering

• Pumps and Systems

• Evolution (SKF Bearing Publication)

• Reliability

• Mechanical Engineering

• Diesel Progress

• Diesel & Gas Turbine Worldwide

• Distributed Power

• Sound and Vibration

• Lubes and Greases

• Sulzer Technical Review

• Plant Services

• World Pumps

• Compressor Tech Two

• Practicing Oil Analysis

• NASA Tech Briefs

Books to be reviewed should include guides on machinery reliability assessment (which include checklists and procedures and popular guides on pumps), Weibull analysis, reciprocating and metering pumps, electric motor guides, books dealing with gear technology, etc. We refer the reader to the Bibliography at the end of this section.

II. INTERMEDIATE ENGINEER (Plant Mechanical/Machinery Engineer), years 3 through 5, possibly 3 through 8.

A. Rotational assignment. Two-year assignment at affiliate location, possibly at Central Engineering or Company Headquarters.

• Involvement in field troubleshooting and upgrading issues

• Familiarization with equipment, work procedures, data logging practices, etc.

• Spare parts procurement practices (probability studies)

• Life-cycle costing involvement

• Maintainability and surveillability input

• Structured networking involvement (provide feedback to other groups).

B. Outside training pursuits.

• Extension of earlier exposure

• Attendance at relevant trade shows and exhibitions (provide feedback to others)

• Attendance at ASME, NPRA, STLE, and related conferences (provide feedback)

• Speaker at local ASME/STLE/Vibration Institute meetings.

C. Personal training and continuing education.

• Develop short articles for trade journals and/or similar publications

• Develop short courses (initial aim: in-plant presentations, intra affiliate presentations)

• Advanced self-study of material on probability, statistics, automation, management of change

• Studies in applicable economics.

III. ADVANCED ENGINEER (Corporate Specialist, Core Engineering Specialist), years 9 and more, depending on exposure and achievements under II - A/B/C, above.

• International conferences (speaker/participant)

• Peer group interfaces (e.g., on discussion panels, industry standards committees, etc.)

• Develop and present technical papers at national/international engineering conferences

• Pursue book publishing opportunities (case histories, teaching tools, work procedures)

• Regular contributions to trade journals

• Development of consultant skills.

The Letter "R": Recognition and Reward

One of the most important and seemingly little known facts is that most professional employees seek different employment for reasons other than better pay. This situation is analogous to divorces. Few marriages break up because of the intense desire to find a new partner whose income exceeds that of the previous one. Most marriages break up because of lack of respect, untruthfulness, immoral or insensitive conduct, or just plain incompatibility. Most employer-employee relationships are wrecked for the same reasons.

Recognition and reward often come in the form of sincere expressions of appreciation for whatever good qualities or commendable performance are displayed by the employee. A few well-chosen words given privately are usually better than public praise. All too often, public praise generates envy in others and may make life more difficult to the recipient of praise. Rewards in the form of Certificates of Recognition to be hung on the office wall come perilously close to being meaningless and employers would be wise to consider how these pieces of paper are perceived. If you want to do something positive for the employee, give him or her a certificate for $300 worth of technical books, or a $200 gift certificate for dinner at an upscale restaurant, or a new floor covering or whatever reaffirms that the employee's contributions are valued.

Several major petrochemical companies frequently reward top technical performers with a bonus of $5000 for exceptional resourcefulness, or the implementation of cost-saving measures, being "doers" instead of "talkers." There is nothing a company likes more than having its professional employees go on record with a firm, well-documented recommendation for specific action, rather than compiling lists of open-ended options for managers to consider. Top technical performers do just that:

They make solidly researched recommendations, showing their effect on risk reduction and downtime avoidance, or demonstrating their production and quality improvement impact.

Empathy: The Overlooked Contributor to Asset Preservation The last item, empathy, is by far the most important and also the most neglected. Yet, it represents the foundation of the CARE concept. Without empathy, without the ability to put oneself into the shoes of the people one manages, a manager will never know them, certainly won’t understand them, and will never bring them to their full potential as employees and people.

Empathy is an understanding so intimate that the feelings, thoughts, and motives of a fellow human being are readily comprehended by another.

You may think that this "intimate understanding" has no place at the office or on the factory floor. Think again.

Say an employee is late coming to work and the manager rebukes her before, or instead of, tactfully inquiring as to the reason for the tardiness.

Assume that this employee has a sick child at home. Does the rebuke make her a more efficient or happier worker? We all know the answer to that question.

Let us say the manager would understand how empathy works, or would remember how he would like to be treated if it were his child that is sick. Let us say the manager would, therefore, offer the employee such options as doing the work at or from her home. The most likely result of his showing empathy and compassion would be that instead of getting 80% efficiency out of the unhappy worker at the office, he gets 120% efficiency from the appreciative worker at home. All parties would benefit from empathy and compassion in the workplace.

We’re fully aware of the standing objection to empathy: "The workers will take advantage of me. I would look like a pushover, and not like the firm leader that I want to project." Let us just end the discussion by stating unequivocally that the vast majority of professional employees respond better to kindness than to harshness. Using such traits as compassion, cooperation, communication, and consideration will result in a more productive, satisfied, motivated, and loyal workforce than many managers could ever imagine.

Yes, empathy is doing more to retain this most valuable asset, your professional employees, than money, slogans, exhortations, and threats.

Empathy, indeed, is the foundation of the ingredients of CARE, and is the hallmark of a long-term BOC company. And so, as we move into the more purely technical topics and sections of this guide, let us never lose sight of the importance of the "people aspects" in capturing and optimizing machinery uptime.

Prev. | Next

Article Index    HOME