The Maintenance Jack-knife chart
A way to visualise asset criticality and help prioritise improvement
Have you ever wanted a single chart that provided a quick way of seeing the main reliability problems for your fleet? Perhaps a Jack-knife chart will help.
Before presenting the Jack-Knife, it is worth stepping back and thinking about some fundamental principles of how information is shared and absorbed.
I am an ex-military maintainer. I believe you could walk into any military organisation’s technical office that uses equipment and see a whiteboard showing asset status. The whiteboard enables any person in the team to grasp situational awareness at-a-glance. Gaining rapid situational awareness is part of the military’s DNA, especially in combat situations. You can also observe similar ideas being used in hospital wards and workplaces with Kaizen or Kanban boards. The hospital shows all the beds, occupancies and essential information. whiteboards are updated in real-time by the staff. They have the attributes of being instantly accessible, easy to change and read, resulting in a very simple but very effective means for imparting information.
We can discuss other ways situational awareness can be enhanced in the maintenance domain in a future blog.
How can this idea of “grasping situational awareness at a glance” be extended in the reliability domain?
The jack-knife chart may be a candidate. This is a type of scatter chart where the x and y-axis represent two important dimensions of criticality for an asset’s components. The x-axis shows the likelihood or frequency of failure, and the y-axis shows the impact or consequences of failure. Each component’s criticality is represented as a plotted data point on the chart. Usually, the impact axis is the ‘time to repair’ or downtime of the failed assets. The diagonal lines represent high or low criticality borders. Any component plotted in the upper right quadrant is bad. Borders may show limits of acceptability for an organisation, or for standards or legislation. The horizontal lines show borders between high or low acuteness or frequency. The axis’ are also logarithmic enabling a wider set of data to be fitted. Here is a real example:
We can build on the baseline Jack Knife chart to include some other display options. The chart shown below is the equivalent of the chart above. With most modern charting tools it is possible to label data points with other data, or if the charts are interactive labels can be seen using mouseover. A couple of labels are showing part descriptions as an example of what may be possible.
An asset may have hundreds or even thousands of functionally significant components. We should be able to plot everything and use logarithmic scales to fit everything in a single chart. This detailed chart provides us with the overall picture of the asset. The clustering of the data may be compared between similar assets. However, we should also be able to filter down to the topmost critical components to focus where we want improvement efforts to be concentrated. The example we have used only shows a limited set of parts for illustrative purposes.
The conventional Jack Knife shown above could be extended with other dimensions or attributes of criticality. For example, in a maintenance Failure Modes and Effects Analysis (FMEA), the detectability of a failure is often included as a dimension of the RPN Risk Profile Number (RPN) to help quantise criticality. Detectability refers to how evident a component failure is.
If a failure occurs suddenly with no symptoms or other indications, the detectability is low. Low detectability is more severe than a failure that is gradual with many symptoms and indications. This is even better when symptoms present in a predictable sequence over the degradation time.
In a scatter chart we can show the third dimension of detectability using colour coding or the size of the individual markers in the chart. This could use a simple classification of detectability into three (or more) sections, to keep the visualisation simple. In the example below, we show a simple example with size designating detectability criticality. The chart legend shows the three classes of detectability using the same components as the basic chart above.
Other attributes of criticality can be mixed with either of the frequency or impact dimensions.
- We could use the B-20 Weibull measure using the inverse of B-20 Age for the frequency of failure. In this previous blog, we discussed how B-20 may be a more informative reliability metric than MTBF or an averaged failure rate.
- We might include attributes such as Safety or operational factors, material, labour and logistics costs, as well as factors for logistic delay or lead time to the impact score.
Other indicators that may be toggled on and off using colour coding maybe
- Colour coding based on Weibull Shape. Parts with failure patterns that are premature, random or aged. It is probably that using B-20 as the major frequency dimension score the premature failure pattern will score highest. Premature failures are predominantly caused by low-quality issues in operations and maintenance. These causes should be preventable by the operating company.
- Part obsolescence risks may be included, where the supply may become restricted as parts may no longer be available.
Themchart below shows a combination of detectability as the size of the plotted point combined with the colour coding for the pattern of failure, derived from the Weibull shape parameter.
Organisations may have other attributes of equipment criticality they may want to include, with weighting factors if they consider some attributes are more important in their context than others.
The Jack-Knife chart may also have a strategy value; it can be used help maintenance align with business strategy. This is illustrated on the chart below with the thick black arrows for improving cost or productivity.
- If there is strong market demand from the organisation’s output and / or commodity prices are high, then the emphasis may be put on improving reliability and availability of the assets downwards on the Y axis. This means primarily focusing on decreasing the likelihood or frequency of failure of the most critical parts. Eliminating defects and premature failure will pay the highest beneficial returns.
- If the market takes a down-turn and profits are reduced then cost cutting may be the primary focus, whilst maintaining quality. This means focusing at reducing the highest scoring components on the impact dimension along the x axis.
Improving detectability may be constrained by the physics of the failure modes, or the ability to monitor the equipment (either with people or sensing systems). For components that fail hidden, then doing the right maintenance involves implementing failure-finding tasks. The timeliness of these tasks is essential when they are applied to safety and protection systems. Failures may become more apparent with improving operator and maintainer awareness, and predictive maintenance may play a part where existing sensors may be able to provide symptoms of some of the failures with low detectability. Staff awareness and knowledge of what should be reported in notifications or work-order reporting is a subject we will cover in a future blog.
Within the previous three blogs looking at maintenance metrics we have looked at timeliness of scheduled maintenance and ensuring that we not only do the right maintenance, but we do maintenance right.
Given we have these other metrics, it should be possible to calculate what the improvement opportunities are. Specifically, these include eliminating premature failure, doing the right maintenance and improving the planned to corrective maintenance ratio. We also saw measures such as scheduling timeliness and adherence to estimates. These can be quantified, and improvement opportunity lines can be plotted on the chart and ranked in data tables. For senior management an aggregated top line improvement opportunity could be shown on a dashboard.
The chart below shows an improvement opportunity for a component with the green arrowed line, which would represent the savings made by eliminating premature failure from this component. The head of the arrow shows where the current plotted position could be shifted if premature failure could be prevented. The chart could be arranged to show all or a number of the component improvement arrows.
It would be powerful if the Jack-Knife chart could be interactive, adding or subtracting different attributes from the three dimensions, and toggling flags on and off.
If the chart was associated with a data grid or table that was able to rank the parts for criticality that was linked to the Jack-Knife or showed the parts that were sorted by improvement opportunity, it would provide the future improvement worklist for the maintenance and reliability department.
We have discussed the concept of situational awareness and taking the examples of whiteboards as examples of how staff can use them to ‘grasp situational awareness at a glance’. The jack-knife may be a candidate for exploring asset components and their criticality visually. If the charts are made interactive allowing different factors to be toggled, they enable a powerful means of prioritising maintenance improvement work. Have you ever used a Jack Knife or similar plot to help situational awareness or identify priorities for improvement work? We would love to hear your experiences in the comments.
In the next blog we will explore defect elimination, which is sometimes called proactive maintenance. This is where common causes of failure for many components may be prevented by improving process quality. This approach to maintenance can also be linked back to the Jack-Knife diagram.