INFLUENCE OF THE INTERACTION BETWEEN PARTS AND APPRAISERS ON THE RESULTS OF REPEATABILITY

The success of companies at the market has for seve al decades depended on the quality of the provided products and services. This quality cannot be achieved without functioning quality management system, whos e main task is to plan, manage and continuously improve all the processes w ithin the organization. All decision-making within the scope of this "trilogy o f quality" should be done on the basis of collected data or facts. In the case o f manufacturing processes, these facts represent the measured data of all the monito red quality parameters. An important condition for making the right decision i s, n this case, a sufficient amount of quality data, provided thanks to a qualit y measurement system only. ISO/TS 16949 standard, which includes the requireme nts for quality management system in the automotive industry, says, in Clause 7.6.1, that statistical studies must be performed in order to a nalyse the variability of all types of measurement and test systems.


INTRODUCTION
The success of companies at the market has for several decades depended on the quality of the provided products and services.This quality cannot be achieved without functioning quality management system, whose main task is to plan, manage and continuously improve all the processes within the organization.All decision-making within the scope of this "trilogy of quality" should be done on the basis of collected data or facts.In the case of manufacturing processes, these facts represent the measured data of all the monitored quality parameters.An important condition for making the right decision is, in this case, a sufficient amount of quality data, provided thanks to a quality measurement system only.ISO/TS 16949 standard, which includes the requirements for quality management system in the automotive industry, says, in Clause 7.6.1,that statistical studies must be performed in order to analyse the variability of all types of measurement and test systems.

METHODOLOGY
In practice, these statistical studies are carried out according to several methods.One of them is the VDA 5 methodology, which originated in the German automotive industry.The basic principle of this methodology is the narrowing of tolerance of a given quality characteristic by the calculated measurement uncertainty.However, this methodology is used very rarely, and evaluation of measurement uncertainty is practically done only in testing and calibration laboratories.The most widely used methodology, utilized not only in the automotive industry, is the MSA methodology -Measurement System Analysis which was created by three American carmakers: Chrysler Group LLC, Ford Motor Company and Generals Motors Corporation.Its main principle is the evaluation of the most important statistical properties of the measurement systems, including stability, bias, linearity, repeatability and reproducibility.The most often performed of the studies is the study of combined repeatability and reproducibility (GRR) of measurements.MSA Handbook (AIAG, 2010) describes three methods used for evaluating these studies.They are: • Range method • Average and range method • Analysis of variance (ANOVA).
Using each of these methods brings certain advantages and disadvantages that can affect the quality and informative value of the results achieved.The range method, also referred to as "short method", is not normally used for verification of the measurement systems quality, but it serves for quick verification whether the percentage share of combined repeatability and reproducibility in total variation (% GRR) is satisfactory.Its major disadvantage is the fact that it does not allow independent evaluation of repeatability and reproducibility of the measurements, which is why this work will be focused only on the remaining two methods.The variability of results achieved using the average and range method is analysed in detail in previous work (Klaput & Plura, 2012).The following part of this work uses real and purposefully modified data in order to compare the GRR studies by means of the average and range method (A&R) and the ANOVA method.Based on the results of these studies, the conditions under which the results of the methods are going to be the same or, on the contrary, completely different are discussed.

Average and range method
The average and range method (A&R) is most commonly used for measurement system repeatability and reproducibility assessment in practice.The required data are obtained by repeated measurements of product samples realised by various appraisers.It uses a defined procedure, which includes both numeric and graphical evaluation of repeatability (EV) and reproducibility (AV).On the basis of their values, it is possible to calculate the combined repeatability and reproducibility (GRR) according to the relation (1).

= ( ) + ( )
The percentage share of GRR in the total variation and the number of distinct categories (ndc) are used as the criteria of the measurement system acceptability.They are calculated using relations (2) and (3). where: -parts variation.
A measurement system is considered as fully acceptable in the cases, when %GRR value is lower than 10% and, at the same time, ndc value is at least 5.

ANOVA
The last, fourth edition of the MSA manual lays more and more stress on the evaluation of repeatability and reproducibility using the analysis of variance (ANOVA).As far as this method is concerned, you can divide the total variation into repeatability (EV), reproducibility (AV), parts variation (PV), and the interaction between appraisers and parts (INT).The GRR study using this method makes possible to obtain more information than in case of the average and range method, because it also provides information on how much of the total variation is caused by the interaction among the individual appraisers and parts.If this interaction is statistically significant, its value is presented separately, and combined repeatability and reproducibility is calculated as follows: If the interaction is not statistically significant, it is assigned to the value of repeatability.That is how ANOVA method can detect more accurate estimates of the variances, provided that the measurement errors are normally distributed.This assumption can be assessed using appropriate graphical tools (Klaput & Plura, 2011).The disadvantage of this method is in more complicated calculations of the individual components of variability, and its application requires the use of a computer (Petrík & Palfy, 2011).

IMPACT OF CHANGES OF THE MEASURED VALUES ON THE RESULTS ACHIEVED BY VARIOUS METHODS
As already mentioned above, the results of GRR analysis obtained using both methods can be very different.This difference may be caused by the occurrence of a statistically significant interaction between the measured parts and appraisers (Osma, 2011;Kazerouni, 2009).In this article, we are going to focus on exploring the impact of outliers that simulates the effect of interaction between parts and appraisers.The outcomes of the analysis of repeatability and reproducibility obtained by the average and range method and ANOVA method were compared on real data of nuts height measurement, performed by three appraisers from Tab. 1 (Plura, 2001).
In order to analyse the partial results of these analyses as well, an application in MS Excel 2010 were prepared for both methods.The accuracy of the results was verified using Minitab 16 program.The obtained results are shown in the first line of Table 3.A comparison of the determined %EV, %AV, %GRR and ndc values clearly show minimum differences among the results of the applications of the individual methods.It is mainly related to the fact, that the variability caused by the interaction between the parts and appraisers was evaluated by ANOVA method as statistically insignificant (it is considered to be zero).
Table 1 Measured data of nuts height, mm (Plura, 2001).The following solution stage deals with a simulation of the effect of increasing variability caused by the occurrence of outliers, which simulate the interactions between parts and appraisers, on the results obtained by both methods.That is why the measured values of one or two selected parts (both measurements for each part) were successively changed in case of appraiser A, while always maintaining the range of repeated measurements.The measured values of the selected parts were gradually increased or decreased by multiples of the standard deviation of repeatability, which was set to 0.025 mm.The changes of the measured values for all three cases are shown in Figure 1.

Simulation 1
In the first case, parts No. 3 and No. 8, which original measured values were close to the average value of all the measurements performed by the appraiser, were selected for the given changes.The measured values were increased in one part and decreased in the other one, so there was no change in the total average or the change of the range of averages of all the measurements of the individual parts.This setting of the performed changes ensured stability of the values of %EV, %AV, %GRR and ndc, evaluated by means of the average and range method (see Table 2).
Table 2 shows the summary results of the analysis of repeatability and reproducibility obtained using the average and range method and ANOVA method, depending on the number of standard deviations, by which the measured values of parts 3 and 8 were increased and decreased.Whereas in the case of the average and range method the results remain constant, when the ANOVA method is used, there are considerable changes related mainly to the occurrence of the part -appraiser interaction.
Figure 1 The changes of the measured values for all three simulations.practically the same.When you make changes by two standard deviations and more, however, the value of interaction significantly increases.This also causes a significant increase of %GRR (see Figure 2).The behaviour of the values of repeatability (% EV) and reproducibility (% AV) is interesting.While the value of %AV decreases with increasing number of standard deviations and when the values are changed by ten standard deviations, it reaches zero, the value of %EV decreases only slightly.With the increasing shift of the measured values, there is also a slight decrease in %PV.Increase of %GRR and a slight decrease in %PV are reflected in a significant decline in the value of ndc (see Table 2).Even when you change the values by 7 standard deviations, the ndc value decreases below 5 and the measurement system would be classified as unacceptable.
Figure 2 Changes in GRR study for Simulation 1.
The changes of the evaluated indicators are also connected with the change of the total variation (TV) to which the percentages of calculated indicators are related.It was calculated on the basis of the measured values, as the set of measured nuts represented the production range.While with using average and range method total variation did not change, the total variation calculated using the ANOVA method was increasing with the growing shift of values (see Figure 3), which somewhat mitigated the changes of the evaluated indicators.

Simulation 2
In the second case, the measured values of part No.1, which has the highest average value of all the measurements of all the measured parts, were increased.These changes therefore led to a change of the overall average of all the measurements of the given appraiser, but no change the average range.The values of the final parameters for the second case are shown in Table 3 and illustrated in Figure 4.When applying average and range method, the growing shift of the measured values to higher values over the entire range was accompanied by a slight decrease in %EV, which is associated with an increase in the value of the total variation (see Figure 3), because the value of repeatability does not change, thanks to the constant value of the average range of the repeated measurements.
In case of the percentage share of reproducibility (%AV), there is firstly a slight decline in the values and only with higher changes, there is an expected growth of values.The initial decreasing course is related both to increasing value of the total variation and also to the fact that with smaller changes of the values, the average of all the measurements of the given appraiser does not affect the value of the range of averages of all the measurements performed by the individual appraisers.The course %GRR practically copies the course of %AV.
When using the ANOVA method, the value of %GRR increases even with the smallest shift by one σ.In this case, however, the interaction part -appraiser itself is not statistically significant yet, and therefore the value of this interaction is included in the value of repeatability, which is reflected in a slightly higher value of %EV.If the shift of both measured values of the given appraiser is 2σ or higher, the interaction is already evaluated as statistically significant and hence its contribution is calculated independently.Increasing the size of interaction leads to a gradual reduction in the value of %AV and %EV, which is in line with the calculating relations of ANOVA method (Burdick, Borror & Montgomery, 2005).Table 3 and Figure 4 clearly show that the ANOVA method is much more sensitive in terms of the occurrence of interaction than the average and range method.Using the A&R method would, in this case, not change the evaluation of the acceptability of the measurement systems, not even for shift of 20σ.On the contrary, the evaluation of GRR using ANOVA method would, in terms of %GRR, rate the measurement system as unacceptable, even with a shift by 12σ.
With regards to the size of the achieved percentage interaction share (% INT) or %GRR, the values in question are lower in comparison to Simulation 1.This is caused by the fact that, in this case, there was a change of the two measured values, while in Simulation 1, four measured values were changed.
The differences in results obtained when using the individual methods also have an impact on the evaluation of the acceptability of the measurement system.When using the average and range method, the system of measurement over the entire range of simulated changes would remain acceptable, and when using the ANOVA method, it would become unacceptable with a shift of values as low as by 10σ, thanks to the low value of ndc.
Figure 4 Changes in GRR study results for Simulation 2.

Simulation 3
In the third case, the measured values of part No. 4, which has the smallest average value of all the measurements of all the measured parts, from the same appraiser were gradually increased.As in the case of Simulation 2, there was a change of overall average of all the measurements of the given appraiser, but there was no change in the average range of the repeated measurements.The determined values of the final indicators for this case are shown in Table 4 and illustrated in Figure 5.
Because the simulation of occurrence of interaction in this case is very similar to the Simulation 2, one would expect that the final values or their changes depending on the size of the shift of two measured values of the given part will be similar, if not the same.However, when the average and range method was used, the first differences are apparent as early as in the course of dependence of %EV.While the percentage share of repeatability in Simulation 2 was decreasing over the entire range, in this case, small values of shift first lead to increase of this value, which remains constant when reaching the shift by approximately 9σ.
The process is clearly related to the change of the total variation, whose dependence also has a shape of a broken curve Figure 3).The value of total variation TV decreases with small values of shift, because the variation range of averages of all the measurements of the individual used as the basis for calculation of variability between the measured parts (PV) declines.At the moment when the average of all the measurements of the given part reaches the level corresponding to the second smallest part, the range of the averages of the parts remains constant.A similar effect can be seen in the percentage share of reproducibility (%AV).
The initial increase of this value (up to app.6σ) is related, as in the case of %EV, to decreasing value of the total variation.The following, more significant increase is caused by the fact that the average of all the measurements of the given appraiser becomes the maximum value of the averages of all the measurements performed by the individual appraisers, which directly affects the range of averages.
When applying the ANOVA method, the values of %EV remain practically unchanged with increasing shift of the measured values, and %AV shows slow increase only.There is, however, a significant increase in the value of interaction between parts and appraisers, the course of which is then copied by the value of the percentage share of combined repeatability and reproducibility (%GRR).The analysis using ANOVA method leads to a change of the evaluation of acceptability of the measurement systems in this case as well.When the average and range method was used, the measurement system becomes unacceptable only when shifting the value by 18σ, which is caused by low value of ndc.The use of ANOVA method makes the system unacceptable for the same reason with the shift by 9σ.

CONCLUSION
The results of the simulations show that the ANOVA method is more suitable for analysis of repeatability and reproducibility of the measurement system.Its main advantage is the ability to detect eventual interactions between parts and appraisers, which may significantly worsen the variability of the used measurement system.This makes the analyses using this method usually more sensitive to the occurrence of unusual situations, such as outliers.
In case of studies of repeatability and reproducibility of measurement systems, where these interactions do not occur, comparable results are achieved by means of the average and range method, whose undisputable advantage is the fact that the used procedure of evaluation is much more transparent and a series of partial results can be analysed as well.However, this method does not allow detecting variability caused by the interaction between parts and appraisers.
The analyses of the measurement systems based on the numerical evaluation must always be completed with appropriate graphic tools (Klaput & Plura, 2011).They will make possible to obtain a much more complex picture of the quality of the evaluated measurement system and to identify concrete causes of the measurement system properties.

ACKNOWLEDGEMENT
This paper was elaborated in the frame of the specific research project No. SP2012/42, which has been solved at the Faculty of Metallurgy and Materials

Figure 3
Figure 3 Changes of total variation in the individual simulations depending on the shift of measured data.

Figure 5
Figure 5 Changes in GRR study results for Simulation 3.
-TU Ostrava with the support of Ministry of Education, Youth and Sports, Czech Republic.

Table 2
Results of Simulation 1 for A&R and ANOVA methods.
When you change the values of the parts in question by only one standard deviation, the interaction is still statistically insignificant and the results remain

Table 3
Results of Simulation 2 for A&R and ANOVA methods.

Table 4
Results of Simulation 3 for A&R and ANOVA methods.