The increasing amount of available process data from machining and other manufacturing processes together with machine learning methods provide new possibilities for quality control and condition monitoring. A prediction of the workpiece quality in an early machining stage can be used to alter current quality control strategies and could lead to savings in terms of time, cost and resources. However, most methods are tested under controlled lab conditions and few implementations in real manufacturing processes have been reported yet. The main reason for this slow uptake of this promising technology is the need to prove the capability of a machine learning method for quality prediction before it can be applied in serial production and supplement current quality control methods. This article introduces and compares approaches from the fields of machine learning and quality management in order to assess predictions. The comparison and adaption of the two approaches is carried out for an industrial use case at Bosch Rexroth AG where the diameter and the roundness of bores are predicted with machine learning based on process data.
Zusammenfassung: Die zunehmende Verfügbarkeit von Prozessdaten aus Fertigungsprozessen und die Zugänglichkeit zu Methoden des maschinellen Lernens eröffnet neue Möglichkeiten für die Qualitätskontrolle und die Zustandsüberwachung. Die Prognose der Qualität eines Werkstückes in einem frühen Bearbeitungsstadium kann zur Änderung der bisherigen Qualitätskontrollstrategien führen und zudem Einsparungen in Bezug auf Zeit, Kosten und Ressourcen hervorbringen. Die meisten Prognosemodelle werden zumeist ausschließlich unter kontrollierten Laborbedingungen getestet, sodass bisher nur wenige Implementierungen in reale Fertigungsprozesse erfolgten. Der Hauptgrund für diese langsame Integration dieser vielversprechenden Technologie in die Serienfertigung sowie die Ergänzung der bisherigen Qualitätskontrollstrategien ist die Notwendigkeit, die Fähigkeit einer Methode des maschinellen Lernens zur Qualitätsprognose nachzuweisen. Dieser Artikel stellt jeweils einen Ansatz aus den Bereichen maschinellen Lernens und Qualitätsmanagement vor, um die Genauigkeit einer Qualitätsprognose zu bewerten. Die Implementierung der beiden Ansätze erfolgt für einen industriellen Anwendungsfall bei der Bosch Rexroth AG, bei dem der Durchmesser und die Rundheit von Bohrungen mithilfe von maschinellem Lernen auf der Basis von Prozessdaten prognostiziert werden.
Keywords: Quality prediction; prediction assessment; machine learning; manufacturing; Qualitätsprognose; Prognosebewertung; maschinelles Lernen; Serienfertigung
Through the integration of more and more sensors into machine tools as well as the accessibility of data from numerical controllers (NC) and programmable logical controllers (PLC), an increasing amount of process data for each workpiece is available [[
The use case considered in this research paper addresses the quality prediction of drilled and reamed bores of hydraulic valves at Bosch Rexroth AG in Homburg (Germany). The process data were obtained from a milling-machine in the serial production during the machining of the pre-casted valve housings made of gray cast iron. Hydraulic valves are characterized by narrow tolerances to allow seal-less fits and prevent oil leakage. Even slight quality deviations can cause high scrap rates and financial losses. Therefore, quality control close to real time at minimal or ideally no additional cost is preferred.
Graph: Figure 1 Manufacturing process of a hydraulic valve with current and future quality control strategy.
During the manufacturing process, each valve housing is first machined, followed by the assembly of the valve, and finally an end-of-line test is carried out as depicted in Figure 1. Sample inspection with industrial metrology is currently used to control the machining process. The sample inspection covers only a low percentage value of all valves but still results in high costs. In addition, the latency between the machining of a valve and the corresponding measurement results allows no direct feedback about the machining process and risks manufacturing waste housings until the measurement results are known. In addition, a considerable amount of resources, time, and money is lost if a valve is only detected as a waste part during the end-of-line testing. To increase the transparency of the machining process and to determine the quality of the housings close to real time, a quality prediction based on process data and machine learning methods is pursued. The aim for the data collection is to use the process data (e. g., torque, current, and speed of the machining center) already available in an NC without the requirement for integration of any additional sensors. This would also allow a fast transfer of the technology to further machining centers and reduced maintenance cost. Hence, a feasible and cost-effective in-process quality control solution for machining processes under industrial conditions could be achieved.
To establish the in-process quality prediction, a gateway, and an industry PC were integrated into the control cabinet of the machining center (GROB; G500). The gateway is required to connect the industry PC with the drive controllers of the spindle and the z-axis as well as the PLC (programmable logic controller). The drive controllers gather the actual torque values with a high frequency and create data packages, which are sent to the gateway via the OPC UA interface. The data packages are forwarded to the industry PC and are stored in a database (MongoDB). The software Python is used to calculate features from the raw data and to predict the workpiece quality with machine learning methods. Depending on the chosen feature extraction strategy [[
Comprehensive studies [[
The applied machine learning methods belong to the ensemble methods [[
The diameter of a bore is characterized by its dimensional tolerance which describes the allowed deviation from the nominal diameter. The diameters of drilled bores of a production batch are usually normally distributed because of the tool wear. In Figure 2 (a) the diameter values of the training data have the shape of a bimodal distribution. The reason for this is the unequal apportionment of the collected training data which furthermore belong to different batches. Adding the validation data to the training data set would result in a distribution curve which is more similar to a normal distribution curve.
The roundness specifies how perfectly the circular cross-section of a bore is, i. e. the difference in diameter between the outer and the inner circle which envelop the circumferential line of a bore. Thus, an ideal bore has a roundness of zero and the roundness cannot be negative. Hence, the corresponding measured values accumulate to the right of the zero point which leads to a folded normal distribution. In Figure 2 (b) the measured roundness values are depicted and the maximum is observed at 0.9 µm and not at 0 µm. Thus, for this machining operation a systematic roundness error exists resulting in a density function very similar to a normal distribution (especially for the validation data). Therefore statistics are used throughout the analysis which are applicable to normally distributed data.
The main focus in this article is on the determination of suitable approaches to assess the quality predictions in manufacturing. These can be achieved using common performance metrics to evaluate predictions or to apply measurement process capability procedures established in manufacturing.
Graph: Figure 2 Distribution of the diameter and roundness predictions as well as the training and validation data sets. [ [
In order to assess the prediction accuracy of a trained algorithm, it is necessary to have interpretable performance metrics. These performance metrics make it possible to evaluate the suitability of an algorithm for a use case and to compare the algorithms with one another. The calculation of these performance metrics is performed for a test data set. Various performance metrics can be used to assess the prediction accuracy. The number of predictions, the prediction values, and the actual value (ground truth) are required to calculate the performance metrics. Suitable performance metrics are for example the mean absolute error (MAE), the maximum error (MAX) and the coefficient of determination (one minus the residual sum of squares divided by the total sum of squares) (R
Capability studies are necessary to ensure that a measuring device can determine a quality characteristic with sufficiently small uncertainty (measurement deviation and measurement value scatter) with regard to the feature tolerance [[
The procedures used in the Bosch Group for assessing the capability of measurement and test processes (booklet 10) [[
Procedure one assesses the capability of a measuring process in terms of location and variation of the measured values within the tolerance field of the quality characteristic. The method is carried out with a standard, which is measured 50 times. It must be ensured that all working steps between the individual measurements of the measurement series are completely carried out. That means that the measurement standard must be removed from the clamping and re-inserted before each measurement. If there is no standard, a calibrated workpiece can also be used. Procedure one requires quality characteristics with two-sided specification limits, i. e., with a lower and an upper limiting value (LSL and USL), so that the tolerance T is defined as the difference between LSL and USL. The reference value of the standard
Potential capability index:
Graph
Critical capability index:
Graph
If procedure one has been successfully completed, procedure three can be used to assess the capability of a measurement process in terms of its variation behavior using measurements of workpieces from series production. It is assumed that the operator has no influences on the measurement process. In contrast to procedure one, procedure three includes possible interactions between the measurement process and the measuring object in the capability study. It concerns the influence of the production part variation on the measurement as well as the influence of the measurement on the behavior of the production parts. For the implementation at least 25 workpieces from series production are required, which are randomly selected and whose characteristic values are within the tolerance. The selected serial parts are measured in random order in at least two measurement series under repeatability conditions. The aim is to determine the total variation %GRR (gage repeatability and reproducibility) of the measurement process. The capability of the measurement process is finally determined based on defined limit values for %GRR:
- – %GRR ≤ 10 % measurement process is capable,
- – 10 % < %GRR ≤ 30 % measurement process is conditionally capable,
- – 30 % < %GRR measurement process is not capable.
The reference value for %GRR is the tolerance T of the measured characteristic and the GRR value is the same as the equipment variation value [[
Graph
The predictions obtained are assessed in two ways. On the one hand, the performance metrics introduced in Section 3 are calculated from the prediction results, which are common for evaluating the prediction accuracy of machine learning methods. On the other hand, the procedures used in metrology to assess the capability of a measurement device are applied (cf. Section 4).
Section 3 described performance metrics that are often used to assess prediction accuracy. They are also suitable for assessing the prediction accuracy that are achieved using machine learning methods. In addition, the performance metrics are set in relation to the tolerance of each quality characteristic. This allows to determine a kind of "safety factor" which indicates whether a characteristic is still within its tolerance limits or not. The observed distribution of the measured and the predicted values for the diameter and the roundness of the bores are shown in Figure 2 (a) and (b). The blue distribution curve and histogram depict the training data set, the validation data is shown in green, and the predictions in red. It can be seen that the predicted diameters do not exceed the value range of the validation data set. The training is based on the data represented by the blue distribution curve, but the predictions are only within the diameter range of the green curve. This is a very positive result as it indicates that the trained machine learning method correctly determines the actual diameter values (value range). As a result of the minimal convergence of the diameter predictions to the arithmetic mean of the measured values, the density distribution of the predictions increases somewhat compared to the actual value. The diameter predictions for the individual bores are characterized by high accuracy, so that the maximum error (MAX) and the mean absolute error (MAE) are only 0.374 µm and 0.126 µm, respectively. Comparing these values with the tolerance of the diameter (10 µm), the MAE is 1/79
The predictions for the roundness of the bores are centered in the middle of the distribution of the validation data set and do not cover the full range (Figure 2 (b)). The predictions are concentrated at approx. 0.86 µm, regardless of the measured roundness of a bore. The MAE and MAX are 0.038 µm and 0.08 µm, respectively, again allowing to a high degree of certainty when determining whether a bore is within the tolerance (MAE and MAX are 1/65
In addition to the assessment of the prediction results with common performance metrics, procedures are used that are established in the manufacturing industry for assessing the capability of measuring equipment and measuring processes. By using these standardized, widely accepted procedures well known to production engineers, it is possible to express the prediction accuracy with common and comparable indices. Hence, the results from machine learning and industrial metrology can be compared with one another.
The basic principle of procedure one (cf. Section 4.1) is to measure a standard with a measuring device 50 times to determine the capability indices
Graph: Figure 3 Workflow of a capability assessment according to procedure one for measured values and its adaption for predicted values [ [
Figure 3 shows the steps for carrying out procedure one for conventional and adapted use. For the common application of procedure one, it can be assumed that there is a ready-to-use measuring device with which a standard can be measured 50 times and then the capability indices are calculated. When carrying out procedure one, to prove the capability of a machine learning method for quality prediction, a distinction can be made as to whether the training of the method is done for each run individually or not. In practice, an ML method is trained once and then used for prediction until a new training is required. If a trained method is used to provide predictions 50 times for the same data set, 50 identical prediction results are achieved (branch "A" in Figure 3). The fact that a trained method returns the same prediction result for the same input data on each run means that the standard deviation of the predictions is zero. Measuring a workpiece several times leads to slightly different measurement results, but this is not the case for a trained machine learning method. As a result, the repeatability of the prediction result examined with procedure one is excellent because the same result is always achieved. To calculate the capability indices, the respective numerator is divided by a multiple of the standard deviation, which in this case is not possible from a mathematical point of view because the standard deviation is zero. Assuming that the standard deviation is not zero, but almost zero, the quotient would be infinitely large and therefore always above the limit values of the capability indices, which would be a positive proof of capability.
To be able to assess the capability of a machine learning method for quality prediction with procedure one, an adaptation of procedure one must therefore be made. Such an adaptation or modification is quite common but requires that it is documented and critically questioned [[
Graph: Figure 4 Workflow of a capability assessment according to procedure three for measured values and its adaption for predicted values [ [
If a measuring process is classified as capable according to procedure one, then procedure three must be used to check its capability for measuring workpieces from series production. For procedure three at least 25 workpieces from series production are measured (predicted) twice and then the capability index is determined. The sequence of the two procedures is basically identical as depicted in Figure 4. Procedure three must be adapted also because identical predictions are achieved for a data set if the same trained method is used for each run. The adaptation again consists in carrying out the training of the method for each run (branch "D" in Figure 4). The predictions obtained in this way have a certain spread, which can be understood as analogous to the uncertainty of a measuring device and enable the use of the formulas to determine the capability. When performing the two series of measurements, the workpieces must be measured in a different order for each series of measurements and must therefore be clamped and unclamped twice. The influences on the measurement result from the measuring equipment and the handling of the workpieces do not occur for the ML-based quality prediction. These uncertainties can only be considered through the renewed training of the method (similar to the approach described for procedure one), since they occur when measuring the quality characteristics of the workpieces in the training data set and are now expressed by the scatter of the predicted values. A selection of the predictions, which are taken into account for the calculation of the capability indices does not have to be carried out because at least 25 predictions are required. Thus, all predictions of the two series are used to determine the capability.
The software Q-DAS solara.MP from Hexagon is used for the assessment of the prediction series and the calculation of the capability indices. The assessment of the capability of a machine learning method for quality prediction is carried out for each quality characteristic. In this chapter the results and obtained values for the capability indices are discussed.
Graph: Figure 5 Results of procedure one and three for the prediction of the diameter [ [
The results of procedure one and three for the prediction of the diameter are shown in Figure 5. The
The right diagram of Figure 5 shows the results of procedure three. The green horizontal line in Figure 5 can be seen as the mean of the two predictions for each sample which is set to zero to better compare the prediction deviation of all samples. The blue and pink curves stand for the first and second prediction, respectively. The difference between the blue and the pink curve of a sample is the total difference between the two prediction values (delta). Depending on which of the two predictions for a sample is higher or smaller it will be half of delta above or below the green line. The %GRR value of 3.98 % is well below the critical limit value of 10 % confirming the capability of the prediction method. The spread of the results of the ML method is therefore within an acceptable range. The diameters are predicted twice for each of the 50 bores. The deviations from the mean of the two predictions are shown for each bore. The maximum deviation is only 0.1 µm and therefore significantly smaller than the limit value, which is 5 % of the bore tolerance. The resolution achievable with the machine learning method is equal to the resolution of the measuring device (coordinate measuring machine) with which the quality data of the training data set was determined. The resolution %RE is 1 % of the tolerance and thus well below the maximum allowed 5 %. Hence, as all necessary capability indices (
Graph: Figure 6 Results of procedure one and three for the prediction of the roundness [ [
For the prediction of the roundness, the machine learning method RFR is used, for which the capability of quality prediction must also be assessed. The diagrams in Figure 6 show that the method RFR can be regarded as capable to predict the roundness in this use case. The standard deviation
In addition, the resolution %RE of the predictions (0.04 %) is far below the maximum limit value of 5 %. Furthermore, the %GRR value of 2.34 % is well below the limit value of 10 %, which confirms the capability of the prediction method. Moreover, a low spread or uncertainty of the prediction values is evident from the right diagram in Figure 6. Only for the first 10 bores the repeated predictions deviate somewhat more from one another. Nevertheless, the roundness prediction in this use case can be seen as capable according to procedures one and three.
The accuracy and the capability of the diameter and roundness predictions were assessed with two different approaches. First, performance metrics and charts commonly used to evaluate the prediction accuracy of machine learning methods were applied. Second, procedures established in quality management to determine the capability of measurement processes were used. Both approaches had to be adapted slightly to be used to assess the quality predictions. The performance metrics were set in relation to the tolerances of the quality characteristics to better assess the prediction accuracy. The workflows of the procedures for capability assessment for measurement equipment had to be adapted to calculate the required capability indices from predictions instead of measurements as normally applied.
The performance metrics of the first approach are simple to calculate and show the prediction accuracy but do not consider the deviation and the distribution of the data. A probability density chart is necessary to evaluate the distribution of the predicted and the actual values. These performance metrics are suitable to check the general performance of a prediction method but are not sufficient to assess the capability of a method in terms of quality management. The procedures and the corresponding indices to assess the capability of measurement processes are well known and established in quality management but are more difficult to calculate (second approach). Important information like deviation, repeatability, and systematic prediction errors are determined and tolerances are considered but the patterns of the distributions from the measured values (validation data) and the predicted values are untested. A prediction method can be capable based on the indices, while the distributions of measured and predicted values differ considerably. This is seen for example for the prediction of the roundness. Hence, the procedures, particularly one and three, are suitable to determine the capability of a prediction method for quality prediction in terms of quality management aspects but have to be supplemented by a comparison of the distribution curves.
In conclusion, with the advent of quality prediction a further development of quality management begins in which the existing approaches and methods are not necessarily replaced but supplemented by machine learning methods. This is evident from the nature of quality management in which almost all decisions are based on extensive data acquisition and analysis.
By Sebastian Schorr; Dirk Bähre and Andreas Schütze
Reported by Author; Author; Author
Sebastian Schorr studied Industrial Engineering at RWTH Aachen University and Tsinghua University and received his Master of Science degrees in 2017. From 2018 until 2021 he did his PhD at Saarland University in cooperation with Bosch Rexroth AG. Since 2021 he is working at Bosch Rexroth AG as an engineer for machine learning and quality management.
Dirk Bähre studied mechanical engineering and completed his PhD in the field of cutting technologies at the Technical University of Kaiserslautern 1994. After holding management positions in research at the TU Kaiserslautern and in process development at a large automotive supplier, he is holding the Chair of Production Engineering at Saarland University since 2008. Since 2021, he is a scientific director at the Centre for Mechatronics and Automation Technology ZeMA in Saarbrücken. He researches and teaches in the field of manufacturing techniques for industrial applications. His research focus is on precise machining technologies, the analysis of machining effects on material properties and resource efficiency as well as sustainability in production.
Andreas Schütze received his diploma in physics from RWTH Aachen in 1990 and his doctorate in Applied Physics from Justus-Liebig-Universität in Gießen in 1994 with a thesis on microsensors and sensor systems for the detection of reducing and oxidizing gases. From 1994 until 1998 he worked for VDI/VDE-IT, Teltow, Germany, mainly in the fields of microsystems technology. From 1998 until 2000 he was professor for Sensors and Microsystem Technology at the University of Applied Sciences in Krefeld, Germany. Since April 2000 he is professor for Measurement Technology in the Department Systems Engineering at Saarland University, Saarbrücken, Germany and head of the Laboratory for Measurement Technology (LMT). His research interests include smart gas sensor systems as well as data engineering methods for industrial applications.