Optimisation of Multiple Response Processes Using Different Modelling Techniques

Purpose: This article aims to compare the impact on process optimisation with multiple responses of two different mathematical modelling methods: Ordinary Least Squares Method (OLS) and Symbolic Regression Method (SR). Methodology/Approach: Data from the literature were selected from the design of experiments for a process with multiple responses. Using these data, models were obtained that represented each response as a function of independent variables using the OLS and SR techniques. Then, the Desirability method was applied together with the Generalized Reduced Gradient (GRG) in order to obtain the process adjustment that would lead to the optimisation of the responses. Findings: The findings illustrate that the SR modelling technique yields models with superior predictive capabilities when contrasted with the OLS technique. Throughout the optimisation process, it becomes evident that the adjustments in the process diverge, even though the desirability function's value exhibits negligible variation.


INTRODUCTION
The evolution of society paralleled the rapid strides of the Industrial Revolution.Within this transformative context, two key concepts emerged, shedding light on the shifts within the sector: technology, which encompasses theories pertaining to the means of production, and technique, representing the practical application of these technologies (Soori et al., 2023).It has become evident that there is a growing demand for premium products as consumers increasingly seek out high-quality offerings (Niu et al., 2020).To ensure the production of these goods remains economically feasible for companies, an ongoing commitment to enhancing technology and the application of techniques in these processes is imperative.
Mathematical modelling plays a crucial role in the field of process optimisation.In fact, the effectiveness of process optimisation is intrinsically linked to the quality of the models obtained.In other words, a process of optimisation can only achieve satisfactory results if the mathematical model representing it is of high quality (Ota, 2015).
Mathematical modelling finds utility in a wide spectrum, encompassing biology, medicine, chemistry, and the social sciences.The significance of mathematical models becomes evident in diverse applications, such as water resource management (Keeler et al., 2012), the planning of glass packaging production (Toledo et al., 2016), the production of organic compounds through biological processes (Ascencio et al., 2021), and forecasting the evolution of the Covid-19 pandemic (Gleeson et al., 2021).This resource serves as a means to represent reality, enabling the design, analysis, and, most importantly, manipulation of scenarios.This resource not only offers a means to portray reality but also facilitates the design, analysis, and, most crucially, manipulation of scenarios.
It is challenging to provide an optimisation model that accurately represents a process, especially when considering multiple responses.This is because processes rarely have a single quality characteristic, and these characteristics often have conflicting process adjustments (Kuriger and Grant, 2011).
Various techniques can be proposed for the development of mathematical models, and Symbolic Regression stands out as a procedure with significant potential for utilisation (Gomes et al., 2019).
Symbolic regression involves the manipulation of mathematical expressions to determine which functions accurately represent a given dataset.It incorporates evolutionary computation and is considered the most suitable approach for model discovery (Liu et al., 2019).Genetic Programming, introduced by Koza (1992) in the early 1990s, plays a pivotal role in Symbolic Regression.It belongs to the family of Genetic Algorithms and is instrumental in seeking the optimal solution for a problem by optimising the fitness function value through the manipulation of an initial population of solutions via genetic operators.
As a form of artificial intelligence, Genetic Programming possesses the capability to identify the best mathematical model based on predefined conditions.This technology is increasingly relevant for industries striving to embrace Industry 4.0 (Frank et al., 2019).
It is known that a goal for a response can vary as being STB, smaller the better, where minimum is best, or as LTB, larger the better, where maximising is best, and there is also a possibility known as NTB, nominal the best, where the goal is to get as close as possible to a predefined goal value.In a manufacturing environment, most products have more than one quality characteristic to be optimised, and these are often correlated, so a process design that favours reaching the objective of one characteristic can be completely unfavourable to another, triggering the well-known Multiple Response Optimisation Problems (Han et al., 2019).
The objective of this study is to analyse the performance of Symbolic Regression as an alternative for the mathematical modelling of problems with multiple responses, optimising the models obtained by the Desirability Method using the Generalized Reduced Gradient (GRG) to search for the optimal point.

Symbolic Regression (SR)
Symbolic Regression (SR) involves the manipulation of mathematical expressions to derive mathematical models that effectively capture the characteristics of a given dataset (Searson, 2015).
SR is an Artificial Intelligence (AI) technique situated within the broader domain of Evolutionary Computing, serving as an extension of the Genetic Algorithm (GA) initially conceptualised by Koza (1992).The foundation of SR draws inspiration from the principles of genetics and evolution at the population level.It employs evolutionary mechanisms, including crossover and mutation, within an initial population to optimise the alignment between the population of candidate programs and a specified function objective, whether oriented towards maximisation or minimisation (Ojha et al., 2017).
The conventional approach to regression typically assumes that a mathematical model describing a phenomenon adheres to a standardised format, often in the form of a first or second-degree polynomial.In this method, the central objective is to ascertain the optimal coefficients for this model, enabling it to closely align with the observed data.To apply this approach, a systematic exploration of various functions is necessary until a fitting solution is identified.As a result, the outcome of such an analysis heavily depends on the expertise of those responsible for selecting the functions to be examined.Even among experienced practitioners, it is common practice to restrict testing to linear and quadratic functions, overlooking the potential for more complex models to deliver superior results (Kommenda et al., 2019).
The SR distinguishes itself from genetic algorithms in its approach to modelling, where it derives the syntax of the model.In contrast, other algorithms focus on optimising only the parameters (Sandoval et al., 2022).This unique characteristic is a result of the representation employed in SR.It utilises a tree-based representation composed of terminals and functions, which can vary according to the problem domain.These functions can take various forms, such as standard arithmetic operations, standard programming operations, standard mathematical functions, logical functions, and more (Koza, 1992).Figure 1

Age-Layered Population Structure
The Age-Layered Population Structure (ALPS) based genetic programming introduces an innovative metaheuristic strategy designed to counter premature convergence.It accomplishes this by running multiple instances of a search algorithm in parallel.Initially, ALPS was integrated with a generational Evolutionary Algorithm (EA), resulting in ALPS-EA, which exhibited significantly superior performance when compared to a standard EA (Hornby, 2009).
ALPS employs a hierarchical population structure that aims to boost genetic diversity and optimise the exploration of novel mathematical models.This strategy relies on the concept of genotypic age, a measure that reflects how long an individual's genetic makeup has been evolving within the population.The fundamental principle of ALPS involves segregating the population into multiple age layers, preventing younger and less proficient individuals from being swiftly outcompeted by older, more adept counterparts.Following this principle, competition primarily occurs among individuals with similar genotypic ages, all while maintaining an overarching selection pressure to enhance overall fitness (Hornby, 2006).
In a study conducted by Patnaik et al. (2018), the performance of ALPS was compared with nonlinear multiple regression for modelling vehicle traffic in India.
The results obtained indicated that the models produced by ALPS exhibited superior predictive power, particularly in heterogeneous traffic conditions.

Evaluating the Performance of Mathematical Models
Pearson's coefficient (R 2 ) is often used to determine the adequacy of a model Montgomery (2017).Mohammadzadeh et al. (2016) used the R 2 coefficient of determination to evaluate the performance of equations that predict the compression index of soils modelled via Multi-gene Genetic Programming.
The R 2 is a measure of the amount of variability in the data explained by the model.This ratio can take on values from -1 to 1, where 1 represents an entirely positive correlation, and -1 is an entirely negative correlation, with the best model being the one with the highest absolute value of R 2 (Zhu et al., 2019).However, employing this statistic can be considered inefficient for validating a model because adding estimated parameters will increase R 2 and consequently increase the variability embedded in the model (Fukuda et al., 2018).Yang (2005) proposes using the adjusted R 2 coefficient ( 2 adj R ) as a more efficient way to measure the performance of mathematical models since this coefficient decreases in value when non-significant parameters are added to the model.The quality of a model can also be assessed by the AIC statistic, which considers that there is a model that best represents the relationship between dependent and independent variables (Tesfamichael and Ndlovu, 2018).The p-value analysis shows how statistically significant the estimated parameters are.However, to avoid having a Type 1 error, i.e., wrongly rejecting the null hypothesis in the p-value analysis, Akaike's Information Criterion becomes an alternative for measuring the quality of models when there is more than one model in the study as a candidate (Halsey 2019).

Akaike's Information Criterion (AIC)
The Akaike Information Criterion (AIC) is a statistical tool widely used in the analysis of statistical models.It was developed by the Japanese statistician Hirotugu Akaike in the 1970s as a measure to select the best statistical model among a set of candidate models (Akpa and Unuabonah, 2011).
The AIC is based on the principle of parsimony, which seeks to find a balance between the quality of the model's fit to the data and the number of model parameters.
In other words, the AIC penalises more complex models, which have a greater number of parameters, in favour of simpler models, which explain the data with the fewest possible parameters (Ingdal et al., 2019).
The AIC calculation takes into account the model's likelihood function and the number of estimated parameters.The smaller the AIC value, the better the fit of the model to the data.Therefore, when comparing different models, the model with the lowest AIC value is generally considered to be the best in terms of fit and generalizability (Ward, 2008).The AIC is given by Equation 2: where N is the number of points used in obtaining the model (sample size).When more parameters are added to a model, the first term becomes smaller, while the second term becomes larger.When N is small compared to K for the largest model size in the candidate set (as a general rule, N/K < 40), it is recommended to use the Akaike Information Criterion Corrected for small samples (AICC) (Burnham et al., 2010).The AICC is given by Equation 3.
The use of AICC instead of AIC is preferred since it is more accurate for small samples and shows very similar results for large samples (Al-Rubaie et al., 2007).
Determining the AICC differences (∆) allows a quick comparison and ranking of candidate models.For the i-th model, the ∆i is given by Equation 4: where min AIC is the smallest AICC value among all models evaluated, the ∆ of the best model generated is equal to zero, while the rest of the models have positive values, and the higher the value of ∆ for the model, the worse the quality of its adjustment.
As a general rule, models with ∆ ≤ 2 have substantial predictability support, those with 2 ≤ ∆ ≤ 7 have considerably less support, and models with ∆ ≥ 7 have no support (Burnham et al., 2010).Gopalan et al. (2018) propose another way to interpret the AICC: normalising the relative likelihood values and calling it AW.The weights of all the models added together will be equal to one; thus, the model with the highest AW is considered the most effective because the weight of evidence that a model i is greater, and it is the best approximation of reality.The calculation of AW can be performed through Equation 5:

Optimisation
Optimisation does not necessarily imply the determination of optimal operating conditions since it is practically impossible to establish the optimum point due to the unlimited number of variables that impact a process.Instead, what can be determined are conditions for improvement by selecting maximum points determined within a predetermined search space (Dehuri and Cho, 2009).

The Desirability Method
The Desirability optimisation method stands out as a robust and adaptable statistical approach widely employed across diverse fields, including science, engineering, medicine, and industry.Its primary application lies in the simultaneous optimisation of multiple responses, making it especially valuable in navigating complex problems that entail balancing often conflicting goals (Bezerra et al., 2019).
At its core, the Desirability method revolves around the integration of various responses or criteria of interest into a unified desirability function.This function is designed to accommodate goals of maximisation, minimisation, or achieving target ranges for each response.By doing so, it allows for the consolidation of diverse objectives into a singular, comprehensive measure, offering insight into the overall "desirability" of a given configuration of parameters or experimental conditions (Derringer and Suich, 1980).
The process of implementing the Desirability optimisation method can be divided into a few key steps: 1. Definition of criteria: Identify and define the responses you want to optimise.These could be process variables, product characteristics, or any other metrics relevant to the problem at hand.

Normalisation:
To perform the aggregation of responses, it is necessary to normalise them.Normalisation is performed so that all responses can be compared on the same scale, preventing any of them from dominating the optimisation process just by their numerical scale.

Desirability function:
Every response is linked to a desirability function, outlining the specific contribution of the variable of interest to the overall outcome.The form of this function, whether linear, quadratic, cubic, or another variant, varies based on the unique characteristics and significance of each response in addressing the problem at hand.

Global desirability calculation:
Once all desirability functions are defined, it is possible to calculate a global desirability index for each combination of parameters or experimental conditions.This index represents how close this configuration is to achieving all desired goals.

Optimisation:
With the desirability values calculated, the next step is to search for combinations of parameters or experimental conditions that maximise this global index.There are several optimisation techniques available, such as numerical methods or heuristic search algorithms (Gomes et al., 2017).
The Desirability optimisation method has several advantages, one of the most notable being its ability to deal with multi-objective optimisation problems, where several conflicting goals must be considered.Furthermore, it is a valuable tool for making informed decisions in research and development projects, as well as for quality control and process improvement in industrial environments (Gomes et al., 2019).

METHOD
According to Bertrand and Fransoo (2002), this work can be classified according to the flow chart shown in Figure 3.

Figure 3 -Modified Desirability Function
The research steps were carried out following the listed steps: 1.The experimental data present in the work of Shin and Cho (2005) were selected; 2. From the experimental data obtained, models describing the previously selected responses were generated using the Ordinary Least Squares (OLS) technique with the aid of Minitab ® v. 20 software.The number of parameters adopted for each model was stipulated as a function of the value of 2 adj R ; that is, the parameters of less significance were removed as their value increased until this indicator was maximised; 3. From the experimental data obtained, models describing the previously selected responses were generated using the Symbolic Regression technique with the aid of Heuristiclab ® v. 3.3.15software and the ALPS algorithm.The computational effort was limited to 30 minutes of software execution for each model; 4. An analysis of the models obtained in steps 2 and 3 was performed, comparing the AIC values and parametric sensitivity analysis.From these data, we can rank the models according to their higher predictability; 5. A process of optimisation of the independent variables present in the models obtained by OLS and SR was carried out using the Desirability function as the agglutination method and the Generalized Reduced Gradient (GRG) as the mathematical search method; 6. From the results obtained in the previous steps, the present conclusions were made at the end of this article.
The GRG implementation was developed using the Microsoft Excel ® software.

RESULTS AND DISCUSSION
The selected data refer to a study on the effect of two dependent variables on silicon wafer coating thickness (Y), measured in micrometres.The two variables studied were the moulding stage temperature given by X1, measured in Fahrenheit and the injection flow rate, X2, measured in pounds per second.The experimental matrix for the case studied by Shin and Cho (2005) is presented in Table 1: The models for the mean and variance of the experiment obtained by OLS are illustrated in Equations 6 and 7.
The models proposed using SR are shown in Equations 8 and 9: To test the homoscedasticity of the models presented in Equations 14 and 15, a normality test of the residuals of these models was performed.The results are shown in Figure 4.As depicted in Figure 4, the p-value of the test exceeded 0.05 in both instances, indicating a normal distribution of residuals at a 95% confidence level.This finding serves as evidence of the homoscedasticity of the models, signifying their unbiased behaviour.
The results of the tests to evaluate the prediction quality of the models are illustrated in Table 2.As shown in Table 2, the analysis of the R 2 coefficient shows that the models obtained by SR perform better than the models obtained by OLS.This same behaviour can be found in the analysis of the 2 adj R , demonstrating that the SR models have a better correlation without the addition of parameters.
Analysing the AICC values, it is observed that SR is the method that best minimises the Kullback-Leibler (K-L) distance.The values of ∆ show that the models obtained via OLS have no support for the data set, and the values obtained for the AW indicator show that the models obtained by SR have greater weight, thus making a better representation of reality.From these analyses, it can be stated that genetic programming, in this case, obtains models of higher predictive quality than the models obtained by OLS.

Optimisation
From the mathematical models generated, optimisation processes were performed for problems with multiple answers.The method consisted of using the GRG as a mathematical search method and the Desirability function as an agglutinating function.The total desirability function can be seen in Equation 18: The results obtained in the optimisation process are summarised in Table 3: As can be seen in Table 3, the Global Desirability values are very close; therefore, there is no difference from one method to another.However, the models obtained by OLS present a different process fit from that presented by the model obtained by SR; this fact occurs due to the change in the behaviour of the Global Desirability function, as illustrated in Figures 5 and 6.Analysing Figures 5 and 6, it can be noted that there is a change in the global optimum region (identified by the red colour in the Figures).Moreover, if we take into account that the statistical tests proved that the models obtained by the SR technique have more accurate predictability, it is believed that the process adjustments proposed by these models are more reliable and, therefore, better.

CONCLUSION
Symbolic regression stands out as a highly promising and versatile tool, adept at extracting polynomial models that astutely encapsulate the intricate dynamics inherent in an experimental matrix grounded in the Design of Experiments (DOE) principles.Its ability to discern complex patterns in data makes it an invaluable asset in the realm of modelling.It is important to note that the use of symbolic regression entails a considerable increase in the time and computational effort required to obtain mathematical models.However, it is worth emphasising that if the goal is optimising an industrial process with few variations over an extended period, this impact can be considered negligible.In this context, the benefits of symbolic regression in terms of accuracy and system understanding may well outweigh the temporal demands, making it a valuable choice in scenarios where process stability is paramount.

Figure 1 -
Figure 1 -Basic PG algorithm using a tree representation for individuals

Figure 2 -
Figure 2 -Steps of a Genetic Programming Algorithm

Figure 4 -
Figure 4 -Test of normality of residuals for the models represented in equations 8 and 9

Figure 5 -
Figure 5 -Behavior of the Desirability function using the models obtained by Ordinary Least Squares

Table 2 -
Quality indicators of the mathematical models The targets for the optimisation of each response are presented in