At the end of Part 1, parametric testing was completed and the data stored in the historian. The objective, and next step, is to obtain a record dataset for training and validating the neural network model.
Preparing the dataset for training
Parsing is the primary preparation step for dataset training. After the record dataset has been constructed, it must be prepared to train the neural network model. The record dataset should be randomly parsed into three separate record subsets: training, test, and validation (see Figure 1).
Both the training and test datasets are used in training the model. The neural network’s training algorithm uses the training subset to converge the model to the target function solution. The test subset is used to prevent over-training that could affect the robustness of the model. Finally, after the neural network model is trained, it must go through a strong validation process to give the user confidence it will perform well when in service. The validation subset proves the model can approximate the target value using data independent of the training process.
The training subset is composed of 60% to 80% of all the records. The remaining records are usually a split between the test and validation subsets. Each record should be randomly chosen from the dataset and placed in one of the three subsets. (Test have shown improved results when there is an equal percentage of records in each subset that are above and below the medium of the target reference.) After a record is selected, it is removed from the original dataset so it can’t be selected again.
Training, validation, and reporting the results
Now that the subsets are created, there are four steps that must be carried out during each training iteration:
And three steps post-training:
Initialization: Before training can occur, the neurons must be initialized by randomly seeding their weight and bias values. Initially, random seeding is deemed best because the target function is unknown.
Normalization: All variables, inputs, and target references are normalized prior to being used in the model for training. Normalization scales each input to the limits of the hidden layer transfer function. The model output is converted back to engineering units by "unnormalizing" the output.
If the perceptron uses a sigmoid function, the normalization is between 0 and 1. If using a hyperbolic tangent sigmoid transfer function (tansig), normalization is between -1 and +1. The engineering unit range of each process variable is scaled to the normalizing limits.
Training: Recall from Part 1 that the perceptron uses weights and bias values to compute its response. The final values are generated through an iterative back-propagation process. During training, the weight and bias values are adjusted to minimize the sum of square errors (SSE) between the target function and the model (see Figure 2).
Training algorithms that use optimization routines find the best path toward the SSE global minima. However, occasionally, they find a local minima and get stuck. This is due to the random seeding of the weights and bias values, the underlying function, and the path taken by the optimization algorithms. If a local minimum is encountered, the model will show poor correlation with the target and the training session must be repeated.
Early stopping: Early stopping is a training cycle exit strategy intended to allow an optimum reduction in SSE from the training data while keeping the model robust and preventing overtraining.
Regulation: Regulation is a technique that penalizes weights that become large during training. Although it has merit, early stopping is sometimes preferred as a means of improving robustness.
Validation: Validation is the most important and final part of confirming that the model will perform as expected. The validation subset is independent of the training cycle and is used to generate a validation plot (see Figure 3). Upon inspection, if the model (red) is in phase and within close approximation of the target (blue), the model can be deemed good. The validation curve is the last part of the training report.
Training report: Training reports should be expected for every model. A report should include the record dataset, the neural network architecture, the final training plot, and the reduction in SSE per iteration for the training and test subset used for regulation (see Figure 4).
Training analysis: The model and the target should be in phase, have a small error, and have a high correlation (see Figure 4a). Regression curves should show a good relationship between the model and target (see Figure 5). If the model is out of phase or has a large error, the model should be retrained, starting with randomly parsing the dataset for training. A continued poor result is a clue to increase the number of neurons in the hidden layer and/or reexamine the process variable selection as described in Part 1 (dataset selection).
Implementing a neural network model
Considerations for implementing a neural network in an industrial application include:
Algorithm: Most control platforms have the instruction set to program the neural network algorithm. The number of neurons in the network is established during training, and the equation for a single neuron in the hidden layer is shown in Equation 1 with "P" being the input vector. Each of the hidden layer neurons produce an output that is an element for the "A" in the output layer equation shown in Equation 2.
The calculated result of Equation 2 is the model of the target variable.
Ϝsigmoid(P x W1,n T + bias1,n) = an Equation 1
Ϝlinear(A x W2 T + bias2) = Model output Equation 2
Quality checks-inputs: Any model depends on reliable and accurate inputs. However, neural networks will happily accommodate inaccurate, skewed, or biased inputs as long as they are repeatable and the model is trained with the imparity. This implies that all inputs should be calibrated before the training dataset is acquired to establish a known baseline. During operation, all inputs into a neural network model should be validated as good quality by the control system.
Control space testing: In an operational model, a test must be performed proving the inputs are in valid control space. Recall from Part 1 (control space), a model only has knowledge in the control space where it was trained. An efficient method of testing is using a set of centroid point vectors with tolerances. The centroids and their tolerances are generated from the point vector distribution.
Alternate control/measurement strategies: Recall from Part 1: Neural networks in process control that Figure 3a (in Part 1) shows the valid control space and Figure 3b (in Part 1) shows an invalid point vector outside the control space. If the input point vector is outside the valid control space, an action plan should be in place to mitigate untrained model results. In other words, there should be an alternate control or measurement strategy. Figure 6 shows a model base controller strategy that—if outside its control space—alarms the operator and automatically switches to an alternate control algorithm. Switching could also be left to the discretion of the operator if performance is acceptable. That is because the errant input influence on the model is effectively clamped by the hidden layer transfer function.
Operator interface
It is recommended the human-machine interface (HMI) should have:
Figure 7a shows a virtual instrument interface. The alarm annunciates on the HMI faceplate, showing which input caused the issue. A transfer is available in case the operator chooses an alternate strategy. A manual mode is available to hold the last value. The multiple input/multiple output (MIMO) controller shown in Figure 7b is similar, but shows the trends of the control and manipulated variables. A separate setup screen is recommended for entering constraint limits and other alarm settings.
Using neural networks
In many process control applications, the approximate function neural network architecture is simple and works well modeling industrial processes. Neural networks must be trained from datasets best obtained by parametric testing. The user selects the target process variable to be modeled and assembles the training dataset to cover its range and function. The dataset contains a set of associated inputs that correlate to the target. The inputs used during training define the control space of the neural network model.
All dataset records are time-stamped with the target reference value capture time. Datasets are tied to the equipment configuration at the time of acquisition. Proper selection of the model inputs is important to increase the model robustness, eliminate noise, and reduce cost. Calibration should be completed on all inputs prior to data acquisition to establish a baseline.
After the dataset has been acquired, it must be parsed into three subsets: training, test, and validation. Training a neural network involves back-propagation of the error between the model output and the target value. Training is an iterative back-propagation process that requires strategies for promoting robustness and validation. Weights and bias values for each neuron are provided by the training process. A validation report should be reviewed prior to implementation on the control platform. A validation report should show the target reference in phase with the model with small error.
Many neural network training programs operate on desktop computers. However, virtually all control platforms have the necessary instruction set to configure a neural network model. Where possible, coding the neurons on the control platform should be in subroutines enabling reuse of code on future models. In operational models, inputs should be checked for goodness of quality and valid control space. An alternate strategy should be considered if inputs are outside of the valid control space.
The HMI should show the model output, inputs, alarms, set points, auto/manual selection, and normal/alt strategy controls. Trends and first out logic is valuable for operational awareness and diagnostics. A separate setup screen is recommended to enter constraints, alarms, and limits.
Jimmy W. Key, PE, CAP is president and owner of Process2Control LLC in Birmingham, Ala. He is a professional control systems engineer with more than 30 years of experience in the pulp and paper, automotive, chemical, and power industries. Key has an MS in control system engineering from Oklahoma State University. He launched Process2Control in 2013 to integrate neural network modeling with other advanced control concepts and applications specifically for the process control industry. Edited by Jack Smith, content manager, CFE Media, Control Engineering, jsmith@cfemedia.com.
Key concepts
Consider this
When is the proper time to stop the iterative training process?
ONLINE extra
See related articles below, offering more information about neural networks and advanced process control.
Do you have experience and expertise with the topics mentioned in this content? You should consider contributing to our WTWH Media editorial team and getting the recognition you and your company deserve. Click here to start this process.
Related Articles