The emissions and efficiency of modern internal combustion engines need to be improved to reduce their environmental impact. Many strategies to address this (e.g., alternative fuels, exhaust gas aftertreatment, novel injection systems, etc.) require engine calibrations to be modified, involving extensive experimental data collection. A new approach to modeling and data collection is proposed to expedite the development of these new technologies and to reduce their upfront cost. This work evaluates a Gaussian Process Regression, Artificial Neural Network and Bayesian Optimization based strategy for the efficient development of machine learning models, intended for engine optimization and calibration. The objective of this method is to minimize the size of the required experimental data set and reduce the associated data collection cost for engine modeling.
This technique is demonstrated by generating engine performance models for a Dual Fuel High Pressure Direct Injection (HPDI) CNG Engine. Models are generated for the emissions and performance of a pilot ignited, direct injection, natural gas engine using only typical control inputs (e.g.: speed, injection timings, and fuel and air pressures). This modeling technique is first demonstrated on a full-factorial data set collected over a narrow operating space and then compared to a much coarser data set collected over a much larger space using the Box-Behnken approach.
Ten sets of neural network and Gaussian process regression models were generated for each engine output. The aggregated model results demonstrate that the machine learning models perform very well for the full factorial data set with correlation coefficients generally over 0.8 and normalized root mean square errors generally under 10%, while the response surface model is unable to characterize the outputs due to the size of the data. While there is a loss in performance using the coarser Box-Behnken data set, the machine learning methods do show some strong results for certain outputs. Models for NOX, CO2, O2, Peak Cylinder Pressure, EQR and Gross Indicated Power have R2 greater than 0.8 and normalized root mean square errors less than 20%. In general, Gaussian process regression shows the higher performing results with less performance variation over multiple tests compared to the neural network models. With further study, this method could enable the rapid evaluation and implementation of technologies and fuels for emission reduction.