| Name: | Description: | Size: | Format: | |
|---|---|---|---|---|
| 2.01 MB | Adobe PDF |
Authors
Advisor(s)
Abstract(s)
In the domain of chemometrics, multiblock data analysis is widely performed for exploring or fusing data
from multiple sources. Commonly used methods for multiblock predictive analysis are the extensions of
latent space modelling approaches. However, recently, deep learning (DL) approaches such as convolutional
neural networks (CNNs) have outperformed the single block traditional latent space modelling
chemometric approaches such as partial least-square (PLS) regression. The CNNs based DL modelling can
also be performed to simultaneously deal with the multiblock data but was never explored until this
study. Hence, this study for the first time presents the concept of parallel input CNNs based DL modelling
for multiblock predictive chemometric analysis. The parallel input CNNs based DL modelling utilizes
individual convolutional layers for each data block to extract key features that are later combined and
passed to a regression module composed of fully connected layers. The method was tested on a real
visible and near-infrared (Vis-NIR) large data set related to dry matter prediction in mango fruit. To have
the multiblock data, the visible (Vis) and near-infrared (NIR) parts were treated as two separate blocks.
The performance of the parallel input CNN was compared with the traditional single block CNNs based
DL modelling, as well as with a commonly used multiblock chemometric approach called sequentially
orthogonalized partial least-square (SO-PLS) regression. The results showed that the proposed parallel
input CNNs based deep multiblock analysis outperformed the single block CNNs based DL modelling and
the SO-PLS regression analysis. The root means squared errors of prediction obtained with deep multiblock
analysis was 0.818%, relatively lower by 4 and 20% than single block CNNs and SO-PLS regression,
respectively. Furthermore, the deep multiblock approach attained ~3% lower RMSE compared to the best
known on the mango data set used for this study. The deep multiblock analysis approach based on
parallel input CNNs could be considered as a useful tool for fusing data from multiple sources.
Description
Keywords
Data fusion Artificial intelligence Spectroscopy Chemistry
Pedagogical Context
Citation
Publisher
Elsevier
