Loading...
17 results
Search Results
Now showing 1 - 10 of 17
- Deep chemometrics: validation and transfer of a global deep near‐infrared fruit model to use it on a new portable instrumentPublication . Mishra, Puneet; Passos, DárioRecently, a large near-infrared spectroscopy data set for mango fruit quality assessment was made available online. Based on that data, a deep learning (DL) model outperformed all major chemometrics and machine learning approaches. However, in earlier studies, the model validation was limited to the test set from the same data set which was measured with the same instru ment on samples from a similar origin. From a DL perspective, once a model is trained it is expected to generalise well when applied to a new batch of data. Hence, this study aims to validate the generalisability performance of the earlier developed DL model related to DM prediction in mango on a different test set measured in a local laboratory setting, with a different instrument. At first, the performance of the old DL model was presented. Later, a new DL model was crafted to cover the seasonal variability related to fruit harvest season. Finally, a DL model transfer method was performed to use the model on a new instrument. The direct application of the old DL model led to a higher error compared to the PLS model. However, the performance of the DL model was improved drastically when it was tuned to cover the seasonal variability. The updated DL model performed the best compared to the implementation of a new PLS model or updating the existing PLS model. A final root-mean-square error prediction (RMSEP) of 0.518% was reached. This result supports that, in the availability of large data sets, DL modelling can outperform chemometrics approaches.
- Deep learning for near-infrared spectral data modelling: Hypes and benefitsPublication . Mishra, Puneet; Passos, Dário; Marini, Federico; Xu, Junli; Amigo, Jose M.; Gowen, Aoife A.; Jansen, Jeroen J.; Biancolillo, Alessandra; Roger, Jean Michel; Rutledge, Douglas N.; Nordon, AlisonDeep learning (DL) is emerging as a new tool to model spectral data acquired in analytical experiments. Although applications are flourishing, there is also much interest currently observed in the scientific community on the use of DL for spectral data modelling. This paper provides a critical and compre-hensive review of the major benefits, and potential pitfalls, of current DL tecnhiques used for spectral data modelling. Although this work focuses on DL for the modelling of near-infrared (NIR) spectral data in chemometric tasks, many of the findings can be expanded to cover other spectral techniques. Finally, empirical guidelines on the best practice for the use of DL for the modelling of spectral data are provided.
- Near-Earth heliospheric magnetic field intensity since 1750: 2. Cosmogenic radionuclide reconstructionsPublication . Owens, M. J.; Cliver, E.; McCracken, K. G.; Beer, J.; Barnard, L.; Lockwood, M.; Rouillard, A.; Passos, Dário; Riley, P.; Usoskin, I.; Wang, Y-M.This is Part 2 of a study of the near-Earth heliospheric magnetic field strength, B, since 1750. Part 1 produced composite estimates of B from geomagnetic and sunspot data over the period 1750-2013. Sunspot-based reconstructions can be extended back to 1610, but the paleocosmic ray (PCR) record is the only data set capable of providing a record of solar activity on millennial timescales. The process for converting Be-10 concentrations measured in ice cores to B is more complex than with geomagnetic and sunspot data, and the uncertainties in B derived from cosmogenic nuclides (similar to 20% for any individual year) are much larger. Within this level of uncertainty, we find reasonable overall agreement between PCR-based B and the geomagnetic- and sunspot number-based series. This agreement was enhanced by excising low values in PCR-based B attributed to high-energy solar proton events. Other discordant intervals, with as yet unspecified causes remain included in our analysis. Comparison of 3year averages centered on sunspot minimum yields reasonable agreement between the three estimates, providing a means to investigate the long-term changes in the heliospheric magnetic field into the past even without a means to remove solar proton events from the records.
- A deep learning approach to improving spectral analysis of fruit quality under interseason variationPublication . Yang, Jie; Luo, Xuan; Zhang, Xiaolei; Passos, Dário; Xie, Lijuan; Rao, Xiuqin; Xu, Huirong; Ting, K.C.; Lin, Tao; Ying, YibinModel updating for developed calibrations is critical for robust spectral analysis in fruit quality control. Existing methods have limitations that usually need sufficient samples for model recalibration and are mainly designed for conventional linear models. This study proposes a model fine-tuning approach to update nonlinear deep learning models using limited sample sizes for fruit detection under interseason variation. This approach provides RMSE of 0.407, 1.035, and 0.642, for predicting soluble solid content (%) or dry matter content (%), in the Cuiguan pear, Rocha pear, and Mango dataset. The proposed approach reduces at least 9.2%, 17.5%, and 11.6% of test RMSE in three datasets compared with conventional model updating methods, including the global model, recalibration, and slope/bias correction. The model fine-tuning approach shows improved reliability under different updating sample sizes, ranging from 5% to 20% proportions of the new season's samples. The utilization of cumulative data in multiple previous seasons enables further improved performance. This study potentially facilitates implementing high-performance deep learning approaches in on-site applications of fruit quality control.
- A tutorial on automatic hyperparameter tuning of deep spectral modelling for regression and classification tasksPublication . Passos, Dário; Mishra, PuneetDeep spectral modelling for regression and classification is gaining popularity in the chemometrics domain. A major topic in the deep learning (DL) modelling of spectral data is the choice and optimization of the deep neural network architecture suitable for the specific task of spectral modelling. Although there are several recent research articles already available in the chemometric domain showing advanced approaches to deep spectral modelling, currently, there is a lack of hands-on tutorial articles in this space that supply the non-expert user with practical tools to learn and implement advanced DL optimization methodologies aimed a spectral data. Hence, this tutorial article aims a reducing the gap between the non-expert user of DL in the chemometric community and the implementation of DL models for daily usage. This tutorial supplies a quick introduction to the state-of-the-art deep spectral modelling and related DL concepts and presents a set of methodologies aimed a DL hyperparameters' optimization. To this end, this tutorial shows two practical examples on how to implement and optimize two DL models for spectral regression and classification tasks. The models are implemented in python and Tensorflow and the complete code is supplied in the form of two complementary notebooks.
- Realizing transfer learning for updating deep learning models of spectral data to be used in new scenariosPublication . Mishra, Puneet; Passos, DárioThis study presents the concept of transfer learning (TL) to the chemometrics community for updating DL models related to spectral data, particularly when a pre-trained DL model needs to be used in a scenario having unseen variability. This is the typical situation where classical chemometrics models require some form of re-calibration or update. In TL, the network architecture and weights from the pre-trained DL model are complemented by adding extra fully connected (FC) layers when dealing with the new data. Such extra FC layers are expected to learn the variability of the new scenario and adjust the output of the main architecture. Furthermore, three approaches of TL were compared, first where the weights from the initial model were left untrained and the only the newly added FC layers could be retrained. The second was when the weights from the initial model could be retrained alongside the new FC layers. The third was when the weights from the initial model could be re-trained with no extra FC layers added. The TL was shown using two real cases related to near-infrared spectroscopy i.e., mango fruit analysis and melamine production monitoring. In the case of mango, the model needs to be updated to cover a new seasonal variability for dry matter prediction, while, for the melamine case, the model needs to be updated for the change in the recipe of the production material. The results showed that the proposed TL approaches successfully updated the DL models to new scenarios for both the mango and melamine cases presented. The TL performed better when the weights from the old model were retrained. Furthermore, TL outperformed three recent benchmark approaches to model updating. TL has the potential to make DL models widely useable, sharable, and scalable.
- A synergistic use of chemometrics and deep learning improved the predictive performance of near-infrared spectroscopy models for dry matter prediction in mango fruitPublication . Mishra, Puneet; Passos, DárioThis study provides an innovative approach to improve deep learning (DL) models for spectral data processing with the use of chemometrics knowledge. The technique proposes pre-filtering the outliers using the Hotelling’s T2 and Q statistics obtained with partial least-square (PLS) analysis and spectral data augmentation in the variable domain to improve the predictive performance of DL models made on spectral data. The data augmentation is carried out by stacking the same data pre-processed with several pre-processing techniques such as standard normal variate, 1st derivatives, 2nd derivatives and their combinations. The performance of the approach is demonstrated on a real near-infrared (NIR) data set related to dry matter (DM) prediction in mango fruit. The data set consisted of a total 11,961 spectra and reference DM measurements. The results showed that removing the outliers and augmenting spectral data improved the predictive performance of DL models. Furthermore, this innovative approach not only improved DL models but attained the lowest root mean squared error of prediction (RMSEP) on the mango data set i.e., 0.79% compared to the best known RMSEP of 0.84%. Further, by removing outliers from the test set the RMSEP decreased to 0.75%. Several chemometrics approaches can complement DL models and should be widely explored in conjunction.
- Characteristics of magnetic solar-like cycles in a 3D MHD simulation of solar convectionPublication . Passos, D.; Charbonneau, P.We analyse the statistical properties of the stable magnetic cycle unfolding in an extended 3D magnetohydroclynamic simulation of solar convection produced with the EULAG-MHD code. The millennium,simulation spans over 1650 years, in the course of which forty polarity reversals take place on a regular similar to 40yr cadence, remaining well-synchronized across solar hemispheres. In order to characterize this cycle and facilitate its comparison with measures typically used to represent solar activity, we build two proxies for the magnetic field in the simulation mimicking the solar toroidal field and the polar radial field. Several quantities that characterize the cycle are measured (period, amplitudes, etc.) and correlations between them are computed, These are then compared with their observational analogs. From the typical Gnevyshesv-Ohl pattern, to hints of Gleissberg modulation the simulated cycles share many of the characteristics of their observational analogs even though the simulation lacks poloidal field regeneration through active region decay, a mechanism nowadays often considered an essential component of the solar dynamo. Some significant discrepancies are, also identified, most notably the in-phase variation of the simulated poloidal and toroidal large-scale magnetic components, and the low degree of hemispheric coupling at the level of hemispheric cycle amplitudes. Possible causes underlying these discrepancies are discussed.
- Non-destructive soluble solids content determination for ‘Rocha’ Pear Based on VIS-SWNIR spectroscopy under ‘Real World’ sorting facility conditionsPublication . Passos, Dário; Rodrigues, Daniela; Cavaco, Ana M.; Antunes, Maria Dulce; Guerra, Rui Manuel Farinha das NevesIn this paper we report a method to determine the soluble solids content (SSC) of 'Rocha' pear (Pyrus communis L. cv. Rocha) based on their short-wave NIR reflectance spectra (500-1100 nm) measured in conditions similar to those found in packinghouse fruit sorting facilities. We obtained 3300 reflectance spectra from pears acquired from different lots, producers and with diverse storage times and ripening stages. The macroscopic properties of the pears, such as size, temperature and SSC were measured under controlled laboratory conditions. For the spectral analysis, we implemented a computational pipeline that incorporates multiple pre-processing techniques including a feature selection procedure, various multivariate regression models and three different validation strategies. This benchmark allowed us to find the best model/preproccesing procedure for SSC prediction from our data. From the several calibration models tested, we have found that Support Vector Machines provides the best predictions metrics with an RMSEP of around 0.82 ∘ Brix and 1.09 ∘ Brix for internal and external validation strategies respectively. The latter validation was implemented to assess the prediction accuracy of this calibration method under more 'real world-like' conditions. We also show that incorporating information about the fruit temperature and size to the calibration models improves SSC predictability. Our results indicate that the methodology presented here could be implemented in existing packinghouse facilities for single fruit SSC characterization.
- Near-Earth heliospheric magnetic field intensity since 1750: 1. Sunspot and geomagnetic reconstructionsPublication . Owens, M. J.; Cliver, E.; McCracken, K. G.; Beer, J.; Barnard, L.; Lockwood, M.; Rouillard, A.; Passos, Dário; Riley, P.; Usoskin, I.; Wang, Y-M.We present two separate time series of the near-Earth heliospheric magnetic field strength (B) based on geomagnetic data and sunspot number (SSN). The geomagnetic-based B series from 1845 to 2013 is a weighted composite of two series that employ the interdiurnal variability index; this series is highly correlated with in situ spacecraft measurements of B (correlation coefficient, r=0.94; mean square error, MSE=0.16nT(2)). The SSN-based estimate of B, from 1750 to 2013, is a weighted composite of eight time series derived from two separate reconstruction methods applied to four different SSN time series, allowing determination of the uncertainty from both the underlying sunspot records and the B reconstruction methods. The SSN-based composite is highly correlated with direct spacecraft measurements of B and with the composite geomagnetic B time series from 1845 to 2013 (r=0.91; MSE=0.24nT(2)), demonstrating that B can accurately reconstructed by both geomagnetic and sunspot-based methods. The composite sunspot and geomagnetic B time series, with uncertainties, are provided as supporting information.