A novel technique to retrieve chlorophyll measurements from particulate beam-attenuation coefficients using deep learning has been published this week in Optics Express by scientists at Plymouth Marine Laboratory.
Particle beam-attenuation coefficient data are measurements of the loss of energy from a beam of light travelling in seawater due to the presence of suspended particles. They are much easier to collect than the current ‘gold standard’ for measuring chlorophyll, which is high-performance liquid chromatography (HPLC). Thus the approach provides a way of collecting significantly more data, which can be used to improve satellite products. The work was developed with expertise from the NERC Earth Observation Data and Acquisition and Analysis Service (NEODAAS) and leveraged the hardware provided under the MAssive GPU cluster for Earth Observation (MAGEO) to dramatically accelerate the research efforts.
Deep neural networks, algorithms inspired by the human brain, take significant computation resources to train. During initial development the team used a single GPU taking roughly 12 hours per neural network trained. MAGEO allowed the team to accelerate the training of deep neural networks, resulting in nearly 1000 overall trained networks. Thanks to the highly parallel environment and significant individual GPU unit computation power, what would have taken 16 months on a single GPU took 10 days on MAGEO.
Figure 1 (above): Bi-dimensional histograms of (A) the relationship between predicted chlorophyll and true chlorophyll (B) the corresponding relative residuals. δ represents the bias of the relationship (the median of the relative residuals) and σ shows its precision (the robust standard deviation). Horizontal dashed lines mark the ±50% residuals.
Figure 2 (above): The relationship between particulate beam attenuation coefficient data and chlorophyll-a concentrations at different wavelengths. Shows the shape of the spectra at different chlorophyll concentrations, which is exploited by the neural network to predict chlorophyll
To train the neural network successfully a large dataset was required. The team quality controlled 378,022 sample points for the purposes of the research and the size of this dataset allowed extensive validation to prove the accuracy of the neural network. As future data collections are performed this can be easily extended.
Thanks to MAGEO, an ensemble of the neural networks was produced. By using the ensemble the chlorophyll predictions were improved and simultaneous estimates of their uncertainties provided. These uncertainties are key in deciding how much the prediction can be trusted.
The results presented in the study show that the trained neural network can predict chlorophyll to a very high accuracy with minimum bias (-3% for the validation data.) Because of the abundance of data, the team were also able to create a wholly independent dataset using the Tara Oceans Expedition. This showed that the neural network could still achieve very high accuracy (-2% bias) in regions and from cruises on which it had not been trained.
This study allows the development of additional methods to create in-situ chlorophyll datasets, which are crucial for the Earth Observation community, as well as displaying the potential of how MAGEO will be able to help towards the development of ‘micro-artificial intelligence’ for the purpose of environmental intelligence. The chlorophyll dataset and code used for the paper have been made publicly available, allowing other researchers to build on and improve the technique.
S. Graban, G. Dall'Olmo, S. Goult, and R. Sauzède (2020) "