Refine
Year of publication
Document Type
- Conference Proceeding (30)
- Article (17)
- Lecture (1)
Language
- English (46)
- German (1)
- Multiple languages (1)
Keywords
- Sensor (21)
- Magnetism (17)
- Magnetics (6)
- Biomedicine (5)
- Magnetic (3)
- Blood Pressure (2)
- Curvature (2)
- Machine Learning (2)
- Medical Technology (2)
- Outer Space (2)
Abstract: The biological investigation of a population’s shape diversity using digital images is typi-
cally reliant on geometrical morphometrics, which is an approach based on user-defined landmarks.
In contrast to this traditional approach, the progress in deep learning has led to numerous applications
ranging from specimen identification to object detection. Typically, these models tend to become black
boxes, which limits the usage of recent deep learning models for biological applications. However, the
progress in explainable artificial intelligence tries to overcome this limitation. This study compares
the explanatory power of unsupervised machine learning models to traditional landmark-based
approaches for population structure investigation. We apply convolutional autoencoders as well
as Gaussian process latent variable models to two Nile tilapia datasets to investigate the latent
structure using consensus clustering. The explanatory factors of the machine learning models were
extracted and compared to generalized Procrustes analysis. Hypotheses based on the Bayes factor are
formulated to test the unambiguity of population diversity unveiled by the machine learning models.
The findings show that it is possible to obtain biologically meaningful results relying on unsupervised
machine learning. Furthermore we show that the machine learning models unveil latent structures
close to the true population clusters. We found that 80% of the true population clusters relying on
the convolutional autoencoder are significantly different to the remaining clusters. Similarly, 60% of
the true population clusters relying on the Gaussian process latent variable model are significantly
different. We conclude that the machine learning models outperform generalized Procrustes analysis,
where 16% of the population cluster was found to be significantly different. However, the applied
machine learning models still have limited biological explainability. We recommend further in-depth
investigations to unveil the explanatory factors in the used model.
Keywords: generalized procrustes analysis; machine learning; convolutional autoencoder; Gaussian
process latent variable models