Mineral Identification from spectral data is performed based on the presence/strength of specific absorption features at designated spectral windows. In this scenario a large portion of the data only contains background information which is not used for diagnostic analysis, due to this standard metrics are unable to accurately measure similarity between various spectra. One possible approach to overcome this limitation is to find a way to assign a numeric value to the presence/strength of the absorption features of interest, one such method is the spectral summary parameters [Viviano-Beck et al. (2014)]. This method attempts to capture the presence and strength of various spectral absorption features (such as band-depth, doublets etc..) using a single numeric value. Since many of these features of interest occur in similar spectral windows the summary parameters may also provide some “false positives”, i.e. high scores for spectra with a different feature than the one of interest. Additionally, since these spectral summary parameters are hand-crafted (designed by experts for a specific absorption feature) they are affected adversely by the presence of noise/distortion artifacts.
We propose a new unsupervised model based on Generative Adversarial Networks (GANs)  to learn features which are diagnostic in term of discriminating the various mineral spectra present the CRISM image database. (GANs) are a class of unsupervised machine learning algorithms, that attempt to learn the distribution of some known dataset and then allows a user to sample this distribution. The model consists of two neural networks called the generator and discriminator engaged in zero-sum game–the generator accepts a latent variable (a low dimensional vector from a known distribution) as an input and attempts to generate samples which the discriminator would be unable to discriminate from real samples from the dataset, on the other hand the discriminator attempts to discriminate between real world samples and samples generated by the generator. In training, we first train the discriminator to differentiate between samples generated by the generator and real world samples, in the next step the generator is chained with the discriminator and trained to generate samples which the discriminator would classify as real (in this process the weights of the discriminator are frozen), these steps are alternated until convergence. At convergence the generator should be successful in generating samples which not only the discriminator but even experts will find hard to differentiate from real world data samples. In the case of spectra at convergence the generator will have “learnt” to generate various mineral spectra (with the appropriate absorption features in the correct spectral windows). Since the training signal to the generator is based on the output of the discriminator, the only way the generator can “learn” to place these absorption features appropriately is if the discriminator is able to learn a representation for such features. We propose to use the representation learnt by the discriminator (which has knowledge of the diagnostic features) for discrimination and identification of minerals using the spectra.
To generate the maps we pass "exemplar" spectra (a spectrum we wish to search the database for) through the model to learn its representation. We do the same with test spectra (i.e. spectra from images to be mapped). Then we consider spectra whose representations are very close to the exemplars (have a SAD of less than 0.05) as identifications (high confidence detections). The spectra have the same diagnostics absorption features as the "exemplar", whereas spectra which are somewhat farther away but still relatively close (SAD between 0:05 and 0.15) are referred to as guesses (inter-mediate confidence detections), the guesses generally have the same absorption as the exemplar but there some scope for confusion with other exemplars with absorption in similar regions. The complete mapping scheme is shown in figure above
The mapping results for CRISM image FRT000093BE is shown in Figure below. The color of the pixels in GAN-Based mineral map indicates that the pixel has absorption features similar to the exemplar mentioned in the legend next to it. The boldly colored pixels represent identifications while lighter coloration indicates a guess. Over a variety of images the model has proven successful in identifying spectra accurately.
Advantages of the Proposed method
- All detections are pixel based.
- Each detection is associated with a similarity score that indicates the confidence of the specific detection.
- The model can map any spectrum from the CRISM image database, e.g. MICA Library Spectra [Viviano-Beck et. al.. ].
- These features are far more resistant to noise than techniques like spectral-summary parameters
- Getting the representations and detections from this pipeline is very fast.
Limitations of the Proposed method
- The method does not perform unmixing
- Can only detect spectra present in the exemplar library.
- In case of intra-class spectral variability, may need multiple examples of the same class to detect all versions of the class. As mentioned above it is helpful to think of the method as a spectrum mapping technique as opposed to a mineral mapping technique.
For more complete analysis and discussion please refer to our preprint.
A. M. Saranathan and M. Parente
Pre-print available. 15 pages. [PDF]
A. M. Saranathan and M. Parente
50th Lunar and Planetary Science Conference (LPSC), The Woodlands, TX, Mar. 2019, Abstract no. 2698.[PDF]