Data Set Information
|
DATA_SET_NAME |
MSL MARS CHEMCAM LIBS SPECTRA 4/5 RDR V1.0
|
DATA_SET_ID |
MSL-M-CHEMCAM-LIBS-4/5-RDR-V1.0
|
NSSDC_DATA_SET_ID |
|
DATA_SET_TERSE_DESCRIPTION |
|
DATA_SET_DESCRIPTION |
Data Set Overview : The ChemCam LIBS RDR data set contains calibrated spectra and higher level products derived from raw data collected by the ChemCam Laser Induced Breakdown Spectrometer on the Mars Science Laboratory rover. Standard derived products include summed calibrated spectra (RDR), Intermediate Clean Calibrated Spectra (CCS) and Multivariate Prediction of Oxide Composition (MOC) tables. Also included are the Passive (PSV) data files, and Trace Element Concentration (TEC) and Trace Element Area (TEA) tables. It is to be noted that the average and median spectra included in the RDR and CCS files are obtained by applying the whole processing to the average or median of the raw spectra (after removal of the first 5 shots). As such, they may be different from the average or median of the processed single shot spectra. The MOC results are based on the average of the CCS. Shot to shot MOC results can be requested from the ChemCam team. The PSV results are based on the average of spectra that were acquired in a particular sequence. For some passive sequences multiple spectra were acquired at the same location and then averaged to represent the passive spectrum at that location. Converting the PSV files (DN) to radiance nominally requires a background subtraction, an estimate of which is obtained by a fixed set of pixels with virtually no photon response. For example, detector background DN levels can be computed for each detector from average DN values on lines 1905-1920 (VNIR), 2237-2241 (VIO) and 4385-4395 (UV). Subsequent to this subtraction, the instrument response (photons/DN; see file GAIN_MARS.TAB), spectrometer angular field of view (0.65 mrad), aperture size (108.4 mm diameter), known distance to target (m), spectral bin size, and collection integration time (msec) can be used to compute target radiance [WIENSETAL2013, JOHNSONETAL2015]. The TEC and TEA results are based on procedures that fit several trace element emission lines. The procedure is described in the document TRACE_ELEMENT_PEAK_FITTING.PDF. WAVE_CAL_COEFFS.TAB, WAVE_CAL_COEFFS_500.TAB and WAVE_CAL_COEFFS_1418.TAB provide the parameters for the ChemCam spectrometer wavelength calibration. The difference between the parameter files is that WAVE_CAL_COEFFS.TAB was used to calibrate ChemCam data from sols 0 - 499, and WAVE_CAL_COEFFS_500.TAB was used to calibrate data after sol 500 for UV and VNIR spectral ranges and from sols 500-1418 for the VIO spectral range and WAVE_CAL_COEFFS_1418.TAB was used to calibrate data after sol 1418 for the VIO spectral range only. Each row of WAVE_CAL_COEFFS.TAB (or WAVE_CAL_COEFFS_500.TAB or WAVE_CAL_COEFFS_1418.TAB) corresponds to an EDR pixel. The 4 columns of WAVE_CAL_COEFFS.TAB (or WAVE_CAL_COEFFS_500.TAB) give the default wavelength of the EDR pixel in the first column (COLUMN1) and the calibration model parameters in the remaining columns (COLUMN2, COLUMN3, COLUMN4). COLUMN1, since it gives the default wavelengths, is identical to the data in CCAM_DEFAULT_WAVE.TAB. Let 'i' be the row number in the EDR or WAVE_CAL_COEFFS.TAB (or WAVE_CAL_COEFFS_500.TAB or WAVE_CAL_COEFFS_1418.TAB), 'DN1[i]' be the EDR DN values, 'PIXEL1[i]' be the default, i.e. original, pixel positions with PIXEL1[i] : i, and 'WAVELENGTH1[i]' be the default wavelengths corresponding to the PIXEL1 default pixel positions. The wavelength calibration algorithm works by first calculating a calibrated pixel position ('PIXEL2[i]') from the default pixel positions ('PIXEL1[i]') and the calibration model parameters. It then uses an interpolation to give either wavelength-calibrated DN ('DN2[i]') at the default pixel positions with wavelengths WAVELENGTH1[i], OR the calibrated wavelengths 'WAVELENGTH2[i]' for the original PIXEL1[i] pixel positions and original DN1[i] EDR DN values. Standard ChemCam data processing as well as the passive spectra in .PSV files use wavelength calibrated DN ('DN2[i]') values. For applications requiring the original EDR DN values to be preserved it may be necessary to generate the calibrated wavelengths('WAVELENGTH2[i]'). For WAVE_CAL_COEFFS_500.TAB (sols 500-1418 in VIO, all sols after sol 500 for UV and VNIR), the calibrated pixel position is : PIXEL2[i] : PIXEL1[i] + COLUMN2[i] + TEMPERATURE * COLUMN3[i] + SOL * COLUMN4[i] where SOL refers to the sol number in the EDR header and TEMPERATURE refers to the BU_SPEC_A temperature, in units of degrees C, in the EDR header. For WAVE_CAL_COEFFS_1418.TAB (sols 1418 up to now in the VIO range only), the calibrated pixel position is : PIXEL2[i] : PIXEL1[i] + COLUMN2[i] + TEMPERATURE * COLUMN3[i] + TEMPERATURE * TEMPERATURE * COLUMN4[i] where TEMPERATURE refers to the BU_SPEC_A temperature, in units of degrees C, in the EDR header. DN2[i] is found by interpolating DN1[i] values at positions PIXEL2[i] onto positions PIXEL1[i]. WAVELENGTH2[i] is found by interpolating WAVELENGTH1[i] values at positions PIXEL1[i] on positions PIXEL2[i]. Standard ChemCam data processing uses a four point cubic spline interpolation, but linear interpolation can be used with negligible loss of accuracy. Note: Values of '16383.00000' suggest saturation of the detector and are unlikely to represent scientifically useful data from the target. Processing : The Committee On Data Management and Computation (CODMAC) data level numbering system is used to describe the processing level of the EDR data product. ChemCam LIBS data products are considered CODMAC Level 4/5 (equivalent to NASA level 1B/level 2) products. With MSL Release 9 a new calibration was applied to ChemCam LIBS CCS and RDR data products, and previously delivered products from sols 500 to 801 were reprocessed. This new ChemCam recalibration was generated with a new LIBS geological database. The resulting spectra were processed using methods similar to those described by Wiens et al. The resulting spectra were used to generate Partial Least Squares (PLS) and Independent Component Regression (ICA) models from which quantitative elemental compositions of the ChemCam Mars spectra were determined. A weighted average of these two models was used to generate a final composition for all Mars samples analyzed with ChemCam. This weighted average was developed to produce geologically realistic results from the ChemCam Mars observations. This document briefly outlines the methods used to generate the mean oxide compositions (MOC). Additional details for the major element calibration are available in: Clegg, et al., (2017) Recalibration of the Mars Science Laboratory ChemCam Instrument with an Expanded Geochemical Database, Spectrochimica Acta Part B: Atomic Spectroscopy. 129, 64-85. https://doi.org/10.1016/j.sab.2016.12.003. Details for the MnO calibration are available in: Gasda et al. (2021) Quantification of manganese for ChemCam Mars and laboratory spectra using a multivariate model, Spectrochimica Acta Part B: Atomic Spectroscopy. 181, 106223. https://doi.org/10.1016/j.sab.2021.106223. ChemCam processed data products are generated by the ChemCam team. They consist of the following: an RDR (reduced data record) spectrum produced by averaging spectra from all but the first five laser pulses at each observation point, subtracting a background, removing high-frequency noise, and removing the white-light continuum, and a clean calibrated spectrum (CCS) produced by correcting for the instrument optical response in addition to the RDR processing mentioned above [WIENSETAL2013]. Mean oxide compositions (MOC) for the major elements SiO2, TiO2, Al2O3, FeOT, MgO, CaO, Na2O, K2O and for MnO are also calculated. Major element oxide compositions are determined using a combination of 'sub-model' partial least squares (SM-PLS) and Independent Component Regression (ICR). MnO compositions are determined using a 'double-blended' submodel approach, with PLS and least absolute selection and shrinkage operator (LASSO) regression models. The following text first describes the major element calibration, followed by the MnO calibration. The PLS model uses as inputs a training set of standards observed in the laboratory using a lab version of the ChemCam instrument in Los Alamos. The ICA models use a mixture of the new training set for some elements and the original database for other elements. The new database was created with the LANL testbed instrument that consists of the ChemCam engineering model mast unit (laser, telescope, RMI, and associated electronics) and a body unit (optical demultiplexer, spectrometers, electronics, and data processing unit) that was assembled from flight spare parts. The targets were observed from a distance of 1.56 m and were housed in a sealed chamber maintained at Mars atmospheric pressure with a Mars simulant gas. The targets consist of pressed powders to maintain homogeneity. Each target is observed using 50 laser shots and corresponding spectra averaged together, and this is repeated on five different locations on each target. The training set consists of up to 357 samples spanning a large range of compositions and comprised of both igneous and sedimentary sources. Nearly all targets are natural terrestrial standards, but a few have been doped to increase the range of certain elements, such as Fe or Ti. Processing of these data included the use of a transfer matrix in which spectra from calibration targets on the rover were ratioed to spectra from replicate targets in the laboratory. This spectral transfer matrix was applied to the laboratory spectra to correct for potential differences between the lab measurements and the ChemCam measurements on Mars. The PLS1 and ICA models represent two techniques that have both been in use since nearly the beginning of the mission. In the case of PLS, the landed mission began with PLS2 and switched to PLS1 after one year. In the case of ICA the technique had been used for classification, but not for quantification. While both techniques used the laboratory training set, each technique used different means for matrix transformation to the Mars instrument and environment and significantly different means for selecting the parameters, whether fitting in the case of ICA or optimizing the number of components in the case of PLS. Because both techniques were well grounded in the training set while at the same time representing significantly different approaches, the combination of both techniques for the final product strengthens the overall approach. SM-PLS uses 'sub-models' for each major element that are optimized for a limited range of sample compositions, rather than using a single 'full' PLS model to predict all compositions from 0 to 100 wt.%. By reducing the composition range of the PLS sub-models, they can 'specialize' in the spectral trends that are most relevant for that range, resulting in improved accuracy compared to a 'full' model. For SiO2, TiO2, Al2O3, FeOT, MgO, and CaO, three sub-models are used. For K2O two submodels were used, and for Na2O, it was found that the submodels did not improve upon the full model, so no submodels were used. The number of components and spectral normalization for each submodel are optimized by 5-fold cross validation. Details of the submodel approach in general are available in: Anderson, et al. (2017), Spectrochimica Acta Part B: Atomic Spectroscopy. 129 (2017) 49-57. http://dx.doi.org/10.1016/j.sab.2016.12.003. Refer to Clegg et al. (2017) for specifics of how submodels were used for the ChemCam calibration. The submodel prediction results are combined using simple logic and linear weighted averages. For example, for SiO2 the logic for combining the 'low' (0-50 wt.%), 'mid' (30-70 wt.%), and 'high' (60-100 wt.%) submodels is as follows: If the full model predicts <30 wt.%, use the low submodel If the full model predicts 30-40 wt.% blend between the low - mid models If the full model predicts 40-60 wt.% use the mid model If the full model predicts 60-70wt.% blend the mid and high models If the full model predicts >70 wt.%, use the high model. Independent component analysis (ICA) is a technique that comes from the developments in the blind source separation (BSS) research. The goal of ICA is to estimate source signals or loadings, assumed to be stationary, using observed signals that are independent of the unknown mixing of the source signals. ICA is a method of linear transformation in which the representation minimizes the statistical dependence of the components. This is achieved, using a criterion related to the information entropy theory that yields the statistical independence, assuming that the data follow a non-Gaussian distribution. It results that the loadings are characterized by many emission LIBS lines of a single element, e.g. Si, and that a relationship between the elemental ICA scores of a given spectrum and its composition can be derived. Such relationships or regression laws have been derived for each element. The stability of the loadings has been verified to be very robust to any sub-sampling of the input database. The regression laws have been determined using an iterative scheme allowing an efficient removal of the potential outliers. The final best fit regression for each element was selected to satisfy the assumed Martian geological trends. To retrieve the composition of a given ChemCam spectrum, the Independent Component Regression (ICR) code computes for each element its score from the respective loading given in the file named 'ICA_COMPONENTS.CSV' in the CALIB directory. The regression law is then applied to the score to get the composition. The coefficients of the regression laws can be found in the file named 'REGRESSION.CSV', also located in the CALIB directory. A pure average of both PLS and ICA techniques might seem like the best approach. However, in viewing the results, certain elements appeared better suited to one technique or the other. This is probably most apparent for aluminum, where the ICA root mean square error of the training set was significantly higher than for PLS, and ICA appeared to under-estimate relatively high values of Al2O3. As the ICA provided some advantage in the low to middle abundance range, a weighting of 75/25 PLS1/ICA was selected. Iron was the one element where a comparison with APXS was used to suggest that the ICA over-estimated the high-abundance range. The target Square Top (sols 576-583) provided the highest FeOT abundances by APXS of the targets observed in common and which were judged to provide good comparisons based on a) relative lack of dust obscuring the APXS observation, b) relative homogeneity within the ChemCam points, and c) relatively close proximity of the measurements of the two instruments. Based on this point ICA abundances were judged to over-estimate the high-Fe abundances, and so a weighting of 75/25 PLS1/ICA was used instead of a 50/50 mix. As the ICA FeOT abundances increased to the highest values they caused another concern in that the ICA silicon values were depressed to balance the seemingly over-estimated high Fe abundances. The SiO2, which was otherwise a 50/50 average, was raised to a 75/25 mix when the weighted average of FeOT was above 30 wt % to reduce the effect on SiO2 for these values. The alkali elements have the fewest strong emission lines of any of the major elements. Potassium has two strong lines at 767 and 770 nm while sodium has a doublet (unresolved by ChemCam) at 589 nm and another at 820 nm, with a less intense peak at 569 nm. Because of the relative paucity of alkali lines, these lines may tend to be 'out-voted' by strong emission peaks of other emission lines. On the other hand, ICA focuses almost exclusively on these respective lines for the alkali abundances. The RMS errors of ICA for both the training set and the CCCTs were relatively low. Additionally, the highest PLS values tended to slightly exceed the values of rock-forming minerals, e.g., of minerals in the solid solution between albite, anorthite, and orthoclase. For these reasons the mix was weighted toward ICA: 40/60 PLS1/ICA for Na2O and 25/75 for K2O. One other issue was noted and corrected. The PLS1 technique tends to work very well in the middle of the range of the training set, but less well at the extremities. As a result, values near zero may be over-estimated. This appeared to be the case for PLS1 models of SiO2 and AL2O3. To counter this tendancy, values near zero were more strongly weighted to ICA for these two elements. For Al2O3 the proportion was ramped linearly from 0/100 PLS1/ICA when the weighted average was zero, to the normal proportions of 75/25 at 15 wt% and above. SiO2 was treated similarly, ramping to the normal proportion of 50/50 at a weighted average of 30 wt % (except if FeOT > 30 wt %, as noted above). To estimate the accuracy of the final, combined results, we selected a test set of training samples for each element that mimics the distribution of compositions in the full database. The PLS model was re-generated with these samples left out, but with all parameters (number of components and normalization) remaining fixed. The ICA and PLS predictions for the test set samples are combined in the same manner as the predictions for unknown samples. These combined predictions for the test set could be used to calculate a single root mean squared error (RMSE) for each element. However, the performance of the calibration varies as a function of target composition, so a more representative estimate of the accuracy would be a RMSE that varies with predicted composition. The RMSE as a function of composition is calculated by using a moving window on the test set predictions. The RMSE vs compositions results are Gaussian-smoothed and resampled so that they can be used as a look-up table to estimate the RMSE for a prediction of any composition. The quartiles of the test sets used to calculate RMSE are provided with each output file as a guide for users. When predicted values are near or beyond the extreme ends of the test set quartiles, the values should be treated with caution. As of data release #29, the MOC table includes additional columns pertaining to the quantification of MnO. The details of the MnO quantification are discussed in Gasda et al. (2021) but we summarize them briefly here. For MnO quantification, the data set of calibration standards used for the major elements was supplemented with additional samples with a wider range of MnO contents. This expanded data set was used to train MnO submodels using PLS, LASSO, and several other multivariate regression algorithms. A key innovation was the development of the 'double blending' approach, in which we initially blend results between a 0-10 wt% LASSO submodel and a 0-129 wt% PLS model. These models have broad enough composition ranges that the distribution model reliably assigns spectra to the correct model. The results of this initial submodel blending are then used as input for a second level of submodel blending between a lower 0-1 wt% LASSO submodel and the initial blended results. With this double blended approach, we achieve a MnO calibration that is sufficiently accurate near zero wt%, with a quantification limit of 0.05 wt% MnO, allowing us to predict most geologic samples on Mars, and also performs well on high MnO contents. The RMSE of MnO predictions as a function of predicted weight percent is calculated in a manner similar to that for the major elements, with the difference that rather than using a moving 'window' in weight percent, we instead use the nearest 40 test set predictions. For predictions >5 wt.%, we smooth the RMSE vs wt.% curve, but we do not smooth the curve for predictions below 5 wt.%. Most of the test set data are in this low concentration range, so the unsmoothed curve better captures the true variation of the accuracy. Data : Each ChemCam LIBS RDR data file is a table. Columns are variable length and are delimited with commas. Each row is terminated with a carriage return and line feed character.
|
DATA_SET_RELEASE_DATE |
2013-12-13T00:00:00.000Z
|
START_TIME |
2012-08-06T12:00:00.000Z
|
STOP_TIME |
N/A (ongoing)
|
MISSION_NAME |
MARS SCIENCE LABORATORY
|
MISSION_START_DATE |
2003-10-01T12:00:00.000Z
|
MISSION_STOP_DATE |
N/A (ongoing)
|
TARGET_NAME |
MARS
|
TARGET_TYPE |
PLANET
|
INSTRUMENT_HOST_ID |
MSL
|
INSTRUMENT_NAME |
CHEMISTRY CAMERA LASER INDUCED BREAKDOWN SPECTROMETER
|
INSTRUMENT_ID |
CHEMCAM LIBS
|
INSTRUMENT_TYPE |
SPECTROMETER
|
NODE_NAME |
Geosciences
|
ARCHIVE_STATUS |
|
CONFIDENCE_LEVEL_NOTE |
Confidence Level Overview : CCS: slight issues with continuum removal. See Wiens et al., 2013. Review : This ChemCam archive has undergone a standard Planetary Data System peer review.
|
CITATION_DESCRIPTION |
Wiens, R., MSL ChemCam Laser Induced Breakdown Spectrometer derived data, MSL-M-CHEMCAM-LIBS-5-RDR-V1.0, NASA Planetary Data System, 2013.
|
ABSTRACT_TEXT |
The MSL ChemCam LIBS RDR data set contains calibrated spectra and higher level products derived from raw data collected by the ChemCam Laser Induced Breakdown Spectrometer on the Mars Science Laboratory rover. Standard derived products include summed calibrated spectra (RDR), Intermediate Clean Calibrated Spectra (CCS), and Multivariate Prediction of Oxide Composition (MOC) tables. Special products, which may be generated as needed and as resources permit, are Univariate Prediction of Elemental Composition (UEC) tables, Univariate Prediction of Oxide Composition (UOC) tables, Multivariate Prediction of Elemental Composition (MEC) tables, and Sammon's Map (RSM) tables.
|
PRODUCER_FULL_NAME |
ROGER WIENS
|
SEARCH/ACCESS DATA |
Geosciences Web Services
Geosciences Data Volume Online
MSL Analysts Notebook
|
|