Program utilities
Paleoceanographer applies MAT by using distance algorithms as similarity operators between samples. Users may choose among four commonly used distance operators:
- Euclidean distance:
dij = √(Σ (pik − pjk)²) - Manhattan (City Block) distance:
dij = Σ |pik − pjk| - Squared chord distance:
dij = Σ (√pik − √pjk)² - Squared Euclidean distance:
dij = Σ (pik − pjk)²
The program provides three main utilities: Autoevaluation, Parameters, and Analogs.
- Autoevaluation estimates values of the oceanographic variables for each sample of the calibration database allowing direct comparison between estimated and original values, in order to evaluate the accuracy of the process according to the selected distance algorithm. In the case of the PaleoUma calibration database, the most accurate results are obtained using the squared chord distance. If the user enters another calibration base, it is highly recommended to first perform the Autoevaluation test to select the similarity algorithm that offers the most accuracy when using Parameters or Analogs. In Autoevaluation, the program temporarily removes the core-top sample under analysis from the calibration set (leave-one-out cross-validation procedure). To carry out self-assessment, a copy of the calibration database must be supplied as sampling dataset, with the columns containing oceanographic and geographic variables removed.
- Parameters estimates sea-surface oceanographic variables —annual mean temperature (SST), Seasonality, and annual mean salinity (SSS)— for fossil samples in which the quantitative composition of their assemblage (taxonomic data) has been determined. By default, the program selects the 10 closest samples from the calibration dataset and calculates the inversely weighted mean of their values according to distance.
- Analogs identifies the modern analogs in the calibration dataset most similar to each fossil sample. In addition, the result shows the distance between fossil sample and modern analog, enabling to see the degree of similarity. By default, the program selects the three closest analogs, although this number is customizable.