Navigation
ESC
[↑↓] Navigate [↵] Select
Command Palette

Core Analytical Features & Capabilities

A detailed technical breakdown of the functional capabilities of the XPS Analyzer toolkit, from raw data parsing to empirical quantification.

Technical Stack

Python SciPy Pandas

Resources

Functional Overview

The XPS Analyzer package provides a complete, programmatically accessible pipeline for processing X-ray Photoelectron Spectroscopy datasets. The library abstracts the repetitive numerical operations required in surface science, enforcing strict mathematical and physical constraints at every step of the analytical workflow.

Below is a technical detailing of the core functionalities implemented in the library.

1. Automated Data Parsing & Ingestion

Spectrometer output files are often generated in proprietary, mixed-text formats containing both instrumental metadata and numerical arrays.

The data_loader module implements a custom parser that reads semicolon-delimited textual exports. It utilizes a multi-criteria detection algorithm to differentiate between broad “Survey” scans and high-resolution “Multiplex” regional scans.

Upon parsing, the module extracts instrumental parameters (e.g., pass energy, dwell time, X-ray source) and instantiates XPSSpectrum and XPSDataset Pydantic models. This ensures that down-stream functions receive strictly typed numpy.ndarray objects rather than raw string representations.

from xps_analyzer.data_loader import load_single_file

# The parser automatically detects the format (Survey vs. Multiplex)
# and populates the dataset.header dictionary with instrumental metadata.
dataset = load_single_file("data/raw/BN-SET-01/BN-BS-3/BN-BS-3 MULTIPLEX.txt")

# Retrieve a specific region as an explicitly typed XPSSpectrum object
ti2p_spectrum = dataset.get_spectrum("Ti 2p")

2. Spectroscopic Energy Calibration

Due to electrostatic charging of non-conductive samples during X-ray irradiation, the entire kinetic energy spectrum often shifts, requiring post-acquisition recalibration.

The preprocessing module provides the calibrate_dataset function, which accepts a reference element and its theoretical binding energy. The algorithm:

  1. Locates the specified reference region within the dataset.
  2. Identifies the binding energy corresponding to the maximum intensity peak (EobsE_{obs}).
  3. Calculates the required shift: ΔE=ErefEobs\Delta E = E_{ref} - E_{obs}.
  4. Applies this uniform scalar shift to the binding_energy arrays of all spectra within the XPSDataset.
from xps_analyzer.preprocessing.calibration import calibrate_dataset

# Calibrate the entire dataset using an internal reference peak.
# By setting inplace=False (default), it returns a new deeply copied dataset.
calibrated_dataset = calibrate_dataset(
    dataset, 
    reference_element="O", 
    reference_energy=530.0  # Theoretical lattice oxygen binding energy
)

3. Algorithmic Background Subtraction

The analysis.background module exposes three discrete mathematical models for the removal of the inelastic scattering tail:

  • Shirley Background: An iterative numerical integral evaluated against a convergence tolerance (tol=1e-5). Recommended for transition metals exhibiting sharp, step-like inelastic tails.
  • Tougaard Background: Computes the background using a universal inelastic scattering cross-section. The function accepts the empirical constants B,C,B, C, and DD, defaulting to the universal parameters for transition metals (B=2866,C=1643B=2866, C=1643).
  • Linear Background: Computes a simple secant line between the integration bounds, utilized primarily for flat spectral regions with low signal-to-noise ratios.

All background functions append their computed arrays into the XPSSpectrum.metadata dictionary, preserving the raw intensity array for auditing.

4. Non-Linear Peak Deconvolution

The analysis.peak_fitting module provides the mathematical engine for isolating overlapping electronic states.

It exposes functions for modeling individual profiles (fit_gaussian, fit_lorentzian, fit_voigt) and a generalized fit_multiple_peaks function for complex multiplets. The module utilizes the Levenberg-Marquardt algorithm via scipy.optimize.curve_fit to minimize the residual sum of squares (χ2\chi^2).

The optimization routine handles:

  • Automatic Parameter Estimation: Derives initial guesses for peak centroids (E0E_0), amplitudes, and FWHM by analyzing the second derivative of the smoothed data.
  • Spin-Orbit Doublet Constraints: When fitting elements like Ti, Sr, or Bi, the algorithm can enforce strict theoretical constraints on the energy splitting (ΔE\Delta E) and the intensity ratios (e.g., 2:1 for pp-orbitals).

The function returns a strictly typed FitResult model containing the optimized PeakParameters, the calculated residual array, and statistical goodness-of-fit metrics (R2R^2, reduced χ2\chi^2).

5. Empirical Atomic Quantification

The area integrated under a deconvoluted peak is not an absolute measure of atomic concentration; it must be normalized against the probability of photoemission.

The analysis.quantification module integrates databases of Relative Sensitivity Factors (RSF). It supports:

  • Scofield Theoretical Cross-Sections (1976): Tabulated for both Mg Kα\alpha and Al Kα\alpha X-ray sources across 89 elements.
  • Wagner Empirical Factors (1981): Derived experimentally, available for 18 common elements.

The calculate_atomic_concentration function takes a list of PeakParameters and computes the normalized fractional composition, providing the final quantitative output of the XPS analysis pipeline.

from xps_analyzer.analysis.quantification import (
    load_sensitivity_factors, 
    calculate_atomic_concentration
)

# Load empirical sensitivity factors for an Mg Ka X-ray source
rsf_db = load_sensitivity_factors(source="scofield", xray_source="mg_ka")

# Compute the fractional atomic concentration from the integrated peak areas
concentrations = calculate_atomic_concentration(
    peaks=[ti_peak, o_peak, sr_peak],
    sensitivity_factors=rsf_db,
    element_names=["Ti 2p", "O 1s", "Sr 3d"]
)

6. Data Serialization and Export

To interface with external statistical tools or publication pipelines, the export.exporters module provides serialization routines for both isolated XPSSpectrum objects and complete XPSDataset collections.

  • JSON Serialization: Utilizes a custom NumpyEncoder that safely casts numpy.ndarray structures into standard JSON arrays, translating NaN and Inf values to strict null equivalents, preserving the complete nested structure of the analysis metadata.
  • CSV/Excel Export: Unrolls the hierarchical data into strictly typed tabular formats using the pandas underlying engine. When exporting a full dataset to Excel, the algorithm dynamically maps each spectral region to an isolated workbook sheet, ensuring compatibility with standard laboratory workflows.