Core Analytical Features & Capabilities
A detailed technical breakdown of the functional capabilities of the XPS Analyzer toolkit, from raw data parsing to empirical quantification.
Technical Stack
Resources
Functional Overview
The XPS Analyzer package provides a complete, programmatically accessible pipeline for processing X-ray Photoelectron Spectroscopy datasets. The library abstracts the repetitive numerical operations required in surface science, enforcing strict mathematical and physical constraints at every step of the analytical workflow.
Below is a technical detailing of the core functionalities implemented in the library.
1. Automated Data Parsing & Ingestion
Spectrometer output files are often generated in proprietary, mixed-text formats containing both instrumental metadata and numerical arrays.
The data_loader module implements a custom parser that reads semicolon-delimited textual exports. It utilizes a multi-criteria detection algorithm to differentiate between broad “Survey” scans and high-resolution “Multiplex” regional scans.
Upon parsing, the module extracts instrumental parameters (e.g., pass energy, dwell time, X-ray source) and instantiates XPSSpectrum and XPSDataset Pydantic models. This ensures that down-stream functions receive strictly typed numpy.ndarray objects rather than raw string representations.
from xps_analyzer.data_loader import load_single_file
# The parser automatically detects the format (Survey vs. Multiplex)
# and populates the dataset.header dictionary with instrumental metadata.
dataset = load_single_file("data/raw/BN-SET-01/BN-BS-3/BN-BS-3 MULTIPLEX.txt")
# Retrieve a specific region as an explicitly typed XPSSpectrum object
ti2p_spectrum = dataset.get_spectrum("Ti 2p")
2. Spectroscopic Energy Calibration
Due to electrostatic charging of non-conductive samples during X-ray irradiation, the entire kinetic energy spectrum often shifts, requiring post-acquisition recalibration.
The preprocessing module provides the calibrate_dataset function, which accepts a reference element and its theoretical binding energy. The algorithm:
- Locates the specified reference region within the dataset.
- Identifies the binding energy corresponding to the maximum intensity peak ().
- Calculates the required shift: .
- Applies this uniform scalar shift to the
binding_energyarrays of all spectra within theXPSDataset.
from xps_analyzer.preprocessing.calibration import calibrate_dataset
# Calibrate the entire dataset using an internal reference peak.
# By setting inplace=False (default), it returns a new deeply copied dataset.
calibrated_dataset = calibrate_dataset(
dataset,
reference_element="O",
reference_energy=530.0 # Theoretical lattice oxygen binding energy
)
3. Algorithmic Background Subtraction
The analysis.background module exposes three discrete mathematical models for the removal of the inelastic scattering tail:
- Shirley Background: An iterative numerical integral evaluated against a convergence tolerance (
tol=1e-5). Recommended for transition metals exhibiting sharp, step-like inelastic tails. - Tougaard Background: Computes the background using a universal inelastic scattering cross-section. The function accepts the empirical constants and , defaulting to the universal parameters for transition metals ().
- Linear Background: Computes a simple secant line between the integration bounds, utilized primarily for flat spectral regions with low signal-to-noise ratios.
All background functions append their computed arrays into the XPSSpectrum.metadata dictionary, preserving the raw intensity array for auditing.
4. Non-Linear Peak Deconvolution
The analysis.peak_fitting module provides the mathematical engine for isolating overlapping electronic states.
It exposes functions for modeling individual profiles (fit_gaussian, fit_lorentzian, fit_voigt) and a generalized fit_multiple_peaks function for complex multiplets. The module utilizes the Levenberg-Marquardt algorithm via scipy.optimize.curve_fit to minimize the residual sum of squares ().
The optimization routine handles:
- Automatic Parameter Estimation: Derives initial guesses for peak centroids (), amplitudes, and FWHM by analyzing the second derivative of the smoothed data.
- Spin-Orbit Doublet Constraints: When fitting elements like Ti, Sr, or Bi, the algorithm can enforce strict theoretical constraints on the energy splitting () and the intensity ratios (e.g., 2:1 for -orbitals).
The function returns a strictly typed FitResult model containing the optimized PeakParameters, the calculated residual array, and statistical goodness-of-fit metrics (, reduced ).
5. Empirical Atomic Quantification
The area integrated under a deconvoluted peak is not an absolute measure of atomic concentration; it must be normalized against the probability of photoemission.
The analysis.quantification module integrates databases of Relative Sensitivity Factors (RSF). It supports:
- Scofield Theoretical Cross-Sections (1976): Tabulated for both Mg K and Al K X-ray sources across 89 elements.
- Wagner Empirical Factors (1981): Derived experimentally, available for 18 common elements.
The calculate_atomic_concentration function takes a list of PeakParameters and computes the normalized fractional composition, providing the final quantitative output of the XPS analysis pipeline.
from xps_analyzer.analysis.quantification import (
load_sensitivity_factors,
calculate_atomic_concentration
)
# Load empirical sensitivity factors for an Mg Ka X-ray source
rsf_db = load_sensitivity_factors(source="scofield", xray_source="mg_ka")
# Compute the fractional atomic concentration from the integrated peak areas
concentrations = calculate_atomic_concentration(
peaks=[ti_peak, o_peak, sr_peak],
sensitivity_factors=rsf_db,
element_names=["Ti 2p", "O 1s", "Sr 3d"]
)
6. Data Serialization and Export
To interface with external statistical tools or publication pipelines, the export.exporters module provides serialization routines for both isolated XPSSpectrum objects and complete XPSDataset collections.
- JSON Serialization: Utilizes a custom
NumpyEncoderthat safely castsnumpy.ndarraystructures into standard JSON arrays, translatingNaNandInfvalues to strictnullequivalents, preserving the complete nested structure of the analysis metadata. - CSV/Excel Export: Unrolls the hierarchical data into strictly typed tabular formats using the
pandasunderlying engine. When exporting a full dataset to Excel, the algorithm dynamically maps each spectral region to an isolated workbook sheet, ensuring compatibility with standard laboratory workflows.