XPS Analyzer: Python Scientific Toolkit
A mathematically rigorous, type-safe Python package for X-ray Photoelectron Spectroscopy (XPS) analysis, featuring Pydantic validation.
Technical Stack
Resources
Package Overview
XPS Analyzer is an open-source Python library designed for the automated processing of X-ray Photoelectron Spectroscopy (XPS) data. The toolkit provides a programmatic pipeline for raw data ingestion, inelastic background subtraction, non-linear peak deconvolution, and atomic quantification using empirical sensitivity factors.
Design Principles
The library is built upon three foundational software engineering principles to ensure data integrity in scientific computing:
- Immutability: Operations that transform numerical arrays (such as background subtraction or energy calibration) return deep copies by default (
model_copy(deep=True)). This prevents accidental mutation of raw experimental data. - Runtime Validation: The core data structures inherit from
Pydantic v2base models. Custom validators are enforced at runtime, ensuring that arrays for binding energy and intensity possess identical dimensions and contain noNaNvalues. - Algorithmic Transparency: Analytical methods (e.g., the Shirley integral or the Levenberg-Marquardt optimization) are explicitly documented and expose their internal tolerances to the user, contrasting with the opaque nature of proprietary software.
Documentation Structure
The technical documentation for XPS Analyzer is divided into four domain-specific modules:
- Physical Chemistry Foundations: Details the quantum mechanical principles underpinning the software, including the photoelectric effect, chemical shift, and spin-orbit coupling.
- Mathematical Methods: Documents the numerical algorithms utilized for background modeling, lineshape convolution, and least-squares optimization.
- Software Architecture: Outlines the object-oriented design, the Pydantic data model hierarchy, and the integration of empirical databases.
- Analytical Features: Provides a technical breakdown of functional capabilities, from data ingestion to empirical atomic quantification.
Basic Usage
The API is designed for modular integration into data science pipelines or Jupyter Notebooks. The following example demonstrates the extraction, background subtraction, and parameter optimization of a Strontium (Sr 3d) doublet spectrum.
from xps_analyzer.data_loader import load_single_file
from xps_analyzer.analysis.background import shirley_background
from xps_analyzer.analysis.peak_fitting import fit_multiple_peaks
# 1. Ingest proprietary text format into validated Pydantic models
dataset = load_single_file("data/raw/BN-SET-01/BN-BS-3/BN-BS-3 MULTIPLEX.txt")
sr3d_spectrum = dataset.get_spectrum("Sr 3d")
# 2. Compute and subtract the Shirley inelastic background
# Returns a deep copy of the spectrum containing the modified intensity array
sr3d_nobg = shirley_background(sr3d_spectrum, max_iter=100, tol=1e-5, inplace=False)
# 3. Perform Levenberg-Marquardt optimization for the spin-orbit doublet
# The algorithm fits two Voigt profiles corresponding to the 3d5/2 and 3d3/2 states
fit_result = fit_multiple_peaks(
sr3d_nobg,
n_peaks=2,
shape="voigt",
auto_estimate=True
)
# Output optimization metrics and structural parameters
print(f"Convergence State: {fit_result.success}")
print(f"Goodness of Fit (R²): {fit_result.r_squared:.4f}")
print(f"Primary Peak Position: {fit_result.peaks[0].position:.2f} eV")
Ongoing Development
The project is currently in Phase 2, which focuses on extending the interactive capabilities of the package.
A preliminary Graphical User Interface (GUI) has been built using Streamlit, enabling users to upload datasets and visually inspect spectral regions. Future development iterations will replace static Matplotlib renders with Plotly graphs, allowing researchers to interactively adjust the bounds of the Shirley background and constrain the optimization parameters of the Levenberg-Marquardt algorithm in real time.