Hierarchical Nonnegative Matrix Factorization: Unraveling Complex Material Structures for Advanced Research

Andrew West Nov 28, 2025 21

This article provides a comprehensive exploration of Hierarchical Nonnegative Matrix Factorization (HNMF), a powerful unsupervised machine learning technique for discerning multi-level structures within complex scientific data.

Hierarchical Nonnegative Matrix Factorization: Unraveling Complex Material Structures for Advanced Research

Abstract

This article provides a comprehensive exploration of Hierarchical Nonnegative Matrix Factorization (HNMF), a powerful unsupervised machine learning technique for discerning multi-level structures within complex scientific data. Tailored for researchers and scientists, we cover foundational principles, methodological implementations across domains like electron microscopy and hyperspectral imaging, and strategies for optimizing performance and mitigating artifacts. The content further addresses critical validation and comparative analysis with other dimensionality reduction techniques, synthesizing key takeaways and future directions for applying HNMF to accelerate discovery in materials science and biomedical research.

What is Hierarchical NMF? Core Principles and Advantages for Materials Science

Modern materials characterization techniques, particularly four-dimensional scanning transmission electron microscopy (4D-STEM), generate extremely large datasets that capture complex, hierarchical material structures [1] [2]. These datasets contain information spanning multiple scalesâ€”from atomic arrangements to mesoscale precipitates and domain structures. Conventional flat dimensionality reduction techniques, such as principal component analysis (PCA) and basic nonnegative matrix factorization (NMF), prove insufficient for extracting these nested hierarchical relationships [2]. They often produce mathematically valid but physically implausible components with negative intensities or unrealistic downward-convex peaks that violate fundamental electron microscopy physics [2].

Hierarchical nonnegative matrix factorization (HNMF) addresses these limitations by recursively applying NMF to discover overarching topics encompassing lower-level features [3]. This approach mirrors the natural hierarchical organization found in material systems, where atomic-scale patterns form nanoscale precipitates, which subsequently organize into larger microstructural features. By framing hierarchical NMF as a neural network with backpropagation optimization, researchers can learn meaningful hierarchical structures that illustrate how fine-grained topics relate to coarse-grained themes [3]. This capability is particularly valuable for understanding complex material phenomena where different levels of structural organization directly influence material properties and performance.

The Limitations of Conventional Analysis Methods

Traditional flat analysis methods suffer from significant limitations when applied to complex materials data:

Physical Interpretability Challenges

Conventional NMF produces sparse components that may not align with physical reality [2]. When experimental results consist of continuous intensity profiles with additional sharp peaksâ€”common in scientific measurementsâ€”primitive NMF introduces downward-convex peaks (unnatural intensity drops) that represent known artifacts rather than true physical signals [2]. These mathematical artifacts misrepresent the actual material structure and can lead to incorrect interpretations.

Handling of Spectral Variability

In hyperspectral imaging and 4D-STEM, endmember variability presents a significant challenge [4]. The linear mixing model assumes a single spectrum fully characterizes each material class, but in reality, spectral signatures vary due to illumination conditions, intrinsic variability, and measurement artifacts [4]. Flat decomposition methods cannot adequately represent this variability, leading to oversimplified representations that lose critical information about material heterogeneity.

Multi-Scale Analysis Limitations

Complex materials exhibit relevant features at multiple scales simultaneously. Metallic glasses containing nanometer-sized crystalline precipitates exemplify this challenge, requiring analysis methods that can detect and classify features across spatial resolutions [1]. Conventional approaches analyze each scale separately, missing important cross-scale relationships that determine material behavior.

Table 1: Comparison of Analysis Methods for Complex Materials Data

Method	Strengths	Limitations	Typical Applications
Principal Component Analysis (PCA)	Efficient dimensionality reduction; Established algorithms	Negative intensities physically impossible in electron microscopy; Limited interpretability	Initial data exploration; Noise reduction
Basic Nonnegative Matrix Factorization (NMF)	Nonnegative components; Parts-based representation	Artifacts with downward-convex peaks; Cannot represent hierarchies	Document clustering; Simple spectral unmixing
Hierarchical NMF (HNMF)	Multi-scale representation; Physically interpretable components; Captures nested structures	Higher computational complexity; More parameters to tune	Metallic glass precipitates; Microbial communities; Topic hierarchies

Hierarchical NMF Framework for Materials Science

Theoretical Foundation

Hierarchical NMF recursively applies matrix factorization to create layered representations of complex data. Given a nonnegative data matrix X âˆˆ â„â‚Š^(NÃ—M), single-layer NMF approximates it as the product of two nonnegative matrices X â‰ˆ AS, where A âˆˆ â„â‚Š^(NÃ—K) contains basis components and S âˆˆ â„â‚Š^(KÃ—M) contains coefficients [3]. Hierarchical NMF extends this by recursively factorizing the coefficient matrix:

X â‰ˆ A^(1)S^(1) S^(1) â‰ˆ A^(2)S^(2) ... S^(L-1) â‰ˆ A^(L)S^(L)

The resulting approximation is X â‰ˆ A^(1)A^(2)â‹¯A^(L)S^(L), with the final basis matrix given by A = A^(1)A^(2)â‹¯A^(L) [3]. This cascaded structure naturally represents hierarchical relationships in material systems.

Domain-Specific Constraints for Materials Data

Incorporating domain knowledge is crucial for physically meaningful factorization. For electron microscopy data, constraints include spatial resolution preservation and continuous intensity features without downward-convex peaks [2]. These constraints eliminate physically implausible components that violate the fundamental principle that detected electron counts cannot be negative. The integration of domain-specific knowledge effectively mitigates artifacts found in conventional machine learning techniques that rely solely on mathematical constraints [2].

Multi-Layer Architecture

Neural NMF implements hierarchical factorization using a multi-layer architecture similar to neural networks [3]. This approach enables:

Backpropagation optimization across layers
Improved hierarchical structure learning
Better inter-layer relationships
Enhanced topic interpretability

The neural framework allows for supervised extension where label information guides the factorization process, improving the separation of relevant material features [3].

Experimental Protocols for Hierarchical Materials Analysis

Protocol 1: HNMF for Metallic Glass Precipitate Analysis

Objective: Detect and classify nanometer-sized crystalline precipitates embedded in amorphous metallic glass (ZrCuAl) using 4D-STEM data [1].

Materials and Equipment:

Four-dimensional STEM instrument
Metallic glass sample (ZrCuAl system)
High-performance computing workstation
DigitalMicrograph software with custom HNMF scripts [2]

Procedure:

Data Acquisition: Collect 4D-STEM dataset Iâ‚„D(x,y,u,v) where (x,y) are real-space coordinates and (u,v) are reciprocal-space coordinates [2].
Data Transformation: Reshape 4D data into 2D matrix X âˆˆ â„â‚Š^(NuvÃ—Nxy) where rows correspond to diffraction patterns and columns to spatial positions [2].
Constraint Definition: Implement domain-specific constraints including:
- Spatial smoothness constraints for real-space maps
- Continuous intensity profile constraints for diffraction patterns
- Non-negativity constraints for all components [2]
Hierarchical Factorization:
- Set the hierarchy levels (typically 2-3 for material systems)
- Initialize A^(1), A^(2), ..., A^(L) with positive random values
- Apply multiplicative update rules with domain constraints [2]: A^(â„“) â† A^(â„“) âŠ› (X(S^(â„“))áµ€) âŠ˜ (A^(â„“)S^(â„“)(S^(â„“))áµ€) S^(â„“) â† S^(â„“) âŠ› ((A^(â„“))áµ€X) âŠ˜ ((A^(â„“))áµ€A^(â„“)S^(â„“))
Hierarchical Clustering: Apply polar coordinate transformation and uniaxial cross-correlation to cluster diffraction patterns by similarity [1].
Precipitate Identification: Identify crystalline precipitates through analysis of factorized diffraction components and their spatial distributions [1].

Expected Outcomes: Successful decomposition will yield interpretable diffractions and maps that reveal precipitate structures not achievable with PCA or primitive NMF [1].

Protocol 2: Bayesian Hierarchical NMF for Microbial Communities

Objective: Analyze microbial metagenomic data to discover underlying community structures and their associations with environmental factors [5].

Materials and Equipment:

Microbial abundance data (OTU table)
Host environmental factor data
BALSAMICO software package
R statistical computing environment

Procedure:

Data Preparation: Format microbial abundance data as nonnegative matrix Y âˆˆ â„â‚Š^(NÃ—K) where N is samples and K is taxa [5].
Model Specification: Implement hierarchical Bayesian model:
- hl âˆ¼ Dirichlet(Î±) (community abundance profiles)
- w(n,l) âˆ¼ Gamma(aw, B(n,l)) (community contributions)
- t(n,l) âˆ¼ Poisson(w(n,l)Ï„n) (latent counts)
- s(n,l) âˆ¼ Multinomial(t(n,l), hl) (species assignments)
- y(n,k) = âˆ‘(l=1)^L s_(n,l,k) (observed abundances) [5]
Variational Inference: Estimate parameters W, H, a_w, and V using variational Bayesian inference [5].
Hierarchical Structure Analysis: Examine the relationship between environmental factors and community structures through the estimated V matrix [5].
Community Detection: Identify key microbial communities associated with specific environmental conditions or disease states.

Expected Outcomes: Accurate detection of bacterial communities related to specific conditions (e.g., colorectal cancer) with estimation of uncertainty through credible intervals [5].

Visualization of Hierarchical NMF Workflows

Diagram 1: Hierarchical NMF Workflow for Materials Data. This workflow illustrates the sequential factorization process from raw 4D-STEM data to hierarchical structure identification.

Diagram 2: Constraint Integration in Hierarchical NMF. The diagram shows how mathematical, physical, and domain-specific constraints are integrated to produce physically meaningful factorizations.

The Scientist's Toolkit: Essential Research Reagents and Materials

Table 2: Essential Research Reagents and Computational Tools for Hierarchical NMF

Item	Function	Application Examples
4D-STEM Instrumentation	Acquires four-dimensional scanning transmission electron microscopy data	Metallic glass precipitate analysis [1] [2]
Hyperspectral Imaging Systems	Captures spatial and spectral information simultaneously	Mineral identification; Phase mapping [4]
DigitalMicrograph with Custom Scripts	Implements HNMF algorithms with domain-specific constraints	Electron microscopy data analysis [2]
BALSAMICO Software Package	Bayesian latent semantic analysis of microbial communities	Microbial community analysis with environmental factors [5]
Neural NMF Framework	Implements hierarchical NMF as neural network with backpropagation	Multi-layer topic modeling; Hierarchical feature extraction [3]
scikit-learn NMF Implementation	Provides standard NMF algorithms (multiplicative updates, ALS)	Baseline comparisons; Preliminary analysis [2]
TGX-115	TGX-115, CAS:351071-62-0, MF:C20H20N2O3, MW:336.4 g/mol	Chemical Reagent
YM-58483	YM-58483, CAS:223499-30-7, MF:C15H9F6N5OS, MW:421.3 g/mol	Chemical Reagent

Hierarchical nonnegative matrix factorization represents a paradigm shift in the analysis of complex materials data by moving beyond flat structures to embrace the multi-scale nature of material systems. By incorporating domain-specific constraints and leveraging recursive factorization, HNMF enables researchers to extract physically meaningful hierarchical structures from 4D-STEM, hyperspectral imaging, and other advanced characterization techniques. The experimental protocols and tools outlined in this application note provide a foundation for implementing hierarchical analysis approaches that can reveal previously hidden relationships in complex materials data, ultimately accelerating materials discovery and development.

Nonnegative Matrix Factorization (NMF) is a cornerstone unsupervised learning technique for parts-based representation and dimensionality reduction across diverse scientific domains. In its fundamental form, NMF factorizes a given non-negative data matrix X into two lower-rank, non-negative matrices W (basis matrix) and H (coefficient matrix) such that X â‰ˆ WH [2] [3]. The nonnegativity constraint fosters intuitive, additive combinations of features, enabling a more interpretable parts-based representation compared to other matrix factorization methods like Principal Component Analysis (PCA) [3]. This mathematical framework finds extensive application in topic modeling, feature extraction, and hyperspectral imaging, particularly within materials science research where interpreting underlying physical components is paramount [2] [3].

Hierarchical NMF (HNMF) extends this core concept by recursively applying factorization to learn latent topics or features at multiple levels of granularity [3]. This multi-layer approach captures overarching themes encompassing lower-level features, thereby illuminating the hierarchical structure inherent in many complex datasets [6] [3]. Unlike standard hierarchical clustering, which forcibly imposes structure, HNMF naturally discovers these relationships, avoiding Procrustean behavior [7]. Within materials research, this capability is invaluable for deciphering complex structure-property relationships, such as identifying hierarchical microstructural features from electron microscopy data or linking multi-scale material characteristics to performance metrics [2].

Fundamental NMF Algorithms and Mathematical Foundations

Core Mathematical Principles

The NMF optimization problem aims to find non-negative matrices W âˆˆ â„âº^{mÃ—k} and H âˆˆ â„âº^{kÃ—n} that minimize the reconstruction error for a given data matrix X âˆˆ â„âº^{mÃ—n}. Formally [3]:

The rank k is chosen such that (n+m)k < nm, ensuring a lower-dimensional representation [7]. The solution provides a parts-based decomposition where the columns of W represent fundamental components (e.g., topics in text, spectral profiles in microscopy), and H contains the coefficients to reconstruct each data point via additive combinations of these components [3].

Standard Optimization Algorithms

Two primary algorithmic approaches dominate NMF implementations:

Table 1: Core NMF Optimization Algorithms

Algorithm	Update Rules	Key Characteristics	Common Implementations
Multiplicative Update (MU)	W â† W âŠ› (XHáµ€) âŠ˜ (WHHáµ€)H â† H âŠ› (Wáµ€X) âŠ˜ (Wáµ€WH)	Element-wise operations (âŠ›, âŠ˜)Maintains nonnegativity automaticallyPart of majorization-minimization framework [2]	MATLAB ('mult' in nnmf)scikit-learn ('mu' in NMF) [2]
Alternating Least Squares (ALS)	W â† [(XHáµ€)(HHáµ€)â»Â¹]â‚ŠH â† [(Wáµ€W)â»Â¹(Wáµ€X)]â‚Š	Solves nonnegative least squares alternatelyProjection [Â·]â‚Š = max{0,Â·} enforces nonnegativityMore flexible for constraints [2]	MATLAB ('als' in nnmf)scikit-learn ('cd' in NMF) [2]

The convergence of these algorithms is typically monitored using cost functions based on the Frobenius norm â€–X - WHâ€–â‚‚Â² or the Kullback-Liebler divergence [2] [7]. A critical challenge with primitive NMF is its tendency to produce sparse components that may not align with physical reality, sometimes generating implausible artifacts like downward-convex peaks in continuous intensity profiles [2].

Advanced Hierarchical NMF Frameworks

Neural NMF for Hierarchical Multilayer Topic Modeling

Neural NMF represents a significant advancement by framing hierarchical factorization within a neural network architecture. This approach recursively applies NMF across multiple layers to discover relationships between topics at different granularity levels [6] [3]. The forward propagation process can be represented as:

where each layer â„“ progressively captures more abstract representations. A key innovation of Neural NMF is its derivation of a backpropagation-style optimization scheme that jointly learns all layer parameters, substantially reducing approximation error compared to sequential HNMF application [3]. This method has demonstrated superior performance in learning interpretable hierarchical topic structures on document datasets (20 Newsgroups) and biomedical data (MyLymeData symptoms), outperforming other HNMF methods in both reconstruction accuracy and classification performance [3].

Deep Non-negative Matrix Factorization (Deep-NMF)

Deep-NMF extends the hierarchical concept further through multiple layers of decomposition to learn non-linear parts-based representations [8]. Unlike semi-NMF frameworks that allow negative values, Deep-NMF maintains strict nonnegativity, preserving the intuitive parts-based interpretation crucial for scientific applications [8]. This approach has shown particular promise in multi-view clustering, where it simultaneously decomposes data from multiple sources or perspectives through a shared hierarchical structure. The Deep-NMF framework can be enhanced with manifold learning techniques that preserve the intrinsic geometric structure of data across all layers, ensuring that local neighborhood relationships are maintained in the learned representations [8].

Orthogonal Diverse Deep NMF (ODD-NMF)

The ODD-NMF framework incorporates several crucial constraints to enhance multi-view learning [8]:

Orthogonal Constraints: Applied via normalized cut-type relaxation to ensure learning of unique, complementary information from each view
Diversity Constraints: Enhance learning of diverse information across view pairs in the low-rank representations
Optimal Manifold Integration: Preserves the most consensed geometric structure across all layers

This combination effectively learns both view-shared and view-specific information, producing more meaningful clusters in complex datasets such as multi-view text and image collections [8].

Practical Implementation and Protocols

Experimental Protocol for Constrained NMF in Electron Microscopy

Modern scientific instruments like 4D-Scanning Transmission Electron Microscopy (4D-STEM) generate extremely large datasets requiring specialized NMF approaches [2]. The following protocol outlines the application of domain-aware constrained NMF for materials characterization:

Table 2: Protocol for 4D-STEM Data Analysis via Constrained NMF

Step	Procedure	Parameters	Domain-Specific Constraints
1. Data Preparation	Transform 4D data Iâ‚„D(x,y,u,v) to matrix XReshape 2D diffractions Iâ‚‚D(u,v) to 1D column vectors	nâ‚“y = nâ‚“ny, náµ¤v = náµ¤nv [2]	Maintain spatial relationships between real and reciprocal spaces
2. Initialization	Initialize W and H with non-negative valuesSet number of components nâ‚– (nâ‚– << nâ‚“y)	W âˆˆ â„âº^{náµ¤vÃ—nâ‚–}H âˆˆ â„âº^{nâ‚–Ã—nâ‚“y} [2]	Incorporate physical prior knowledge where available
3. Constrained Factorization	Apply MU or ALS updates with embedded constraints	Monitor convergence via â€–X - WHâ€–â‚‚Â² [2]	Enforce:â€¢ Spatial smoothness in H mapsâ€¢ Continuous intensity profiles in W diffractionsâ€¢ No downward-convex peaks [2]
4. Component Interpretation	Transform W columns to 2D diffractions wâ‚–(u,v)Reshape H rows to 2D maps hâ‚–(x,y)	k = 0,1,...,nâ‚–-1 [2]	Relate components to physical structuresâ€¢ Crystalline precipitatesâ€¢ Amorphous phasesâ€¢ Defect regions

Research Reagent Solutions

Table 3: Essential Computational Tools for HNMF Research

Tool Name	Environment/Language	Primary Function	Application Context
scikit-learn	Python	NMF implementations ('mu', 'cd')Model evaluation utilities [2]	General machine learning pipeline integration
HyperSpy	Python	Multi-dimensional data analysisSignal processing for microscopy [2]	Electron microscopy data analysis
DigitalMicrograph	Gatan Inc. proprietary	MU/ALS NMF scripts [2]	In-situ STEM data processing
BALSAMICO	R	Bayesian NMF with clinical covariate integration [5]	Microbial community analysis with environmental factors
ODD-NMF	MATLAB	Deep multi-view clustering with orthogonal constraints [8]	Multi-view data integration

Domain-Specific Applications in Materials and Biomedical Research

Materials Characterization via 4D-STEM

The application of constrained NMF to 4D-STEM data has demonstrated remarkable success in extracting physically interpretable components from complex nanoscale phenomena [2]. In a notable study analyzing ZrCuAl metallic glass, domain-constrained NMF successfully identified and classified nanometer-sized crystalline precipitates embedded within the amorphous matrix by decomposing both simulated and experimental data into interpretable diffractions and maps [2]. This approach overcame critical limitations of PCA and primitive NMF, which produced physically implausible results with negative intensities or artifact-laden components. The integration of domain knowledgeâ€”specifically spatial resolution constraints and continuous intensity profile characteristicsâ€”proved essential for generating scientifically meaningful decompositions [2].

Biomedical Applications

HNMF methods have shown significant utility in biomedical domains, particularly for analyzing complex, high-dimensional biological data. The BALSAMICO framework exemplifies this application, employing a hierarchical Bayesian NMF approach to model microbial communities and their associations with clinical factors [5]. This method effectively identified bacteria related to colorectal cancer by decomposing microbial abundance data while incorporating clinical covariates, demonstrating how hierarchical factorization can reveal relationships between microbial community structures and disease states [5].

Critical Implementation Considerations

Rank Selection Methodologies

Determining the optimal number of components (k) remains a fundamental challenge in NMF applications. Several methods have been evaluated for estimating k in synthetic and empirical data [7]:

Brunet's Cophenetic Correlation Coefficient: Measures stability of clustering assignments across multiple runs
Velicer's Minimum Average Partial (MAP): Originally developed for factor analysis, adapted for NMF
Minka's Laplace-PCA: Bayesian model selection approach
Euclidean Distance Reduction Analysis: Examines reconstruction error versus complexity trade-off

Research indicates that when underlying components are orthogonal, PCA-based methods and Brunet's approach achieve highest accuracy [7]. However, normalization techniques can unpredictably affect rank estimation, suggesting that unnormalized data may provide more reliable component number estimates [7].

Hierarchical Component Relationships

A key advantage of hierarchical NMF methods is their ability to illustrate relationships between topics learned at different granularity levels without requiring multiple separate NMF runs [3]. This hierarchical representation immediately reveals how finer-grained topics relate to broader thematic categories, providing valuable insights into the latent structure of complex datasets. In materials research, this capability enables multi-scale characterization, linking atomic-scale features to microstructural domains and ultimately to macroscopic material properties.

Workflow and Conceptual Diagrams

Hierarchical NMF Conceptual Framework

Hierarchical NMF Framework: Illustrating the multi-layer decomposition process where each layer factorizes the coefficient matrix from the previous layer.

Domain-Constrained NMF for Materials Science

Domain-Constrained NMF Workflow: Demonstrating the integration of physical constraints into the NMF process for scientifically interpretable results in materials characterization.

Nonnegative Matrix Factorization (NMF) is a powerful unsupervised learning technique for parts-based representation and dimensionality reduction. By decomposing a non-negative data matrix into two lower-dimensional, non-negative factor matrices, NMF provides intuitive and interpretable latent features. Recent advancements have extended this core methodology into more sophisticated frameworksâ€”Multi-layer NMF, Neural NMF, and Constrained NMFâ€”which offer enhanced hierarchical representation, deep learning integration, and incorporation of prior knowledge, respectively. These algorithms are pivotal in modern materials research and drug development, enabling researchers to uncover complex, hierarchical patterns in high-dimensional data. This note details the key algorithms, their experimental protocols, and applications.

Key Algorithm Specifications and Comparisons

The table below summarizes the core architectures, optimization methods, and primary applications of three advanced NMF algorithms.

Table 1: Specification and Comparison of Key NMF Algorithms

Algorithm Name	Core Architecture & Model	Optimization Method	Primary Applications
Multi-layer NMF [9] [3]	Cascaded decomposition: `V â‰ˆ A(1)A(2)...A(L)X(L)`â€¢ Key Parameters: Number of layers (L), ranks (k(1), k(2), ..., k(L))	â€¢ Multiplicative Updateâ€¢ Backpropagation-styleâ€¢ Inspired by physical chemistry (e.g., Boltzmann probability for convergence) [9]	â€¢ Hierarchical topic modeling [3]â€¢ Crystal orientation mapping in 4D-STEM [10]â€¢ Cardiorespiratory disease clustering [9]
Neural NMF [3]	Framed as a neural network with `L` layers.â€¢ Key Parameters: Ranks per layer (k(â„“)), regularization parameters (Î¼, Î»)	â€¢ Alternating Multiplicative Updatesâ€¢ Gradient Descent via backpropagation [3]	â€¢ Hierarchical multilayer topic modeling [3]â€¢ Document classification (e.g., 20 Newsgroups) [3]â€¢ Biomedical symptom analysis (e.g., MyLymeData) [3]
Constrained NMF (DSNMF) [11]	Standard NMF with added regularization terms.â€¢ Key Parameters: Decomposition rank (k), regularization coefficients (Î»1, Î»2) for pointwise and pairwise constraints	â€¢ Alternating Multiplicative Updatesâ€¢ Graph and Label Regularization [11]	â€¢ Multi-view data clustering [11]â€¢ Image clustering (e.g., COIL20, LandUse21) [11]

Detailed Experimental Protocols

Protocol 1: Hierarchical Material Analysis using Neural NMF

This protocol outlines the application of Neural NMF for extracting hierarchical structure from 4D-STEM data for crystal orientation mapping [3] [10].

Objective: To determine the optimal number of hierarchical components (clusters) and their relationships in a 4D-STEM dataset.
Materials and Data Preprocessing:
- Input Data: A 4D-STEM dataset, which is a four-dimensional array (x, y, diffraction pattern x, diffraction pattern y) [10].
- Preprocessing: Convert the 4D dataset into a 2D matrix V of size (number of pixels per diffraction pattern Ã— number of probe positions). Apply data reduction and noise reduction techniques to handle the inherent sparsity of diffraction patterns [10].
Procedure:
- Initialization: Specify the number of layers L and the rank (number of components/topics) for each layer, k(1), k(2), ..., k(L). Initialize the factor matrices A(1), A(2), ..., A(L), and X(L) with non-negative values, potentially using nonnegative double singular value decomposition (NNDSVD) for stability [3] [12].
- Forward Propagation (Neural NMF Model): Execute the layered decomposition: V â‰ˆ A(1) * A(2) * ... * A(L) * X(L) The output of one layer serves as the input for the next [3].
- Backward Propagation (Optimization): Minimize the total reconstruction error ||V - A(1)A(2)...A(L)X(L)||Â² using an alternating optimization scheme. For Neural NMF, this involves calculating gradients with respect to all factor matrices and updating them iteratively, similar to backpropagation in neural networks [3].
- Stopping Criterion: Iterate until the change in the reconstruction error falls below a predefined threshold (e.g., 1e-6) or a maximum number of iterations is reached.
- Component and Cluster Analysis:
  - Use the K-component loss method combined with Image Quality Assessment (IQA) metrics to evaluate the quality of the decomposition for different values of k and select the optimal number of components [10].
  - Analyze the basis matrix W (the product of A matrices) for spectral templates and the coefficient matrix H (X(L)) for their activations to generate spatial distribution maps of different crystal orientations [10].
Troubleshooting: High reconstruction error may indicate an incorrect choice of ranks; reevaluate using the K-component loss method. Uninterpretable components may suggest a need for increased sparsity constraints or improved data preprocessing [10].

Protocol 2: Multi-view Data Integration using Dual Constraint NMF (DSNMF)

This protocol describes using DSNMF for clustering multi-view data by leveraging limited supervisory information [11].

Objective: To perform clustering on multi-view data by effectively integrating both pointwise and pairwise constraints into the NMF framework.
Materials and Data Preprocessing:
- Input Data: Multiple feature matrices {X(1), X(2), ..., X(V)} from V different views, representing the same set of n data points. A small set of known labels for partial data.
- Preprocessing: Normalize each view's data matrix. Construct a binary label matrix Y from the available labels. Calculate similarity matrices S_d and S_e for drugs/diseases or data points within each view [13] [11].
Procedure:
- Multi-view Dual Constraint (MDC) Algorithm: For each view, use the limited label information to:
  - Pointwise Constraints: Encode labels into a regularization term to guide the latent representation [11].
  - Pairwise Constraints: Employ a hypergraph-based constraint propagation algorithm to generate informed similarity matrices that capture data correlations [11] [14].
- Model Formulation: Solve the DSNMF optimization problem, which integrates graph and label regularization [11]: min ||X - USV^T||Â² + Î±||U||Â² + Î²||V||Â² + Î»( Tr(S^T L S) ) + Î¼( Pointwise Constraint Term ) where L is the graph Laplacian derived from the similarity matrices.
- Optimization: Use an alternating multiplicative update algorithm to iteratively solve for the factor matrices U, S, and V until convergence [11].
- Clustering: Apply a classical clustering algorithm (e.g., k-means) to the learned consensus representation S or use the label matrix directly for classification to obtain the final clusters [11].
Validation: Use ground-truth labels to compute clustering metrics like Accuracy, Normalized Mutual Information (NMI), and Adjusted Rand Index (ARI). Conduct cross-validation to assess robustness [11].

Workflow and Signaling Pathway Diagrams

Hierarchical NMF Decomposition Workflow

Dual Constraint NMF Integration

The Scientist's Toolkit: Research Reagent Solutions

Table 2: Essential Computational Tools and Datasets for NMF Research

Research Reagent	Function/Purpose	Example Use Case
4D-STEM Datasets [10]	Raw input data for hierarchical structure analysis in materials science. Contains spatial and diffraction information.	Identifying crystal orientations and phases in polycrystalline materials [10].
Multi-view Datasets (e.g., COIL20, LandUse21) [11]	Benchmark datasets comprising the same objects from multiple views or feature sets.	Testing and validating multi-view clustering algorithms like DSNMF [11].
Gold-Standard Association Datasets (e.g., Cdataset, Fdataset) [13] [14]	Curated matrices of known associations (e.g., drug-disease, virus-drug).	Training and benchmarking predictive models for computational drug repositioning [13] [14].
Similarity/Networks (Drug/Disease Similarity) [13] [14]	Precomputed matrices capturing functional or semantic relationships between entities.	Incorporated as graph regularization terms in constrained NMF to guide factorization [13].
NMF Software Packages (R, Python Nimfa) [15]	Open-source libraries providing optimized implementations of various NMF algorithms.	Rapid prototyping, testing, and deployment of NMF models in research [15].
NSC 663284	NSC 663284, CAS:383907-43-5, MF:C15H16ClN3O3, MW:321.76 g/mol	Chemical Reagent
Tyroservatide	Tyroservaltide (YSV) Tripeptide	Research-grade Tyroservaltide tripeptide for cancer research. Investigates inhibitory effects on hepatocarcinoma cells. For Research Use Only. Not for human use.

Hierarchical Nonnegative Matrix Factorization (HNMF) represents a significant advancement in the analysis of complex materials science data. By decomposing a non-negative data matrix into multiple layers of latent features, HNMF moves beyond standard NMF to uncover intricate 'parts-of-parts' structures inherent in material systems. This hierarchical approach provides materials scientists with an unparalleled interpretability advantage, enabling the dissection of multi-scale phenomenaâ€”from atomic-scale interactions in electron microscopy data to compositional variations in microbial communities affecting material biosynthesis.

The core mathematical principle of NMF involves approximating a data matrix ( \mathbf{X} ) as the product of two lower-rank, non-negative matrices: ( \mathbf{X} \approx \mathbf{WH} ) [2]. HNMF extends this framework by imposing additional structural constraints or further factorizing the components matrices, creating a hierarchy that reveals how broader patterns are composed of finer, constituent sub-patterns. This is particularly powerful for materials research, where properties emerge from interactions across different spatial and compositional scales. Unlike methods like Principal Component Analysis (PCA) that often yield components with physically unrealistic negative intensities, HNMF ensures all decomposed components maintain non-negativity, resulting in physically plausible and directly interpretable features such as diffraction patterns, elemental maps, or community structures [2] [5].

Theoretical Foundations: From Standard NMF to Hierarchical Decomposition

The Primitive NMF Framework

Standard NMF algorithms aim to minimize a cost function, typically the Frobenius norm of the reconstruction error: [ D(\mathbf{X} \| \mathbf{WH}) = \frac{1}{2} \|\mathbf{X} - \mathbf{WH}\|F^2 ] This is achieved through iterative update procedures, with two common approaches being the Multiplicative Update (MU) and Alternating Least Squares (ALS) algorithms [2]. The MU algorithm updates the matrices via elementwise operations: [ \mathbf{W} \leftarrow \mathbf{W} \circledast \mathbf{XH}^T \oslash \mathbf{WHH}^T ] [ \mathbf{H} \leftarrow \mathbf{H} \circledast \mathbf{W}^T\mathbf{X} \oslash \mathbf{W}^T\mathbf{WH} ] where ( \circledast ) and ( \oslash ) denote elementwise multiplication and division, respectively. The ALS algorithm, on the other hand, employs a projection-based approach: [ \mathbf{W} \leftarrow [(\mathbf{XH}^T)(\mathbf{HH}^T)^{-1}]+ ] [ \mathbf{H} \leftarrow [(\mathbf{W}^T\mathbf{W})^{-1}(\mathbf{W}^T\mathbf{X})]+ ] where ( [\cdot]+ ) represents the nonnegativity constraint projection [2].

Incorporating Domain Knowledge through Constraints

While mathematically sound, primitive NMF can produce artifacts that contradict domain knowledge, such as downward-convex peaks in continuous intensity profiles or high-frequency noise interpreted as signal [2]. This limitation is particularly problematic in materials characterization techniques like electron microscopy, where signals must adhere to specific physical constraints.

Constrained NMF addresses this by integrating domain-specific knowledge directly into the factorization process. For 4D-STEM data, this includes incorporating knowledge about spatial resolution and continuous intensity features, yielding decomposed components that are not only mathematically valid but also physically interpretable [2]. This philosophy of adding constraints forms the foundation for more complex hierarchical models.

The Hierarchical Extension

Hierarchical NMF frameworks introduce additional layers of decomposition, often through Bayesian probabilistic models or sequential factorization. The BALSAMICO framework, for instance, models microbiome data using a hierarchical structure where the factor matrices themselves are influenced by external covariates [5]: [ \mathbf{W} \approx a_w \exp(\mathbf{XV}) ] Here, the contribution matrix ( \mathbf{W} ) is governed by clinical covariates ( \mathbf{X} ) and their coefficients ( \mathbf{V} ), creating a hierarchy where observed environmental factors influence the latent communities, which in turn explain the observed data [5]. This approach successfully detected bacteria related to colorectal cancer, demonstrating its power to uncover meaningful biological structures with direct clinical relevance.

Table 1: Comparison of NMF Variants for Materials Research

Method	Key Features	Advantages	Limitations	Typical Applications
Primitive NMF	Non-negative factors, sparsity	Simple implementation, physically plausible components	May produce artifacts; no domain knowledge	Initial data exploration, basic feature extraction
Constrained NMF	Domain-specific constraints	Physically interpretable results, removes artifacts	Requires domain expertise to define constraints	4D-STEM analysis, hyperspectral imaging
Hierarchical NMF	Multi-layer decomposition, incorporation of covariates	Reveals 'parts-of-parts' structures, models complex relationships	Computationally intensive, complex implementation	Microbial communities, multi-scale materials analysis

Application Notes: HNMF for 4D-STEM Data Analysis in Metallic Glasses

Experimental Background and Data Preparation

Four-dimensional Scanning Transmission Electron Microscopy (4D-STEM) represents a cutting-edge characterization technique that generates massive datasets containing bimodal information from both real and reciprocal spaces [2]. Each 4D dataset ( \mathbf{I}{4D}(x, y, u, v) ) consists of 2D electron diffractions ( \mathbf{I}{2D}(u, v) ) acquired at varying probe positions ( (x, y) ), where ( (u, v) ) and ( (x, y) ) are reciprocal and real-space coordinates, respectively.

To apply HNMF, the 4D data must first be transformed into an appropriate matrix representation. The 2D experimental diffractions ( \mathbf{I}{2D}(u, v) ) are reshaped into one-dimensional column vectors, forming the matrix ( \mathbf{X} ), where rows correspond to reciprocal-space coordinates and columns correspond to real-space coordinates [2]. For data points of size ( (nx, ny, nu, nv) ) and an assumed number of components ( nk ), the matrix dimensions become ( \mathbf{X} \in \mathbb{R}^{n{uv} \times n{xy}} ), where ( n{xy} = nx ny ) and ( n{uv} = nu nv ).

Workflow for Hierarchical Decomposition

The following diagram illustrates the complete HNMF workflow for analyzing 4D-STEM data from metallic glasses, from data acquisition through hierarchical decomposition:

Constrained Factorization Protocol

The hierarchical decomposition begins with a constrained NMF step that incorporates electron microscopy domain knowledge:

Initialization: Initialize matrices ( \mathbf{W} ) and ( \mathbf{H} ) with non-negative random values or using smart initialization algorithms.
Multiplicative Updates with Constraints: Implement the MU algorithm while applying domain-specific constraints:
- Spatial Smoothing Constraint: Apply Gaussian filtering to the columns of ( \mathbf{H} ) (real-space maps) between iterations to enforce realistic spatial continuity.
- Intensity Profile Constraint: Apply median filtering to the rows of ( \mathbf{W} ) (diffraction patterns) to eliminate physically implausible downward-convex peaks while preserving sharp diffraction features.
Convergence Monitoring: Iterate until the relative change in the cost function falls below a threshold (typically ( 10^{-6} )) or until a maximum number of iterations is reached.
Hierarchical Decomposition: Take the resulting matrix ( \mathbf{W}1 ) and apply a second NMF decomposition: ( \mathbf{W}1 \approx \mathbf{W}2 \mathbf{H}2 ), revealing the sub-structure within the primary components.

Table 2: Key Research Reagents and Computational Tools for HNMF in Materials Science

Reagent/Solution	Function/Application	Implementation Notes
4D-STEM Dataset	Primary experimental data	Pre-process: flat-field correction, background subtraction
Constrained NMF Algorithm	Core decomposition engine	Implement MU or ALS with domain constraints
Spatial Smoothing Filter	Enforces realistic spatial continuity in maps	Gaussian kernel with Ïƒ = 1-2 pixels
Intensity Profile Filter	Removes unphysical intensity artifacts	Median filter with 3Ã—3 kernel
Hierarchical Clustering	Classifies decomposed components	Polar coordinate transformation + cross-correlation
DigitalMicrograph Script	Execution environment for electron microscopists	Gatan Inc. platform with custom HNMF scripts

Results Interpretation and Precipitate Classification

Applying this HNMF protocol to ZrCuAl metallic glass data successfully decomposes both simulated and experimental 4D-STEM data into physically interpretable diffractions and maps that cannot be achieved using PCA or primitive NMF [2]. The hierarchical decomposition reveals nanometer-sized crystalline precipitates embedded within the amorphous matrix, with the 'parts-of-parts' structure showing how different precipitate types share common sub-structural motifs.

For classification, hierarchical clustering is optimized based on diffraction similarity using a combination of polar coordinate transformation and uniaxial cross-correlation [2]. This enables precise classification of precipitates according to their diffraction patterns, demonstrating HNMF's capability to detect and categorize subtle structural features that would remain hidden in conventional analysis.

Advanced Protocol: Bayesian HNMF for Complex Material Systems

Theoretical Framework for Bayesian HNMF

For more complex material systems with external covariates or prior knowledge, a Bayesian hierarchical approach provides a powerful extension to standard HNMF. The BALSAMICO framework offers a template for such an approach, modeling the data generation process as [5]: [ \mathbf{h}l \sim \text{Dirichlet}(\boldsymbol{\alpha}) ] [ \mathbf{B} = \exp(-\mathbf{XV}) ] [ w{n,l} \sim \text{Gamma}(aw, B{n,l}) ] [ t{n,l} \sim \text{Poisson}(w{n,l} \taun) ] [ \mathbf{s}{n,l} \sim \text{Multinomial}(t{n,l}, \mathbf{h}l) ] [ y{n,k} = \sum{l=1}^{L} s{n,l,k} ] where ( \mathbf{X} ) represents covariates, ( \mathbf{V} ) their coefficients, ( \taun ) is an offset term, and ( \mathbf{S} = {s_{n,l,k}} ) are latent variables introduced to facilitate inference [5].

Implementation Workflow

The Bayesian HNMF implementation involves the following steps:

Model Specification: Define the hierarchical structure based on domain knowledge, identifying which covariates should influence which levels of the decomposition.
Variational Inference: Implement an efficient variational Bayesian inference procedure to estimate parameters, using Laplace approximation to reduce computational cost.
Posterior Analysis: Examine the posterior distributions of the parameters to identify significant relationships between covariates and latent components.
Validation: Use synthetic data with known ground truth to validate the accuracy of parameter estimation before applying to experimental data.

The following diagram illustrates the hierarchical Bayesian structure for modeling complex relationships in material systems:

Application to Microbial-Material Systems

In an analysis of clinical metagenomic data, the BALSAMICO framework successfully detected bacteria related to colorectal cancer, demonstrating its power to uncover meaningful biological structures with direct clinical relevance [5]. For materials research, similar approaches can be applied to systems where microbial communities interact with material surfaces, or in the study of biomaterials where biological and material factors jointly determine performance.

Comparative Analysis and Implementation Guidelines

Performance Comparison Across Domains

Table 3: Quantitative Performance of HNMF Across Application Domains

Application Domain	Data Type	Comparison Methods	Key HNMF Advantages	Performance Metrics
4D-STEM of Metallic Glasses	Electron diffraction patterns	PCA, Primitive NMF	Eliminates negative intensities, reveals precipitate structures	Successful classification of nanometer-sized crystalline precipitates [2]
Microbial Communities	OTU abundance data	BioMiCo, Supervised NMF	Incorporates multiple environmental factors, handles sparse data	Accurate detection of CRC-related bacteria [5]
LLM Interpretability	MLP activations	Sparse Autoencoders	Better causal steering, aligns with human-interpretable concepts	Outperforms SAEs and supervised baselines on concept steering [16]

Implementation Considerations

Successful implementation of HNMF for materials research requires careful attention to several practical aspects:

Data Preprocessing: Normalize data appropriately for the specific domain. For 4D-STEM, apply flat-field correction and background subtraction. For compositional data, use appropriate transformations to handle sparsity.
Constraint Design: Collaborate with domain experts to identify appropriate constraints. Spatial smoothness, intensity continuity, and known physical boundaries are common starting points.
Model Selection: Determine the appropriate hierarchical depth through cross-validation. Deeper hierarchies offer more detailed decomposition but require more data and computational resources.
Validation Strategy: Employ multiple validation approaches, including synthetic data with known structure, experimental controls, and comparison with complementary characterization techniques.
Computational Optimization: Leverage GPU acceleration for large-scale problems, and consider variational inference methods for Bayesian approaches to reduce computational burden.

Hierarchical Nonnegative Matrix Factorization represents a powerful paradigm for extracting meaningful, interpretable patterns from complex materials science data. By revealing the 'parts-of-parts' structure inherent in multi-scale material systems, HNMF provides researchers with an unparalleled ability to connect microscopic features to macroscopic properties and performance. The protocols and application notes presented here offer a roadmap for implementing these advanced analytical techniques across diverse materials characterization domains, from electron microscopy of metallic glasses to the analysis of complex microbial communities relevant to biomaterials development. As materials research continues to generate increasingly large and complex datasets, hierarchical decomposition approaches will play an ever more critical role in unlocking the scientific insights contained within.

In the field of materials research, the analysis of spectral data from techniques such as mass spectrometry imaging (MSI) and hyperspectral imaging (HSI) is crucial for understanding material composition and properties. Dimensionality reduction methods are indispensable tools for interpreting these complex datasets. While Principal Component Analysis (PCA) and standard Non-negative Matrix Factorization (NMF) have been widely used, Hierarchical Non-negative Matrix Factorization (HNMF) has emerged as a superior approach for extracting meaningful, hierarchical information from spectral data. This application note details the comparative strengths of HNMF, provides experimental protocols for its implementation, and visualizes its advantages through structured data and workflow diagrams.

Theoretical Foundations and Comparative Strengths

Fundamental Limitations of PCA and Standard NMF

Principal Component Analysis (PCA) is an unconstrained factorization method that projects data onto orthogonal principal components. However, it does not account for the non-negative nature of spectral data, which can result in components with negative values that lack physical interpretability in contexts like spectral intensities or chemical concentrations [17].

Standard Non-negative Matrix Factorization (NMF) factorizes a data matrix ( \mathbf{X} \in \mathbb{R}^{M \times N}+ ) into two non-negative factor matrices: a spectral basis matrix ( \mathbf{W} \in \mathbb{R}^{M \times K}+ ) and a coefficient matrix ( \mathbf{H} \in \mathbb{R}^{K \times N}_+ ), such that ( \mathbf{X} \approx \mathbf{WH} ). The non-negativity constraint enhances interpretability, providing a parts-based representation [17] [15]. Despite this advantage, standard NMF is a single-layer decomposition that assumes a flat structure in the data, limiting its ability to capture hierarchical relationships and making it prone to local minima and sensitivity to noise [3] [18].

The Hierarchical NMF (HNMF) Advantage

Hierarchical NMF (HNMF) recursively applies NMF in multiple layers to discover overarching topics encompassing lower-level features. In a typical two-layer HNMF, the input data matrix ( \mathbf{X} ) is first factorized into ( \mathbf{W}1 ) and ( \mathbf{H}1 ). The coefficient matrix ( \mathbf{H}1 ) is then further factorized into ( \mathbf{W}2 ) and ( \mathbf{H}2 ), resulting in the overall factorization ( \mathbf{X} \approx \mathbf{W}1 \mathbf{W}2 \mathbf{H}2 ) [3]. This structure provides several key advantages over both PCA and standard NMF for spectral data analysis, which are summarized in the table below.

Table 1: Qualitative comparison of PCA, Standard NMF, and Hierarchical NMF for spectral data analysis.

Feature	PCA	Standard NMF	Hierarchical NMF (HNMF)
Interpretability	Low (negative components)	High (additive parts)	Very High (multi-level structure)
Data Structure Model	Linear, global structure	Linear, flat structure	Linear, hierarchical structure
Noise Robustness	Moderate	Low to Moderate	High (with robust variants)
Handling of Mixed Pixels	Poor	Good	Excellent
Application Flexibility	General purpose	Domain-specific	Domain-specific with hierarchy

Quantitative Performance Evidence

The theoretical advantages of HNMF are substantiated by empirical evidence from various applications. The following table summarizes key performance metrics from recent studies.

Table 2: Quantitative performance comparison of PCA, NMF, and HNMF in different applications.

Application Domain	Metric	PCA	Standard NMF	HNMF / Robust DNMF
Hyperspectral Unmixing [18]	Reconstruction Error (Synthetic Data)	-	High	Low ((\ell_{2,1})-RDNMF)
Mineral Identification [19]	Average Accuracy	-	-	84.8% (Clustering-Rank1 NMF)
Document Classification [3]	Classification Accuracy (20 Newsgroups)	-	Lower than HNMF	Higher than HNMF (Neural NMF)
Mass Spectrometry Imaging [17]	Match to UMAP Distributions	Poor	Good	Best (KL-NMF recommended)

Experimental Protocols for HNMF in Spectral Analysis

Protocol 1: Basic HNMF for Spectral Unmixing

This protocol outlines the steps for applying a two-layer HNMF to decompose a hyperspectral image dataset ( \mathbf{X} ) (with rows representing spectral bands and columns representing pixels) into endmembers and hierarchical abundances [3] [18].

Research Reagent Solutions:

Hyperspectral Image Data Cube: A 3D data matrix (x, y, Î») reshaped into a 2D matrix ( \mathbf{X} ) of size (number of spectral bands Ã— number of pixels).
HNMF Software Toolbox: Python (Nimfa library, scikit-learn) or R (NMF package) with HNMF capabilities.
Spectral Library: A known library of pure component spectra (e.g., USGS, JPL/NASA) for endmember identification.

Procedure:

Data Preprocessing:
- Reshape the HSI data cube ( \mathbf{I}(x, y, \lambda) ) into a 2D matrix ( \mathbf{X} ) where each column is the spectrum of a single pixel [2].
- Perform necessary pre-processing: noise reduction, dead pixel removal, and normalization.

Rank Selection (k1, k2):
- Determine the ranks for the first (( k1 )) and second (( k2 )) layers. Use methods like analysis of singular values or the stability of NMF solutions across multiple runs [17]. The condition ( k2 < k1 < < \text{min}(M, N) ) must hold.
Layer 1 Factorization:
- Factorize ( \mathbf{X} \approx \mathbf{W}1 \mathbf{H}1 ).
- ( \mathbf{W}1 ) (size: #bands Ã— ( k1 )) contains the first-level spectral basis.
- ( \mathbf{H}1 ) (size: ( k1 ) Ã— #pixels) contains the first-level abundance coefficients.
Layer 2 Factorization:
- Use ( \mathbf{H}1 ) as the new data matrix for the second layer: ( \mathbf{H}1 \approx \mathbf{W}2 \mathbf{H}2 ).
- ( \mathbf{W}2 ) (size: ( k1 ) Ã— ( k_2 )) describes the composition of first-level components from more fundamental second-level components.
- ( \mathbf{H}2 ) (size: ( k2 ) Ã— #pixels) contains the final, second-level abundance maps.
Result Interpretation:
- The overall decomposition is ( \mathbf{X} \approx \mathbf{W}1 \mathbf{W}2 \mathbf{H}_2 ).
- The final endmember spectra can be interpreted as ( \mathbf{W} = \mathbf{W}1 \mathbf{W}2 ) [18].
- The abundance maps for each of the ( k2 ) fundamental components are the rows of ( \mathbf{H}2 ). Reshape each row back to the original 2D spatial dimensions to visualize abundance maps.

Protocol 2: Robust HNMF for Noisy Data

Real-world spectral data is often contaminated by noise. This protocol modifies the basic HNMF using the ( \ell_{2,1} )-norm to improve robustness [18].

Procedure:

Pretraining Stage (Forward Pass):
- For each layer ( l ), minimize the objective function ( O{RNMF} = \Vert \mathbf{H}{l-1} - \mathbf{V}l \mathbf{H}l \Vert{2,1} ), where ( \mathbf{H}0 = \mathbf{X} ).
- Use multiplicative update rules tailored for the ( \ell{2,1} )-norm to compute ( \mathbf{V}l ) and ( \mathbf{H}_l ). This norm reduces the influence of outlier pixels by assigning them lower weight during updates [18].

Fine-Tuning Stage (Global Optimization):
- After pretraining all layers, fine-tune the entire network by minimizing the global reconstruction error ( O{deep} = \Vert \mathbf{X} - \mathbf{V}1 \mathbf{V}2 \cdots \mathbf{V}L \mathbf{H}L \Vert{2,1} ).
- This stage propagates information backward, adjusting earlier layers based on the performance of all subsequent layers, which enhances the overall factorization quality [18].

Workflow Visualization and Data Interpretation

HNMF Spectral Unmixing Workflow

The following diagram illustrates the complete two-stage workflow for robust HNMF applied to hyperspectral unmixing, integrating both the pretraining and fine-tuning stages.

Hierarchical Structure of HNMF

This diagram illustrates the conceptual hierarchical decomposition of data in HNMF, showing how the model reveals multi-level structure compared to standard NMF.

Hierarchical NMF represents a significant advancement over both PCA and standard NMF for the analysis of spectral data in materials research. Its capacity to model the intrinsic hierarchical structure of complex mixtures, coupled with superior interpretability and robustness to noise, makes it an indispensable tool for researchers. The provided protocols and visualizations offer a practical foundation for implementing HNMF, enabling deeper insights into material composition and accelerating discovery in fields ranging from pharmaceuticals to geology. As HNMF algorithms continue to evolve, their integration with domain-specific knowledge will further unlock the potential of spectral data analysis.

Implementing HNMF: Methods and Real-World Applications in Research

Within the domain of unsupervised machine learning, Nonnegative Matrix Factorization (NMF) serves as a pivotal tool for parts-based data representation. The objective of NMF is to approximate a given nonnegative data matrix ( \mathbf{V} ) as the product of two lower-dimensional, nonnegative factor matrices: ( \mathbf{V} \approx \mathbf{W}^T \mathbf{H} ) [20]. This constraint of nonnegativity is crucial for materials research and many scientific fields, as it yields sparse, parts-based representations that are often more physically interpretable than those from methods permitting negative values (e.g., Principal Component Analysis) [2] [21]. Among the many algorithms developed to compute NMF, the Multiplicative Update (MU) and Hierarchical Alternating Least Squares (HALS) frameworks stand out for their widespread use and distinct characteristics.

The MU algorithm, popularized by Lee and Seung, is renowned for its simplicity of implementation [20] [22]. It operates through element-wise update rules that ensure the nonnegativity of the factors without requiring explicit projection steps. For the commonly used Frobenius norm loss, the updates for ( \mathbf{W} ) and ( \mathbf{H} ) are given by: [ \mathbf{W}{i,j} \leftarrow \mathbf{W}{ij} \frac{(\mathbf{V}\mathbf{H}^T){ij}}{(\mathbf{W}\mathbf{H}\mathbf{H}^T){ij}} \quad \text{and} \quad \mathbf{H}{i,j} \leftarrow \mathbf{H}{ij} \frac{(\mathbf{W}^T\mathbf{V}){ij}}{(\mathbf{W}^T\mathbf{W}\mathbf{H}){ij}} ] These multiplicative rules can be derived from gradient descent by using adaptive learning rates that eliminate subtraction and thus prevent negative elements [22]. A key advantage of MU is its adaptability to various loss functions and regularizations, making it a versatile tool. However, a significant drawback is that its convergence can be slow for some problems, particularly when minimizing the Frobenius norm [20].

In contrast, the HALS algorithm is a block coordinate descent method that optimizes one column of ( \mathbf{W} ) and one row of ( \mathbf{H} ) at a time [21]. Instead of updating the entire matrices simultaneously, HALS solves a series of constrained subproblems. For each column ( \mathbf{w}k ) of ( \mathbf{W} ) and each row ( \mathbf{h}k ) of ( \mathbf{H} ), the updates are: [ \mathbf{w}k \leftarrow \left[ \frac{ (\mathbf{V} - \sum{j \neq k} \mathbf{w}j \mathbf{h}j) \mathbf{h}k^T }{ \|\mathbf{h}k\|2^2 } \right]+ \quad \text{and} \quad \mathbf{h}k \leftarrow \left[ \frac{ \mathbf{w}k^T (\mathbf{V} - \sum{j \neq k} \mathbf{w}j \mathbf{h}j) }{ \|\mathbf{w}k\|2^2 } \right]+ ] where ( [\cdot]_+ ) denotes the projection onto the nonnegative orthant. This hierarchical approach often yields faster convergence and superior numerical performance compared to MU, as it more effectively exploits the structure of the problem [23] [21].

Quantitative Performance Comparison

The theoretical differences between MU and HALS translate into distinct practical performance. Empirical assessments, particularly in fields like electroencephalography (EEG) analysis, provide quantitative measures for comparing these algorithms across key metrics including estimation accuracy, stability, and computational time.

Table 1: Quantitative Comparison of NMF Algorithms on Simulated Data (SNR = 20 dB)

Algorithm	Average Correlation Coefficient*	Stability (Iq Index)	Relative Computation Time
lraNMF_HALS	~0.95	~0.90	1.0x (Fastest)
HALS	~0.85	~0.75	~1.5x
lraNMF_MU	~0.75	~0.65	~2.0x
NMF_MU	~0.65	~0.50	~3.0x

*Correlation of estimated components with ground truth. Higher is better. Data derived from assessment of NMF algorithms for EEG analysis [23].

The data in Table 1 demonstrates that HALS-based methods, particularly the low-rank approximation variant (lraNMFHALS), comprehensively outperform MU algorithms. The lraNMFHALS algorithm achieves the highest accuracy in recovering true underlying components, exhibits the greatest stability (as measured by the Iq index, where a higher value indicates more consistent results across multiple runs), and requires the least computational time [23]. This superior performance is attributed to HALS's more effective update strategy, which avoids the slow convergence often associated with the multiplicative updates.

Table 2: General Algorithmic Characteristics and Applicability

Feature	Multiplicative Updates (MU)	Hierarchical ALS (HALS)
Core Principle	Simultaneous updates via multiplication	Cyclic, column/row-wise updates
Convergence Speed	Slower, especially for Frobenius loss [20]	Faster [23] [21]
Stability of Results	Lower, higher variability between runs [23]	Higher, more reproducible results [23]
Implementation Complexity	Simple, easy to code [20]	More complex, requires careful optimization
Best-Suited For	Quick prototyping; KL divergence loss [20]	Large-scale data; high accuracy & stability required [23] [21]

Experimental Protocols for Materials Research

The application of NMF in materials science, particularly for techniques like 4D-Scanning Transmission Electron Microscopy (4D-STEM), requires specific protocols to ensure the extracted components are physically interpretable [2].

Protocol 1: Primitive NMF for 4D-STEM Data Factorizatio

Objective: To decompose a 4D-STEM dataset ( I_{4D}(x, y, u, v) ) into a set of nonnegative basis diffractions and their corresponding spatial maps to identify distinct material phases or orientations.

Materials and Data Preparation:

Input Data: A 4D data array ( I_{4D}(x, y, u, v) ) where ( (x, y) ) are real-space coordinates and ( (u, v) ) are reciprocal-space coordinates.
Data Reshaping: Reconstruct the 4D array into a 2D matrix ( \mathbf{X} \in \mathbb{R}^{n{uv} \times n{xy}} ), where ( n{uv} = nu \times nv ) is the total number of pixels in a single diffraction pattern, and ( n{xy} = nx \times ny ) is the total number of probe positions [2].

Procedure:

Initialization: Randomly initialize or use smart initialization for the matrices ( \mathbf{W} ) (diffraction basis) and ( \mathbf{H} ) (coefficient maps). Set the target rank ( k ), which corresponds to the expected number of distinct physical components.
Algorithm Selection and Iteration: Choose either MU or HALS and iterate until convergence (e.g., until the relative change in the Frobenius norm loss falls below a threshold of ( 10^{-4} ) or for a maximum of 100 iterations).
- For MU: Apply the multiplicative update rules (Eq. 3 & 4 in [2]).
- For HALS: Apply the hierarchical alternating least squares updates, optimizing each column of ( \mathbf{W} ) and each row of ( \mathbf{H} ) in sequence [21].
Post-processing and Interpretation: Reshape each column of ( \mathbf{W} ) back into a 2D diffraction pattern ( \mathbf{w}k(u, v) ). Reshape each row of ( \mathbf{H} ) back into a 2D spatial map ( \mathbf{h}k(x, y) ). Analyze these components to correlate specific diffractions with material features in real space.

Protocol 2: Constrained NMF with Domain Knowledge

Objective: To perform NMF with constraints that incorporate domain-specific knowledge from electron microscopy, such as spatial smoothness in maps and specific intensity profiles in diffractions, to avoid physically implausible artifacts.

Rationale: Primitive NMF can produce components with "downward-convex peaks" or high-frequency noise that are not physically meaningful in the context of electron diffraction [2]. This protocol integrates constraints to mitigate these issues.

Procedure:

Preprocessing and Primitive Factorization: Follow Protocol 1 to obtain an initial factorization using an unconstrained algorithm.
Constraint Implementation via Alternating Least Squares (ALS): Transition to an ALS-based solver (the core of HALS) which allows flexible projection of constraints [2].
- Spatial Smoothness Constraint: For each spatial map in ( \mathbf{H} ), apply a Gaussian or median filter during each update to suppress high-frequency noise and enforce spatial coherence, reflecting the finite probe size in STEM.
- Intensity Profile Constraint: For the diffraction basis in ( \mathbf{W} ), enforce non-sparsity or apply a smoothing kernel to prevent the algorithm from representing a continuous background as a set of sparse, sharp peaks.
Convergence Check: Monitor the cost function to ensure the constrained optimization is converging. The final output will be a set of smoothed, physically plausible components.

Figure 1: Workflow for 4D-STEM data factorization using primitive and constrained NMF.

The Scientist's Toolkit: Essential Research Reagents and Computational Tools

Successfully implementing NMF in a materials research context involves both data and software resources. The following table lists key "research reagents" for computational experiments.

Table 3: Essential Computational Reagents for NMF in Materials Science

Tool / Resource	Type	Function in Analysis	Example Platform/Library
4D-STEM Data Acquisitio	Raw Data	Provides the initial nonnegative matrix ( \mathbf{V} ) for factorization [2]	Electron Microscope
Data Reshaping Script	Preprocessing	Transforms 4D data array ( I_{4D} ) into 2D matrix ( \mathbf{X} ) for NMF [2]	Python (NumPy), MATLAB
NMF Algorithm Solver	Core Algorithm	Computes the factorization ( \mathbf{X} \approx \mathbf{W} \mathbf{H} ) using MU, HALS, or other variants [2] [23]	scikit-learn ('mu', 'cd'), MATLAB ('nnmf'), Custom HALS [23]
Smoothing & Constraint Functions	Post-processing/Constraint	Enforces domain knowledge (e.g., spatial smoothness) on ( \mathbf{W} ) or ( \mathbf{H} ) [2]	Custom Image Processing Filters
Stability Validation Script	Validation	Assesses the reliability of extracted components via multiple runs and clustering [23]	Python, MATLAB
U-104489	U-104489, CAS:177577-60-5, MF:C26H36N6O3S, MW:512.7 g/mol	Chemical Reagent	Bench Chemicals
U18666A	U18666A, CAS:3039-71-2, MF:C25H42ClNO2, MW:424.1 g/mol	Chemical Reagent	Bench Chemicals

The choice between Multiplicative Update and Hierarchical Alternating Least Squares algorithms for Nonnegative Matrix Factorization has a direct and significant impact on the quality and efficiency of data analysis in materials research. While MU offers simplicity and ease of implementation, comprehensive benchmarks show that HALS and its variants (like lraNMF_HALS) provide superior performance in terms of convergence speed, stability, and estimation accuracy [23]. For researchers working with complex materials characterization data such as 4D-STEM, starting with a primitive NMF analysis (using either MU or HALS) and then progressing to a constrained NMF framework that incorporates domain-specific knowledge is a powerful approach to extract physically meaningful and interpretable components from high-dimensional datasets [2].

Four-Dimensional Scanning Transmission Electron Microscopy (4D-STEM) has emerged as a revolutionary technique in materials characterization, enabling the acquisition of a two-dimensional diffraction pattern at every probe position during a two-dimensional raster scan, thus generating a complex four-dimensional dataset [24]. This method captures all available information from the probe's interaction with the sample, going beyond conventional STEM imaging which integrates large portions of the diffraction pattern to generate a single intensity value per probe position, thereby discarding vast amounts of structural and electrostatic information [25]. The 4D-STEM technique allows researchers to visualize the distribution of crystalline phases, crystal orientations, and the directions of magnetic and electric fields by leveraging differences in diffraction data at specific spatial positions [24].

However, this advanced capability comes with significant computational challenges. A typical 4D-STEM dataset can contain tens of thousands of diffraction patterns, often reaching gigabyte-scale sizes (e.g., 128â´ pixels with 4 bytes per pixel) [26]. For instance, datasets comprising 3,364 diffractions with 128Ã—128 pixels in diffraction space are common, creating substantial computational burdens for analysis [26]. This deluge of information necessitates sophisticated statistical and machine learning approaches to extract meaningful crystallographic insights with nanometer spatial resolution, pushing the boundaries of conventional electron microscopy data analysis methods [27] [26].

Hierarchical Nonnegative Matrix Factorization (HNMF) Framework

Theoretical Foundation of NMF for 4D-STEM

Nonnegative Matrix Factorization (NMF) provides a powerful framework for decomposing complex 4D-STEM datasets into interpretable components. The core mathematical principle involves representing the experimental data matrix X as a linear combination of essential diffraction patterns S and their spatial distributions C through the equation X = SC [26]. In this formulation, X âˆˆ â„â‚Š^{nuv Ã— nxy} represents the transformed 4D-STEM data where each experimental diffraction pattern s(u,v) at position (x,y) is transformed into a one-dimensional column vector, with nxy = nx Ã— ny representing the total number of probe positions and nuv = nu Ã— nv representing the dimensionality of each diffraction pattern [26].

The matrices S âˆˆ â„â‚Š^{nuv Ã— nk} and C âˆˆ â„â‚Š^{nk Ã— nxy} contain the factorized diffraction patterns and their spatial distributions (maps), respectively, with nk representing the number of components and nk << n_xy ensuring dimensionality reduction [26]. The non-negativity constraints on all matrices are physically meaningful for electron microscopy data since diffraction intensities and spatial concentrations cannot be negative, leading to more interpretable components compared to other factorization methods like Principal Component Analysis (PCA) where components often include unphysical negative values [27].

Hierarchical Clustering with Crystallographic Similarity

The hierarchical aspect of HNMF addresses a critical challenge in conventional NMF: determining the optimal number of components. Hierarchical clustering generates a nested set of clusters represented as a dendrogram, allowing researchers to explore the data structure at different levels of granularity without pre-specifying the final number of clusters [26]. Unlike conventional clustering that uses Euclidean distances or cosine similarity, the optimized HNMF approach employs a crystallographic similarity measure based on cross-correlation of diffraction patterns transformed into polar coordinates (r-Ï† space) [26].

This specialized similarity metric accounts for the physics of electron diffraction governed by Bragg's law (2dsinÎ¸ = Î»), where the scattering angle Î¸ is directly proportional to the inverse lattice constant of the material [26]. By allowing shifts only along the Ï† axis during cross-correlation computation, this approach automatically corrects for in-plane rotations of crystal domains, a common occurrence in real specimens that would otherwise complicate analysis using conventional distance measures [26]. The cross-correlation values range from -1 to 1, with peaks indicating perfect similarity and off-centering reflecting misalignment that can be computationally corrected.

Domain-Constrained NMF Formulation

Recent advances have incorporated domain-specific constraints inherent to electron microscopy to further enhance the interpretability and physical relevance of the factorization results. These constraints include spatial resolution limits and continuous intensity features without downward-convex peaks, which reflect physical knowledge about how materials scatter electrons [1]. This constrained NMF approach has demonstrated superior performance in decomposing both simulated and actual experimental 4D-STEM data into interpretable diffractions and maps that cannot be achieved using PCA and primitive NMF methods [1].

The domain knowledge integration helps mitigate common artifacts found in conventional machine learning techniques that rely solely on mathematical constraints without incorporating physical understanding of electron microscopy principles [1]. By embedding these domain-specific constraints directly into the factorization algorithm, researchers can obtain results that are not only mathematically sound but also physically meaningful, bridging the gap between pure data analysis and materials science interpretation.

Experimental Protocols and Implementation

Data Acquisition Parameters for 4D-STEM

The acquisition of high-quality 4D-STEM data requires careful optimization of experimental parameters to ensure sufficient signal-to-noise ratio while minimizing electron dose damage to sensitive specimens. The following table summarizes key acquisition parameters used in published studies applying HNMF to 4D-STEM data:

Table 1: Typical 4D-STEM Data Acquisition Parameters

Parameter	Specification	Application Context
Accelerating Voltage	200 kV	High-resolution imaging of titanium oxide nanosheets [27]
Spatial Scan Dimensions	58Ã—58 pixels (3,364 diffractions)	Metallic glass analysis [26]
Diffraction Pattern Dimensions	128Ã—128 pixels	Standard for balance of resolution & file size [26]
Detector Type	Fast, high-sensitivity camera (CCD/CMOS)	Ptychography and orientation mapping [24]
Convergence Angle	Small (non-overlapping disks) to large (overlapping disks)	Application-dependent [24]

HNMF Algorithm Implementation

The implementation of HNMF for 4D-STEM data follows a multi-step procedure that combines alternating least-squares NMF with hierarchical clustering:

Data Preprocessing: The 4D data Iâ‚„D(x,y,u,v) is transformed into a 2D matrix X where each column represents an unraveled diffraction pattern [26].
Dimensionality Reduction via NMF:
- The number of components n_k is assumed (typically starting with an overestimate) [26].
- Matrix C is initialized with non-negative random numbers [26].
- Alternate updating of S and C matrices:
  - S = (XCáµ€)(CCáµ€)â»Â¹, with negative values set to zero and columns normalized [26].
  - C = (Sáµ€S)â»Â¹(Sáµ€X), with negative values set to zero [26].
- Row vectors of C are sorted by their L2 norm, with corresponding reordering of S columns [26].
- Mean Square Error (MSE) is calculated as MSE = (nxy Ã— nuv)â»Â¹âˆ‘(X - SC)Â² [26].
- Multiple computations with different initial values are performed to avoid local minima [26].
Hierarchical Clustering:
- Factorized diffractions are transformed to polar coordinates (r-Ï† space) [26].
- Cross-correlation with Ï†-axis shifting is computed as similarity measure [26].
- Dendrogram is constructed to visualize hierarchical relationships [26].
- Optimal clustering level is selected based on domain knowledge [26].

HNMF Workflow for 4D-STEM Data Analysis

Research Reagents and Essential Materials

The successful application of HNMF to 4D-STEM data analysis requires both physical specimens and computational tools, as detailed in the following table:

Table 2: Essential Research Reagents and Computational Solutions

Category	Specific Examples	Function/Purpose
Specimen Systems	Titanium oxide nanosheets, Zr-Cu-Al metallic glass, Mn-Zn ferrite	Model systems for method validation [27] [26]
Software Platforms	DigitalMicrograph (Gatan) with custom scripts, Python/scikit-learn	Implementation of NMF and clustering algorithms [26]
Computational Methods	Alternating Least Squares (ALS) NMF, Cross-correlation similarity	Core factorization and similarity measurement [26]
Detector Systems	Fast pixelated detectors (CCD/CMOS), Timepix3 event-based detectors	Data acquisition with high temporal resolution [28]
Preprocessing Tools	Polar coordinate transformation, Intensity normalization	Data preparation for crystallographic analysis [26]

Applications and Case Studies

Analysis of Titanium Oxide Nanosheets

In a foundational study, NMF was applied to 4D-STEM data acquired from titanium oxide nanosheets with overlapping domains [27]. The experimental data contained diffraction patterns from both pristine Tiâ‚€.â‚ˆâ‚‡Oâ‚‚ and topotactically reduced domains, which were successfully factorized into interpretable components using the HNMF approach [27]. The analysis revealed that NMF provided lower Mean Square Errors (MSEs) compared to PCA for up to 9 components, demonstrating its superior capability for identifying a small number of essential components from complex 4D-STEM data [27]. This case study established HNMF as a valid approach for mining useful crystallographic information from big data obtained using 4D-STEM, particularly for systems with overlapping structural domains.

Metallic Glass Nanostructural Analysis

The combination of 4D-STEM and optimized unsupervised machine learning enabled comprehensive bimodal analysis of a high-pressure-annealed metallic glass, Zr-Cu-Al [26]. This investigation revealed an amorphous matrix and crystalline precipitates with an average diameter of approximately 7 nm, which were challenging to detect using conventional STEM techniques [26]. The HNMF approach successfully decomposed the complex dataset into physically meaningful diffraction patterns and spatial maps, demonstrating the power of this method for analyzing nanostructures that deviate from perfect crystallinity and would be difficult to characterize using traditional methods.

Battery Material Interfaces

Recent advances have applied randomized NMF (RNMF) with QB decomposition preprocessing to map complex battery interfaces, specifically between amorphous Liâ‚â‚€GePâ‚‚Sâ‚â‚‚ (LGPS) and crystalline LiNiâ‚€.â‚†Coâ‚€.â‚‚Mnâ‚€.â‚‚Oâ‚‚ (NMC) [29]. This approach addressed the significant computational challenges of analyzing large 4D-STEM datasets, achieving scaling independent of the largest data dimension (âˆ¼O(nk)) instead of the conventional O(nmk) scaling of standard NMF [29]. The successful application to this technologically important material system highlights the potential of HNMF for investigating mixed crystalline-amorphous interfaces in functional materials.

HNMF Application Domains in Materials Research

Technical Considerations and Limitations

Computational Challenges and Solutions

The application of HNMF to 4D-STEM data faces significant computational hurdles due to the large dataset sizes involved. Conventional NMF scales as O(nmk) where n is the number of probe positions, m is the number of diffraction features, and k is the number of components, making analysis of large 4D datasets computationally intensive and often prohibitively slow [29]. For example, while PCA analysis might complete in 55 seconds, NMF analysis on the same dataset could require 44 hours [29]. To address these challenges, researchers have developed optimized approaches including randomized NMF (RNMF) using QB decomposition as a preprocessing step, which achieves scaling independent of the largest data dimension (âˆ¼O(nk)) [29]. Additionally, event-driven acquisition and processing frameworks based on direct electron detectors (e.g., Timepix3 chip) offer potential for reduced memory, bandwidth, and computational requirements by working directly with sparse electron event data rather than intermediate dense representations [28].

Methodological Limitations and Optimization Strategies

The HNMF approach has two primary technical difficulties: first, the number of components must be assumed in advance, and second, there is a possibility of convergence to local minima rather than the global minimum of interest [27]. To mitigate these issues, researchers typically perform multiple computations with different initial values to survey the global minimum and compare MSE values across different component numbers to estimate the optimal dimensionality [26]. The hierarchical clustering component of the workflow helps address the component number uncertainty by allowing examination of the data structure at multiple levels of granularity. Additionally, the incorporation of domain-specific constraints, such as spatial resolution limits and continuous intensity features without downward-convex peaks, has shown promise in improving result interpretability and physical relevance [1]. These constrained NMF approaches successfully decompose both simulated and actual experimental data into interpretable diffractions and maps that cannot be achieved using PCA and primitive NMF methods [1].

Hierarchical Nonnegative Matrix Factorization represents a powerful analytical framework for extracting meaningful information from complex 4D-STEM datasets. By combining the intrinsic interpretability of NMF with the flexibility of hierarchical clustering and domain-specific constraints, this approach enables comprehensive bimodal analysis of material nanostructures that would be challenging using conventional methods. The continuing development of computational optimizations, such as randomized NMF and event-driven processing frameworks, promises to make these techniques more accessible for routine analysis of large 4D-STEM datasets. As instrumentation advances yield ever-larger datasets, the integration of domain knowledge with sophisticated machine learning approaches like HNMF will be crucial for unlocking the full potential of 4D-STEM for materials characterization across diverse applications from energy storage to magnetic materials.

Hyperspectral unmixing (HU) is a cornerstone analytical technique for interpreting hyperspectral images (HSIs), which are acquired across numerous contiguous wavelength bands, providing detailed spatial and spectral information about a scene [30] [31]. A fundamental challenge in analyzing these images arises from the limited spatial resolution of the imaging sensors. This often results in a single pixel capturing a mixture of the spectral signatures of multiple distinct materials present in the sensor's field of view [32]. Hyperspectral unmixing is the inverse process designed to resolve this mixture by decomposing each pixel's spectrum into a set of fundamental spectral signatures, known as endmembers (which ideally correspond to pure materials), and their corresponding fractional abundances (which represent the proportion of each material within the pixel) [32] [30]. The most foundational approach to this problem is the Linear Mixing Model (LMM), which assumes that a pixel's spectrum is a linear combination of endmember spectra, weighted by their abundances, and that photons interact with only one material before reaching the sensor [32] [31].

However, a significant limitation of classical LMM is the assumption that a single spectral signature is perfectly representative of an entire material class. In reality, endmember variability is a pervasive phenomenon caused by variable illumination, environmental conditions, atmospheric effects, temporal changes, and intrinsic differences in the material itself (e.g., grain size or chemical composition) [32] [33]. Ignoring this intra-class spectral variation introduces errors and propagates inaccuracies throughout the analysis, limiting the reliability of the identified material phases and their abundances. Therefore, advancing unmixing techniques to explicitly account for spectral variability is crucial for improving the accuracy of material identification and quantification in complex scenarios, such as planetary surface mapping or the analysis of sophisticated material systems in a research setting [32] [31]. This document details protocols and application notes for tackling these challenges, with a specific focus on integration with hierarchical and constrained nonnegative matrix factorization techniques.

Theoretical Foundations and Key Concepts

Mixing Models

The choice of a mixing model is the primary factor determining the approach to unmixing and the interpretation of the results. The two broad categories are linear and nonlinear models.

Linear Mixing Model (LMM): This model is mathematically simple and physically justifiable when materials are spatially segregated within a pixel. The observed spectrum of a pixel, x, is given by: x = S a + e where S is the matrix containing the endmember spectra, a is the vector of fractional abundances, and e is a noise term [32]. The abundances are typically constrained to be non-negative and sum to one, representing the fractional area covered by each material [32] [31].
Nonlinear Mixing Model: Nonlinear effects occur when incoming light interacts with multiple materials before being recorded by the sensor. This is common in intimate mixtures (materials mixed at a microscopic level) or multilayered structures [31]. Nonlinear models are more complex but can provide a more accurate representation of the underlying physics in such scenarios [32].

Spectral Variability and its Causes

Spectral variability refers to the change in the spectral signature of a single material class. Endmember variability is a specific term for this phenomenon within the unmixing context [32]. The principal causes include:

Illumination and Atmosphere: Changes in sun angle, weather, and atmospheric conditions between or during acquisitions [33].
Temporal Changes: Phenological changes in vegetation or weathering of materials over time [32].
Spatial and Angular Variability: Changes in material properties across a scene or due to different sensor viewing angles [32].
Intrinsic Material Properties: Variations in grain size, chemical composition, or crystal structure in geological or material science samples [31].

Nonnegative Matrix and Tensor Factorization

Nonnegative Matrix Factorization (NMF) is a powerful unsupervised learning tool for hyperspectral unmixing. Given a nonnegative data matrix X (where columns are pixel spectra), NMF factorizes it into two low-rank nonnegative matrices: W (the basis vectors, or endmembers) and H (the coefficients, or abundances) such that X â‰ˆ WH [3] [2]. Its "parts-based" representation and inherent non-negativity align perfectly with the physical constraints of hyperspectral unmixing.

Table 1: Key Matrix Factorization Frameworks for Hyperspectral Unmixing

Framework	Key Formulation	Advantages for Unmixing
NMF [3] [2]	`X â‰ˆ W H`	Provides a "parts-based" representation; naturally enforces non-negativity on endmembers and abundances.
Hierarchical NMF (HNMF) [3]	Recursively applies NMF in layers: `X â‰ˆ Wâ‚ Hâ‚`, `Hâ‚ â‰ˆ Wâ‚‚ Hâ‚‚`, ...	Reveals hierarchical topic structure; allows exploration of materials at multiple levels of granularity.
Neural NMF [3]	Frames HNMF as a neural network with backpropagation optimization.	Improves reconstruction error and hierarchical structure interpretability over traditional HNMF.
Constrained NMF [2]	`X â‰ˆ W H`, with domain-specific constraints applied during optimization.	Incorporates physical knowledge (e.g., spatial smoothness, specific intensity profiles) to yield more interpretable components.

Advanced Methodologies for Handling Variability

A Taxonomy of Techniques for Endmember Variability

Several methodological approaches have been developed to address the critical challenge of endmember variability, which can be categorized as follows [32]:

Endmember Bundles: This approach uses a dictionary or set of candidate spectra for each endmember class to introduce diversity, rather than relying on a single signature [32].
Probabilistic Methods: These methods model each endmember class as a statistical distribution, accounting for variance within the class [32].
Extended Mixing Models: These models, such as the Perturbed Linear Mixing Model (PLMM), explicitly include a term to account for spectral variations within the standard linear model framework [32].
Unmixing in the Temporal Dimension: For multi-temporal hyperspectral images, specialized dynamical models can be used to track and unmix spectral changes over time [32].

Protocol: Linear Unmixing with MCR-ALS for Multiplatform Image Fusion

This protocol is adapted from the work on multiplatform image fusion using Multivariate Curve Resolution-Alternating Least Squares (MCR-ALS), a flexible linear unmixing method [34]. MCR-ALS is particularly suited for fusion scenarios as it can handle multiset structures and incorporate diverse constraints.

Application Scope: Fusing hyperspectral images from different spectroscopic platforms (e.g., Raman, synchrotron infrared, fluorescence) to obtain a unified chemical model of a sample, such as a vegetal tissue cross-section.
Objective: To resolve the pure spectral signatures and their distribution maps from a fused HSI data structure, providing a complete description of sample constituents.

Step-by-Step Workflow:

Sample Preparation & Image Acquisition:
- Prepare the sample to be compatible with all imaging platforms. In the case of rice leaves, this involved embedding leaves in agarose and producing thin cryosections placed on calcium fluoride slides [34].
- Acquire hyperspectral images from each platform (e.g., SR-FTIR, Raman, fluorescence) from the same sample region.
Data Preprocessing & Fusion:
- Perform platform-specific preprocessing (e.g., cosmic ray removal, baseline correction).
- Spatial Co-registration: Align all images to a common spatial grid and orientation to account for differences in resolution, orientation, and scanned area. This is a critical step for successful fusion [34].
- Data Structuring: Fuse the preprocessed and co-registered images into a single multiset data structure (e.g., a column-wise augmented matrix) for MCR-ALS analysis.
MCR-ALS Analysis with Constraints:
- Initialization: Provide an initial estimate of either the spectral profiles or the concentration maps. This can be done using pure variable detection methods or based on prior knowledge.
- Alternating Least Squares Optimization: Iteratively solve for the concentration (C) and spectral (S) matrices to minimize ||X - C S||.
- Apply Constraints: During the ALS cycles, apply physically meaningful constraints to ensure interpretable solutions [34]. Common constraints include:
  - Non-negativity: Enforced on both concentrations and spectra.
  - Spectrum equality: If a pure spectrum is known from a library, it can be forced to remain constant.
  - Closure: If needed, to enforce a mass balance.
  - Spatial or spectral smoothing.
Model Validation & Interpretation:
- Use lack-of-fit and percent of variance explained to evaluate the model's performance.
- Physically interpret the resolved pure spectra by comparing them to known spectral libraries.
- Analyze the concentration maps to locate the distribution of each chemical component within the sample.

The following diagram illustrates the logical workflow and data flow for this MCR-ALS based fusion protocol.

Protocol: Constrained NMF for 4D-STEM with Domain Knowledge

This protocol applies a modern Constrained NMF approach, incorporating domain-specific knowledge to analyze 4D-Scanning Transmission Electron Microscopy (4D-STEM) data, a type of hyperspectral data [2]. The principles are directly applicable to spectral unmixing where physical realism is paramount.

Application Scope: Decomposing 4D-STEM (or other HSI) data into interpretable diffractions (endmembers) and spatial maps (abundances), avoiding physically implausible artifacts common in purely mathematical factorizations.
Objective: To extract physically realistic components that respect the inherent resolution and signal characteristics of the instrument.

Step-by-Step Workflow:

Data Transformation:
- Transform the 4D dataset I_4D(x, y, u, v) into a 2D matrix X. Each 2D diffraction pattern I_2D(u, v) at a real-space position (x, y) is unfolded into a column of X. Thus, rows of X correspond to reciprocal-space coordinates (u, v) and columns correspond to real-space positions (x, y) [2].
Algorithm Selection & Constraint Definition:
- Choose an NMF solver, such as the Multiplicative Update (MU) or Alternating Least Squares (ALS) algorithm.
- Define domain-specific constraints [2]:
  - Spatial Smoothing Constraint: Apply a low-pass filter (e.g., Gaussian, median) to the map matrix H during each iteration to enforce smoothness and suppress high-frequency noise, respecting the spatial resolution of the instrument.
  - Intensity Profile Constraint: For the diffraction matrix W, apply a filter or penalty that prevents unphysical "downward-convex" peaks in continuous intensity backgrounds. This ensures the extracted spectral components represent realistic signal profiles.
Constrained NMF Optimization:
- Implement the update rules for W and H, integrating the defined constraints within each iteration.
- For the MU algorithm, the standard updates are [2]:
  - W â† W * (X H^T) âŠ˜ (W H H^T)
  - H â† H * (W^T X) âŠ˜ (W^T W H)
- Within these updates, apply the smoothing constraint to H and the intensity profile constraint to W after each multiplicative step.
Component Analysis & Clustering:
- After convergence, the k-th column of W is an interpreted diffraction pattern (endmember), and the k-th row of H is its corresponding abundance map.
- For further classification, hierarchical clustering can be optimized based on diffraction pattern similarity (e.g., using polar coordinate transformations and cross-correlation) to group similar precipitates or material phases [2].

The following diagram illustrates the core computational process of the Constrained NMF protocol.

The Scientist's Toolkit: Essential Research Reagents & Materials

Table 2: Key Research Reagent Solutions for Hyperspectral Unmixing Experiments

Item / Solution	Function / Role in the Protocol
Calcium Fluoride (CaFâ‚‚) Slides & Coverslips	Provides a non-absorbing substrate and window for transmission-mode spectroscopic imaging, especially in IR and SR-FTIR [34].
Agarose Embedding Medium	Used to support and maintain the structural integrity of delicate biological tissues (e.g., rice leaves) during cryosectioning [34].
Spectral Library / Dictionary	A collection of known pure material spectra (e.g., USGS, JPL). Used for endmember identification, initialization, or as a constraint during unmixing [31].
MCR-ALS Software Suite	A computational environment (e.g., in MATLAB, Python) implementing the MCR-ALS algorithm with flexible constraint application, crucial for the fusion protocol [34].
Constrained NMF Algorithm Scripts	Custom or library scripts (e.g., for DigitalMicrograph, Python) that implement NMF with domain-specific constraints for 4D-STEM/HSI analysis [2].
Spatial Co-registration Tool	Software tool for aligning multiple hyperspectral images from different platforms to a common spatial grid, a prerequisite for image fusion [34].
UK-1745	UK-1745, CAS:170684-14-7, MF:C16H23ClN2O2, MW:310.82 g/mol
Tec-IN-1	Tec-IN-1, CAS:931664-41-4, MF:C16H15ClN4O2S, MW:362.8 g/mol

Hyperspectral unmixing has evolved significantly from rigid linear models to flexible frameworks that account for the real-world complexity of spectral variability. The integration of domain-specific knowledge into factorization methods like Constrained NMF and MCR-ALS is pivotal for extracting chemically and physically meaningful endmembers and abundance maps from complex data. Furthermore, the emergence of hierarchical and multi-layer factorization models, such as Neural NMF, offers a promising avenue for discovering and representing the intrinsic hierarchical structure of material phases in a system. For researchers in materials science, these advanced protocols provide a robust toolkit for moving beyond simple identification towards a comprehensive, quantitative, and interpretable analysis of material composition and distribution.

Hierarchical Nonnegative Matrix Factorization (hNMF) represents a powerful advancement in the computational analysis of complex materials science data. Building upon the standard NMF framework, which approximates a non-negative data matrix V as the product of two lower-dimensional, non-negative matrices W (basis components) and H (coefficients) such that V â‰ˆ WH, hierarchical methods introduce multiple layers of decomposition to separate mixed signals at different spatial or spectral scales [35]. This technical approach is particularly valuable for materials characterization, where researchers often encounter measurements containing both sharp, localized features (sparse targets) and broad, distributed signals (diffuse backgrounds). The Two-Hierarchical NMF (thNMF) methodology specifically addresses this common analytical challenge by implementing a cascaded factorization process that sequentially isolates these fundamentally different signal types.

In materials research, the ability to distinguish sparse targets from diffuse backgrounds has profound implications across multiple domains. For battery research, it enables the separation of isolated degradation particles from homogeneous electrode matrices. In catalyst studies, it helps distinguish active catalytic sites from support material signatures. For polymer composites, it can separate filler particle signals from the bulk polymer background. The thNMF framework provides a mathematically rigorous, computationally efficient approach to this pervasive analytical problem, offering significant advantages over traditional spectral unmixing or background subtraction techniques [35] [36].

Theoretical Foundations of thNMF

Mathematical Formulation

The Two-Hierarchical NMF algorithm extends standard NMF through a two-stage decomposition process. In the first stage, the original data matrix V is factorized to separate dominant background components:

V â‰ˆ Wâ‚Hâ‚ + Eâ‚

where Wâ‚ represents the basis vectors corresponding to diffuse background patterns, Hâ‚ contains their coefficients, and Eâ‚ is the residual matrix after background subtraction. The innovation of thNMF lies in its treatment of this residual. Rather than treating Eâ‚ as noise, the algorithm performs a second factorization specifically designed to capture sparse targets:

Eâ‚ â‰ˆ Wâ‚‚Hâ‚‚

where Wâ‚‚ represents the basis vectors for sparse target components and Hâ‚‚ contains their sparse coefficients. The complete thNMF model thus becomes:

V â‰ˆ Wâ‚Hâ‚ + Wâ‚‚Hâ‚‚

This hierarchical approach explicitly models the different statistical characteristics of background and target components. The method incorporates specialized regularization terms tailored to each component type: smoothness or graph regularization for background components to capture their diffuse nature [35], and sparsity constraints (Lâ‚ regularization) for target components to enforce their localized character.

Key Algorithmic Differentiators

thNMF incorporates several critical innovations that distinguish it from conventional NMF approaches:

Structure-Preserving Factorization: Unlike standard NMF which treats all components equally, thNMF preserves the intrinsic structural properties of both background and target components through tailored regularization schemes [35].
Weighted Label Constraints: Drawing inspiration from semi-supervised NMF methods, thNMF can incorporate prior knowledge about specific components through weighted label matrices that preserve both label information and data magnitude [35].
Graph Regularization: For materials with spatial organization, thNMF can integrate graph Laplacian regularization to preserve local neighborhood relationships in the factorizations, maintaining the topological structure of the data [36].

The convergence of thNMF is guaranteed through an iterative update algorithm that alternately optimizes the background and target factors while maintaining non-negativity constraints. The optimization procedure minimizes a joint cost function combining reconstruction error, background smoothness penalties, and target sparsity penalties.

Experimental Protocols and Methodologies

General Implementation Framework

The successful application of thNMF requires careful implementation across three phases: data preprocessing, model optimization, and result interpretation. Below is a comprehensive protocol for implementing thNMF in materials characterization workflows.

Data Preprocessing Protocol

Data Formatting: Convert raw analytical data (spectral images, diffraction patterns, etc.) into a non-negative data matrix V of dimensions (m \times n), where (m) represents features (wavelengths, scattering angles) and (n) represents samples or spatial positions.
Background Assessment: Perform initial exploratory data analysis to characterize background dominance using:
- Singular value decomposition (SVD) to estimate rank
- Spatial/spectral autocorrelation analysis
- Histogram analysis of intensity distributions
Normalization: Apply appropriate normalization based on data type:
- For spectral data: Vector normalization or standard normal variate
- For count-based data: Total intensity normalization
- For spatial data: Detrending or baseline correction

Model Optimization Procedure

Parameter Initialization:
- Set the background rank (k1) and target rank (k2) based on SVD scree plots or prior knowledge
- Initialize Wâ‚ and Hâ‚ using non-negative double SVD (NNDSVD)
- Initialize Wâ‚‚ and Hâ‚‚ with random non-negative values
Iterative Optimization:
- Update background factors: Hâ‚ â† Hâ‚ âŠ— (Wâ‚áµ€V) âŠ˜ (Wâ‚áµ€Wâ‚Hâ‚ + Î»â‚âˆ‡Hâ‚)
- Update target factors: Hâ‚‚ â† Hâ‚‚ âŠ— (Wâ‚‚áµ€Eâ‚) âŠ˜ (Wâ‚‚áµ€Wâ‚‚Hâ‚‚ + Î»â‚‚ðŸ™)
- Update basis matrices with similar multiplicative rules
- Monitor convergence via reconstruction error: (\|V - Wâ‚Hâ‚ - Wâ‚‚Hâ‚‚\|_F)
Hyperparameter Tuning:
- Optimize regularization parameters Î»â‚ (background smoothness) and Î»â‚‚ (target sparsity) through grid search
- Validate using domain-specific quality metrics or cross-validation

Application-Specific Methodologies

Protocol 1: thNMF for Spectral Imaging of Catalyst Materials

Application Context: Identifying rare precious metal clusters on high-surface-area support materials using spectroscopic imaging techniques.

Table 1: Key Parameters for Catalyst Characterization

Parameter	Recommended Setting	Rationale
Background Rank (kâ‚)	2-4	Captures support material and bulk phase variations
Target Rank (kâ‚‚)	1-2	Corresponds to different metal cluster types
Sparsity Regularization (Î»â‚‚)	0.1-0.5	Enforces localized nature of metal clusters
Convergence Threshold	1e-6	Balances accuracy and computation time

Step-by-Step Workflow:

Acquire hyperspectral image data from STEM-EDX or synchrotron-based XRF mapping
Reshape spectral cubes into (m \times n) matrix (m = energy channels, n = pixels)
Apply thNMF with parameters from Table 1
Validate results using:
- Co-localization with TEM imaging
- Comparison with bulk composition analysis
- Correlation with catalytic activity measurements

Protocol 2: thNMF for Battery Electrode Degradation Analysis

Application Context: Isolating heterogeneous degradation products from intact electrode matrix in cycled battery materials.

Table 2: Optimization Parameters for Battery Materials Analysis

Parameter	Recommended Setting	Rationale
Background Rank (kâ‚)	3-5	Represents primary electrode phases and electrolytes
Target Rank (kâ‚‚)	2-3	Captures different degradation species
Smoothness Regularization (Î»â‚)	0.05-0.2	Maintains homogeneity of bulk phases
Number of Iterations	1000-5000	Ensures convergence for complex degradation patterns

Step-by-Step Workflow:

Collect synchrotron Î¼-XANES or Raman spectral maps from electrode cross-sections
Preprocess to remove cosmic rays, fluorescence background, and instrument artifacts
Implement thNMF with graph regularization to preserve spatial relationships
Interpret results through:
- Spatial correlation with capacity loss measurements
- Validation with post-mortem TEM and EELS analysis
- Comparison with electrochemical modeling predictions

Research Reagent Solutions and Computational Tools

Successful implementation of thNMF requires both computational resources and analytical materials tailored to specific applications in materials research.

Table 3: Essential Research Reagent Solutions for thNMF Applications

Reagent/Tool	Function	Application Examples
High-Purity Solvents (Electronic Grade)	Sample preparation and processing	N-Methylformamide for precursor synthesis [37] [38]
Reference Standard Materials	Method validation and calibration	Certified reference materials for quantitative analysis
Specialized Software Libraries	Algorithm implementation	Python with scikit-learn, MANTA, or custom NMF toolboxes [39]
High-Resolution Characterization	Ground truth validation	TEM, SEM, XRD for structural correlation

The MANTA Python library provides an excellent foundation for implementing thNMF, offering integrated pipelines for corpus-specific tokenization and advanced term weighting schemes that can be adapted for materials science applications [39]. For large-scale spectral imaging data, specialized implementations using projective NMF methods can significantly reduce computational overhead while maintaining analytical precision.

Workflow Visualization and Data Interpretation

The analytical workflow for thNMF follows a systematic process from data acquisition to scientific insight, with multiple validation checkpoints to ensure robust results.

thNMF Analytical Workflow

Interpretation Guidelines

Successful interpretation of thNMF results requires careful analysis of both the background and target components:

Background Component Analysis:
- Examine Wâ‚ for spectral profiles representing homogeneous phases
- Analyze Hâ‚ maps for spatial distribution of bulk materials
- Validate against known reference spectra or physical models
Target Component Analysis:
- Inspect Wâ‚‚ for characteristic signatures of rare features
- Analyze sparsity patterns in Hâ‚‚ for localization information
- Correlate target distributions with performance properties
Quantitative Metrics:
- Calculate relative contributions of background vs. target components
- Compute sparsity indices for target components
- Determine reconstruction quality metrics

Comparative Performance Analysis

The effectiveness of thNMF can be evaluated through systematic comparison with alternative methodologies across multiple performance dimensions.

Table 4: Method Performance Comparison for Materials Characterization

Method	Background Separation	Target Sparsity	Computational Efficiency	Interpretability
thNMF	Excellent [35]	Excellent [35]	Moderate	High
Standard NMF	Poor	Moderate	High	Moderate
PCA/ICA	Moderate	Poor	High	Low
Deep Learning	Good	Good	Low	Low

The superior performance of thNMF stems from its explicit modeling of the different statistical characteristics of background and target components. Unlike standard NMF which applies identical constraints to all components, thNMF's hierarchical approach with tailored regularization enables more physically meaningful factorizations specifically optimized for the sparse target separation problem [35].

Two-Hierarchical NMF represents a significant advancement in computational materials characterization, providing researchers with a powerful tool for separating sparse targets from diffuse backgrounds. The method's mathematical foundation in structured matrix factorization, combined with its adaptability to various analytical techniques, makes it particularly valuable for investigating heterogeneous materials systems where rare features dictate functional properties.

Future developments in thNMF will likely focus on several frontiers. The integration with multi-modal data fusion approaches will enable more comprehensive materials characterization by simultaneously analyzing complementary datasets [36]. Advances in real-time implementation will open opportunities for adaptive experimental control, where thNMF results guide subsequent measurement strategies. Finally, the incorporation of physics-based constraints into the factorization framework will enhance the physical interpretability of results, bridging the gap between computational pattern recognition and fundamental materials science principles.

Nonnegative Matrix Factorization (NMF) is a powerful unsupervised learning technique that decomposes a non-negative data matrix X into the product of two lower-rank, non-negative matrices: a basis matrix W and a coefficient matrix H, such that X â‰ˆ WH [40]. This constraint of non-negativity leads to parts-based representations that are often more interpretable than those provided by other factorization methods [41]. In materials research, this property is particularly valuable as it aligns with physical realities where measurements such as pixel intensities, chemical concentrations, or spectral signals cannot be negative [2].

Hierarchical NMF (HNMF) extends this fundamental concept by building multi-layer architectures that progressively extract features at different levels of abstraction. Unlike "shallow" NMF models which may not accurately capture complex underlying physiology or material structures, deep HNMF architectures can learn hierarchical coordination patterns through multiple layers of decomposition [42]. This approach naturally generalizes matrix-based counterparts to tensor formulations, enabling the analysis of complex, multi-modal data structures common in modern materials characterization [43]. The hierarchical organization allows researchers to visualize topics and features at various levels of granularity while illustrating their hierarchical relationships, making it especially suitable for analyzing intricate material systems with features spanning multiple scales [43] [44].

Fundamental Principles of NMF

Core Mathematical Formulation

The standard NMF problem can be formulated as an optimization task where given a non-negative matrix X âˆˆ â„^{mÃ—n}{+}, we seek to find non-negative matrices W âˆˆ â„^{mÃ—k}{+} and H âˆˆ â„^{kÃ—n}_{+} that minimize the reconstruction error [2]:

The rank k is typically chosen to be much smaller than min(m,n), resulting in a compressed, parts-based representation of the original data [40]. Two common algorithms for solving this optimization are Multiplicative Update (MU) and Alternating Least Squares (ALS). The MU algorithm employs iterative update rules derived from calculus [2]:

H â† H * (W^TX) âŠ˜ (W^TWH)

W â† W * (XH^T) âŠ˜ (W*HH*^T)

where âŠ˜ denotes element-wise division [40]. The ALS approach alternatively solves for W and H while projecting the solutions onto the non-negative orthant [2].

Advanced Regularization Techniques

To enhance the physical interpretability and performance of NMF in materials science applications, several regularization strategies can be incorporated:

Sparsity constraints promote the discovery of the simplest functional units, which is consistent with the notion of "gestures" in phonological theories and fundamental building blocks in material structures [42].
Graph regularization enables the discovery of intrinsic geometric structures in the data by incorporating manifold geometry, which is particularly valuable for capturing the continuous nature of many material properties [42].
Domain-specific constraints integrate knowledge of scientific instrumentation, such as spatial resolution limits and intensity profile characteristics, to prevent physically implausible artifacts like downward-convex peaks in continuous intensity profiles [2].

Table 1: Comparison of NMF Regularization Techniques for Materials Research

Regularization Type	Mathematical Formulation	Materials Science Application	Effect on Results
Sparsity	Addition of L1 penalty term to cost function	Identification of fundamental structural units	Increases interpretability; reduces size variability
Graph Regularization	Incorporation of Laplacian smoothness term	Preservation of manifold geometry in spectral data	Discovers intrinsic geometric structure; improves clustering
Domain Constraints	Application of physical constraints during optimization	Enforcement of resolution limits in electron microscopy	Eliminates physically impossible high-frequency components

Deep NMF Architectures for Hierarchical Feature Extraction

Hierarchical Decomposition Frameworks

Deep NMF architectures extend the concept of standard NMF by implementing multiple layers of decomposition. In a typical two-layer HNMF, the input matrix X is first factorized into Wâ‚ and Hâ‚, after which the coefficient matrix Hâ‚ is further factorized into Wâ‚‚ and Hâ‚‚, resulting in the overall factorization X â‰ˆ Wâ‚Wâ‚‚Hâ‚‚ [42]. This approach can be generalized to multiple layers, creating a hierarchical structure that progressively extracts features at different levels of abstraction.

A recent innovation in deep NMF networks incorporates local feature interactions through subsequent 1Ã—1 convolutional layers following NMF modules. This architecture more closely emulates cortical hyper-columns in biological systems and has demonstrated performance exceeding that of pure convolutional neural networks of similar size on benchmark datasets [45]. The 1Ã—1 convolutional layers enable local interactions that include inhibition, an important property missing in earlier NMF networks that only included positive connections [45].

Tensor Generalizations

For complex, multi-modal data structures encountered in materials characterization, hierarchical nonnegative tensor decomposition (HNTD) provides a natural generalization of matrix-based approaches [43]. Tensors can represent higher-dimensional data, such as 4D-STEM (Scanning Transmission Electron Microscopy) datasets Iâ‚„á´…(x,y,u,v) where (x,y) are real-space coordinates and (u,v) are reciprocal-space coordinates [2]. The hierarchical decomposition of such tensor data enables the identification of patterns and relationships across multiple dimensions simultaneously, providing a more comprehensive analysis of material structure-property relationships.

Diagram 1: Deep NMF hierarchical architecture showing progressive feature extraction through multiple layers (W1, W2, W3) from input data (X) to final features (H).

Application Protocols for Materials Classification

4D-STEM Data Analysis Protocol

Objective: To decompose 4D-STEM data into interpretable diffractions and maps for classification of nanometer-sized crystalline precipitates embedded in amorphous metallic glass [2].

Step-by-Step Procedure:

Data Preparation: Transform 4D data Iâ‚„á´…(x,y,u,v) into matrix X âˆˆ â„^{náµ¤áµ¥ Ã— nâ‚“áµ§} by reshaping 2D experimental diffractions Iâ‚‚á´…(u,v) into 1D column vectors, where rows represent reciprocal-space coordinates and columns represent real-space coordinates.
Initialization: Initialize W and H with non-negative random values. Alternatively, use smart initialization algorithms like Non-negative Double Singular Value Decomposition (NNDSVD) for faster convergence.
Constrained Factorization: Apply NMF with domain-specific constraints:
- Implement spatial smoothness in H (map matrix) through Fourier filtering or regularization to respect microscope resolution limits.
- Enforce continuous intensity profiles in W (diffraction matrix) to prevent physically implausible downward-convex peaks.
Iterative Optimization: Execute multiplicative update rules (Equations 3-4) or alternating least squares (Equations 5-6) until convergence criteria are met (max iterations or minimal improvement threshold).
Component Analysis: For each of the nâ‚– components, reshape columns of W back to 2D diffractions wâ‚–(u,v) and rows of H to 2D maps hâ‚–(x,y).
Hierarchical Clustering: Apply spectral clustering to weighting maps based on diffraction similarity, combining polar coordinate transformation and uniaxial cross-correlation.

Expected Outcomes: Successful protocol execution will yield physically interpretable diffractions and maps that reveal crystalline precipitates in amorphous matrices, enabling classification according to their diffraction patterns.

Functional Unit Identification in Complex Materials

Objective: To identify common and subject-specific functional units in material systems with complex, hierarchical structures [42].

Step-by-Step Procedure:

Motion Quantification: Extract displacement fields or other relevant motion quantities from time-series characterization data (e.g., in situ TEM or XRD).
Deep Joint Sparse NMF: Implement a deep graph-regularized sparse NMF framework:
- Transform NMF with sparse and graph regularizations into modular architectures by unfolding the Iterative Shrinkage-Thresholding Algorithm (ISTA).
- Learn interpretable building blocks and associated weighting maps through multiple decomposition layers.
Cross-Sample Integration: Jointly factorize data from multiple samples or regions while preserving shared and unique features through common and subject-specific weighting maps.
Spectral Clustering: Apply spectral clustering to both common and subject-specific weighting maps to determine functional units at different hierarchical levels.
Validation: Compare identified functional units with known material phases or structures and evaluate clustering performance using metrics such as silhouette score or domain-specific validation measures.

Table 2: Key Parameters for Deep NMF in Materials Classification

Parameter	Recommended Setting	Impact on Results	Optimization Method
Decomposition Rank (k)	5-20 components	Higher values capture more detail but may overfit; lower values may miss key features	Elbow method in reconstruction error plot
Sparsity Regularization	Î» = 0.01-0.1	Increases interpretability but may oversimplify complex structures	Cross-validation with domain knowledge
Graph Regularization	Î± = 0.001-0.01	Preserves data manifold structure	Sensitivity analysis with clustering metrics
Convergence Tolerance	1e-6	Balances computational cost with solution quality	Fixed based on computational resources
Maximum Iterations	1000-5000	Ensures convergence without excessive computation time	Monitoring of reconstruction error

Research Reagent Solutions

Table 3: Essential Computational Tools for Deep NMF Implementation

Tool Name	Function	Application Context
Scikit-learn NMF	Basic NMF implementation with MU and CD algorithms	Initial prototyping and standard applications
HyperSpy	Multi-dimensional data analysis	Electron microscopy data (4D-STEM, EELS)
DigitalMicrograph Scripts	Custom NMF with domain constraints	4D-STEM processing with instrument-specific knowledge
MATLAB NMF Function	ALS and MU algorithms with custom regularizations	Algorithm development and comparative studies
SAS Viya NMF Procedure	Large-scale factorization with APG and CAPG methods	Industrial-scale materials data analysis

Workflow Integration and Validation

Diagram 2: Integrated workflow for materials classification using deep NMF, showing the pathway from raw data to structure-property relationships.

Validation Metrics:

Reconstruction Error: Measure using Frobenius norm â€–X - WHâ€–Â² to quantify approximation quality [2].
Cluster Purity: Assess classification accuracy against known material phases or structures.
Interpretability Score: Qualitative evaluation by domain experts of whether components correspond to physically meaningful entities.
Reproducibility: Test stability of results across multiple runs with different initializations.

Deep NMF architectures represent a significant advancement over traditional matrix factorization approaches for materials classification tasks. By leveraging hierarchical feature extraction and incorporating domain-specific constraints, these methods enable the identification of interpretable, physically meaningful patterns in complex materials data. The protocols outlined in this document provide a roadmap for implementing these techniques in practice, from data preprocessing through validation. As demonstrated in applications ranging from 4D-STEM analysis to functional unit identification, deep NMF offers a powerful framework for uncovering structure-property relationships across multiple scales in hierarchical material systems.

Overcoming Challenges: Optimization, Constraints, and Best Practices in HNMF

In the domain of materials research, hierarchical nonnegative matrix factorization (HNMF) has emerged as a powerful tool for deciphering complex, multi-modal data, such as that obtained from powder diffraction or multi-parametric imaging [43] [3]. However, the practical application of HNMF is often hampered by the challenge of ill-convergence, where algorithms converge to poor local minima or degenerate solutions that lack physical interpretability [46] [47]. This ill-convergence stems fundamentally from the non-convex nature of the NMF optimization problem, making the final solution highly dependent on the starting point, or initialization, of the factor matrices [46] [48]. Within the context of a hierarchical model, where factorizations are performed recursively across multiple layers, the propagation of error from a poor initialization in an early layer can be particularly detrimental to the entire structure [3]. Therefore, a systematic approach combining robust initialization and diligent convergence monitoring is not merely beneficial but essential for extracting meaningful, reproducible latent topicsâ€”such as distinct material phases or chemical componentsâ€”from experimental data. This application note provides a detailed protocol to combat ill-convergence, ensuring that HNMF realizes its full potential in materials science applications.

The Critical Role of Initialization in HNMF

The initialization of factor matrices W and H is the first and one of the most critical steps in any HNMF procedure. A well-chosen initialization strategy accelerates convergence and significantly increases the likelihood of the algorithm finding a solution that is both mathematically sound and physically interpretable.

The Problem of Ill-Convergence and Degenerative Solutions

Ill-convergence in NMF manifests in several ways, including prohibitively slow convergence rates, convergence to local minima with high reconstruction error, and the emergence of degenerative solutions [47]. In a materials context, a degenerative solution might correspond to a factorization that fails to separate distinct chemical phases or assigns non-physical, mixed signatures to components. The susceptibility to ill-convergence is exacerbated in hierarchical models because the factorization at each layer serves as the input to the next. An error introduced at a lower layer is therefore propagated and potentially amplified through the hierarchy, compromising the entire multi-resolution analysis [3]. The non-convexity of the problem means that random initialization, while simple, offers no guarantee of quality or reproducibility, often necessitating multiple runs to secure a satisfactory result [46] [48].

Taxonomy of Initialization Strategies

A variety of initialization strategies have been developed to move beyond simple random starts. The following table summarizes the primary categories and their characteristics, with particular emphasis on their applicability to materials research.

Table 1: Classification of NMF Initialization Methods for Materials Research

Method Category	Examples	Key Principle	Advantages	Disadvantages for Materials Data
Randomization-Based	Random Averages [46]	Construct initial factors by averaging random columns of the data matrix `X`.	Low computational cost; simple to implement.	Lack of reproducibility; may not capture true data structure.
Low-Rank Approximation	NNDSVD [48]	Uses singular value decomposition, setting negative values to zero.	Provides a good analytic starting point; faster convergence.	May introduce artifacts due to forced non-negativity.
Clustering-Based	Fuzzy C-Means (FCM) [48]	Uses clustering algorithms to identify initial prototype sources.	Provides realistic source estimates.	Computationally expensive; may require its own initialization.
Geometric/Convexity-Based	Successive Projection Algorithm (SPA) [48]	Selects pure variables/endmembers based on successive orthogonal projections.	Fast, reproducible, and aligns well with the convex geometry of separable NMF problems.	Assumes near-separability, which may not always hold perfectly.

For materials data, where the latent components often correspond to physically distinct entities (e.g., pure chemical phases), geometric methods like the Successive Projection Algorithm (SPA) are particularly powerful. SPA identifies columns of the data matrix X that are located at the vertices of the convex hull of the data, which are natural candidates for pure components [48]. When used as an initialization for HNMF, SPA can seed the algorithm with chemically plausible starting points, leading to more interpretable hierarchical topics and improved convergence rates.

Monitoring Convergence in HNMF

Selecting an initialization strategy is only half the battle; diligent monitoring is required to diagnose and combat ill-convergence during the optimization process.

Key Metrics for Convergence Diagnostics

Convergence should be assessed using multiple, complementary metrics to gain a holistic view of the algorithm's behavior.

Cost Function Trajectory: The most direct metric is the value of the objective function (e.g., Frobenius norm ||X - WH||â‚‚ or Kullback-Leibler divergence) over iterations. A steady, monotonic decrease is expected for many algorithms. Stagnation indicates convergence, while oscillations or a sudden plateau can signal numerical instability or convergence to a poor minimum.
Solution Stationarity: Monitor the change in the factor matrices between iterations (e.g., ||Wáµ¢â‚Šâ‚ - Wáµ¢||â‚‚ / ||Wáµ¢||â‚‚). When this change falls below a predefined tolerance, the solution can be considered stationary.
Quantitative Quality Metrics: For a more grounded assessment, track metrics related to the final application.
- Sparsity: The sparsity of the H matrix can be monitored, as overly dense solutions may indicate component mixing.
- Orthogonality: In some cases, a degree of orthogonality between topics is desired. The orthogonality of W or H can be quantified and tracked [49].
Qualitative and Physical Plausibility Check: Ultimately, the success of an HNMF applied to materials data is judged by the interpretability of its outputs. Researchers should periodically inspect the extracted components (e.g., the resulting diffraction patterns or PDFs) to ensure they resemble physically realistic signals [50].

A Protocol for Comprehensive Convergence Monitoring

The following workflow diagram outlines a recommended procedure for integrating these metrics into a robust monitoring protocol.

Diagram 1: Workflow for monitoring HNMF convergence, integrating quantitative metrics and qualitative assessment.

Experimental Protocols for Initialization and Convergence

Protocol 1: Initializing HNMF with the Successive Projection Algorithm (SPA)

This protocol details the steps for initializing a single layer of HNMF using SPA, which can be applied recursively at each layer of the hierarchy.

Objective: To generate a robust, reproducible initialization for the factor matrices W and H that reduces the risk of ill-convergence. Materials: A non-negative data matrix X (size m x n) and a target rank r.

Preprocessing: Normalize the columns of the data matrix X to have unit â„“Â²-norm. This ensures the geometric selection is based on direction rather than magnitude.
First Vertex Selection: Select the column of X with the largest â„“Â²-norm. This is the first "purest" candidate or endmember. Initialize the set of selected indices S = {iâ‚} and the residual matrix R = X.
Successive Projection: a. For k = 2 to r: i. Compute the orthogonal projection projector P = I - R_{:,S} R_{:,S}^â€ , where â€ denotes the pseudo-inverse. ii. Project all columns of R onto the space orthogonal to the current set: R_proj = P R. iii. Find the column index i_k of R_proj with the largest â„“Â²-norm. iv. Add i_k to the set S.
Factor Matrix Construction: a. Set the initial matrix Wâ‚€ = X_{:, S}, i.e., the columns of X corresponding to the indices in S. b. Solve for the initial Hâ‚€ using non-negative least squares: Hâ‚€ = argmin_{Hâ‰¥0} ||X - Wâ‚€ H||_F^2.

This procedure yields an initial pair (Wâ‚€, Hâ‚€) that approximates the data with a set of pure, extreme vectors, providing an excellent starting point for subsequent HNMF iterations [48].

Protocol 2: Hierarchical Alternating Least Squares (HALS) with Convergence Monitoring

This protocol describes how to implement a HALS algorithm, known for its fast convergence [49], within an HNMF framework, integrated with the monitoring workflow from Diagram 1.

Objective: To solve the HNMF optimization problem at a given layer while actively monitoring for signs of ill-convergence. Materials: Data matrix X, initial matrices Wâ‚€ and Hâ‚€ (e.g., from Protocol 1), convergence tolerance tol, maximum iterations max_iter.

Algorithm Initialization: Set W = Wâ‚€, H = Hâ‚€, and iteration counter t = 0.
Iteration Loop: While t < max_iter and convergence is not reached: a. Update H: For each column j of H, update using a projected gradient step or a column-wise least squares update, ensuring non-negativity [49]. b. Update W: For each column j of W (each topic), update similarly, ensuring non-negativity. c. Normalization: Normalize the columns of W to unit norm and adjust H accordingly to maintain the product WH. d. Convergence Diagnostics (See Diagram 1): i. Compute Cost: Calculate the current cost f(W, H) = 0.5 * ||X - WH||_F^2. ii. Check Stationarity: Compute Î”W = ||Wáµ¢â‚Šâ‚ - Wáµ¢||_F / ||Wáµ¢||_F. iii. Check Sparsity/Othogonality: Compute sparsity of H or orthogonality of W if desired. iv. Check Criteria: If the relative decrease in cost and Î”W are both below tol, proceed to qualitative check. Otherwise, continue iterating.
Qualitative Plausibility Check: Visually inspect the columns of W. Do they represent coherent, physically plausible components (e.g., a clean diffraction pattern)? If not, convergence may be ill-founded, and the algorithm should be re-initialized.
Output: The resulting matrices W and H for the current layer. Use the product H as the data matrix X for the next layer in the HNMF and repeat Protocols 1 and 2.

The Scientist's Toolkit: Essential Research Reagents and Computational Solutions

Table 2: Key Computational "Reagents" for Robust HNMF

Item Name	Type/Function	Application in HNMF Protocol
Successive Projection Algorithm (SPA)	Geometric initialization method.	Protocol 1: Used to generate chemically plausible, reproducible initial factors `Wâ‚€` and `Hâ‚€`.
Hierarchical Alternating Least Squares (HALS)	Efficient NMF optimization algorithm.	Protocol 2: The core solver for the NMF problem at each hierarchical layer, known for fast convergence [49].
Frobenius Norm Metric	Quantitative measure of reconstruction fidelity.	Protocol 2: The primary cost function monitored to ensure the model accurately approximates the raw data.
Stationarity Monitor	Quantitative measure of solution stability.	Protocol 2: Tracks the change in factor matrices between iterations to confirm convergence.
Sparsity Metric	Quantitative measure of component purity.	Protocol 2: Aids in diagnosing component mixing; higher sparsity in `H` often indicates cleaner separation.
Normalized Data Matrix `X`	Preprocessed input data.	Protocol 1: Essential pre-conditioning for SPA and other geometric methods to function correctly.
Tecnazene	Tecnazene	Tecnazene, a fungicide and pesticide. For research applications only. This product is not for personal or household use.
JAK-IN-32	JAK-IN-32, CAS:936091-56-4, MF:C26H34N6O3S, MW:510.7 g/mol	Chemical Reagent

Nonnegative matrix factorization (NMF) has established itself as a powerful unsupervised learning tool for deciphering complex, high-dimensional data across various scientific domains, from hyperspectral imaging to microbiology and materials characterization [51] [5]. The standard NMF model approximates a non-negative data matrix X as the product of two lower-rank non-negative matrices: X â‰ˆ WH, where W represents basis components (e.g., endmember spectra, microbial communities, or diffraction patterns) and H contains their corresponding coefficients or abundances [2] [5].

However, the conventional NMF framework suffers from a critical limitation: its purely mathematical formulation often yields solutions that are mathematically sound but physically implausible. In materials research, this manifests as components with negative intensities, unrealistic sparse representations, or downward-convex peaks in spectral dataâ€”artifacts that violate fundamental physical principles of scientific measurement [2]. For instance, in electron microscopy, the number of detected electrons cannot be negative, yet principal component analysis (PCA) and primitive NMF often produce negative intensities [2].

Hierarchical NMF (HNMF) architectures provide a structured framework for integrating domain knowledge through multi-layer decomposition, enabling more nuanced constraint application at different representation levels [52]. This article details protocols for implementing spatial and spectral constraints within HNMF to ensure physically meaningful results in materials discovery and characterization.

Theoretical Foundation

The core challenge in materials data analysis lies in reconciling mathematical models with physical reality. Primitive NMF algorithms, while enforcing non-negativity, frequently generate components that contradict instrumental and physical constraints [2]. Two prevalent issues include:

Non-Physical Sparse Components: Primitive NMF tends to produce spuriously sparse components even when the actual materials components are not sparse [2].
Downward-Convex Peaks: When experimental data contains continuous intensity baselines with superimposed sharp peaks, primitive NMF introduces artificial downward-convex peaks (unnatural intensity drops) into the continuous component [2].

These limitations stem from the fundamentally ill-posed nature of NMF, where infinite factorizations can approximate the original data with similar accuracy. Domain knowledge provides the essential regularization needed to steer solutions toward physical plausibility.

Table 1: Categories of Domain Knowledge for Constraining NMF in Materials Science

Constraint Category	Physical Basis	Representation in HNMF
Spectral Constraints	Spectral continuity, known emission profiles, non-negative intensities	Constraints on the W matrix (basis spectra)
Spatial Constraints	Spatial smoothness, localized features, structural homogeneity	Constraints on the H matrix (abundances/concentrations)
Optical Priors	Point spread function, Seidel aberrations, optical transfer function	Guided PSF modeling for spatially variant blur correction [53]
Compositional Constraints	Abundance sum-to-one, material conservation	Equality constraints (e.g., ANC and ASC) [4]

Hierarchical NMF Framework for Materials Data

Deep bidirectional hierarchical NMF architectures significantly enhance the representation of complex materials data by capturing multi-level manifolds that single-layer NMF cannot adequately describe [52]. In a typical hierarchical framework:

X â‰ˆ Wâ‚Wâ‚‚...Wâ‚—Hâ‚—

where each layer progressively refines the data representation. This multi-layer approach enables differential constraint application across hierarchical levels, with stricter physical constraints often applied to shallow layers and more relaxed constraints at deeper layers [52].

Figure 1: Hierarchical NMF framework with domain knowledge integration at multiple levels. Shallow layers typically employ stricter physical constraints, while deeper layers capture fine structure with relaxed constraints.

Application Protocols

Protocol 1: Constrained HNMF for 4D-STEM Data Analysis

Four-dimensional scanning transmission electron microscopy (4D-STEM) generates complex datasets where each spatial position contains a full diffraction pattern, creating data cubes Iâ‚„á´…(x,y,u,v) that challenge conventional analysis methods [2].

Experimental Workflow:

Data Reformation: Transform 4D data Iâ‚„á´…(x,y,u,v) to matrix X âˆˆ Râ‚Š^(náµ¤áµ¥ Ã— nâ‚“áµ§), where rows represent reciprocal space coordinates and columns represent real-space coordinates [2].
Constraint Definition:
- Apply spatial smoothing filters to the map matrix H to enforce resolution limits
- Impose continuous intensity profile constraints on diffraction matrix W to prevent downward-convex peaks [2]
Hierarchical Decomposition:
- Implement multi-layer NMF with domain-specific constraints
- Use alternating least squares (ALS) with nonnegativity projection [2]
Component Interpretation:
- Basis vectors in W represent interpretable diffractions
- Coefficients in H represent spatial abundance maps

Validation Metrics:

Physical plausibility of component spectra
Spatial coherence of abundance maps
Reconstruction error â€–X - WHâ€–â‚‚â‚‚

Protocol 2: Spatially-Variant PSF Correction for Aerial Imaging

Aerial imaging systems suffer from spatially variant aberrations due to atmospheric turbulence, thermal deformation, and platform vibrations, resulting in position-dependent point spread functions (PSFs) that conventional deconvolution methods cannot handle [53].

Implementation Steps:

PSF Modeling:
- Incorporate optical priors from Seidel aberrations
- Generate PSF basis set constrained by Seidel polynomials [53]
Patch-wise Deconvolution:
- Divide image into subregions with locally uniform PSF assumption
- Apply patch-wise deconvolution for initial correction [53]
Plug-and-Play (PnP) Regularization:
- Integrate pretrained PnP network as global consistency regularizer
- Iteratively refine between deconvolution and PnP steps [53]

Performance Metrics:

Neural Image Assessment (NIMA) score improvement (7.49% for visible-light cameras)
Hyper Image Quality Assessment (HyperIQA) score improvement (14.15% for visible-light cameras)
Reduction in required iterations (from 8 to 3) [53]

Table 2: Quantitative Results of Spatially-Variant Aberration Correction in Aerial Imaging

Imaging Modality	NIMA Improvement	HyperIQA Improvement	Computational Efficiency
Visible-light cameras	7.49%	14.15%	3 iterations (was 8)
Infrared cameras	29.58%	17.53%	3 iterations (was 8)

Protocol 3: Hyperspectral Unmixing with Spectral Variability

Hyperspectral imaging of materials and terrestrial surfaces must account for endmember variabilityâ€”spectral signature changes due to illumination, intrinsic variability, and environmental factors [4].

Procedure:

Prototypal Endmember Extraction:
- Model endmember variability as conical hulls defined by extremal pixels
- Automatically extract and cluster prototypal endmember spectra [4]
Hierarchical Sparsity Constraints:
- Apply group-sparsity constraints to abundance matrix H
- Enforce spatial coherence through bidirectional hierarchical constraints [52]
Deep Bidirectional Optimization:
- Implement deep NMF with noise filter-guided constraints on shallow layers
- Apply ordinary sparsity constraints on deeper layers [52]
- Utilize Nesterov's Optimal Gradient Method for efficient optimization

Validation Approach:

Comparison with ground truth endmembers in synthetic datasets
Spatial coherence assessment in real hyperspectral imagery
Abundance estimation accuracy relative to library-based methods

The Scientist's Toolkit

Table 3: Essential Research Reagents and Computational Tools for Constrained HNMF

Tool/Reagent	Function/Application	Implementation Considerations
Optical Prior Database	PSF modeling for aberration correction [53]	Seidel coefficients, lens parameters, environmental conditions
Spectral Library	Endmember variability representation [4]	Prototypal endmembers, extremal pixels, canonical spectra
Spatial Smoothing Filters	Enforcement of spatial resolution limits [2]	Gaussian kernels, total variation regularization
Multiplicative Update (MU) Algorithm	Standard NMF solver with nonnegativity preservation [2]	Implemented in scikit-learn ('mu') and MATLAB ('mult')
Alternating Least Squares (ALS)	Flexible constraint integration via projection [2]	Implemented in scikit-learn ('cd') and MATLAB ('als')
Plug-and-Play (PnP) Priors	Deep denoiser integration as regularization [53]	Pretrained network modules for specific artifact types
Reweighting Denoising Regularizer	Noise filtering guidance for shallow NMF layers [52]	Prevents over-denoising while maintaining signal fidelity
HIV-1 inhibitor-70	4,1-Benzoxazepinone analogue 2q

Workflow Integration

Figure 2: Complete workflow for integrating domain knowledge into hierarchical NMF analysis, featuring iterative refinement based on physical plausibility assessment.

Integrating spatial and spectral constraints within hierarchical NMF frameworks represents a paradigm shift in materials data analysis, moving from mathematically convenient solutions to physically plausible interpretations. The protocols outlined herein provide actionable methodologies for incorporating domain knowledge at multiple levels of the factorization process, effectively addressing the pervasive issue of physically implausible results. As materials characterization techniques continue to generate increasingly complex datasets, these constrained HNMF approaches will become essential tools for extracting meaningful scientific insights from the data deluge.

Future developments in this field will likely focus on adaptive constraint mechanisms that automatically adjust to data characteristics, more sophisticated bidirectional hierarchical architectures, and tighter integration of physical simulation models with data-driven factorization approaches.

Hierarchical Nonnegative Matrix Factorization (HNMF) has emerged as a powerful tool for unraveling complex, multi-scale structures in materials science data. By recursively applying NMF to learn latent topics at different levels of granularity, HNMF provides a parts-based representation that enhances the interpretability of underlying material components [3]. A critical and long-standing challenge in constructing these models is rank selectionâ€”the choice of factorization rank at each hierarchical layer [54]. This decision profoundly impacts the model's ability to capture meaningful material features without overfitting noise or oversimplifying the signal. The rank selection problem is further complicated in unsupervised learning settings common in materials research, where ground truth labels are often unavailable [54]. This article addresses these dilemmas by presenting a structured framework of strategies and protocols for determining optimal hierarchical components, specifically tailored for materials research applications.

Rank Selection Fundamentals

The factorization rank, typically denoted as k or r, determines the number of components or latent topics extracted at each layer of the HNMF architecture. Selecting a rank that is too high risks modeling experimental noise, while choosing one that is too low oversimplifies the material's intrinsic structure and can miss critical features [55]. In hierarchical implementations, where data is sequentially decomposed as ð— â‰ˆ ð€(Â¹)ð€(Â²)â€¦ð€(ð‹)ð—(ð‹), the rank at each layer defines the resolution of features discovered [3] [9]. The non-increasing error property of NMFâ€”where the reconstruction error generally decreases with increasing rankâ€”further complicates selection, as there is no intrinsic minimum to indicate the optimal value [55] [56].

Quantitative Comparison of Rank Selection Strategies

The following table summarizes the primary rank selection methods, their core principles, and key performance characteristics as validated across multiple studies.

Table 1: Comparative Analysis of Rank Selection Strategies

Method	Core Principle	Key Metrics	Advantages	Limitations
Cophenetic Correlation (`ccc`)	Measures clustering stability over multiple NMF runs with random initializations [55].	Cophenetic correlation coefficient [55] [57].	Exploits stochastic nature of NMF; no training required [55].	Performance degrades when underlying clusters are non-orthogonal [55].
Elbow & UIK Method	Identifies the "knee point" where adding more components yields diminishing returns in error reduction [57].	Residual Sum of Squares (RSS) curvature; first inflection point [57].	Fast, computationally efficient, free from prior rank input [57].	Can be subjective without automated knee-point detection (e.g., UIK) [57].
Concordance	Assesses stability of NMF solutions relative to a reference decomposition (e.g., NNDSVD) [55].	Ratio of concordance to approximation error [55].	Outperforms `ccc` for broader matrix classes, including non-orthogonal clusters [55].	Requires definition of a stable reference solution.
Image Quality Assessment (IQA)	Balances reconstruction fidelity against model complexity using image quality metrics [10].	K-component loss combined with IQA metrics [10].	Directly optimizes for human-interpretable feature quality in imaging data.	Primarily suited for image-based datasets (e.g., 4D-STEM).
NMF-Merge	Performs factorization at a higher initial rank, then iteratively merges similar components [56].	Reconstruction error (SED) and solution consistency post-merging [56].	Helps escape poor local minima; yields more consistent solutions [56].	Introduces additional computational steps for merging.

Experimental Protocols for Rank Determination

The following section provides detailed, step-by-step protocols for implementing two of the most robust rank selection strategies in a materials research context.

Protocol: Rank Selection via the Unit Invariant Knee (UIK) Method

This protocol is designed for determining the optimal rank of a single NMF layer using the UIK method, which automates the identification of the "elbow" in a scree plot [57].

Table 2: Research Reagents and Computational Tools for UIK Protocol

Item Name	Function/Description	Example/Notes
Data Matrix (V)	The target non-negative material data matrix for decomposition.	Dimensions: m features Ã— n samples (e.g., spectra, diffraction patterns) [57].
NMF Algorithm	The computational engine for performing the matrix factorization.	e.g., Multiplicative Update (MU), Alternating Least Squares (ALS) [2] [57].
UIK Function	Automatically detects the knee point in the RSS vs. rank plot.	Implementation available in R package `inflection` [57].
RSS Calculator	Computes the Residual Sum of Squares for each NMF model.	RSS = \|\|V - WH\|\|Â²_F [57].

Parameter Initialization: Define a range of potential ranks r to investigate, typically from 1 to min(m, n).
NMF Model Fitting: For each candidate rank r in the defined range, run the NMF algorithm (e.g., with 20-30 iterations) to decompose V into Wr and Hr [57].
Error Calculation: For each model, calculate the Residual Sum of Squares (RSS): RSS(r) = \|\|V - WrHr\|\|Â²_F.
Knee Point Detection: Input the vector of ranks x = [1, 2, ..., rmax] and corresponding RSS values *y* = [RSS(1), RSS(2), ..., RSS(rmax)] into the uik(x, y) function to compute the optimal rank r* [57].
Validation: Use the identified rank r* for a final, robust NMF analysis with a higher number of iterations (e.g., 200-500) to obtain stable components [57].

Protocol: Hierarchical Rank Selection for 4D-STEM Orientation Mapping

This protocol outlines a decision-making strategy for determining the number of clusters (k) in HNMF applied to 4D-Scanning Transmission Electron Microscopy (4D-STEM) data, integrating image quality assessment [10].

Data Preprocessing:
- Convert the 4D dataset Iâ‚„D(x, y, u, v) into a 2D matrix X where each column is a flattened 2D diffraction pattern Iâ‚‚D(u, v) [2] [10].
- Apply necessary noise reduction and normalization.
Evaluation Phase (Level One):
- Perform HNMF across a wide range of candidate k values.
- For each k, compute the K-component loss (reconstruction error) and a set of Image Quality Assessment (IQA) metrics on the resulting basis and coefficient matrices [10].
Decision Phase (Level Two):
- Use the k-metric derived from the IQA analysis to identify the optimal k. This value represents the best balance between model complexity and the physical interpretability of the extracted components (e.g., crystal orientations) [10].
Spatial Analysis:
- Employ a spatial weight matrix and threshold-based visualization to investigate regions where component maps overlap, indicating potential crystal overlaps or complex structural features [10].

Workflow Visualization

The following diagram illustrates the logical workflow for hierarchical rank selection, integrating the strategies and protocols discussed.

HNMF Rank Selection Workflow

Advanced and Emerging Techniques

Beyond the established methods, several advanced techniques offer promising avenues for tackling rank selection dilemmas.

Neural NMF: This approach frames hierarchical NMF as a neural network, employing a backpropagation optimization scheme across layers. This can lead to lower reconstruction error and improved interpretability of the hierarchical structure between topics at different layers [3].
NMF-Merge: This strategy involves performing an initial "over-complete" factorization at a rank higher than the target, then analytically merging the most similar components iteratively until the desired rank is achieved. This method can help optimizers escape poor local minima and find more consistent, higher-quality solutions [56].
Chem-NMF: Inspired by energy barriers in chemical reactions, this multi-layer NMF variant incorporates concepts from physical chemistry to theoretically analyze and stabilize convergence. The Î±-divergence offers flexibility, and the chemical catalyst-inspired bounding factor aims to improve robustness and clustering accuracy [9].

In materials research, hierarchical nonnegative matrix factorization (NMF) serves as a powerful unsupervised learning tool for decomposing complex, high-dimensional spectral dataâ€”such as four-dimensional scanning transmission electron microscopy (4D-STEM)â€”into interpretable components representing material phases, chemical compositions, or structural features [2] [1]. However, real-world materials data is invariably afflicted by data sparsity (incomplete measurements) and noise (from detectors or environmental interference), which can severely degrade the quality and physical interpretability of the factorization. This application note details advanced regularization techniques and robust cost functions, contextualized within a hierarchical NMF framework, to mitigate these challenges effectively. We provide structured comparisons, experimental protocols, and implementation tools specifically tailored for materials scientists and drug development professionals working with spectroscopic and hyperspectral imaging data.

Regularization Techniques for Combatting Sparsity and Noise

Regularization techniques introduce additional constraints or penalty terms to the NMF objective function, guiding the optimization toward solutions that are not only accurate but also exhibit desirable properties such as sparsity, smoothness, or geometric consistency. These properties are crucial for extracting physically meaningful patterns from noisy materials data.

Table 1: Regularization Techniques for Hierarchical NMF in Materials Science

Technique	Mathematical Formulation	Primary Effect	Typical Application in Materials Research
Sparsity (L1-Norm) Regularization [12] [58]	( D = \frac{1}{2} \| \mathbf{X} - \mathbf{WH} \|F^2 + \lambda (\| \mathbf{W} \|1 + \| \mathbf{H} \|_1) )	Promotes localized, part-based representations by forcing many elements in W and/or H to zero.	Identifying discrete chemical phases or distinct spectral signatures from hyperspectral images.
Archetypal Regularization [59] [60]	Constrains factors to be sparse and represent data points as convex combinations of "pure" archetypes.	Enhances geometric interpretability and robustness, ensuring recovered archetypes are close to underlying data extremes.	Determining end-member compositions in phase diagrams or pure component spectra in mixtures.
Graph Regularization [58]	( D = \frac{1}{2} \| \mathbf{X} - \mathbf{WH} \|F^2 + \alpha \text{Tr}(\mathbf{H} \mathbf{L}c \mathbf{H}^T) + \beta \text{Tr}(\mathbf{W}^T \mathbf{L}_f \mathbf{W}) )	Preserves the intrinsic geometric structure (manifold) of both the data space (via ( \mathbf{L}c )) and the feature space (via ( \mathbf{L}f )).	Mapping smooth concentration gradients or spatial domains in thin-film or composite material analysis.
Domain-Specific Constraints [2]	Incorporates prior knowledge (e.g., spatial smoothness, forbidden intensity profiles) directly into the update rules.	Yields physically plausible components that avoid artifacts like downward-convex peaks or high-frequency noise.	Decomposing 4D-STEM data into interpretable diffraction patterns and spatial maps.

Application Protocol: Graph-Regularized NMF for Spatial Mapping

Objective: Decompose a hyperspectral image (e.g., EELS, EDX) into chemically distinct components while preserving the spatial continuity of phases.

Materials/Software: Python (scikit-learn, Nimfa), Raw spectral data matrix (X), Pre-computed spatial affinity matrix.

Procedure:

Data Preprocessing: Normalize the spectral data matrix X to a standard scale (e.g., 0-1).
Graph Construction: a. Spatial Graph (L_c): For each pixel, connect it to its 4 or 8 immediate neighbors. The weight of the edge can be 1 (binary) or based on the spectral similarity of the pixels. b. Feature Graph (L_f): (Optional) Construct a k-nearest neighbor graph in the spectral feature space to enforce smoothness in the spectral profiles.
Model Initialization: Initialize W and H using Non-Negative Double Singular Value Decomposition (NNDSVD) for faster convergence [12].
Optimization: Apply a dual graph-regularized NMF solver [58] to minimize the full objective function, including the graph penalty terms.
Validation: Assess the spatial smoothness of the resulting coefficient maps in H and the sparsity of the spectral bases in W. Compare the reconstruction error against unregularized NMF.

Robust Cost Functions for Noisy Data Environments

Traditional NMF based on the Frobenius norm (least squares) is highly sensitive to outliers and non-Gaussian, heavy-tailed noise common in experimental materials data. Robust cost functions replace the Frobenius norm to diminish the influence of outliers.

Table 2: Robust Cost Functions for Noisy Materials Data

Cost Function	Formulation	Robustness Profile	Advantages in Materials Context
Maximum Correntropy Criterion (MCC) [12]	( D = - \sum \exp\left(-\frac{(X{ij} - (WH){ij})^2}{2\sigma^2}\right) + \lambda \|H\|_1 )	Highly robust to heavy-tailed (impulsive) noise and outliers.	Ideal for vibration analysis in fault detection and data with sporadic, high-intensity noise spikes.
Î²-Divergence [12]	( D = \sum \frac{X{ij}^\beta}{\beta(\beta-1)} + \frac{(WH){ij}^\beta}{\beta} - \frac{X{ij}(WH){ij}^{\beta-1}}{\beta-1} )	Tuned via Î² parameter (Î²=1: KL-divergence; Î²=2: Euclidean).	Offers a flexible family of divergences. Î²<1 often improves robustness to shot noise in electron microscopy.
L2,1-Norm [58]	( D = \| \mathbf{X} - \mathbf{WH} \|_{2,1} )	Robust to sample-specific outliers (entire corrupted columns in X).	Useful when entire spectra or measurements might be corrupted, ensuring they have minimal impact.

Application Protocol: Sparse NMF-MCC for Bearing Fault Detection

Objective: Identify a fault-related frequency band from a noisy vibration signal spectrogram in the presence of heavy-tailed, non-cyclic impulsive noise [12].

Materials/Software: Vibration sensor data, Signal processing toolbox for spectrogram computation.

Procedure:

Signal Transformation: Compute the spectrogram S of the raw vibration signal y(t) to obtain a time-frequency matrix X.
Parameter Configuration: Set the hyperparameters: rank of decomposition (r, number of components), correntropy kernel bandwidth (Ïƒ_k), and sparsity coefficient (Î»).
Model Initialization: Initialize W and H using NNDSVD.
Optimization: Apply the Sparse NMF-MCC algorithm to factorize X into basis W (frequency patterns) and coefficients H (time activations).
Filter Selection & Signal Reconstruction: a. Identify the component in W that best corresponds to the fault frequency using an indicator like the Envelope Spectrum Indicator (ENVSI). b. Reconstruct the signal using only the selected component. c. Analyze the envelope spectrum of the reconstructed signal to confirm the presence of the fault frequency and its harmonics.

The Scientist's Toolkit: Research Reagent Solutions

Table 3: Essential Computational Tools for Robust Hierarchical NMF

Item	Function/Description	Example Use Case
Sparse NMF-MCC Algorithm [12]	An NMF variant using the Maximum Correntropy Criterion and L1-norm sparsity for robustness to impulsive noise.	Isolating faint cyclic spectral signatures from bearing vibration data contaminated with heavy-tailed noise.
Dual Graph-Regularized NMF [58]	An NMF model that incorporates graphs for both data and feature manifolds to preserve geometric structures.	Unmixing hyperspectral images of material composites where components form smooth spatial gradients.
Constrained NMF (ALS Solver) [2]	An Alternating Least Squares (ALS) NMF solver that allows projection-based incorporation of domain-specific constraints.	Processing 4D-STEM data to enforce smooth, non-negative, and physically plausible diffraction patterns.
NNDSVD Initialization [12]	A nonnegative double singular value decomposition method to initialize factor matrices, leading to faster convergence.	Providing a stable and accurate starting point for any NMF algorithm, reducing overall computation time.
Envelope Spectrum Indicator (ENVSI) [12]	A metric to automatically identify the NMF-derived component that best captures a fault-related frequency band.	Automated analysis pipeline for selecting the most diagnostically relevant filter from multiple NMF components.

Nonnegative Matrix Factorization (NMF) is a cornerstone dimensionality reduction technique for materials research, capable of extracting meaningful, parts-based representations from high-dimensional experimental data. Hierarchical NMF (HNMF) extends this capability by performing sequential factorizations to uncover latent hierarchical structures within materials data, a property crucial for understanding complex material systems. The performance and interpretability of HNMF depend critically on the selected optimization algorithm. Within the specific context of materials research, where data from techniques like 4D-STEM (4D Scanning Transmission Electron Microscopy) is prevalent, the choice between Multiplicative Update (MU), Alternating Least Squares (ALS), and Gradient-Based Optimizers involves distinct trade-offs between computational efficiency, solution quality, and the ability to incorporate physical constraints. This guide provides a structured comparison and detailed protocols to aid researchers in selecting and implementing the most appropriate optimizer for their HNMF workflows.

Optimizer Comparison and Selection Table

The table below summarizes the core characteristics, advantages, and limitations of the three primary optimizer classes for HNMF in materials science.

Table 1: Comparative Analysis of NMF Optimizers for Materials Research

Optimizer	Mathematical Foundation	Key Advantages	Key Limitations	Ideal Use Cases in Materials Science
Multiplicative Update (MU)	Majorization-Minimization framework; iterative element-wise updates [2].	â€¢ Guaranteed non-negativity without projections [2].â€¢ Simple implementation.â€¢ Well-established and widely available in tools like scikit-learn and HyperSpy [2].	â€¢ Slower convergence rate [61].â€¢ Can get stuck in poor local minima.â€¢ Less flexible for adding custom domain constraints [2].	â€¢ Initial exploratory data analysis.â€¢ Datasets of moderate size where simplicity is prioritized over speed.
Alternating Least Squares (ALS)	Solves non-negative least squares subproblems for each matrix, alternatingly; uses projection `[Â·]+` to enforce non-negativity [2].	â€¢ Faster convergence than MU in many cases [61].â€¢ High flexibility for incorporating domain-specific constraints via the projection step [2].	â€¢ Projection step is mathematically less rigorous than MU's inherent non-negativity [2].â€¢ Requires careful monitoring for convergence.	â€¢ Constrained NMF (e.g., spatial smoothing in 4D-STEM maps, enforcing continuous intensity profiles in diffractions) [2].â€¢ Larger datasets requiring faster processing.
Gradient-Based Optimizers	Uses gradient descent with adaptive learning rates, momentum, and other accelerations; framed as a neural network for backpropagation in "Neural NMF" [3].	â€¢ Potential for highest convergence speed with modern acceleration techniques (e.g., extrapolation) [61].â€¢ Discovers better hierarchical structure in deep/hierarchical models [3].â€¢ Enables end-to-end training of complex network architectures.	â€¢ Complex implementation and hyperparameter tuning (e.g., learning rate) [62].â€¢ Risk of instability or divergence without proper tuning [62].	â€¢ Deep/Hierarchical NMF architectures like Neural NMF [3].â€¢ Very large-scale datasets (e.g., from high-throughput microscopy).

Visual Guide to Optimizer Selection and Workflow

The following diagram illustrates the logical decision process for selecting an optimizer and outlines the core iterative workflow shared by all three methods.

Diagram 1: Optimizer Selection Logic

Diagram 2: Core NMF Optimization Workflow

Detailed Experimental Protocols

Protocol 1: Implementing Multiplicative Update (MU) for 4D-STEM

This protocol is designed for decomposing 4D-STEM datasets into constituent diffractions (W) and their spatial maps (H).

Objective: To factorize a 4D-STEM data matrix X into non-negative matrices W (basis diffractions) and H (coefficient maps) using the MU algorithm.

Materials and Reagents:

Data Matrix (X): The raw 4D-STEM data, reshaped into a 2D matrix where rows correspond to flattened diffraction patterns and columns to real-space pixels [2].
Computing Environment: Python with NumPy, or specialized software like HyperSpy or DigitalMicrograph [2].

Procedure:

Preprocessing: Reshape the 4D dataset I4D(x, y, u, v) into a 2D matrix X of dimensions (n_u * n_v) x (n_x * n_y) [2].
Initialization: Randomly initialize matrices W and H with non-negative values.
Iterative Update: Until convergence (or for a set number of iterations), apply the following update rules [2]:
- Update for H: H <- H * (W^T @ X) / (W^T @ W @ H + Îµ)
- Update for W: W <- W * (X @ H^T) / (W @ H @ H^T + Îµ) (where @ denotes matrix multiplication, * and / are element-wise operations, and Îµ is a small constant to avoid division by zero).
Post-processing: Reshape the columns of W back into 2D diffraction patterns and the rows of H into 2D spatial maps for analysis.

Troubleshooting:

Slow Convergence: Consider using accelerated MU methods or switching to ALS [61].
Poor Interpretability: The solution may be a local minimum. Try multiple random initializations.

Protocol 2: Constrained NMF via ALS with Domain Knowledge

This protocol uses ALS to enforce physical constraints, such as smoothness in spatial maps, which is often necessary for meaningful materials science interpretation.

Objective: To perform HNMF with ALS, incorporating domain-specific constraints to obtain physically realistic factorizations.

Materials and Reagents:

Data Matrix (X): Preprocessed data matrix, as in Protocol 1.
Constraint Functions: Custom functions for projecting W and/or H onto a constrained space (e.g., low-pass filter for smoothness) [2].

Procedure:

Preprocessing: Identical to Protocol 1.
Initialization: Randomly initialize W and H.
Iterative Update with Projection: Until convergence, alternate between the following steps [2]:
- Update W (with constraint): W <- [ (X @ H^T) @ inv(H @ H^T) ]_+. Then, apply a constraint function to W (e.g., apply Gaussian filtering to each reshaped basis diffraction to suppress high-frequency noise).
- Update H (with constraint): H <- [ inv(W^T @ W) @ (W^T @ X) ]_+. Then, apply a constraint function to H (e.g., apply a smoothing filter to each reshaped coefficient map to reflect the spatial resolution of the microscope).
Hierarchical Decomposition: For HNMF, take the resulting H matrix from the first layer and use it as the new X matrix for factorization in the next layer, repeating the ALS procedure to uncover deeper hierarchical structure [63] [3].

Troubleshooting:

Constraint Violation: Ensure the projection step [Â·]+ correctly enforces non-negativity after the unconstrained least-squares calculation.
Numerical Instability: Use Tikhonov (L2) regularization in the least-squares solves to improve stability.

Protocol 3: Hierarchical Factorization with Gradient-Based Neural NMF

This protocol frames HNMF as a neural network trained with gradient-based optimizers, enabling the discovery of complex hierarchical topic structures.

Objective: To implement "Neural NMF," a hierarchical NMF model trained with backpropagation, for tasks requiring deep feature extraction.

Materials and Reagents:

Data Matrix (X): Preprocessed data matrix.
Deep Learning Framework: PyTorch or TensorFlow.
Gradient-Based Optimizer: Adam, RMSProp, or a specialized optimizer like MAMGD [62].

Procedure:

Model Definition: Define the HNMF model as a neural network with L layers. The model's forward pass is: X â‰ˆ W1 @ W2 @ ... @ WL @ HL [63] [3].
Loss Function: Define the loss as the reconstruction error, e.g., L = Â½ ||X - W1 W2 ... HL||Â²_F [63].
Gradient Calculation: Use automatic differentiation (autograd) to compute the gradients of the loss with respect to all matrices W1 ... WL and HL [3].
Parameter Update: Update all model parameters using a gradient-based optimizer. For example, using Adam: m_t = Î²1 * m_{t-1} + (1 - Î²1) * g_t (1st moment estimate) v_t = Î²2 * v_{t-1} + (1 - Î²2) * g_tÂ² (2nd moment estimate) Î¸_t = Î¸_{t-1} - Î· * m_t / (âˆšv_t + Îµ) (parameter update) [62].
Fine-Tuning: After initial training, a fine-tuning step that jointly optimizes all layers can be performed to further refine the hierarchical features [63].

Troubleshooting:

Gradient Explosion/Vanishing: Use gradient clipping and carefully tune the learning rate (Î·) [62].
Non-Negativity Enforcement: Apply a non-negativity constraint after each gradient update (e.g., using torch.clamp(min=0) or a projected optimizer).

The Scientist's Toolkit: Essential Research Reagents and Materials

Table 2: Key Computational Tools and Environments for HNMF

Tool/Environment	Function	Example Use Case
Scikit-learn (Python)	Provides ready-to-use implementations of standard NMF with MU and ALS solvers [2].	Rapid prototyping and baseline analysis of material datasets.
HyperSpy (Python)	An open-source toolkit specifically designed for multidimensional spectral data analysis, including NMF [2].	Decomposing 4D-STEM, EELS, and other hyperspectral data.
DigitalMicrograph Scripting	Allows integration of custom NMF algorithms (MU/ALS) directly into the Gatan Microscopy Suite [2].	In-situ analysis and decomposition of data during TEM/STEM acquisition.
PyTorch / TensorFlow	Deep learning frameworks that enable the creation and training of custom Neural NMF architectures with gradient-based optimizers [3].	Building complex, deep hierarchical NMF models for advanced feature extraction.
Constrained NMF Scripts	Custom scripts (e.g., for DigitalMicrograph) that implement ALS with domain-specific projection steps [2].	Enforcing physical constraints (smoothness, positivity) during factorization.

Validating HNMF: Performance Metrics, Benchmarking, and Comparative Analysis

In the context of materials research, Hierarchical Nonnegative Matrix Factorization (HNMF) has emerged as a powerful unsupervised machine learning technique for extracting latent structures from complex, high-dimensional data. HNMF recursively applies NMF to discover overarching topics encompassing lower-level features, enabling researchers to identify hierarchically organized patterns in materials characterization data, such as that generated by electron microscopy and hyperspectral imaging [6]. The value of any factorization, however, depends critically on the rigorous quantification of its performance across multiple dimensions. For materials scientists applying HNMF, three quantitative metrics form the essential triad for evaluation: reconstruction error (fidelity to original data), topic coherence (interpretability of components), and cluster purity (separation of distinct materials phases or properties). These metrics provide the mathematical foundation for validating whether the discovered hierarchical structure corresponds to physically meaningful phenomena rather than analytical artifacts, ultimately determining the reliability of insights gained about material behavior, composition, and performance.

Core Quantitative Metrics Framework

Mathematical Foundations and Definitions

The evaluation of HNMF models relies on mathematical formulations that quantify different aspects of factorization quality. These metrics provide complementary insights into model performance, with optimal HNMF implementations achieving an appropriate balance across all three dimensions.

Table 1: Core Quantitative Metrics for HNMF Evaluation

Metric	Mathematical Definition	Interpretation in Materials Science
Reconstruction Error	( D(\mathbf{X} \|\| \mathbf{WH}) = \frac{1}{2} \|\| \mathbf{X} - \mathbf{WH} \|\|_F^2 ) [2]	Fidelity of the factorized approximation to original experimental data
Topic Coherence	( C(t) = \sum{ii^t, v_j^t) ) [64]}>	Semantic consistency of features within discovered components
Cluster Purity	( P = \frac{1}{N} \sum{k} \maxj	wk \cap cj	) [65]	Effectiveness in separating distinct material phases or properties

Metric Interrelationships and Trade-offs

In practice, these metrics often involve trade-offs that must be managed based on research objectives. Minimizing reconstruction error ensures the factorization faithfully represents the original dataâ€”a crucial consideration when HNMF is applied to quantitative materials characterization techniques like 4D-STEM [2]. However, excessively minimizing reconstruction error may lead to overfitting, reducing the interpretability of results. Topic coherence measures the semantic consistency of features within discovered components, which translates to the physical meaningfulness of extracted patterns in materials data [64]. Cluster purity quantifies how effectively HNMF separates distinct material phases or properties, with higher purity indicating cleaner separation of physically distinct categories [65]. The optimal balance depends on the specific application: materials discovery may prioritize topic coherence, while quantitative phase analysis may emphasize reconstruction accuracy.

Experimental Protocols for Metric Evaluation

Standardized Protocol for Reconstruction Error Assessment

Objective: Quantify the fidelity of HNMF approximation to original materials data.

Workflow:

Data Preparation: Preprocess raw materials characterization data (e.g., 4D-STEM, hyperspectral images) to form non-negative input matrix (\mathbf{X}) [2]
HNMF Implementation: Apply hierarchical factorization to obtain (\mathbf{W}) and (\mathbf{H}) matrices using recursive decomposition [6]
Error Calculation: Compute Frobenius norm between (\mathbf{X}) and (\mathbf{WH}) [2]
Dimensionality Analysis: Repeat across multiple factorization ranks (k) to establish error versus complexity relationship
Validation: Compare reconstruction error against baseline methods (e.g., PCA, standard NMF)

Critical Parameters:

Convergence tolerance: (10^{-6}) to (10^{-8})
Maximum iterations: 1000-5000
Sparsity constraints: 0.3-0.6 [66]

Figure 1: Reconstruction Error Assessment Workflow

Standardized Protocol for Topic Coherence Evaluation

Objective: Measure the semantic interpretability and physical meaningfulness of HNMF-derived components.

Workflow:

Component Identification: Extract basis components from (\mathbf{W}) matrix at each hierarchy level [6]
Feature Ranking: Identify top-N features (words, spectral signatures, diffraction patterns) for each component
Similarity Calculation: Compute pairwise similarity between top features using appropriate metrics:
- Text data: Pointwise Mutual Information (PMI)
- Spectral data: Cosine similarity [4]
- Structural data: Cross-correlation [2]
Coherence Scoring: Aggregate similarity scores across all component pairs
Interpretation Validation: Correlate with domain expert evaluations where feasible [64]

Critical Parameters:

Top features considered (N): 10-20
Similarity metric: Domain-specific appropriate measure
Reference corpus: Domain-specific materials science corpus

Standardized Protocol for Cluster Purity Assessment

Objective: Quantify the separation quality of distinct materials classes or phases.

Workflow:

Label Assignment: Apply clustering algorithms (k-means, hierarchical clustering) to coefficient matrix (\mathbf{H}) [15]
Ground Truth Establishment: Identify reference classes (e.g., known material phases, synthetic labels)
Purity Calculation: For each cluster, assign to most frequent class; compute percentage of correctly assigned items [65]
Hierarchical Validation: Repeat at multiple levels of HNMF hierarchy
Comparative Analysis: Benchmark against alternative factorization approaches

Critical Parameters:

Clustering algorithm: Consistent across comparisons
Ground truth reliability: Well-characterized reference materials
Statistical significance: Repeated measures with different initializations

The Scientist's Toolkit: Essential Research Reagents

Table 2: Essential Computational Tools for HNMF Evaluation

Tool Category	Specific Implementation	Function in HNMF Evaluation
Programming Frameworks	Python (scikit-learn, Nimfa) [15], R (NMF package) [15]	Core factorization algorithms and metric implementations
Domain-Specific Libraries	HyperSpy [2], DigitalMicrograph [2]	Specialized processing for materials characterization data
Visualization Tools	Matplotlib, Graphviz	Results presentation and workflow documentation
Constrained NMF Implementations	Custom ALS/MU algorithms [2]	Domain-aware factorization with physical constraints

Advanced Methodologies for Materials Research

Domain-Informed Metric Optimization

Materials research applications often benefit from constraining HNMF using domain knowledge, which subsequently improves all three evaluation metrics. For electron microscopy data, incorporating point spread function constraints and continuous intensity profile characteristics has been shown to significantly enhance both reconstruction accuracy and topic coherence by eliminating physically implausible artifacts [2]. These domain-specific constraints reject solutions with downward-convex peaks or high-frequency noise that, while mathematically plausible, violate known physical principles of electron detection. Implementation requires modifying standard alternating least squares (ALS) or multiplicative update (MU) algorithms to incorporate:

Spatial smoothing constraints to respect instrumental resolution limits
Intensity profile constraints to maintain physically realistic signal characteristics
Structured sparsity to reflect known material phase distributions [4]

Hierarchical Validation Framework

A key advantage of HNMF over standard NMF is its ability to capture latent structure at multiple scales, necessitating a multi-level evaluation approach.

Figure 2: Hierarchical Validation Workflow

Multi-Scale Evaluation Protocol:

Level-Wise Metric Calculation: Independently compute reconstruction error, topic coherence, and cluster purity at each hierarchy level
Cross-Level Consistency: Assess whether higher-level topics meaningfully encompass lower-level patterns [6]
Stability Analysis: Evaluate metric sensitivity to initialization and parameter variation
Physical Plausibility: Verify that hierarchical organization corresponds to known materials taxonomy

Applications in Materials Research Domains

Electron Microscopy Data Analysis

In 4D-STEM analysis of metallic glasses, constrained HNMF with specialized evaluation metrics has successfully identified and classified nanometer-sized crystalline precipitates embedded in amorphous matrices [2]. The optimization approach emphasized:

Reconstruction error minimization with spatial resolution constraints
Topic coherence through polar coordinate transformation and uniaxial cross-correlation [2]
Cluster purity maximization for precise phase identification

Hyperspectral Image Unmixing

For hyperspectral unmixing of geological samples, HNMF with group-sparsity constraints has demonstrated superior performance in extracting and clustering prototypal endmember spectra while estimating material abundances [4]. Evaluation emphasized:

Reconstruction fidelity to original spectral measurements
Topic coherence through physically interpretable endmember bundles
Cluster purity for distinct mineral phase separation

Longitudinal Materials Characterization

In tracking progressive structural alterations in materials under environmental stress, longitudinal HNMF has identified altered trajectories of feature evolution [66]. Evaluation strategies included:

Time-point specific reconstruction accuracy
Temporal coherence of topic evolution
Transition purity between distinct morphological states

Robust evaluation of Hierarchical NMF through the triad of reconstruction error, topic coherence, and cluster purity provides the mathematical foundation for reliable materials discovery and characterization. The protocols outlined herein establish standardized methodologies for quantifying HNMF performance across diverse materials research applications, from nanoscale phase identification to spectral unmixing. By implementing these comprehensive evaluation frameworks, materials scientists can ensure their factorizations yield not just mathematically sound but physically meaningful hierarchical representations of complex materials phenomena, ultimately accelerating the design and optimization of novel materials systems.

Within materials research, advanced characterization techniques like four-dimensional scanning transmission electron microscopy (4D-STEM) generate extremely large datasets that require sophisticated machine learning for analysis [2]. Hierarchical nonnegative matrix factorization (NMF) has emerged as a powerful tool for extracting physically meaningful components from such data, but validating these algorithms requires reliable ground-truth data that is often scarce or privacy-sensitive [2] [67]. This protocol establishes methods for benchmarking hierarchical NMF algorithms using privacy-preserving synthetic data that incorporates domain-specific constraints inherent to materials science, enabling rigorous algorithm validation while addressing data scarcity and confidentiality concerns [2] [67].

Background

Hierarchical NMF in Materials Research

In 4D-STEM analysis, primitive NMF factorizes a data matrix X into lower-rank matrices W (diffraction basis) and H (coefficient maps) through the approximation X â‰ˆ WH, where all elements are nonnegative [2]. Hierarchical NMF extends this approach by incorporating domain knowledge including spatial resolution constraints and continuous intensity profiles without downward-convex peaks, which are physically implausible in electron microscopy signals [2]. This integration of materials-specific constraints enables extraction of interpretable diffractions and maps that conventional machine learning techniques cannot achieve, making it particularly valuable for detecting and classifying nanometer-sized crystalline precipitates in amorphous metallic glasses [2].

Synthetic Data for Validation

Synthetic data has emerged as an essential solution for ML validation challenges, with Gartner forecasting that by 2030, synthetic data will be more widely used for AI training than real-world datasets [68]. For hierarchical NMF benchmarking, synthetic data provides:

Controlled ground truth with known factorization components
Privacy preservation for sensitive materials research data
Coverage of edge cases and rare material phases
Cost-effective alternative to manual data collection and annotation [68]

Synthetic Data Generation Protocol

Workflow for Ground-Truth Generation

Additive Case-based Reasoning (AddCBR) Implementation

The proposed method utilizes Additive Case-based Reasoning (AddCBR) as a model-aligned interpretable baseline for benchmarking additive feature attribution methods in hierarchical NMF [67]. AddCBR generates synthetic data that retains original feature behavior through:

Feature preservation: Maintaining statistical properties of original materials data
Interpretability: Providing transparent reasoning for generated factorizations
Alignment: Ensuring compatibility with hierarchical NMF constraints
Privacy protection: Removing sensitive information while preserving data utility [67]

Experimental Setup and Reagents

Research Reagent Solutions

Table 1: Essential Research Reagents for Hierarchical NMF Benchmarking

Reagent Solution	Function	Implementation Notes
Domain-Constrained NMF Solver	Factorizes data with physical constraints	Implements spatial resolution and non-negative intensity constraints [2]
AddCBR Generator	Produces synthetic ground-truth data	Creates privacy-preserving data retaining original feature behavior [67]
CQV Analyzer	Quantifies attribution consistency	Measures stability of feature attribution outputs [67]
Hierarchical Clustering Module	Classifies diffraction patterns	Combines polar coordinate transformation with uniaxial cross-correlation [2]
Contrast Validation Tool	Ensures visualization accessibility	Verifies WCAG 2.2 AA compliance (â‰¥4.5:1 contrast ratio) [69]

Quantitative Evaluation Metrics

Table 2: Metrics for Hierarchical NMF Benchmarking

Metric Category	Specific Metrics	Target Values	Measurement Purpose
Factorization Accuracy	Reconstruction Error, Component Purity	Error â‰¤5% vs. ground truth	Quantifies decomposition precision [2]
Attribution Consistency	Coefficient of Quartile Variation (CQV)	CQV â‰¤0.25	Measures feature attribution stability [67]
Physical Plausibility	Peak Convexity, Spatial Continuity	Zero downward-convex peaks	Validates domain constraint adherence [2]
Computational Efficiency	Convergence Iterations, Processing Time	â‰¤1000 iterations	Assesses algorithmic performance [2]
Privacy Preservation	k-anonymity, Feature Correlation	Correlation â‰¥0.9 with original	Evaluates synthetic data utility [67]

Benchmarking Protocol

Experimental Workflow

Step-by-Step Procedure

Synthetic Data Generation
- Utilize AddCBR with original materials data as seed
- Preserve feature distributions and correlations
- Incorporate domain-specific physical constraints
- Generate ground-truth W and H matrices for validation [67]
Hierarchical NMF Execution
- Apply multiplicative update (MU) or alternating least squares (ALS) algorithms
- Implement spatial resolution constraints through Fourier filtering
- Enforce continuous intensity profiles without downward-convex peaks
- Monitor convergence with Frobenius norm objective function [2]
Component Extraction and Analysis
- Extract diffraction basis matrices (W) and coefficient maps (H)
- Apply hierarchical clustering based on diffraction similarity
- Combine polar coordinate transformation with uniaxial cross-correlation
- Identify and classify material phases and precipitates [2]
Quantitative Evaluation
- Calculate reconstruction error between synthetic and factorized data
- Measure attribution consistency using Coefficient of Quartile Variation (CQV)
- Verify adherence to physical constraints (non-negativity, continuity)
- Compare computational efficiency across algorithm variants [2] [67]
Validation and Reporting
- Benchmark against hold-out real experimental data
- Document all parameters and convergence behavior
- Generate comprehensive performance reports
- Archive synthetic datasets for reproducibility [68]

Expected Results and Interpretation

Performance Benchmarks

Successful implementation should yield hierarchical NMF decompositions with:

High physical interpretability: Factorized components should correspond to meaningful material structures without unphysical artifacts [2]
Robust attribution consistency: CQV values below 0.25 indicate stable feature attribution across multiple runs [67]
Efficient convergence: Domain-constrained NMF typically converges within 1000 iterations for standard 4D-STEM datasets [2]
Accurate phase identification: Precipitation detection accuracy exceeding 90% for crystalline phases in amorphous matrices [2]

Validation Against Real Data

While synthetic data provides controlled ground truth, final validation must use hold-out real experimental data to ensure real-world performance [68]. The hierarchical NMF should maintain:

Generalization capability: Performance within 5% degradation on real versus synthetic data
Physical plausibility: All diffractions and maps adhere to electron microscopy constraints
Computational efficiency: Practical processing times for large-scale materials datasets [2]

This protocol establishes a comprehensive framework for benchmarking hierarchical NMF algorithms using privacy-preserving synthetic data with materials-specific constraints. By integrating AddCBR for ground-truth generation and rigorous quantitative metrics, researchers can objectively validate decomposition performance while addressing data scarcity and confidentiality challenges. The approach enables more reliable extraction of physical insights from complex materials characterization data, advancing materials discovery and development.

In the field of materials science, researchers increasingly encounter high-dimensional datasets derived from techniques such as powder diffraction, spectroscopy, and microscopic imaging. Dimensionality reduction is a fundamental machine learning technique that simplifies these complex datasets by reducing the number of input variables or features, thereby enhancing computational efficiency and mitigating the "curse of dimensionality" that plagues high-dimensional data analysis [70]. This simplification is crucial for extracting meaningful patterns and structural information from materials data while reducing the risk of overfitting in predictive models.

Hierarchical Non-negative Matrix Factorization (HNMF) represents an advanced development in the family of matrix factorization techniques, building upon the foundation of standard Non-negative Matrix Factorization (NMF). While flat NMF factorizes a single data matrix into two lower-dimensional non-negative matrices, HNMF introduces a multi-layer or hierarchical structure that enables more sophisticated representation learning. This article provides a comprehensive comparative analysis of HNMF against three established techniques: Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), and flat NMF, with specific application to materials research challenges.

Theoretical Foundations of Dimensionality Reduction Techniques

Principal Component Analysis (PCA)

PCA is a linear dimensionality reduction technique that identifies the directions of maximum variance in high-dimensional data. The algorithm works by transforming the original variables into a new set of orthogonal components called principal components, which are ordered by the amount of variance they explain from highest to lowest [70]. The mathematical foundation of PCA involves eigen decomposition of the covariance matrix or singular value decomposition (SVD) of the data matrix [71].

The PCA algorithm follows these key steps: (1) standardization of the original variables to have zero mean and unit variance; (2) computation of the covariance matrix to understand how variables deviate from the mean and relate to each other; (3) calculation of eigenvectors and eigenvalues from the covariance matrix; (4) sorting eigenvectors by their corresponding eigenvalues in descending order; and (5) projection of the original data onto the selected principal components [70]. For materials researchers, PCA serves as a valuable tool for exploratory data analysis, noise reduction, and visualization of high-dimensional materials data.

Linear Discriminant Analysis (LDA)

LDA is a supervised dimensionality reduction technique that finds a linear combination of features that maximally separates different classes in the dataset while simultaneously minimizing the within-class scatter [72]. Unlike PCA, which is unsupervised and focuses on variance preservation, LDA explicitly utilizes class label information to identify the most discriminative features. This makes LDA particularly valuable in materials classification problems, such as identifying material phases or categorizing spectral signatures.

The LDA algorithm seeks to maximize the ratio of between-class variance to within-class variance in the projected space. The resulting linear discriminants provide directions that optimally separate predefined classes, making LDA especially effective in scenarios where distinct class boundaries exist [72].

Standard Non-negative Matrix Factorization (NMF)

NMF is a multivariate analysis technique that factorizes a non-negative data matrix V into two lower-dimensional non-negative matrices W (basis matrix) and H (coefficient matrix), such that V â‰ˆ WH [70] [50]. The non-negativity constraint distinguishes NMF from other factorization methods and often results in parts-based representations that are more interpretable, particularly for materials data that inherently exhibit non-negative properties (e.g., spectral intensities, concentrations).

The standard NMF optimization problem can be formulated as minimizing the reconstruction error between V and WH, typically measured using Euclidean distance or divergence measures, subject to non-negativity constraints on W and H [50]. This technique has demonstrated significant potential in finding physically plausible structural signals from materials characterization data, such as diffraction patterns collected during in situ chemical reactions [50].

Hierarchical Non-negative Matrix Factorization (HNMF)

HNMF extends standard NMF by introducing a hierarchical or multi-layer structure to the factorization process. Instead of a single decomposition V â‰ˆ WH, HNMF performs sequential factorizations, potentially revealing nested structures within the data. This approach is particularly valuable for materials data that exhibit natural hierarchical organization, such as multi-scale materials characterization from atomic to microstructural levels.

The hierarchical architecture allows HNMF to capture complex, layered patterns that may be obscured in flat decompositions. In materials research, this capability enables researchers to simultaneously model phenomena occurring at different scales, from atomic arrangements to mesoscopic domain structures.

Comparative Technical Analysis

Algorithmic Characteristics and Mathematical Properties

Table 1: Fundamental Characteristics of Dimensionality Reduction Techniques

Technique	Matrix Structure	Constraints	Optimization Objective	Interpretability
PCA	Orthogonal components	None	Maximize variance	Global components, mixed signs
LDA	Linear discriminants	Class separation	Maximize between-class vs within-class variance	Discriminative features, mixed signs
Flat NMF	Two low-rank matrices	Non-negativity	Minimize reconstruction error	Parts-based, additive components
HNMF	Multiple layered matrices	Non-negativity, hierarchical sparsity	Multi-level reconstruction error minimization	Multi-scale, hierarchical parts

Each technique exhibits distinct mathematical properties that determine its suitability for specific materials analysis tasks. PCA components are orthogonal and ordered by variance explanation, which facilitates dimensionality reduction but may not align with physically meaningful directions in materials data [70]. LDA components maximize class separability, making them ideal for classification tasks but requiring labeled data [72]. Flat NMF produces additive combinations of non-negative basis vectors, often corresponding to physically meaningful building blocks such as pure component spectra or diffraction patterns [50]. HNMF extends this concept to multiple levels of abstraction, potentially capturing hierarchical materials organization.

Performance Metrics and Computational Characteristics

Table 2: Computational Characteristics and Application Suitability

Technique	Computational Complexity	Robustness to Noise	Handling of Non-linearity	Recommended Materials Applications
PCA	O(NÂ·dÂ² + NÂ³)	Moderate	Linear only	Initial data exploration, noise filtering
LDA	O(max(N,d)Â·dÂ²)	Moderate	Linear only	Classification of known material phases
Flat NMF	NP-Complete (approximations used)	Moderate to high (with sparsity)	Linear only	Extraction of pure component patterns
HNMF	Higher than flat NMF	High (with hierarchical sparsity)	Limited non-linearity via hierarchy	Multi-scale materials analysis

The computational requirements vary significantly across techniques. PCA and LDA have well-defined computational complexity and efficient implementations [72]. Standard NMF is NP-Complete, necessitating approximate algorithms, while HNMF introduces additional computational demands due to its hierarchical structure [72]. In practice, the choice of algorithm involves trade-offs between computational efficiency, interpretability, and alignment with the inherent structure of materials data.

Experimental Protocols for Materials Research Applications

Protocol 1: Phase Identification in Mixed Powder Diffraction Data

Objective: To identify distinct crystalline phases and their relative concentrations from temperature-dependent powder diffraction patterns of mixed materials.

Materials and Reagents:

Powder diffraction dataset (e.g., synchrotron PXRD or neutron diffraction)
High-temperature stage or environment chamber
Reference patterns for suspected phases (if available)

Procedure:

Data Collection: Acquire powder diffraction patterns across a temperature range, ensuring sufficient angular resolution and signal-to-noise ratio.
Preprocessing: Normalize intensities, correct for background scattering, and align diffraction angles if necessary.
Matrix Formulation: Arrange data as matrix V where rows represent different temperatures and columns represent diffraction angles or scattering vector magnitudes.
Factorization: Apply HNMF with appropriate hierarchy depth (typically 2-3 layers) to decompose V into hierarchical components.
Component Analysis: Interpret basis vectors in W as representative diffraction patterns of pure phases and coefficients in H as their relative abundance variations with temperature.
Validation: Compare extracted components with reference patterns from crystallographic databases and verify temperature-dependent concentration trends against expected phase behavior.

Technical Notes: For systems with thermal expansion, consider stretched NMF variants that accommodate signal stretching along the independent variable axis [50]. The hierarchical structure of HNMF can separately capture phase composition changes (higher level) and lattice parameter variations (lower level).

Protocol 2: Multi-scale Microstructural Analysis from Imaging Data

Objective: To extract hierarchical features from microstructural characterization data (e.g., SEM, TEM, or optical microscopy) spanning multiple length scales.

Materials and Reagents:

Multi-resolution imaging data of material microstructure
Image processing software (e.g., ImageJ, Python with OpenCV)
Computational resources for handling large image datasets

Procedure:

Data Preparation: Acquire images at multiple magnification levels, ensuring overlapping fields of view for registration.
Feature Extraction: Convert images to appropriate feature representations (e.g., patch-based, texture descriptors).
Multi-level Matrix Construction: Organize features into hierarchical data structures preserving spatial and scale relationships.
HNMF Implementation: Apply HNMF with hierarchy depth matching the number of resolution levels in the imaging data.
Pattern Recognition: Identify hierarchical patterns corresponding to microstructural elements at different scales (e.g., grains, subgrains, precipitates).
Structure-Property Correlation: Relocate extracted hierarchical components to material properties through regression models.

Technical Notes: The non-negativity constraint in HNMF aligns well with image data (pixel intensities). Consider incorporating spatial constraints to enhance physical interpretability of the factorization.

Protocol 3: Comparative Analysis of HNMF vs Alternative Techniques

Objective: To systematically evaluate the performance of HNMF against PCA, LDA, and flat NMF on benchmark materials datasets.

Materials and Reagents:

Standardized materials datasets (e.g., diffraction patterns, spectral libraries)
Computational environment with implementation of all four techniques
Validation metrics (reconstruction error, clustering purity, etc.)

Procedure:

Dataset Selection: Choose appropriate benchmark datasets with known ground truth where possible.
Dimensionality Reduction: Apply each technique to reduce dimensionality while varying key parameters (components number, hierarchy depth for HNMF).
Reconstruction Analysis: Quantify reconstruction error for each method using metrics such as Frobenius norm.
Interpretability Assessment: Evaluate physical meaningfulness of components through consultation with domain experts.
Classification Performance: For labeled data, assess utility of reduced representations for classification tasks.
Robustness Testing: Evaluate sensitivity to noise through controlled noise addition experiments.

Technical Notes: For fair comparison, ensure consistent preprocessing and similar effective dimensionality across methods. The evaluation should consider both quantitative metrics and qualitative assessment of component physical meaningfulness.

Visualization of Technique Workflows and Relationships

Figure 1: Workflow relationships between major dimensionality reduction techniques, highlighting their distinctive approaches to processing high-dimensional materials data.

Figure 2: Detailed workflow for Hierarchical NMF analysis of materials data, showing the sequential factorization process that enables multi-scale representation learning.

Table 3: Essential Resources for Dimensionality Reduction in Materials Research

Resource Category	Specific Tools/Solutions	Function/Purpose	Implementation Notes
Data Acquisition	Powder diffractometer, SEM/TEM, Spectrometers	Generate high-dimensional materials characterization data	Ensure appropriate resolution and signal-to-noise ratio
Preprocessing Tools	Background correction algorithms, Noise filters, Normalization routines	Prepare raw data for dimensionality reduction	Critical for meaningful factorization results
Computational Libraries	Scikit-learn (PCA, LDA, NMF), TensorFlow/PyTorch (HNMF), Custom HNMF implementations	Implement core dimensionality reduction algorithms	HNMF may require custom implementation or extension of existing NMF libraries
Validation Databases	Crystallographic databases (ICSD), Spectral libraries, Reference microstructures	Validate extracted components against known materials	Essential for establishing physical meaningfulness
Visualization Tools	Matplotlib, Plotly, Paraview, Custom visualization pipelines	Interpret and communicate results	Particularly important for hierarchical results in HNMF

Hierarchical Non-negative Matrix Factorization represents a powerful advancement in the dimensionality reduction toolkit for materials researchers, offering unique capabilities for multi-scale analysis of complex materials data. While established techniques like PCA, LDA, and flat NMF each have distinct strengths for specific applications, HNMF provides a flexible framework for capturing hierarchical structures inherent in many materials systems.

The comparative analysis presented in this work highlights how technique selection should be guided by specific research objectives: PCA for initial exploration and noise reduction, LDA for classification tasks with labeled data, flat NMF for parts-based decomposition of non-negative data, and HNMF for complex, multi-scale materials characterization. As materials research continues to generate increasingly sophisticated and high-dimensional datasets, the development of specialized dimensionality reduction techniques like HNMF will play a crucial role in extracting physically meaningful insights and accelerating materials discovery and optimization.

Future research directions include the development of more efficient algorithms for HNMF computation, integration of domain knowledge as constraints during factorization, and hybrid approaches that combine the strengths of multiple techniques. Additionally, as interpretable machine learning gains importance in materials science, the transparent, parts-based representations offered by HNMF and related techniques will become increasingly valuable for establishing reliable structure-property relationships.

This application note provides a detailed evaluation of Hierarchical Nonnegative Matrix Factorization (HNMF) applied to Four-Dimensional Scanning Transmission Electron Microscopy (4D-STEM) data of metallic glasses. We demonstrate that a domain-specific constrained NMF approach successfully decomposes complex 4D-STEM datasets from Zr-based metallic glasses into interpretable components, enabling the detection and classification of nanometer-sized crystalline precipitates within an amorphous matrix. The method significantly outperforms conventional techniques like Principal Component Analysis (PCA) and primitive NMF by incorporating physical constraints inherent to electron microscopy, such as spatial resolution and continuous intensity features. This protocol offers materials researchers a robust framework for extracting meaningful insights from large, multidimensional material characterization datasets.

The analysis of metallic glasses, such as ZrCuAl alloys, requires understanding their medium-range order (MRO) and its correlation with material properties like glass-forming ability and mechanical behavior [73]. Modern 4D-STEM techniques generate extremely large datasets, creating a pressing need for optimized machine learning methods that can reduce dimensionality and extract physically meaningful patterns [1]. Hierarchical Nonnegative Matrix Factorization (HNMF) has emerged as a powerful tool for this purpose, offering a parts-based representation that is particularly suited for interpreting complex material structures.

HNMF extends standard NMF by recursively decomposing data into multiple levels of latent topics or features, creating a hierarchical structure that reveals multiscale patterns without manual tuning [74]. This capability is crucial for materials science, where phenomena operate across different spatial scales. When applied to scientific literature networks, HNMF has successfully identified hidden associations between materials and research themes like superconductivity and energy storage [74]. This same principle applies directly to 4D-STEM data, where hierarchical patterns in diffraction space correspond to meaningful material structures.

Computational Framework

Hierarchical NMF Fundamentals

Standard Nonnegative Matrix Factorization (NMF) approximates a nonnegative data matrix ( \textbf{X} \in \mathbb{R}{+}^{q \times n} ) as the product of two nonnegative factor matrices: a basis matrix ( \textbf{Z} \in \mathbb{R}{+}^{q \times r} ) and a coefficient matrix ( \textbf{H} \in \mathbb{R}_{+}^{r \times n} ), satisfying ( \textbf{X} \approx \textbf{Z}\textbf{H} ) [75]. The HNMF framework extends this through recursive decomposition:

Local Factorization: From the current data subset ( X^{(d)} ), HNMF predicts the stable rank ( k_d ) using NMFk and factorizes ( X^{(d)} \approx W^{(d)}H^{(d)} ) [74].
Cluster Assignment: Each sample ( i \in I^{(d)} ) is assigned to a cluster ( \elli = \arg \maxr W_{ir}^{(d)} ) [74].
Stopping Test: Recursion halts when minimum dimensions are reached, cluster size falls below a threshold, only one cluster is identified, or maximum depth is achieved [74].
Recursion: The process repeats for each non-empty cluster, building a tree structure that reveals hierarchical relationships in the data [74].

This hierarchical approach enables the discovery of multiscale patterns in 4D-STEM data, from atomic-scale arrangements to micrometer-scale morphological features.

Domain-Specific Constraints for 4D-STEM

For 4D-STEM data analysis, a novel constrained NMF approach incorporates physical knowledge inherent to electron microscopy:

Spatial Resolution Constraints: Integrate real-space positional information to maintain consistent structural interpretations [1].
Continuous Intensity Features: Model the continuous nature of diffraction intensity patterns without downward-convex peaks [1].
Hypergraph Regularization: Capture complex higher-order relationships in the data geometry beyond simple pairwise interactions [75].

These constraints differentiate the method from conventional NMF and ensure that the factorization results align with physical principles of electron scattering and diffraction.

Results & Performance Evaluation

Quantitative Performance Metrics

Table 1: Comparative performance of NMF algorithms on 4D-STEM data analysis tasks

Algorithm	Interpretability	Spatial Coherence	Component Accuracy	Noise Robustness
Constrained HNMF	High	High	High	High
Standard NMF	Medium	Low	Medium	Medium
PCA	Low	Low	Low	Low

Table 2: Application results of constrained HNMF on ZrCuAl metallic glass

Analysis Target	Detected Features	Size Range	Classification Accuracy
Crystalline Precipitates	Successful detection	Nanometer scale	High
Amorphous Matrix	Successful decomposition	N/A	High
MRO Structures	Identified and classified	Medium-range	Medium-High

The constrained NMF approach successfully decomposed both simulated and experimental 4D-STEM data into interpretable components that could not be achieved using PCA or primitive NMF methods [1]. Specifically, for ZrCuAl metallic glass, the technique enabled:

Detection of Nanoscale Precipitates: Successful identification of crystalline precipitates embedded in the amorphous matrix [1].
MRO Analysis: Extraction of medium-range order patterns and their variation with composition [73].
Improved Interpretability: Basis components and coefficients directly corresponded to physically meaningful diffraction patterns and their spatial distributions [1].

Workflow Visualization

Diagram 1: HNMF workflow for 4D-STEM data analysis

Experimental Protocols

4D-STEM Data Acquisition Protocol

Purpose: Acquire high-quality 4D-STEM data from metallic glass samples suitable for HNMF analysis.

Materials:

Zr-based metallic glass samples (e.g., ZrCu, ZrCuAl, ZrCuNiTiAl)
Transmission electron microscope with 4D-STEM capability
Direct electron detector for nanodiffraction patterns

Procedure:

Sample Preparation: Prepare electron-transparent regions of metallic glasses using focused ion beam (FIB) milling or electropolishing.
Microscope Alignment: Align STEM imaging system to ensure parallel illumination and optimal camera length.
Data Acquisition:
- Set probe size to 1-2 nm for nanodiffraction patterns
- Define scan area (typically 128Ã—128 to 512Ã—512 pixels)
- Acquire full diffraction pattern at each probe position
- Ensure adequate signal-to-noise ratio without excessive electron dose
Data Export: Save 4D dataset in binary format with metadata for processing.

Constrained HNMF Implementation Protocol

Purpose: Implement domain-specific constrained HNMF for 4D-STEM data decomposition.

Materials:

Python with NumPy, SciPy, and specialized NMF libraries
Computational resources (CPU/GPU) for large matrix operations
4D-STEM dataset in appropriate format (e.g., .blo, .emd, or custom binary)

Procedure:

Data Preprocessing:
- Reshape 4D dataset (x, y, diffraction x, diffraction y) to 2D matrix (pixels, diffraction patterns)
- Apply background subtraction and normalization
- Handle missing values and outliers

Constraint Implementation:
- Code spatial constraints to preserve real-space relationships
- Implement continuous intensity modeling without downward-convex peaks
- Set up hypergraph regularization using k-nearest neighbors (k=5-10)
HNMFk Execution:
- Initialize with hierarchical alternating least squares (HALS) algorithm
- Determine optimal rank using NMFk model selection
- Perform recursive decomposition with stopping criteria:
  - Minimum cluster size: 10-20 samples
  - Maximum depth: 3-5 levels
  - Stability threshold: >90% reproducibility
Hierarchical Clustering:
- Apply polar coordinate transformation to diffraction patterns
- Calculate uniaxial cross-correlation for similarity assessment
- Cluster components based on diffraction similarity
Validation:
- Reconstruct data from components to assess fit quality
- Compare with known crystal structures and diffraction simulations
- Evaluate stability through multiple runs with random initializations

Component Interpretation Protocol

Purpose: Relate HNMF components to physical structures in metallic glasses.

Materials:

HNMF decomposition results (basis and coefficient matrices)
Reference diffraction patterns for known crystal structures
Visualization software (e.g., Matplotlib, Paraview)

Procedure:

Basis Analysis:
- Map basis components back to diffraction space
- Compare with simulated patterns for candidate structures
- Identify symmetry elements and characteristic features

Spatial Distribution:
- Reconstruct coefficient maps to show spatial prevalence of each component
- Correlate with real-space STEM images (HAADF, ABF)
- Identify regions of phase separation or structural heterogeneity
MRO Quantification:
- Measure characteristic length scales from coefficient maps
- Calculate volume fractions of different structural components
- Correlate MRO parameters with local composition (if available via EDS)
Cross-Validation:
- Compare with alternative characterization (TEM imaging, XRD)
- Verify physical plausibility through literature comparison
- Establish structure-property relationships with mechanical testing data

The Scientist's Toolkit

Table 3: Essential research reagents and computational tools

Tool/Reagent	Function/Purpose	Specifications/Alternatives
4D-STEM System	Acquisition of nanodiffraction patterns at each spatial position	Probe size: 1-2 nm, Scan resolution: 128Ã—128 to 512Ã—512
Constrained HNMF Algorithm	Decomposition of 4D datasets into interpretable components	Custom Python implementation with spatial and intensity constraints
Hierarchical Clustering	Grouping similar diffraction patterns for MRO analysis	Polar coordinate transformation + uniaxial cross-correlation
Zr-based Metallic Glasses	Model system for studying amorphous structures and MRO	ZrCu, ZrCuAl, ZrCuNiTiAl compositions
Hypergraph Regularization	Preserving complex data geometry during factorization	k-nearest neighbors hyperedge construction

Workflow and Data Logic

Diagram 2: Data processing and analysis logic

This case study demonstrates that domain-specific constrained HNMF provides a powerful framework for analyzing complex 4D-STEM data from metallic glasses. By incorporating physical constraints inherent to electron microscopy, the method successfully decomposes data into interpretable components that reveal nanoscale precipitates and medium-range order structures. The hierarchical approach enables multiscale analysis of material structures, from atomic arrangements to micrometer-scale morphological features. The provided protocols offer researchers a comprehensive guide for implementing these techniques, advancing the characterization of complex materials through integrated computational and experimental approaches.

Hierarchical Nonnegative Matrix Factorization (HNMF) represents a powerful advancement in the dimensional reduction toolkit for materials science, enabling the discovery of multi-layer, interpretable patterns within complex scientific datasets. As an extension of standard NMF, which approximates a nonnegative data matrix X as the product of two lower-rank, nonnegative matrices W (components) and H (coefficients or abundances) such that X â‰ˆ WH, HNMF imposes a hierarchical structure onto this factorization [64] [4]. This hierarchy is crucial for materials research, where data often exhibits inherent multi-scale characteristicsâ€”from atomic-level interactions in a transmission electron microscope to macroscopic morphological features. The core strength of HNMF lies in its ability to provide a parts-based representation that is inherently more interpretable than real-valued alternatives like Principal Component Analysis (PCA), which often produces components with physically implausible negative intensities [2] [76]. Validating the meaning of the resulting multi-layer topics or components is therefore not a mere supplementary step, but a fundamental requirement to ensure that the extracted patterns are physically meaningful, reproducible, and scientifically actionable.

The challenge in applying HNMF to materials characterization, such as in the analysis of 4D-STEM or hyperspectral imaging data, is that a mathematically sound factorization does not guarantee physically interpretable results. Primitive NMF algorithms can converge to local minima, producing unstable components that vary between runs, or generate artifacts such as downward-convex peaks in continuous intensity profiles, which contradict known physical models [23] [2]. Consequently, a robust validation framework is essential to bridge the gap between abstract numerical output and domain-specific scientific insight. This document outlines application notes and detailed protocols for such validation, focusing on stability analysis, domain-constraint integration, and multi-modal correlation to empower researchers in materials science and drug development to confidently interpret their hierarchical decompositions.

Foundational Methods for Validating HNMF Components

Stability and Robustness Analysis via Clustering

A foundational principle for validating HNMF components is assessing their stabilityâ€”the repeatability of results across multiple runs of the algorithm with random initializations. A stable component is one that reliably reoccurs, suggesting it captures a true underlying signal in the data rather than numerical noise or an artifact of a particular initialization.

Experimental Protocol:

Multiple Factorization Runs: Execute the HNMF algorithm a significant number of times (e.g., 50 iterations) on the same dataset, each time with a different random seed [23].
Component Aggregation: Extract all components from all runs into a single pool.
Hierarchical Clustering: Apply a hierarchical clustering algorithm (e.g., using correlation as a distance metric) to this pool of components. The ideal outcome is the formation of dense, well-separated clusters, where each cluster corresponds to a single, stable component [23].
Stability Metric Calculation: Compute a quantitative stability index, such as the ( Iq ) metric, for each cluster. A higher ( Iq ) indicates a more stable component [23].
Centroid Selection: Select the centroid of each cluster as the final, validated component for subsequent interpretation and analysis.

This method was successfully applied to EEG data, demonstrating that the HALS-based low-rank NMF algorithm (lraNMF_HALS) produced significantly more stable components than other NMF variants [23]. The protocol translates directly to materials data, such as spectral maps from 4D-STEM, where stable components represent reproducible physical or chemical signatures.

Integration of Domain-Specific Constraints

Validation is significantly strengthened by integrating domain knowledge directly into the factorization process itself. This approach ensures that the extracted components are not only mathematically sound but also physically plausible.

Experimental Protocol for Constrained HNMF:

Constraint Identification: Collaborate with domain experts to define explicit physical constraints. For electron microscopy data, this includes:
- Spatial Smoothness: Component maps should exhibit a degree of smoothness corresponding to the instrument's spatial resolution [2].
- Spectral Continuity: Spectral signatures (e.g., diffractions) should not contain physically implausible high-frequency noise or unnatural downward-convex peaks [2].
Algorithm Selection: Implement a constrained HNMF algorithm, such as one based on Alternating Least Squares (ALS), which allows for flexible projection of constraints after each update step [2]. The update rules are:
- W â† [(XH^T)(HH^T)^-1]â‚Š
- H â† [(W^TW)^-1(W^TX)]â‚Š where [Â·]â‚Š represents the nonnegativity constraint projection and can be extended to include other domain-specific projections [2].
Validation Check: Compare the components from the constrained model against those from an unconstrained model. Physically realistic components from the constrained model, which avoid known artifacts, are considered more valid and interpretable.

This methodology was demonstrated in 4D-STEM analysis, where constrained NMF successfully decomposed data into interpretable diffractions and maps that were unattainable using PCA or primitive NMF [2].

Correlation with Complementary Techniques

The most compelling validation for an HNMF component is its correspondence with known properties measured by a complementary, established technique.

Experimental Protocol:

Multi-Modal Data Acquisition: Collect a dataset where the same sample or system is characterized by HNMF (e.g., 4D-STEM, spatial genomics, hyperspectral imaging) and a complementary technique (e.g., EDS, fluorescence labeling, crystallography).
Spatial Correlation Analysis: For each component map hâ‚–(x,y) extracted via HNMF, perform a spatial correlation analysis with the signal from the complementary modality.
Statistical Testing: Calculate correlation coefficients (e.g., Pearson) to quantitatively assess the relationship. A high correlation strongly validates the component's physical meaning.
Interpretation: Use the validated component to draw scientific conclusions. For instance, in spatial genomics, a component strongly associated with a specific gene expression pattern can be interpreted as defining a distinct cell type or tissue region [76].

Table 1: Summary of Core HNMF Validation Methods

Validation Method	Key Objective	Quantitative Metrics	Primary Application Context
Stability Analysis	Assess reproducibility and robustness of components across runs	( I_q ) index, cluster density [23]	General-purpose; essential for any HNMF application
Domain Constraints	Ensure components adhere to known physical or biological models	Model fit error, absence of artifact peaks [2]	4D-STEM, Hyperspectral Imaging, Scientific Instrumentation
Multi-Modal Correlation	Ground components in verified, external measurements	Pearson correlation, spatial overlap coefficients [76]	Spatial Transcriptomics, Correlative Microscopy

Advanced and Specialized Validation Frameworks

The Hybrid Model Framework for Quantifying Spatial Importance

In many applications, not all patterns of variation are spatially correlated. The Nonnegative Spatial Factorization Hybrid (NSFH) model provides a sophisticated framework to quantify the degree to which a component is driven by spatial structure, offering a powerful validation metric [76].

Experimental Protocol for NSFH:

Model Definition: Implement a hybrid model that decomposes the data matrix such that: Î»áµ¢â±¼ = Î£â‚—â‚Œâ‚áµ€ wâ±¼â‚—e^{fáµ¢â‚—} + Î£â‚—â‚Œâ‚œâ‚Šâ‚á´¸ vâ±¼â‚—e^{háµ¢â‚—} where the first sum represents T spatially-regularized factors (with Gaussian Process priors) and the second represents (L - T) nonspatial factors (with standard normal priors) [76].
Model Fitting: Fit the NSFH model to the data, for example, spatial transcriptomics data from a biological tissue section.
Spatial Importance Score Calculation: For each feature (e.g., a gene), calculate a spatial importance score as the ratio of its loading weights across all spatial components versus all components. A score near 1 indicates the feature's expression is almost entirely spatially determined, providing strong validation for the associated spatial components [76].

HNMF with Knowledge Graph Integration for Semantic Validation

A cutting-edge validation approach involves integrating HNMF outputs into a Knowledge Graph (KG), enabling semantic reasoning and validation against established knowledge.

Experimental Protocol:

HNMFk for Topic Extraction: Apply HNMFk (HNMF with automatic rank selection) to a corpus of textual data (e.g., scientific literature, patent documents) to extract latent topics [64].
Knowledge Graph Construction: Construct a knowledge graph where nodes represent entities such as documents, statutes, materials compounds, or genes. Link these nodes based on known relationships (e.g., citations, co-occurrence, functional similarity) [64].
Graph Integration: Integrate the latent topics discovered by HNMFk into the KG as new semantic nodes, linking them to the documents they represent.
Validation via Reasoning: Validate the topics by traversing the graph. A topic is semantically valid if it connects to a logically consistent and scientifically meaningful subgraph of related entities. This method was used to build a knowledge graph of legal documents, where HNMFk-derived topics served as semantic anchors for enhanced information retrieval [64].

Table 2: Advanced HNMF Validation Frameworks

Framework	Core Principle	Validation Output	Suitability
Hybrid (NSFH) Model	Partitions variation into spatial and nonspatial sources	Spatial Importance Score per feature/component [76]	Data with mixed spatial/non-spatial variance
Knowledge Graph Integration	Embeds components in a network of known relationships	Semantic consistency within the graph structure [64]	Unstructured/semi-structured data (text, patents)
Archetypal/Conical Hull Analysis	Models variability via data extremals (prototypes)	Physically meaningful prototypal endmembers [4]	Hyperspectral data with endmember variability

Experimental Protocols and Research Reagents

Detailed Protocol: Validating Components in 4D-STEM via Constrained HNMF

This protocol is adapted from applications in 4D-STEM and hyperspectral unmixing [2] [4].

I. Data Preprocessing

Data Formatting: Reshape 4D data ( I{4D}(x, y, u, v) ) into a 2D matrix X âˆˆ ( \mathbb{R}^{n{uv} \times n_{xy}} ), where rows are reciprocal-space coordinates (diffraction patterns) and columns are real-space pixels [2].
Normalization: Apply necessary normalization, such as flat-field correction or log transformation, to account for experimental variations.

II. Hierarchical NMF with Spatial and Spectral Constraints

Algorithm Selection: Utilize a Hierarchical Alternating Least Squares (HALS) algorithm for its superior convergence and stability properties [23].
Spatial Constraint (Abundance Maps): Apply a smoothing filter to the H matrix (abundance maps) after each iteration to enforce spatial coherence commensurate with the probe size.
Spectral Constraint (Diffraction Patterns): Apply a low-pass filter or a total variation penalty to the W matrix (diffraction patterns) to suppress high-frequency noise and prevent unnatural intensity dips [2].
Implementation Script: Use provided DigitalMicrograph or Python scripts implementing the constrained MU or ALS update rules [2].

III. Post-processing and Validation

Stability Clustering: Run the constrained HNMF 50 times. Aggregate all W and H components, then perform hierarchical clustering. Select cluster centroids as final components [23].
Correlative Validation: Acquire a secondary signal (e.g., EDS elemental map). Calculate the spatial correlation between the HNMF abundance map hâ‚– and the elemental map. A high correlation (>0.8) validates the component.

Diagram 1: Constrained HNMF validation workflow for 4D-STEM.

The Scientist's Toolkit: Essential Research Reagents and Computational Tools

Table 3: Essential Research Reagents and Computational Tools for HNMF Validation

Item Name	Function / Role	Technical Specifications / Examples
4D-STEM Dataset	Primary input data for HNMF decomposition of materials structure.	Bimodal data: real-space (x,y) probe positions & reciprocal-space (u,v) diffraction patterns [2].
Spatial Transcriptomics Data	Input data for HNMF in biological context; validates component-cell type links.	RNA read counts matrix with associated spatial (x,y) coordinates for each cell/spot [76].
Constrained HNMF Algorithm	Core computational engine for extracting physically plausible components.	HALS or ALS-based solver with projection steps for spatial smoothness & spectral continuity [2].
Stability Analysis Script	Quantifies robustness of components across multiple runs.	Script for hierarchical clustering of components & calculation of ( I_q ) stability index [23].
Knowledge Graph Framework	Enables semantic validation of topics against known relationships.	Neo4j graph database populated with entities and HNMFk-derived latent topics [64].
Complementary Characterization Data	Provides ground-truth for multi-modal validation of HNMF components.	EDS elemental maps, SEM images, fluorescence microscopy images [2] [76].

Diagram 2: Addressing NMF artifacts with domain-specific constraints.

Conclusion

Hierarchical Nonnegative Matrix Factorization emerges as a uniquely powerful tool for deconvoluting the complex, multi-scale structures inherent in modern scientific datasets, from 4D-STEM to hyperspectral imagery. By moving beyond flat factorizations, HNMF provides a more nuanced and interpretable model that aligns with the hierarchical nature of many material systems and biological communities. The key to its successful application lies in the thoughtful integration of domain-specific constraints to guide the factorization toward physically meaningful results and robust validation against established methods. Future directions point toward deeper integration with deep learning architectures, the development of online HNMF methods for streaming data, and expanded applications in dynamic process tracking and personalized medicine, ultimately offering a robust framework to accelerate discovery and innovation across research domains.