1. Human Inspired Technology Research Center, University of Padua (Padova), Italy
2. Department of Information Engineering, University of Padua (Padova), Italy
* Equal contribution
Proceedings of the 39th Annual Conference on Neural Information Processing Systems (Neurips), San Diego, California, 2025.
@inproceedings{SinigagliaSartorCecconSusto2025,
author = {Alberto Sinigaglia and Davide Sartor and Marina Ceccon and Gian Antonio Susto},
title = {Simple and Effective Specialized Representations for Fair Classifiers},
booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},
year = {2025},
}
Fair classification is a critical challenge that has gained increasing importance due to international regulations and its growing use in high-stakes decision-making settings. Existing methods often rely on adversarial learning or distribution matching across sensitive groups; however, adversarial learning can be unstable, and distribution matching can be computationally intensive. To address these limitations, we propose a novel approach based on the characteristic function distance. Our method ensures that the learned representation contains minimal sensitive information while maintaining high effectiveness for downstream tasks. By utilizing characteristic functions, we achieve a more stable and efficient solution compared to traditional methods. Additionally, we introduce a simple relaxation of the objective function that guarantees fairness in common classification models with no performance degradation. Experimental results on benchmark datasets demonstrate that our approach consistently matches or achieves better fairness and predictive accuracy than existing methods. Moreover, our method maintains robustness and computational efficiency, making it a practical solution for real-world applications.

Overview of the framework: the encoder maps inputs $X$ to representations $Z$ that retain task-relevant structure while minimizing sensitive information via (i) CF matching (FmCF) or (ii) sufficient-statistics alignment (FmSS).
For each group $s \in \mathcal{S}$, we penalize the Characteristic Function Distance (CFD) between $\mathbb{P}(Z\mid S{=}s)$ and a target (e.g., standard Normal):
$$\operatorname{CFD}_{\mathbb{P}_T}^2\big(\mathbb{P}(Z\mid s), \mathbb{P}(\mathcal{N})\big)\;=\; \mathbb{E}_{T}\big[\,\lvert \varphi_{Z\mid s}(T) - e^{-\|T\|^2/2}\rvert^2\,\big].$$
Monte Carlo draws $T\!\sim\!\mathbb{P}_T$ make the objective differentiable and easy to use with any encoder.
For classification, matching the first two moments of $\mathbb{P}(Z\mid S)$ is sufficient to limit a logistic adversary. We minimize the KL to $\mathcal{N}(0,1)$ per group:
$$\mathrm{KL}\big(\mathcal{N}(\mu_s,\sigma_s^2)\,\|\,\mathcal{N}(0,1)\big) = \tfrac{1}{2}\,\big(\sigma_s^2 + \mu_s^2 - 1 - \log \sigma_s^2\big).$$
This yields provable guarantees for logistic downstream models while remaining lightweight and stable in practice.
Matching group-wise means and variances makes $Z$ uninformative for a logistic adversary (coefficients trend to zero as moments align), delivering post-hoc fairness certificates for that model family.
Unlike several competing approaches, the trained classifier does not require the sensitive attribute at inference. The penalties add little overhead and plug into standard PyTorch training loops.