DESCRIPTION :
However, the deployment of these systems also raises serious ethical and technical concerns. Research has shown that generative models can encode and amplify societal biases present in their training data, leading to unfair performance across demographic groups (e.g., gender, race, body type, or age). In editing scenarios, this may manifest as disproportionate errors, inconsistent realism, or stereotypical representations for certain groups. Furthermore, maintaining fidelity - i.e., ensuring the edited output remains consistent with the original input outside modified regions - remains a key challenge. Diffusion models, by design, regenerate entire images from noise, often unintentionally altering unedited regions and compromising visual integrity. Balancing fairness and fidelity within a stochastic generative process is thus both a scientific and ethical frontier for AI research.
This PhD will systematically investigate bias, fairness, and fidelity in diffusion-based image and video generation models, particularly within editing tasks. It will develop new frameworks for evaluating, understanding, and mitigating bias while preserving high fidelity in generative outcomes.
Mission confiée
Research Objectives :
The overarching aim of this research is to develop a principled framework for understanding, evaluating, and mitigating bias in diffusion-based image and video generation while maintaining high fidelity in editing outcomes. The project begins with a systematic characterization of bias in existing diffusion models. It will analyze how these models perform across different groups defined by age, gender, skin tone, and body morphology, with particular attention to editing quality and consistency. Some other biases not related to humans will also be analyzed for general scenes. This involves both quantitative and qualitative analyses, comparing perceptual realism, structural accuracy, and fairness metrics across diverse datasets.
Principales activités
Methodology
To achieve this, the research will first investigate robust evaluation metrics for fairness and fidelity in generative editing tasks. While traditional measures such as Fréchet Inception Distance (FID) and perceptual similarity scores (CLIP-based or LPIPS) are valuable, they do not capture demographic disparities or context preservation. Therefore, new composite metrics will be developed that integrate demographic parity, perceptual consistency, and semantic coherence. These metrics will form the basis for a systematic bias audit of existing diffusion models in editing tasks like virtual try-on and face retouching. The second stage of the research will focus on fidelity analysis, emphasizing the preservation of unedited regions. This will include developing new metrics that account for context-specific deviations, measuring how global visual properties-such as lighting or color tone-shift during editing. User studies and psychophysical evaluations will complement
quantitative measures, ensuring that technical fidelity aligns with human-perceived consistency. The final and most substantial component will involve bias mitigation and fidelity enhancement. Several methodological strategies will be explored. One approach will modify conditioning mechanisms to ensure equitable generative quality across demographics by learning balanced feature representations. Another will involve re-weighting training data or applying adversarial fairness constraints that penalize demographic performance gaps. At the same time, novel diffusion control mechanisms-such as mask-preserving denoising schedules and attention modulation-will be developed to maintain high fidelity during editing. The project will explore whether fairness and fidelity objectives can be co-optimized through a unified loss function or multi-objective training regime, potentially establishing a new paradigm for fair generative editing. The study will begin with static image models and
later extend to video diffusion models, which introduce additional challenges of temporal coherence and fairness over time. Temporal fidelity (preserving motion and lighting consistency) and temporal fairness (maintaining equal generative performance across demographics over consecutive frames) will both be evaluated. Throughout the research, standard diffusion architectures such as Stable Diffusion and ControlNet will be used as baselines. The outcomes will include a benchmark dataset and an open-source evaluation toolkit for fairness and fidelity in generative editing, enabling broader community use and transparency. In addressing the core research questions- how fairness and fidelity can be quantitatively assessed, and how both can be improved without sacrificing visual realism-this project will contribute a comprehensive understanding of ethical and technical reliability in generative AI systems. 3. References * Ho, J., Jain, A., & Abbeel, P. (2020). Denoising
Diffusion Probabilistic Models. NeurIPS. * Rombach, R. et al. (2022). High-Resolution Image Synthesis with Latent Diffusion Models. CVPR. * Meng, C. et al. (2021). SDEdit: Image Synthesis and Editing with Stochastic Differential Equations. ICLR.
Code d'emploi : Thésard (h/f)
Niveau de formation : Bac+8
Temps partiel / Temps plein : Plein temps
Type de contrat : Contrat à durée indéterminée (CDI)
Compétences : Données d'Apprentissage, Intelligence Artificielle, Vision par Ordinateur, Rendu Numérique, ControlNet, Java APIs for Integrated Networks, Machine Learning, Technologie Open Source, Large Language Models, Stable Diffusion, Éthique, Fiabilité, Equation Differentielle Stochastique, Recherche, Algorithmes, Modulation, Edition, Matériaux Composites, Manipulation Photographique, Installation d'Éclairage, Conception et Réalisation en Robotique, Sémantique, Etudes et Statistiques, Gestion de Planning, Morphologie, Métrique, Modélisation de l'Utilisateur
Courriel :
stephane.lathuiliere@inria.fr
Téléphone :
0139635511
Type d'annonceur : Employeur direct