DESCRIPTION :
The widespread deployment of Large Language Models (LLMs) has given rise to diverse adaptation paradigms that accommodate varying computational and infrastructural constraints. In addition to traditional full fine-tuning, emerging methods such as prompt-tuning, parameter-efficient tuning like LoRA (Low-Rank Adaptation), and in-context learning for model adaptation with reduced computational resources or without access to model weights. These approaches open up new possibilities for collaborative learning in privacy-sensitive contexts, where multiple clients aim to improve LLM performance without exposing their raw data.
This PhD thesis will focus on designing privacy-preserving collaborative learning strategies for LLMs, starting in a homogeneous setting, where all participants rely on the same adaptation paradigm. This initial step will build a foundation for tackling the more ambitious and impactful goal of heterogeneous collaboration, where clients operate under different adaptation regimes due to diverse privacy, computational, or architectural constraints. A central challenge of the project is to reconcile contributions from such heterogeneous clients in a unified learning process, while ensuring rigorous privacy guarantees-most notably through differential privacy (DP), which provides strong theoretical protections against data leakage. The thesis will also address the trade-offs between model utility and privacy risk and propose novel mechanisms specifically tailored to this multi-paradigm collaborative learning scenario.
The PhD candidate will be based at Inria Lille, within the MAGNET research team, and will be co-supervised by M. Tommasi, Dr. Raouf Kerkouche (Inria Lille) and Dr. Cédric Gouy-Pailler (CEA Saclay). The research will benefit from a stimulating scientific environment, combining Inria's strong expertise in machine learning and artificial intelligence with the applied research focus of the CEA. This thesis is part of the REDEEM project, funded by the PEPR IA initiative (France 2030). It offers a highly interdisciplinary environment bridging machine learning, natural language processing, and privacy-enhancing technologies, with opportunities for national and international collaboration.
Mission confiée
The person hired will carry out original research toward a PhD on "Privacy-Preserving Collaborative Learning of Large Language Models Across Heterogeneous Learning Paradigms." The research will involve designing novel collaborative protocols, formalizing privacy guarantees, and evaluating the impact of different learning paradigms on performance and privacy.
Principales activités
The candidate will get acquainted with the state of the art on privacy-preserving collaborative learning and adaptation of Large Language Models (LLMs), perform original research in close interaction with the thesis supervisors and other collaborators, and design collaborative learning strategies across heterogeneous adaptation paradigms such as full fine-tuning, prompt-tuning, and in-context learning. A key part of the work will involve developing and analyzing privacy-preserving mechanisms-particularly those based on differential privacy, which offers strong theoretical guarantees against data leakage. The candidate will evaluate these mechanisms in terms of their utility-privacy trade-offs, write scientific articles detailing the results, and present the work at top-tier international conferences and leading peer-reviewed journals in the areas of machine learning, privacy, and natural language processing.
Code d'emploi : Mannequin Photo (h/f)
Domaine professionnel actuel : Employés du Service de la Promotion des Ventes
Niveau de formation : Bac+5
Temps partiel / Temps plein : Plein temps
Type de contrat : Contrat à durée indéterminée (CDI)
Compétences : Intelligence Artificielle, Apprentissage Collaboratif, Programmation Informatique, Prévention des Fuites d'Information, Python (Langage de Programmation), Machine Learning, Traitement du Langage Naturel, Données Brutes, Tensorflow, Pytorch, Large Language Models, Technologies Informatiques, Anglais, Axé sur le Succès, Mathématiques Appliquées, Recherche Appliquée, Approche Pluridisciplinaire, Paradigmes, Documentation Scientifique, Coaching
Courriel :
Marc.Tommasi@inria.fr
Téléphone :
0139635511
Type d'annonceur : Employeur direct