DESCRIPTION :
The goal of this project is to address key issues in Large Language Models (LLMs), particularly cultural biases stemming from Western-centric training data. These models often underperform or exhibit prejudice in non-English and especially South American contexts due to limited resources for bias detection. We propose to define sociologically grounded notions of social bias that can be computationally identified and measured. This involves data collection, annotation, and adapting existing datasets. Once biases are defined, we will detect them through model behavior analysis and apply cutting-edge fact-editing techniques to adjust the model's internal weights, mitigating harmful stereotypes while enhancing culturally relevant knowledge. Our focus is on general, multilingual methods, with a key application to Latin American languages and cultural contexts.
Is regular travel foreseen for this post ?
Regular travels to Chile are planned and fully funded.
Computing resources
Access to high-level Inria HPC clusters is granted. Access to the Jean Zay and Adastra HPC cluster will be sought.
Assignments :
With the help of the project team, and especially Djamé Seddah, the recruited person will be taken to develop techniques able to ease the exploration and the mitigation of socio-cultural biases in current Large Language Models. An important part of the project will be devoted to the development of new benchmarks able to assess the exact sensitivity of models to culturally-based biases.
Collaborations :
The person recruited will work in close liaison with (i) other members of the Almanach team involved in the SaLM project (https://salmproject.github.io/) working on adjacent domains of biases detection and neutralization, (ii) members of Universidad de Chile (led by Valentin Barrière) and Inria Chile (led by Luis Marti, Nayat Sanchez Pi)
* Establishing the state of the art of relevant techniques to the project
* Software development
* Conducts of experiments
* Preparation of reports and publications
* Release as open source the resulting software and resources
Code d'emploi : Travailleur Socio-culturel (h/f)
Domaine professionnel actuel : Travailleurs Sociaux et Intervenants Socio-Éducatifs
Temps partiel / Temps plein : Plein temps
Type de contrat : Contrat à durée déterminée (CDD)
Compétences : Données d'Apprentissage, Python (Langage de Programmation), Technologie Open Source, Conception et Développement de Logiciel, Recherche du Radical, Large Language Models, Deep Learning, Latin, Anglais, Espagnol, Français, Adaptabilité, Enthousiasme, Esprit d'Équipe, Motivation Personnelle, Edition, Collecte de Données, Expérimentation, Recherche Post-Doctorale, Organiser des Campagnes, Science des Données, Annotations, Multilinguisme, Publication / Edition
Courriel :
webmaster@inria.fr
Téléphone :
0139635511
Type d'annonceur : Employeur direct