DESCRIPTION :
An increasing number of applications rely on complex inference tasks based on machine learning(ML). Currently, two options exist to run such tasks : either served directly by the end device (e.g., smartphones, IoT equipment, smart vehicles) or offloaded to a remote cloud. Both options may beunsatisfactory for many applications : local models may have inadequate accuracy, while the cloudmay fail to meet delay constraints. In [SSCN+24], researchers from the Inria NEO and Nokia AIRLteams presented the novel idea of inference delivery networks (IDNs), networks of computing nodesthat coordinate to satisfy ML inference requests achieving the best trade-off between latency andaccuracy. IDNs bridge the dichotomy between device and cloud execution by integrating inference delivery at the various tiers of the infrastructure continuum (access, edge, regional data center, cloud).Nodes with heterogeneous capabilities can store a set of monolithic machine-learning models
withdifferent computational/memory requirements and different accuracy and inference requests that canbe forwarded to other nodes if the local answer is not considered accurate enough.
Research goal
Given an AI model's placement in an IDN, we will study inference delivery strategies to BE implementedat each node in this task. For example, a simple inference delivery strategy is to provide the inferencefrom the local AI model if this seems to BE accurate enough or to forward the input to a more accuratemodel at a different node if the inference quality improvement (e.g., in terms of accuracy) compensatesfor the additional delay or resource consumption. Besides this serve-locally-or-forward policy, we willinvestigate more complex inference delivery strategies, which may allow inferences from models atdifferent clients to BE combined. To this purpose, we will rely on ensemble learning approaches [MS22]like bagging [Bre96] or boosting [Sch99], adapting them to IDN distinct characteristics. For example, in an IDN, models may or may not BE trained jointly, may BE trained on different datasets, and havedifferent architectures, ruling out some ensemble
learning techniques. Moreover, queries to remotemodels incur a cost, which leads to prefer ensemble learning techniques that do not require jointevaluation of all available models.
In an IDN, models could BE jointly trained on local datasets using federated learning algorithms[KMA+21]. We will study how the selected inference delivery strategy may require changes to such algorithms to consider the statistical heterogeneity induced by the delivery strategy itself. For example,
nodes with more sophisticated models will receive inference requests for difficult samples from nodeswith simpler and less accurate models, leading to a change in the data distribution seen at inferencewith respect to that of the local dataset. Some preliminary results about the training for early-exit
networks in this context are in [KSR+24].
1
References
[Bre96] Leo Breiman. Bagging predictors. Machine Learning, 24(2) :123-140, August 1996.
[KMA+21] Peter Kairouz et al, Advances andOpen Problems in Federated Learning. Foundations and Trends® in Machine Learning,14(1-2) :1-210, 2021.
[KSR+24] Caelin Kaplan, Tareq Si Salem, Angelo Rodio, Chuan Xu, and Giovanni Neglia. Federatedlearning for cooperative inference systems : The case of early exit networks, 2024.
[MS22] Ibomoiye Domor Mienye and Yanxia Sun. A Survey of Ensemble Learning : Concepts, Algorithms, Applications, and Prospects. IEEE Access, 10 :99129-99149, 2022.
[Sch99] Robert E. Schapire. A brief introduction to boosting. In Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2, IJCAI'99, pages 1401-1406,San Francisco, CA, USA, July 1999. Morgan Kaufmann Publishers Inc.
[SSCN+24]T. Si Salem, G. Castellano, G. Neglia, F. Pianese and A. Araldo, "Toward Inference Delivery Networks : Distributing Machine Learning With Optimality Guarantees, " in IEEE/ACM Transactions on Networking, vol. 32, no. 1, pp. 859-873, Feb. 2024
Principales activités
Research.
If the selected candidate is interested, he/she may BE involved in students' supervision (master and PhD level) and teaching activities.
Code d'emploi : Chargé de Recherches (h/f)
Domaine professionnel actuel : Scientifiques
Niveau de formation : Bac+8
Temps partiel / Temps plein : Plein temps
Type de contrat : Contrat à durée indéterminée (CDI)
Compétences : Intelligence Artificielle, Programmation Informatique, Data Centers, Data Distribution Services, Machine Learning, Tensorflow, Pytorch, Technologies Informatiques, Adaptabilité, Réseautage, Sens de la Stratégie, Esprit d'Équipe, Recherche, Algorithmes, Mathématiques Appliquées, Enseignement, Gestion des Infrastructures, Mathématiques, Brevets, Recherche Post-Doctorale, Gestion de la Qualité, Comptabilité de la Consommation des Ressources, Etudes et Statistiques, Capacités de Démonstration, Ensachage
Courriel :
webmaster@inria.fr
Téléphone :
0139635511
Type d'annonceur : Employeur direct