Talk title: Towards LLMs that Understand, Remember, and Adapt to Users
Abstract: Large language models (LLMs) are trained on the collective knowledge of the internet, but they are used to serve billions of individual users. Recent years witness increasing interests in adapting population-level LLMs to accommodate the diverse goals, preferences, and contexts of individual users. To build personalised LLMs, we need models that can understand users, maintain memory over time, learn from sparse and heterogeneous personal data, and align with what each user values. In this talk, I will illustrate how each of these challenges can be addressed through examples from our group’s recent work. I will conclude by discussing open problems and potential directions for personalised LLM research.
Bio: Yulan He is a Professor in Natural Language Processing at King’s College London and a Turing AI Fellow. Her research focuses on improving the robustness of Large Language Model (LLM) reasoning, agentic AI, long-context QA, model interpretability, safety alignment, AI for education, health, and science. She has received several prizes and awards for her research. Her five-year Turing AI Fellowship project, “Event-Centric Framework for Natural Language Understanding”, was awarded Best Research Project (Research Excellence) at the RAi UK AI & Robotics Awards 2026. She also received a Best Research Paper Award at the same awards ceremony. In addition, she is the recipient of a SWSA Ten-Year Award and a CIKM Test-of-Time Award, and was named an inaugural Highly Ranked Scholar by ScholarGPS.
Talk title: Evaluating for LLM uncertainty and bias in healthcare
Abstract: In this talk, I discuss two of our recent papers (EACL 2026 and ACL 2026) and how they combine uncertainty (quantification) and bias (evaluation) for LLMs in healthcare. First, we explore uncertainty quantification methods for LLMs tailored to clinical applications. We look at how different uncertainty quantification methods for LLMs fare when applied to different clinical specialties (e.g. cardiology vs. speech language pathology) and to different types of clinical concepts (e.g. diagnosis vs. procedures). Second, we investigate how sexual orientation and religious affiliation of a patient distort uncertainty signals and model accuracy. We show that these identity markers cause a "calibration crisis" in LLMs with harms to calibration that compound non-additively, a significant risk to LLM fairness and safety.
Bio: Iacer Calixto has a background in Computer Science, Natural Language Processing, and Machine Learning, and obtained his PhD from Dublin City University (2017) on the topic of integrating visual information in machine translation. He was a Marie-Curie Postdoctoral Fellow visiting the New York University Courant Institute of Mathematical Sciences, and currently leads the NLP4Health Lab in the Department of Medical Informatics of the University of Amsterdam. His lab, including 6 PhD students and 2 postdocs, tackles the methodological bottlenecks necessary to take NLP methods from bench to clinical practice: how to guarantee patient privacy, quantifying the uncertainty of large language models (LLMs), integrating data from multiple modalities (e.g., structured data, medical images, time series, text), how to make NLP models interpretable and explainable. Finally, Iacer focuses on real-world problems across a number of high-impact clinical specialties, such as cardiology and intensive care medicine. He holds an NWO AiNED Fellowship (2024-2029) and is involved in the EU projects DataTools4Heart (which goal is the development of a toolbox for clinicians, researchers, and data scientists) and Medispeech (which goal is to automate the reporting of doctor-patient consultations using LLMs).
Talk title: Responsibly Building Multilingual Language Models for Hundreds of Languages
Abstract: Large Language Models (LLMs) have transformed artificial intelligence, yet their development remains heavily skewed toward high-resource languages, leaving many global communities underserved. Multilingual LLMs aim to bridge this gap but face steep hurdles, including severe data scarcity, biased tokenization systems, and a lack of representative evaluation benchmarks. This presentation explores these critical challenges and introduces practical, technically innovative solutions designed to build fairer AI systems. The discussion details new insights into how multilingual models transfer knowledge across diverse languages, alongside architectural enhancements that boost cross-lingual reasoning. To tackle data and evaluation constraints, the research introduces a novel contrastive learning approach for low-resource language identification, a parity-aware tokenization algorithm to eliminate script biases, and a comprehensive evaluation benchmark spanning 44 languages. Ultimately, these contributions provide the theoretical frameworks and practical tools necessary to advance equitable AI technologies that leave no language behind.
Bio: Negar Foroutan is a research scientist at Google Research, Zurich, and a recent PhD graduate from EPFL. Her research broadly encompasses NLP and machine learning, with a passionate focus on improving the multilingual capabilities of LLMs, especially in low-resource settings. She is actively involved in the full pipeline of training multilingual LLMs, including pretraining data construction, data mixtures, language-aware tokenization, and robust evaluation. When she isn't training models, you can find her hiking, keeping up with politics, or annoying her friends with unsolicited etymology facts :)
Talk title: Compositional approaches in modelling language and reasoning
Abstract: Neural approaches to modelling language and concepts have proven quite effective, with a proliferation of large models trained on correspondingly massive datasets. However, these models still fail on some tasks that humans, and symbolic approaches, can easily solve. Large neural models are also, to a certain extent, black boxes - particularly those that are proprietary. There is therefore a need to integrate compositional and neural approaches, firstly to potentially improve the performance of large neural models, and secondly to analyze and explain the representations that these systems are using. In this talk I will present results showing that large neural models can fail at tasks that humans are able to do, and discuss alternative, theory-based approaches that have the potential to perform more strongly. I will give applications in language, reasoning, and vision. Finally, I will present some future directions in understanding the types of reasoning or symbol manipulation that large neural models may be performing.
Talk title: TBA
Abstract: TBA
Bio: TBA
Talk title: TBA
Abstract: TBA
Bio: TBA
Talk title: TBA
Abstract: TBA
Bio: TBA