Talk title: Rethinking Evaluation for the Future of Language Models
Abstract: Evaluation plays a central, but often underestimated, role in how large language models are developed and understood. This talk critically examines current evaluation practices, highlighting how they shape perceptions of model progress while often overlooking key challenges in robustness, multilingual performance, and real-world reliability. By reflecting on these gaps, I make the case for rethinking evaluation as a guiding force, and not just a final checkpoint, in building more capable, inclusive, and trustworthy LLMs.
Bio: Marzieh Fadaee is a staff research scientist at Cohere Labs (formerly Cohere For AI) whose work centers on multilingual language models, data-efficient learning, and robust evaluation methods. Her research aims to improve language technologies for diverse and low-resource languages, with a focus on understanding how data selection, representation, and model design influence generalization and fairness. She has published extensively on these topics and has contributed to the development of widely used multilingual benchmarks and datasets. In addition to her research, she is involved in mentoring early-career researchers and actively engages in community-driven science initiatives.
Talk title: A blueprint for adaptive and tokenizer-free foundation models
Abstract: Foundation models (FMs) process information as a sequence of internal representations; however, the length of this sequence is fixed and entirely determined by tokenization. This essentially decouples representation granularity from information content, which exacerbates the deployment costs of FMs and narrows their “horizons” over long sequences. What if, instead, we could free FMs from tokenizers by modelling bytes directly, while making them faster than current tokenizer-bound FMs? To achieve this goal, I will show how to: 1) learn tokenization end-to-end, by dynamically pooling representations in internal layers and progressively learning abstractions from raw data; 2) compress the KV cache (memory) of Transformers adaptively during generation without loss of performance; 3) predict multiple bytes per time step in an efficient yet expressive way; 4) retrofit existing tokenizer-bound FMs into byte-level FMs through cross-tokenizer distillation. By blending these ingredients, we may soon witness the emergence of new, more efficient architectures for foundation models.
Bio: Edoardo M. Ponti is an assistant professor in Natural Language Processing at the University of Edinburgh and a visiting professor at NVIDIA. Previously, he was a visiting postdoctoral scholar at Stanford University and a postdoctoral fellow at Mila Montreal. In 2021, he obtained a PhD in computational linguistics from the University of Cambridge, St John’s College. His main research foci are efficient architectures for foundation models (tokenizer-free and adaptive memory), modular deep learning, and grounded typology. His research earned him a Google Research Faculty Award and 3 best paper awards. He is a board member and co-founder of SIGTYP and a scholar of the European Lab for Learning and Intelligent Systems (ELLIS). He is also a (terrible) violinist and an aspiring practitioner of heroic viticulture.
Talk title: TBA
Abstract: TBA
Bio: TBA
Talk title: TBA
Abstract: TBA
Abstract: TBA
Talk title: TBA
Abstract: TBA
Abstract: TBA
Talk title: TBA
Abstract: TBA
Bio: TBA