Duke Computer Science Colloquium

Discovering Hidden Capabilities and Limits in Large Language Models

March 4, -
Speaker(s): Peter West

Lunch

Lunch will be served at 11:45 AM.

Abstract 

As massive language models (LMs) like GPT-4 dominate natural language processing and AI, extreme-scale has become a clear and frequent theme for success. However, increasing model size is inherently at odds with the interests of a diverse user base and community of open researchers. The largest models are typically closed to the public, extremely energy-intensive, and difficult to conduct systematic and reproducible research on.

In this talk, I will discuss my vision for effective natural language processing beyond scale alone. I will begin by describing alternative approaches, more efficient methods that work with compact language models to unlock hidden capabilities. I begin with inference-time algorithms that work on top of existing models to allow new functionality. Next, I describe a method for distilling valuable knowledge from extreme-scale models into compact LMs. Finally, I will explain my work towards understanding the limits that even extreme-scale language models suffer from, with a particular focus on how such models differ from human intuitions.

Speaker Bio

Peter West is PhD student in the Paul G. Allen School of Computer Science & Engineering at the University of Washington, working with Yejin Choi. His research is focused on natural language processing and language models, particularly studying the capabilities and limits of both compact and extreme-scale models. His work has received multiple awards, including best methods paper at NAACL 2022, and outstanding paper awards at ACL and EMNLP in 2023. His work has been supported in part by an NSERC PGS-D fellowship. Previously, Peter received a BSc in computer science from the University of British Columbia.

Contact

Carlo Tomasi