24.S90 Demystifying Large Language Models

Thursdays, 10-12:30pm in 32-D461
Hadas Kotek

Course Description

This course explores the abilities and limitations of language models, focusing on state of the art tools such as GPT-4, Gemini, and LLaMA. LLMs possess impressive language abilities, but they also occasionally fail in unpredictable ways. Our goal in this class will be to map the abilities and limitations of these models, focusing on complex reasoning and language abilities. We will attempt to discover systematicity in the models’ failures and to understand how they relate on the one hand to how the prompt is formulated and what we believe the training data and model architecture to be, and on the other hand how humans perform on the same tasks and how children acquire this knowledge. We will additionally entertain the various costs associated with the deployment and use of LLMs, be they due to privacy breaches, environmental costs, security risks, copyrights abuses, the environment, or the entrenchment and amplification of biases and stereotypes at scale. Along the way, we will investigate the development of language technologies and their capacities over time, as well as the state of the art linguistic theories that explain the phenomena of interest. We’ll ask ourselves whether it is reasonable to conclude that the LLMs use a similar sort of approach as humans do to complex language reasoning, and what this means for how we should understand what LLMs actually do (and how humans can and should interact with them).