Going beyond pretraining: recent advances, applications and future directions for test-time compute and RL

25 Nov 2025

days

hours

minutes

seconds

EVENT DATE

25 Nov 2025

Please refer to specific dates for varied timings

TIME

1:30 pm – 3:00 pm

LOCATION

SUTD Lecture Theatre 5 (Building 2, Level 5, Room 2.505)

While large language models have traditionally relied on massive pretraining compute budgets, a paradigm shift is emerging that prioritizes test-time computation and reinforcement learning. This talk explores how techniques like chain-of-thought reasoning, self-consistency, and iterative refinement unlock new capabilities by investing compute during inference rather than training alone. We examine recent breakthroughs including RL with verifiable rewards, process reward model that enable reasoning models to achieve superhuman performance on many tasks including advanced mathematics, agentic capabilities. This talk covers practical concerns for RL and test-time compute, including RL training efficiency and designing domain-specific reward functions. Finally, we outline future research directions including hybrid architecture for combining reasoning and non-reasoning tasks, and potential for splitting memory and intelligence in LLMs.

Speaker’s profile

Bill Cai is a Senior Applied Scientist in the Generative AI Innovation Center in Amazon Web Services. His most recent research focuses on model optimisations and fine-tuning, including efficient deployments on hardware including NVIDIA and AWS Neuron devices, and optimizing model performance across multiple modalities for domain-specific/language-specific applications. He has most recently published in the area of NLP, CV and ML research in NAACL, ACM MM, ACM Web Conference, IEEE IOTJ, CVPR workshop and NeurIPS workshops. Bill holds a Master’s degree from Massachusetts Institute of Technology and a Bachelor’s degree from University of Chicago.

ADD TO CALENDAR

Google Calendar

Apple Calendar