I am a Master's student at the University of Waterloo, supervised by Prof. Sirisha Rambhatla. My research focuses on safe post-training adaptation of large language models (LLMs), spanning tamper-resistance evaluation and parameter-efficient fine-tuning (PEFT) methods.
Training large language models (LLMs) is highly resource-intensive due to their massive number of parameters and the overhead of optimizer states. While recent work has aimed to reduce memory consumption, such efforts often entail trade-offs among memory efficiency, training time, and model performance. Yet, true democratization of LLMs requires simultaneous progress across all three dimensions. To this end, we propose SubTrack++ that leverages Grassmannian gradient subspace tracking combined with projection-aware optimizers, enabling Adam's internal statistics to adapt to subspace changes. Additionally, employing recovery scaling, a technique that restores information lost through low-rank projections, further enhances model performance. Our method demonstrates SOTA convergence by exploiting Grassmannian geometry, reducing training wall-time by up to 65% compared to the best performing baseline, LDAdam, while preserving the reduced memory footprint.
Featured in UWaterloo News for our work on SubTrack++, an advanced training technique that accelerates large language model pre-training by up to 65% while maintaining state-of-the-art accuracy — helping democratize AI by reducing computational costs.
Supervised by Prof. Sirisha Rambhatla
Built a platform for analyzing flight data from the Pipistrel Velis Electro, the world's first type-certified electric plane. Leverages ML to optimize flight schedules based on weather forecasts and predict battery consumption for Canadian conditions.
With Meenakshi Andoorveedu, Peter Twarecki, Joanna Yang, Vikram Bhatt