One project my group worked on is cost-efficient use of large language models (LLMs). Recently, there has been a proliferation of LLMs by multiple providers, each with different inference accuracy, monetary cost, and latency. If a user has a finite monetary budget and wants to use LLMs to solve a series of problems, how should they allocate their dollars across diverse LLMs? We propose a reinforcement learning approach, TREACLE, that learns which LLMs to choose, intelligently trading off accuracy for cost. This work appeared in NeurIPS 2024.
I started off as a undergraduate studying biomedical engineering.
I have a diploma in piano performance.
I love traditional Chinese lion dance and used to perform all over New York City during university.