Stable Offline Value Function Learning with Bisimulation-based Representations
[arxiv]Brahma S. Pavse, Yudong Chen, Qiaomin Xie, Josiah P. Hanna
I am a third-year Computer Science PhD candidate at the University of Wisconsin-Madison.
I am broadly interested in representation learning and offline policy evaluation (OPE) of reinforcement learning policies. OPE is the important problem of evaluating the performance of a policy without taking the risk of deploying it. Within this problem setup, I study how AI agents can learn suitable representations for stable and accurate offline policy evaluation.
I am fortunate to be advised by Josiah Hanna, and I also closely collaborate with Qiaomin Xie and Yudong Chen. I have also interned at Sony AI.
Previously, I completed my BS and MS in Computer Science from the University of Texas at Austin, where I was fortunate to be advised by Peter Stone. I also worked as a software engineer at Salesforce and SAS Institute.
Feel free to shoot me an email if you want to chat!