Deep Value Benchmark

Co-authored paper which was recongnized as a Spotlight (top ~3.2% of submissions) at NeurIPS 2025!

Collaborated with researchers from the University of Michigan School of Information and the University of Washington to develop an AI alignment benchmark which measures how well LLMs learn fundamental human values against surface-level preferences.

Developed supervised fine tuning and few shot learning experiments to evaluate state of the art models on Deep Value Benchmark. Engineered prompts to generate 12,000 examples to use in experiments which are grounded in foundational psychology research.

Paper Link