Deep Value Benchmark

Collaborating with researchers from the University of Michigan School of Information and the University of Washington to develop an AI alignment benchmark which measures how well LLMs learn fundamental human values against surface-level preferences.

Developed supervised fine tuning and few shot learning experiments to evaluate state of the art models on Deep Value Benchmark. Engineered prompts to generate 12,000 examples to use in experiments which are grounded in foundational psychology research.

Paper is currently under review for NeurIPS 2025 Conference Proceedings.