I am an incoming PhD student in Computer Science at Purdue University, advised by Prof. Jason Wu. I study how people and AI systems interact, how language models reason, and how we can evaluate these systems in ways that reflect human values. I am currently completing my B.S. in Computer Science at the University of Notre Dame, graduating in May 2026.

I have been fortunate to be mentored by Prof. Toby Jia-Jun Li (SaNDwich Lab, University of Notre Dame), Prof. Xiangliang Zhang (MINE Lab, University of Notre Dame), and Prof. Qiaozhu Mei (FORESEER Group, University of Michigan) across projects spanning web agent evaluation, value alignment, reasoning calibration, and benchmark analysis.

Research Interests

I am broadly interested in Human-AI Interaction (HAI), especially:

  1. Human-AI collaboration and evaluation - Designing systems that help people work effectively with AI agents, and developing evaluation methods that capture interaction quality beyond simple task success
  2. LLM systems and human-centered implementation - Developing language model systems that bring AI capabilities into real-world settings in ways that are useful, usable, and grounded in human needs
  3. LLM reasoning and interpretability - Understanding, probing, and calibrating the reasoning behavior of language models
  4. Evaluation methodology and benchmarking - Creating rigorous and reproducible evaluation frameworks for comparing AI systems and benchmarks

Publications

IUI 2026 paper figure
The Behavioral Fabric of LLM-Powered GUI Agents: Human Values and Interaction Outcomes
Gebreegziabher, S., Yang, Y., Chiang, C., Yoo, H., Chen, C., Do, H. J., Ashktorab, Z., Geyer, W., Gomez-Zara, D., Li, T. J.-J.
IUI 2026, pp. 909-927.
[PDF]

News