Posts by Collection

portfolio

Benchmark Harmonization and Model Similarity Analysis

Harmonizing major AI evaluation benchmarks (LiveBench, HELM, LMSYS Arena) and developing model similarity maps at FORESEER Lab, University of Michigan.

Meaning-Removed Steering Vectors for LLM Reasoning

Developing novel techniques for calibrating large language model reasoning through sentence-level hidden-state interventions at MINE Lab, University of Notre Dame.

Trajectory-Level Web Agent Evaluation

Developing comprehensive evaluation frameworks for web agents that assess both action sequences and value alignment at SaNDwich Lab (IBM–Notre Dame collaboration).

ML Verification Benchmarks and Reproducibility

Improving machine learning verification benchmarks and addressing numerical reproducibility challenges with the alpha-beta-CROWN verification tool at UIUC.

Dan (Hojun) Yoo

Posts by Collection

portfolio

Benchmark Harmonization and Model Similarity Analysis

Meaning-Removed Steering Vectors for LLM Reasoning

Trajectory-Level Web Agent Evaluation

ML Verification Benchmarks and Reproducibility

talks