Marius Binner

Position

PhD Candidate, Artificial Intelligence

Affiliation

Research groups

Short info

Looking at technical AI safety topics, currently mechanistic interpretability of agentic frontier models. Frontier AIs can autonomously perform increasingly longer horizon tasks: https://arxiv.org/abs/2503.14499

It would be nice to have ways to ensure they're not dangerous.