Tom-Felix Thormann

Publications

Philosophy of AI

Thormann, T.F. (Under Review). Defining Strategic AI Deception Through Generalized Propositional Attitudes. Download PDF

Dung, L., Thormann, T.F. (Under Review). Mask or Mind? Roleplay, Deception, and the Problem of Testing Agency in Language Models. Download PDF


Other Philosophy

Thormann, T.F., Michelini, M. (Under Review): An Evolutionary Model of Morality as Externalization. Download PDF


More Technical Work

Thormann, T.F., Gilles, M. (Under Review): Probing the Limits of the Lie Detector Approach to LLM Deception. https://doi.org/10.48550/arXiv.2603.10003. Download PDF