Publications

Dissecting the Ullman Variations with a SCALPEL: Why do LLMs fail at Trivial Alterations to the False Belief Task?

Why might LLMs succeed on traditional false belief tasks but fail when variations are made? Are they exclusively relying on spurious correlations learned from the training set and don’t really have any ToM capabilities? Or are there other explanations?