Guy Leroy, Stephanie Milani, Evelyn Zuniga*, Jaroslaw Rzepecki, Raluca Georgescu, Ida Momennejad, Dave Bignell, Mingfei Sun, Ali Shaw, Gavin Costello, Mikhail Jacob, Sam Devlin, Katja Hofmann
Abstract: The goal of this paper is to understand how people assess human-likeness in human- and AI-generated behavior. To this end, we present a qualitative study of hundreds of crowd-sourced assessments of human-likeness of behavior in a 3D video game navigation task. In particular, we focus on an AI agent that has passed a Turing Test, in the sense that human judges were not able to reliably distinguish between videos of a human and AI agent navigating on a quantitative level. Our insights shine a light on the characteristics that people consider as human-like. Understanding these characteristics is a key first step for improving AI agents in the future.