Towards Social Artificial Intelligence: Nonverbal Social Signal Prediction in A Triadic Interaction

Hanbyul Joo, Tomas Simon, Mina Cikara, Yaser Sheikh

Carnegie Mellon University

Harvard University

Abstract

We present a new research task and a dataset to understand human social interactions via computational methods, to ultimately endow machines with the ability to encode and decode a broad channel of social signals humans use. This research direction is essential to make a machine that genuinely communicates with humans, which we call Social Artificial Intelligence. We first formulate the “social signal prediction” problem as a way to model the dynamics of social signals exchanged among interacting individuals in a data-driven way. We then present a new 3D motion capture dataset to explore this problem, where the broad spectrum of social signals (3D body, face, and hand motions) are captured in a triadic social interaction scenario. Baseline approaches to predict speaking status, social formation, and body gestures of interacting individuals are presented in the defined social prediction framework.

Publication

Towards Social Artificial Intelligence: Nonverbal Social Signal Prediction in A Triadic Interaction
Hanbyul Joo, Tomas Simon, Mina Cikara, Yaser Sheikh
In CVPR 2019   (Oral)
[arxiv]

Dataset

The raw data (videos and skeletons) is downloadable via our Panoptic Studio toolbox. Use the script: get_haggling.sh

The processed data and code used in the paper (with speaking status annotations) will be available soon.