The CLAff Diplomacy dataset

This was one of most interesting and challenging datasets I explored. The original dataset was collected by a brilliant team of researchers over at CMU. If you want to see how awesome they are, check out their video describing the data:

Here’s the description provided by Jordan Boyd-Graber:

Machine learning techniques to detect deception in online communications requires training and evaluation data. However, there is a dearth of data either because of uncertain gold labels or privacy concerns; we create a new, large deception-centered dataset in the online game of Diplomacy. We gathered 17,289 messages from 12 games (each of which took over a month) involving 84 players, the majority of which were unique users. This data was collected with a custom-made bot that allowed us to collect messages and annotations. The user pool was created from scratch: we varied participant demographics across gender, age, nationality, and past game experience. Some of our participants included the former president of the Diplomacy players’ association, several top ranked players in the world, a board game shop owner, and scientists. We create machine learning models to detect lies using linguistic, context, and power-dynamic features. Our best model had similar lie detection accuracy to humans. Paper:…

Diplomacy, taken from Wikipedia

So yeah, we got this data and began to explore it. We were mainly surprised by how limited the recent literature was, on online cooperation. The headlining papers at CL conferences have focused on detecting multimodal affect using deep learning networks. This challenge — our challenge — of detecting the fine nuances of negotiation in online teams — was last properly touched in the 1990s when semantic frames were still a thing and natural language generation was called “machinery for Artificial Language” 😀 thank you Candace L. Sidner for a fabulous paper on the topic, which really kickstarted our own conceptualization.

Ultimately, we only released a teaser for the Shared Task, but further investigation is ongoing, I promise! There’s a lot to be gauged from how complete strangers please, flatter, and persuade each other to gain ground, wage wars, and forge alliances. Here’s a sample of the kind of things they say:

Examples from the CLAff Diplomacy dataset

Please check out the dataset here. If you’re still in time, do register for the Shared Task @ AAAI so that you can get access to the test set. The original dataset is described here and here.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Create a free website or blog at

Up ↑

%d bloggers like this: