Context-Aware Language Modeling for Goal-Oriented Dialogue Systems
Paper | Code

CALM's training teaches the model to better pay attention to the dialogue task context, producing coherent and successful outputs.
Summary

Dialogue Task Relabeling

Context Aware Finetuning

Model Based Rollout Planning

The standard language modeling objective naturally learns to model both our own utterances (the policy) and the other person's utterances (the dynamics). We use this insight to improve CALM's task success rate by executing a simple model-based planning procedure within the language model.

We sample full rollouts of the dialogue and then re-rank them with a reward model to determine the agent's response.

Results

A comparison of CALM against baselines. We evaluate models in a simulated evaluation setting. We present results for both standard greedy decoding (greedy) and model based planning (planning).

Example Dialogues

We present a set of example dialogues produced by CALM (in green) in the simulated evaluation. Despite being end-to-end, CALM produces highly coherent and sensible outputs.

Context-Aware Language Modeling for Goal-Oriented Dialogue Systems
Paper | Code