Interpretable Dialogue Generation with Attention-based Encoder-Decoder

Goal: Building an interpretable dialogue system and analysing how attention shapes generation quality over training.

In this project, I developed a sequence-to-sequence (Seq2Seq) dialogue system using an attention-based encoder-decoder architecture. The model was trained to generate appropriate responses in a conversational setting, with a particular focus on interpreting how the attention mechanism evolves during training.

I evaluated model outputs at different training stages (5, 50, and 100 epochs), and visualised attention weight matrices to examine how the decoder focuses on input tokens. I also compared greedy and beam search decoding strategies to assess response quality and diversity.

Key Highlights:

Implemented Bahdanau-style attention in a Seq2Seq framework
Trained on an open-domain dialogue dataset
Compared greedy vs. beam search decoding
Visualised attention to interpret learning progression
Discussed overfitting, sequence length, and vocabulary limits

Tech stack: PyTorch, NumPy, Matplotlib, seq2seq architecture