Goal: Building an interpretable dialogue system and analysing how attention shapes generation quality over training.
In this project, I developed a sequence-to-sequence (Seq2Seq) dialogue system using an attention-based encoder-decoder architecture. The model was trained to generate appropriate responses in a conversational setting, with a particular focus on interpreting how the attention mechanism evolves during training.
I evaluated model outputs at different training stages (5, 50, and 100 epochs), and visualised attention weight matrices to examine how the decoder focuses on input tokens. I also compared greedy and beam search decoding strategies to assess response quality and diversity.
Key Highlights:
- Implemented Bahdanau-style attention in a Seq2Seq framework
- Trained on an open-domain dialogue dataset
- Compared greedy vs. beam search decoding
- Visualised attention to interpret learning progression
- Discussed overfitting, sequence length, and vocabulary limits
Tech stack: PyTorch, NumPy, Matplotlib, seq2seq architecture
