Natural Language Processing
Course Code: Y2A1
ECTS Credits: 15.0
Course Description
This course introduces students to core and advanced techniques in Natural Language Processing (NLP), with a strong emphasis on practical application across different domains. Students learn to process both written and spoken language using real-world data. Key tasks include emotion classification, speech-to-text transcription, machine translation, semantic representation, and feature extraction.
Students gain hands-on experience with both traditional models (e.g., Logistic Regression, Naive Bayes) and deep learning architectures (e.g., LSTM, RNN, Transformers). They apply widely-used NLP libraries and frameworks such as HuggingFace Transformers, SpaCy, NLTK, and Gensim. Feature engineering techniques include POS tagging, TF-IDF, sentiment scoring, and embedding-based representations. The course also covers model evaluation using F1-score, Word Error Rate (WER), and error analysis, as well as explainability methods like Gradient × Input and Layer-wise Relevance Propagation (LRP). Students explore prompt engineering strategies to fine-tune large language models for downstream tasks, and complete the course by designing end-to-end NLP pipelines.
Course Content
- Text Classification
- Logistic Regression, Naive Bayes, LSTM, RNN, Transformers (e.g., BERT, DistilBERT)
- Speech-to-Text
- Automatic transcription using Whisper and AssemblyAI
- Evaluation with Word Error Rate (WER)
- Machine Translation
- Neural machine translation using pretrained models (e.g., MarianMT)
- Round-trip translation and quality assessment
- Feature Engineering
- Part-of-Speech tagging, TF-IDF, sentiment analysis
- Pretrained and custom-trained word embeddings (Word2Vec, GloVe)
- Prompt Engineering
- Zero-shot and few-shot prompting for classification tasks
- Explainable AI for NLP
- Gradient × Input
- Layer-wise Relevance Propagation (LRP)
- Evaluation & Error Analysis
- Precision, recall, F1-score, confusion matrices
- Qualitative and quantitative assessment of model behavior
- End-to-End NLP Pipelines
- Modular workflows combining transcription, translation, feature extraction, classification, and explainability
Prerequisites
- Completion of all Year 1 courses in the Applied Data Science & AI programme.
Recommended Reading
-
Speech and Language Processing (3rd Ed.) by D. Jurafsky, J.H. Martin
-
Natural Language Processing with Python by S. Bird, E. Klein, E. Loper