Natural Language Processing

Course Code: Y2A1
ECTS Credits: 15.0


Course Description

This course introduces students to core and advanced techniques in Natural Language Processing (NLP), with a strong emphasis on practical application across different domains. Students learn to process both written and spoken language using real-world data. Key tasks include emotion classification, speech-to-text transcription, machine translation, semantic representation, and feature extraction.

Students gain hands-on experience with both traditional models (e.g., Logistic Regression, Naive Bayes) and deep learning architectures (e.g., LSTM, RNN, Transformers). They apply widely-used NLP libraries and frameworks such as HuggingFace Transformers, SpaCy, NLTK, and Gensim. Feature engineering techniques include POS tagging, TF-IDF, sentiment scoring, and embedding-based representations. The course also covers model evaluation using F1-score, Word Error Rate (WER), and error analysis, as well as explainability methods like Gradient × Input and Layer-wise Relevance Propagation (LRP). Students explore prompt engineering strategies to fine-tune large language models for downstream tasks, and complete the course by designing end-to-end NLP pipelines.


Course Content

  • Text Classification
    • Logistic Regression, Naive Bayes, LSTM, RNN, Transformers (e.g., BERT, DistilBERT)
  • Speech-to-Text
    • Automatic transcription using Whisper and AssemblyAI
    • Evaluation with Word Error Rate (WER)
  • Machine Translation
    • Neural machine translation using pretrained models (e.g., MarianMT)
    • Round-trip translation and quality assessment
  • Feature Engineering
    • Part-of-Speech tagging, TF-IDF, sentiment analysis
    • Pretrained and custom-trained word embeddings (Word2Vec, GloVe)
  • Prompt Engineering
    • Zero-shot and few-shot prompting for classification tasks
  • Explainable AI for NLP
    • Gradient × Input
    • Layer-wise Relevance Propagation (LRP)
  • Evaluation & Error Analysis
    • Precision, recall, F1-score, confusion matrices
    • Qualitative and quantitative assessment of model behavior
  • End-to-End NLP Pipelines
    • Modular workflows combining transcription, translation, feature extraction, classification, and explainability

Prerequisites

  • Completion of all Year 1 courses in the Applied Data Science & AI programme.


Course Coordinator(s)