Data Engineering II

Course Code: Y2D2
ECTS Credits: 5.0


Course Description

In this course, students deepen their understanding of data pipelines and infrastructure for deploying machine learning solutions. The focus is on scalable data ingestion, transformation, and storage techniques that support robust, production-grade ML workflows.

Students implement pipelines using cloud services and build modular, maintainable systems that enable automated retraining and seamless integration with ML models.


Course Content

  • Data Ingestion and Preprocessing Pipelines
  • Storing and Versioning Datasets in the Cloud
  • Azure ML Pipelines and Job Scheduling
  • Automated Model Training Workflows
  • Secure Access and Environment Management
  • Data Handling for Real-Time and Batch Scenarios

Prerequisites

  • This course builds on earlier work in which students developed machine learning models for natural language processing (Y2A1) or computer vision (Y2B1). These models will now be prepared for production deployment.

  • Completion of all Year 1 courses in the Applied Data Science & AI programme.



Course Coordinator(s)