Architecture of ML Systems Summer 2025
(VL/UE, 41078 Architecture of Machine Learning Systems)

AMLS is a 6 ECTS module, applicable to the master study courses computer science, computer engineering, information systems management, and electrical engineering, as well as the study areas data and software engineering, cognitive systems, and distributed systems and networks. Machine learning (ML) applications profoundly transform our lives, and many domains such as health care, finance, media, transportation, production, and information technology itself. In a narrow sense, ML systems are software systems underpinning theses ML applications. However, in a broad sense, ML systems comprise the entire systems from ML applications, over the compiler/runtime stack, to the underlying heterogeneous hardware devices.

This module covers the architecture and essential concepts of modern machine learning (ML) systems for both local and large-scale machine learning. These architectures include systems for data-parallel execution, parameter servers, ML lifecycle systems, and the integration of ML into database systems. The covered topics focus both on a microscopic view of internal compilation, execution, and data management techniques, as well as a macroscopic view of end-to-end ML pipelines. In detail, the module covers the following topics which also reflect the lecture calendar (with a separate 90-120min lecture per topic):


Lectures

In detail, the course covers the following topics, which also reflects the course calendar. All slides will be made available prior to the individual lectures, which take place Thursday, 4pm-6pm in H 0107 and virtually via zoom (call-in link). Furthermore, we also offer weekly office hours, which take place Tuesday, 4pm-5.30pm in B 119 and virtually via zoom (call-in: office hour, starting Apr 22)

A: Overview and ML System Internals

  • 01 Introduction and Overview [Apr 17]
  • 02 Languages, Architectures, and System Landscape [Apr 24]
  • 03 Size Inference, Rewrites, and Operator Selection [May 02]
  • 04 Operator Fusion and Runtime Adaptation [May 08]
  • 05 Data- and Task-Parallel Execution [May 15
  • 06 Parameter Servers [May 22]
  • 07 LLM Training and Inference [Jun 05]
  • 08 Hybrid Execution and HW Accelerators [Jun 12]
  • 09 Caching, Partitioning, Indexing and Compression [Jun 19]

B: ML Lifecycle Systems

  • 10 Data Acquisition, Cleaning, and Preparation [Jun 26]
  • 11 Model Selection and Management [Jul 03]
  • 12 Model Debugging, Fairness, and Explainability [Jul 10]
  • 13 Model Serving Systems and Techniques [Jul 17]


Project / Exercises

The lectures are accompanied by mandatory programming projects (to the extend of 3 ECTS, i.e, roughly 90 working hours), preferably in Apache SystemDS (an open source ML system for the end-to-end data science lifecycle), or DAPHNE (an open and extensible system infrastructure for integrated data analysis pipelines).
A list of project proposals and details on alternative exercises (programming contest or ML pipeline) are available here:

The submission deadline for all projects and exercises is Tuesday Jul 15, 11.59pm.


Organization

  • Lecturer: Univ.-Prof. Dr.-Ing. Matthias Boehm, DAMS
  • Teaching Assistant: M.Sc. Sebastian Baunsgaard, DAMS
  • Final written exams: Jul 18, 4pm (pdf); Aug 10, 2pm (pdf); Oct 12, 1pm
  • Grading: 100% final exam, project as prerequisite (up to 5 extra points)