Architecture of ML Systems SS2023
(VL/UE, 41078 Architecture of Machine Learning Systems)

AMLS is a 6 ECTS module, applicable to the master study courses computer science, computer engineering, information systems management, and electrical engineering, as well as the study areas data and software engineering, cognitive systems, and distributed systems and networks. Machine learning (ML) applications profoundly transform our lives, and many domains such as health care, finance, media, transportation, production, and information technology itself. In a narrow sense, ML systems are software systems underpinning theses ML applications. However, in a broad sense, ML systems comprise the entire systems from ML applications, over the compiler/runtime stack, to the underlying heterogeneous hardware devices.

This module covers the architecture and essential concepts of modern machine learning (ML) systems for both local and large-scale machine learning. These architectures include systems for data-parallel execution, parameter servers, ML lifecycle systems, and the integration of ML into database systems. The covered topics focus both on a microscopic view of internal compilation, execution, and data management techniques, as well as a macroscopic view of end-to-end ML pipelines. In detail, the module covers the following topics which also reflect the lecture calendar (with a separate 90-120min lecture per topic):


Lectures

In detail, the course covers the following topics, which also reflects the course calendar. All slides will be made available prior to the individual lectures, which take place Thursday, 4pm-6pm in A 053 and virtually via zoom (call-in: first lecture, other lectures). Furthermore, we also offer weekly office hours, which take place Tuesday, 3pm-4.30pm in TEL 0811 and virtually via zoom (call-in: office hour, starting May 09)

A: Overview and ML System Internals

  • 01 Introduction and Overview [Apr 20, pdf, pptx, mp4]
  • 02 Languages, Architectures, and System Landscape [Apr 27, pdf, pptx, mp4]
  • 03 Size Inference, Rewrites, and Operator Selection [May 04, pdf, pptx, mp4]
  • 04 Operator Fusion and Runtime Adaptation [May 11, pdf, pptx, mp4]
  • 05 Data- and Task-Parallel Execution [May 25, pdf, pptx, mp4]
  • 06 Parameter Servers [May 31 (virtual only), pdf, pptx, mp4]
  • 07 Hybrid Execution and HW Accelerators [Jun 08, pdf, pptx, mp4]
  • 08 Caching, Partitioning, Indexing and Compression [Jun 14 (virtual only), pdf, pptx, mp4]

B: ML Lifecycle Systems

  • 09 Data Acquisition, Cleaning, and Preparation [Jun 22 (virtual only), pdf, pptx, mp4]
  • 10 Model Selection and Management [Jun 29 (virtual only), pdf, pptx, mp4]
  • 11 Model Debugging, Fairness, and Explainability [Jul 06, pdf, pptx, mp4]
  • 12 Model Serving Systems and Techniques [Jul 13, pdf, pptx, mp4]


Project / Exercises

The lectures are accompanied by mandatory programming projects (to the extend of 3 ECTS, i.e, roughly 80 working hours), preferably in Apache SystemDS (an open source ML system for the end-to-end data science lifecycle), or DAPHNE (an open and extensible system infrastructure for integrated data analysis pipelines).
A list of project proposals and details on alternative exercises (programming contest or ML pipeline) are available here:


Organization

  • Lecturer: Univ.-Prof. Dr.-Ing. Matthias Boehm, DAMS
  • Teaching Assistant: M.Sc. Sebastian Baunsgaard, DAMS
  • Final oral exam: ~Jul 20, 2023
  • Grading: 100% final exam, project as prerequisite