Architecture of ML Systems SS2024
(VL/UE, 41078 Architecture of Machine Learning Systems)

AMLS is a 6 ECTS module, applicable to the master study courses computer science, computer engineering, information systems management, and electrical engineering, as well as the study areas data and software engineering, cognitive systems, and distributed systems and networks. Machine learning (ML) applications profoundly transform our lives, and many domains such as health care, finance, media, transportation, production, and information technology itself. In a narrow sense, ML systems are software systems underpinning theses ML applications. However, in a broad sense, ML systems comprise the entire systems from ML applications, over the compiler/runtime stack, to the underlying heterogeneous hardware devices.

This module covers the architecture and essential concepts of modern machine learning (ML) systems for both local and large-scale machine learning. These architectures include systems for data-parallel execution, parameter servers, ML lifecycle systems, and the integration of ML into database systems. The covered topics focus both on a microscopic view of internal compilation, execution, and data management techniques, as well as a macroscopic view of end-to-end ML pipelines. In detail, the module covers the following topics which also reflect the lecture calendar (with a separate 90-120min lecture per topic):


Lectures

In detail, the course covers the following topics, which also reflects the course calendar. All slides will be made available prior to the individual lectures, which take place Thursday, 4pm-6pm in H 0107 and virtually via zoom (call-in link). Furthermore, we also offer weekly office hours, which take place Tuesday, 4pm-5.30pm in TEL 0811 and virtually via zoom (call-in: office hour, starting Apr 23)

A: Overview and ML System Internals

  • 01 Introduction and Overview [Apr 18, pdf, pptx, mp4]
  • 02 Languages, Architectures, and System Landscape [Apr 25, pdf, pptx, mp4]
  • 03 Size Inference, Rewrites, and Operator Selection [May 02, pdf, pptx, mp4]
  • 04 Operator Fusion and Runtime Adaptation [May 16, pdf, pptx, mp4]
  • 05 Data- and Task-Parallel Execution [May 21 (in H0111), pdf, pptx, mp4]
  • 06 Parameter Servers [May 30, pdf, pptx, mp4]
  • 07 Hybrid Execution and HW Accelerators [Jun 06, pdf, pptx, mp4]
  • 08 Caching, Partitioning, Indexing and Compression [Jun 13 (virtual only), pdf, pptx, mp4]

B: ML Lifecycle Systems

  • 09 Data Acquisition, Cleaning, and Preparation [Jun 20, pdf, pptx, mp4]
  • 10 Model Selection and Management [Jun 27 (virtual only), pdf, pptx, mp4]
  • 11 Model Debugging, Fairness, and Explainability [Jul 04, pdf, pptx, mp4]
  • 12 Model Serving Systems and Techniques [Jul 11, pdf, pptx, mp4]
  • Q&A and Exam Preparation [Jul 11, pdf, pptx, mp4]


Project / Exercises

The lectures are accompanied by mandatory programming projects (to the extend of 3 ECTS, i.e, roughly 80 working hours), preferably in Apache SystemDS (an open source ML system for the end-to-end data science lifecycle), or DAPHNE (an open and extensible system infrastructure for integrated data analysis pipelines).
A list of project proposals and details on alternative exercises (programming contest or ML pipeline) are available here:

The submission deadline for all projects and exercises is Monday Jul 08, 11.59pm.


Organization

  • Lecturer: Univ.-Prof. Dr.-Ing. Matthias Boehm, DAMS
  • Teaching Assistant: M.Sc. Sebastian Baunsgaard, DAMS
  • Final written exams: Jul 18, 4pm (pdf); Aug 10, 2pm (pdf); Oct 12, 1pm
  • Grading: 100% final exam, project as prerequisite (up to 5 extra points)