Matthias Boehm is a BMK-endowed professor for data management at Graz University of Technology, Austria, and a research area manager for data management at the colocated Know-Center GmbH, Austria. Prior to joining TU Graz in 2018, he was a research staff member at IBM Research - Almaden, CA, USA, with a major focus on compilation and runtime techniques for declarative, large-scale machine learning in Apache SystemML. Matthias received his Ph.D. from Dresden University of Technology, Germany in 2011 with a dissertation on cost-based optimization of integration flows. His previous research also includes systems support for time series forecasting as well as in-memory indexing and query processing. Matthias is a recipient of the 2016 VLDB Best Paper Award, a 2016 SIGMOD Research Highlight Award, and a 2016 IBM Pat Goldberg Memorial Best Paper Award.
Current Projects: Apache SystemDS (An open source ML system for the end-to-end data science lifecycle), ExDRa (exploratory data science and federated ML over raw data, w/ Siemens, DFKI, and TU Berlin), and DAPHNE (an open and extensible system infrastructure for integrated data analysis pipelines, w/ AVL, DLR, ETH Zurich, HPI Potsdam, ICCS, Infineon, Intel, ITU Copenhagen, KAI, TU Dresden, Uni Maribor, Uni Basel)
The DAMSLab (data management for data science laboratory) is a cross-organizational research group uniting the data management group of TU Graz and the research area data management of the co-located Know-Center.
We're looking for motivated PhD, master, and bachelor students to join our team. Our research focuses on building ML systems and tools for simplifying the data science liefecycle – from data integration over model training to deployment and scoring – via high-level language abstractions and specialized compiler and runtime techniques. If you're interested, please contact me directly via email.
Open Bachelor/Master Thesis Topics
This publication list covers the last six years. For a full list see DBLP
and Google Scholar
- Prithviraj Sen, Marina Danilevsky, Yunyao Li, Siddhartha Brahma, Matthias Boehm, Laura Chiticariu, Rajasekar Krishnamurthy: Learning Explainable Linguistic Expressions with Neural Inductive Logic Programming for Sentence Classification. EMNLP 2020.
- Matthias Boehm: Technical Perspective: Declarative Recursive Computation on an RDBMS. SIGMOD Record 2020 49(1). [paper]
- Matthias Boehm, Iulian Antonov, Sebastian Baunsgaard, Mark Dokter, Robert Ginthör, Kevin Innerebner, Florijan Klezin, Stefanie Lindstaedt, Arnab Phani, Benjamin Rath, Berthold Reinwald, Shafaq Siddiqi, Sebastian Benjamin Wrede: SystemDS: A Declarative Machine Learning System for the End-to-End Data Science Lifecycle CIDR 2020. [paper, slides]
- Johanna Sommer, Matthias Boehm, Alexandre V. Evfimievski, Berthold Reinwald, Peter J. Haas: MNC: Structure-Exploiting Sparsity Estimation for Matrix Expressions. SIGMOD 2019. [paper, slides, poster]
- Ahmed Elgohary, Matthias Boehm, Peter J. Haas, Frederick R. Reiss, Berthold Reinwald: Compressed Linear Algebra for Large-Scale Machine Learning. Commun. ACM 2019 62(5). [paper, Link]
- Matthias Boehm, Arun Kumar, Jun Yang: Data Management in Machine Learning Systems. Synthesis Lectures on Data Management 11 (1), Morgan & Claypool Publishers 2019. [book]
- Matthias Boehm, Alexandre V. Evfimievski, Berthold Reinwald: Efficient Data-Parallel Cumulative Aggregates for Large-Scale Machine Learning. BTW 2019. [paper, slides]
- Matthias Boehm, Berthold Reinwald, Dylan Hutchison, Prithviraj Sen, Alexandre V. Evfimievski, Niketan Pansare: On Optimizing Operator Fusion Plans for Large-Scale Machine Learning in SystemML. PVLDB 2018 11(12). [paper, slides, poster]
- Ahmed Elgohary, Matthias Boehm, Peter J. Haas, Frederick R. Reiss, Berthold Reinwald: Compressed Linear Algebra for Large-Scale Machine Learning. VLDB Journal 2018 27(5). [paper, link]
- Matthias Boehm: Apache SystemML – Declarative Large-Scale Machine Learning. Encyclopedia of Big Data Technologies 2018. [paper]
- Niketan Pansare, Michael Dusenberry, Nakul Jindal, Matthias Boehm, Berthold Reinwald, Prithviraj Sen: Deep Learning with Apache SystemML. SysML 2018. [paper]
- Arun Kumar, Matthias Boehm, Jun Yang: Data Management in Machine Learning: Challenges, Techniques, and Systems. SIGMOD 2017. [paper, slides, video]
- Ahmed Elgohary, Matthias Boehm, Peter J. Haas, Frederick R. Reiss, Berthold Reinwald: Scaling Machine Learning via Compressed Linear Algebra. SIGMOD Record 2017 46(1). [paper]
- Tarek Elgamal, Shangyu Luo, Matthias Boehm, Alexandre V. Evfimievski, Shirish Tatikonda, Berthold Reinwald, Prithviraj Sen: SPOOF: Sum-Product Optimization and Operator Fusion for Large-Scale Machine Learning. CIDR 2017. [paper, slides]
- Ahmed Elgohary, Matthias Boehm, Peter J. Haas, Frederick R. Reiss, Berthold Reinwald: Compressed Linear Algebra for Large-Scale Machine Learning. PVLDB 2016 9(12). [paper, slides, poster]
- Matthias Boehm, Michael Dusenberry, Deron Eriksson, Alexandre V. Evfimievski, Faraz Makari Manshadi, Niketan Pansare, Berthold Reinwald, Frederick Reiss, Prithviraj Sen, Arvind Surve, Shirish Tatikonda: SystemML: Declarative Machine Learning on Spark. PVLDB 2016 9(13). [paper, slides]
- Matthias Boehm, Alexandre V. Evfimievski, Niketan Pansare, Berthold Reinwald: Declarative Machine Learning - A Classification of Basic Properties and Types. CoRR 2016 abs/1605.05826. [paper]
- Arash Ashari, Shirish Tatikonda, Matthias Boehm, Berthold Reinwald, Keith Campbell, John Keenleyside, P. Sadayappan: On Optimizing Machine Learning Workloads via Kernel Fusion. PPOPP 2015. [paper]
- Botong Huang, Matthias Boehm, Yuanyuan Tian, Berthold Reinwald, Shirish Tatikonda, Frederick R. Reiss: Resource Elasticity for Large-Scale Machine Learning. SIGMOD 2015. [paper, slides, poster]
- Matthias Boehm: Costing Generated Runtime Execution Plans for Large-Scale Machine Learning Programs. CoRR 2015 abs/1503.06384. [paper]
- Matthias Boehm, Douglas R. Burdick, Alexandre V. Evfimievski, Berthold Reinwald, Frederick R. Reiss, Prithviraj Sen, Shirish Tatikonda, Yuanyuan Tian: SystemML's Optimizer: Plan Generation for Large-Scale Machine Learning Programs. IEEE Data Eng. Bull. 2014 37(3). [paper]
- Matthias Boehm, Dirk Habich, Wolfgang Lehner: On-Demand Re-Optimization of Integration Flows. Inf. Syst. 2014 45. [paper]
- Peter D. Kirchner, Matthias Boehm, Berthold Reinwald, Daby M. Sow, J. Michael Schmidt, Deepak S. Turaga, Alain Biem: Large Scale Discriminative Metric Learning. IPDPS Workshop ParLearning 2014. [paper, slides]
- Matthias Boehm, Shirish Tatikonda, Berthold Reinwald, Prithviraj Sen, Yuanyuan Tian, Douglas Burdick, Shivakumar Vaithyanathan: Hybrid Parallelization Strategies for Large-Scale Machine Learning in SystemML. PVLDB 2014 7(7). [paper, slides, poster]
This list summarizes PC memberships and review activities, again of the last six years.
- Workshop Co-Chair DEEM 2021, Track Chair (Data Science) BTW 2021, GI working group Data Science (since 03/2020)
- Program Committee PVLDB 2022
- Program Committee SIGMOD 2021, SIGMOD 2021 Industry, CIDR 2021, ICDE 2021 Demo, PVLDB PhD 2021
- Program Committee SIGMOD 2020, PVLDB 2020, ICDE 2020, CIDR 2020, DEEM 2020, PVLDB PhD 2020; Journal Reviewer VLDBJ 2020
- Program Committee SIGMOD 2019, PVLDB 2019, ICDE 2019, EDBT 2019, DEEM 2019, AIDB 2019; Journal Reviewer TKDE 2019
- Program Committee PVLDB 2018, EDBT 2018 Industry, DEEM 2018, WebDB 2018, EBDVF 2018
- Program Committee ICDE 2017 Demo, DEEM 2017; Journal Reviewer SIGMOD Record 2017/18
- Journal Reviewer TKDE 2016/17, ACM Computing Surveys 2016, IBM Journal R&D 2016; External Reviewer CIKM 2016
- Program Committee SSDBM 2015; Journal Reviewer Information Systems 2015; External Reviewer SIGMOD Record 2015
Our research group is grateful for funding support from BMK, the BMK/FFG program "ICT of the Future", TU Graz, AVL LIST, Infineon Technologies Austria, Magna Steyr Fahrzeugtechnik, voestalpine Stahl Donawitz, and Know-Center.