that Matter

News and updates

  • 22.02.2019 I obtained the BKO, i.e., the University Teaching Qualification for Dutch universities (in Dutch: Basiskwalificatie Onderwijs).
  • 18.01.2019 NWO has awarded me a TOP grant (module 2) for my research proposal titled Human-Guided Data Science by Interactive Model Selection. Read the news article.
  • 01.09.2018 Our paper titled Identifying flight delay patterns using diverse subgroup discovery, with Hugo Proença, Ruben Klijn, and Thomas Bäck, got accepted at SSCI 2018. Congratulations Hugo!
  • 30.08.2018 Our paper titled Predicting outcome of endovascular treatment for acute ischemic stroke: potential value of machine learning algorithms, with Hine van Os and colleagues, got accepted at Frontiers of Neurology. Congratulations Hine!
  • 01.08.2018 I am now general chair of the IDA Council, the steering committee of the IDA symposium series. First up: IDA 2018 in Den Bosch, October 24-26!
  • 01.05.2018 Our research proposal titled Data Science for State-of-the-Art Blood Banking (BloodStarT), with Mart Janssen, Aske Plaat, Marian van Kraaij, and Katja van den Hurk, was granted by Sanquin.
  • 24.03.2018 Our paper titled Multi-Fidelity Surrogate Model Approach to Optimization, with Sander van Rijn, Sebastian Schmitt, Markus Olhofer, and Thomas Bäck, got accepted at GECCO 2018. Congratulations Sander!
  • 01.02.2018 Our special issue on Interactive Data Exploration and Analytics (IDEA), co-edited with Polo Chau, Jilles Vreeken, Dafna Shahaf, and Christos Faloutsos, was published in TKDD.
  • 31.08.2017 New website for my research group: Explanatory Data Analysis.
  • 18.07.2017 Our research proposal titled Dementia back in the heart of the community, a consortium effort for which we will conduct the data scientific component, was granted by ZonMW.

I am assistant professor and group leader of the Explanatory Data Analysis group at the Leiden Institute of Advanced Computer Science (LIACS), the computer science institute of Leiden University. My primary research interest is exploratory data mining: how can we enable domain experts to explore and analyse their data, to discover structure and—ultimately—novel knowledge?

For this it is important that methods and results are explainable to domain experts, who may not be data scientists. My signature approach is to define and identify patterns that matter, i.e., succinct descriptions that characterise relevant structure present in the data. Which patterns matter strongly depends on the data and task at hand, hence defining the problem is one of the key challenges of exploratory data mining. Information theoretic concepts such as the Minimum Description Length (MDL) principle have proven very useful to this end. I am also interested in interactive data mining, i.e., involving humans in the loop. Finally, I am interested in fundamental data mining research for real-world applications, both in science (e.g., life sciences, social sciences) and industry (e.g., manufacturing and engineering, aviation), as this is the best way to show that the theory works in practice.

I am affiliated with the Leiden Centre of Data Science (LCDS) and university-wide Data Science Research Programme (DSRP). Broadly speaking, my research can be situated in the fields of data mining, machine learning, data science, and artificial intelligence (AI).

see all


Current and upcoming
  • Teacher of Statistics '18-'19 (BSc Computer Science).
  • Teacher of Honours Class Data Science '18-'19 (BSc), together with Arno Knobbe and guest lecturers.
  • Committee member at the PhD defence of Sanjar Karaev (with promotors Gerhard Weikum and Pauli Miettinen). Saarland University, Saarbrücken, Germany, 10 July.
  • Invited talk at Saarland University, Saarbrücken, Germany, 10 July.
  • Track Chair of the ASCI/IPA/SIKS research schools track of ICT.OPEN 2019 in Hilversum, 19-20 March.
  • Co-organiser of the symposium titled Fairness and Transparency, towards responsible data science, Leiden University, 5 March.
  • Teacher of Information Theoretic Data Mining '18-'19 (MSc Computer Science & Data Science).
  • Committee member at the PhD defence of Ricardo Cachucho (with promotors Joost Kok and Arno Knobbe). Leiden University, Leiden, 10 December.
  • Committee member at the PhD defence of Sergey Paramonov, whom I previously worked with together with Luc De Raedt. KU Leuven, Leuven, 29 October.
  • Advisory Chair of IDA 2018 in Den Bosch, 24-26 October.
  • Participant at Dagstuhl Seminar 18401, Automating Data Science, in Wadern, Germany, 30 September - 5 October.
  • Participant at Honda Research Institute's EGN Symposium 2018, in Offenbach, Germany, 26-27 September.

see all

Selected recent publications

Gawehns, D, Veiga, G & van Leeuwen, M Focus on dynamics: a proof of principle in exploratory data mining of face-to-face interactions. In: Proceedings of the 5th International Conference on Computational Social Science (IC2S2), 2019. (Poster presentation)
Proença, HM, Klijn, R, Bäck, T & van Leeuwen, M Identifying flight delay patterns using diverse subgroup discovery. In: Proceedings of the Symposium Series on Computational Intelligence (SSCI'18), IEEE, 2018.
van Rijn, S, van Leeuwen, M, Schmitt, S, Olhofer, M & Bäck, T Multi-Fidelity Surrogate Model Approach to Optimization. In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO'18), ACM, 2018.
van Os, H, Ramos, L, Hilbert, A, van Leeuwen, M, van Walderveen, M, Kruyt, N, Dippel, D, Steyerberg, E, van der Schaaf, I, Lingsma, H, Schonewille, W, Majoie, C, Olabarriaga, S, Zwinderman, K, Venema, E, Marquering, H & Wermer, M Predicting outcome of endovascular treatment for acute ischemic stroke: potential value of machine learning algorithms. Frontiers in Neurology vol.9(784), Frontiers, 2018.
van Leeuwen, M, Chau, DH, Vreeken, J, Shahaf, D & Faloutsos, C Editorial: TKDD Special Issue on Interactive Data Exploration and Analytics. Transactions on Knowledge Discovery from Data vol.12(1), ACM, 2018.
Ukkonen, A, Dzyuba, V & van Leeuwen, M Explaining Deviating Subsets through Explanation Networks. In: Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD'17), Springer, 2017.
Dzyuba, V & van Leeuwen, M Learning what matters – Sampling interesting patterns. In: Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'17), pp 534-546, Springer, 2017.
Paramonov, S, van Leeuwen, M & De Raedt, L Relational Data Factorization. Machine Learning vol.106(12), pp 1867-1904, Springer, 2017.
Dzyuba, V, van Leeuwen, M & De Raedt, L Flexible constrained sampling with guarantees for pattern mining. Data Mining and Knowledge Discovery vol.31(5), pp 1266-1293, Springer, 2017. (ECMLPKDD'17 Special Issue)implementation
Le Van, T, Nijssen, S, van Leeuwen, M & De Raedt, L Semiring Rank Matrix Factorisation. Transactions on Knowledge and Data Engineering vol.29(8), pp 1737-1750, IEEE, 2017.