Patterns
that Matter

News and updates

  • 01.03.2020 I have been promoted to associate professor at Leiden University.
  • 13.02.2020 Our paper titled Discovering Subjectively Interesting Multigraph Patterns, with Sarang Kapoor and Dhish Saxena, got accepted for publication in Machine Learning. Congratulations Sarang!
  • 20.01.2020 Our paper titled Vouw: Geometric Pattern Mining using the MDL Principle, with Micky Faas, got accepted at IDA 2020. Congratulations Micky!
  • 20.01.2020 Our paper titled Widening for MDL-based Retail Signature Discovery, with Clément Gautrais, Peggy Cellier, and Alexandre Termier, got accepted at IDA 2020. Congratulations Clément!
  • 07.11.2019 NWO has awarded a TTW Perspectief grant to our 5.7M€ research programme titled Integration of Data-drIven and model-based enGIneering in fuTure industriAL Technology With value chaIn optimizatioN (DIGITAL TWIN).
  • 25.10.2019 Our paper titled Interpretable multiclass classification by MDL-based rule lists, with Hugo Proença, got accepted for publication in Information Sciences. Congratulations Hugo!
  • 01.09.2019 As of today I am programme manager of Leiden University's master's programme in Computer Science.
  • 19.07.2019 Our paper titled Challenges and Limitations in Clustering Blood Donor Hemoglobin Trajectories, with Marieke Vinkenoog and Mart Janssen, got accepted at AALTD 2019. Congratulations Marieke!
  • 22.02.2019 I obtained the BKO, i.e., the University Teaching Qualification for Dutch universities (in Dutch: Basiskwalificatie Onderwijs).
  • 18.01.2019 NWO has awarded me a TOP grant (module 2) for my research proposal titled Human-Guided Data Science by Interactive Model Selection. Read the news article.

I am associate professor and group leader of the Explanatory Data Analysis group at the Leiden Institute of Advanced Computer Science (LIACS), the computer science institute of Leiden University. My primary research interest is exploratory data mining: how can we enable domain experts to explore and analyse their data, to discover structure and—ultimately—novel knowledge?

For this it is important that methods and results are explainable to domain experts, who may not be data scientists. My signature approach is to define and identify patterns that matter, i.e., succinct descriptions that characterise relevant structure present in the data. Which patterns matter strongly depends on the data and task at hand, hence defining the problem is one of the key challenges of exploratory data mining. Information theoretic concepts such as the Minimum Description Length (MDL) principle have proven very useful to this end. I am also interested in interactive data mining, i.e., involving humans in the loop. Finally, I am interested in fundamental data mining research for real-world applications, both in science (e.g., life sciences, social sciences) and industry (e.g., manufacturing and engineering, aviation), as this is the best way to show that the theory works in practice.

I am affiliated with the Leiden Centre of Data Science (LCDS) and university-wide Data Science Research Programme (DSRP). Broadly speaking, my research can be situated in the fields of data mining, machine learning, data science, and artificial intelligence (AI).


see all

Activities

Current and upcoming Recent
  • Invited talk at Data Science Center Eindhoven in Eindhoven, the Netherlands, 5 March.
  • Teacher of Information Theoretic Data Mining '19-'20 (MSc Computer Science).
  • Keynote at Workshop on Computational Intelligence & Its Applications, in Greater-Noida, India, 15 November 2019.
  • I participated in the CaStleD 2019 workshop on Computation and Statistics in Data science, in Bertinoro, Italy, 30 September - 4 October 2019.
  • I attended ECML PKDD 2019 in Würzburg, Germany, 16 - 20 September 2019.
  • Invited talk at SciCAR 2019, Scientific Computer Assisted Reporting conference, in Dortmund, Germany, 10 September 2019.
  • Committee member at the PhD defence of Sanjar Karaev (with promotors Gerhard Weikum and Pauli Miettinen). Saarland University, Saarbrücken, Germany, 10 July 2019.
  • Invited talk at MPI Informatics and Saarland University, Saarbrücken, Germany, 10 July 2019.

see all

Selected recent publications

2020
Faas, M & van Leeuwen, M Vouw: Geometric Pattern Mining using the MDL Principle. In: Proceedings of the Eighteenth International Symposium on Intelligent Data Analysis (IDA 2020), Springer, 2020.
Gautrais, C, Cellier, P, van Leeuwen, M & Termier, A Widening for MDL-based Retail Signature Discovery. In: Proceedings of the Eighteenth International Symposium on Intelligent Data Analysis (IDA 2020), Springer, 2020.
Kapoor, S, Saxena, DK & van Leeuwen, M Discovering Subjectively Interesting Multigraph Patterns. Machine Learning, pp 1-28, Springer
Proença, HM & van Leeuwen, M Interpretable multiclass classification by MDL-based rule lists. Information Sciences vol.512, pp 1372-1393, Elsevier, 2020.implementationwebsite
2019
Vinkenoog, M, Janssen, M & van Leeuwen, M Challenges and Limitations in Clustering Blood Donor Hemoglobin Trajectories. In: Proceedings of 4th Workshop on Advanced Analytics and Learning on Temporal Data at ECMLPKDD 2019, Springer, 2019.
Gawehns, D, Veiga, G & van Leeuwen, M Focus on dynamics: a proof of principle in exploratory data mining of face-to-face interactions. In: Proceedings of the 5th International Conference on Computational Social Science (IC2S2), 2019. (Poster presentation)
van Leeuwen, M, Chau, DH, Vreeken, J, Shahaf, D & Faloutsos, C Addendum to the Special Issue on Interactive Data Exploration and Analytics (TKDD, Vol. 12, Iss. 1): Introduction by the Guest Editors. Transactions on Knowledge Discovery from Data vol.13(1), ACM, 2019.
2018
Proença, HM, Klijn, R, Bäck, T & van Leeuwen, M Identifying flight delay patterns using diverse subgroup discovery. In: Proceedings of the Symposium Series on Computational Intelligence (SSCI'18), IEEE, 2018.
van Rijn, S, van Leeuwen, M, Schmitt, S, Olhofer, M & Bäck, T Multi-Fidelity Surrogate Model Approach to Optimization. In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO'18), ACM, 2018.
van Os, H, Ramos, L, Hilbert, A, van Leeuwen, M, van Walderveen, M, Kruyt, N, Dippel, D, Steyerberg, E, van der Schaaf, I, Lingsma, H, Schonewille, W, Majoie, C, Olabarriaga, S, Zwinderman, K, Venema, E, Marquering, H & Wermer, M Predicting outcome of endovascular treatment for acute ischemic stroke: potential value of machine learning algorithms. Frontiers in Neurology vol.9(784), Frontiers, 2018.
van Leeuwen, M, Chau, DH, Vreeken, J, Shahaf, D & Faloutsos, C Editorial: TKDD Special Issue on Interactive Data Exploration and Analytics. Transactions on Knowledge Discovery from Data vol.12(1), ACM, 2018.