Patterns
that Matter

News and updates

  • 31.08.2017 New website for my research group: Explanatory Data Analysis.
  • 18.07.2017 Our research propsal titled Dementia back in the heart of the community, a consortium effort for which we will conduct the data scientific component, was granted by ZonMW.
  • 22.06.2017 Our paper titled Explaining Deviating Subsets through Explanation Networks, with Antti Ukkonen and Vladimir Dzyuba, got accepted at ECML-PKDD 2017.
  • 09.06.2017 Vladimir Dzyuba successfully defended his PhD thesis titled Mine, Interact, Learn, Repeat: Interactive Pattern-Based Data Exploration, supervised by Luc De Raedt and me. Congratulations Dr. Vladimir!
  • 01.06.2017 Our paper titled Relational Data Factorization, with Sergey Paramonov and Luc De Raedt, got accepted in ML. Congratulations Sergey!
  • 17.03.2016 IDEA 2017, our (full-day) workshop on Interactive Data Exploration and Analytics, got accepted at KDD 2017!
  • 10.03.2017 Our paper titled Semiring Rank Matrix Factorisation, with Thanh Le Van, Siegfried Nijssen, and Luc De Raedt, got accepted in TKDE. Congratulations Thanh!
  • 07.02.2017 Our paper titled Flexible constrained sampling with guarantees for pattern mining, with Vladimir Dzyuba and Luc De Raedt, has been accepted for publication in DAMI and presentation at ECML-PKDD 2017. Congratulations Vladimir!
  • 16.01.2017 Our paper titled Learning what matters – Sampling interesting patterns, with Vladimir Dzyuba, got accepted at PAKDD 2017. Congratulations Vladimir!
  • 11.01.2017 Thanh Le Van successfully defended his PhD thesis titled Rank Matrix Factorisation and its Applications. Congratulations Dr. Thanh!
  • 23.12.2016 We are looking for 14 PhD students for Leiden University's Data Science research project!
  • 01.11.2016 Our paper titled Towards Data Driven Process Control in Manufacturing Car Body Parts, with Bas van Stein, Hao Wang, Stephan Purr, Sebastian Kreissl, Josef Meinhardt, and Thomas Bäck, got accepted at IEEE CSCI-ISBD 2016.
  • 11.10.2016 Our paper titled Local Subspace-Based Outlier Detection using Global Neighbourhoods, with Bas van Stein and Thomas Bäck, got accepted at IEEE BigData 2016. Congratulations Bas!

I am assistant professor and group leader of the Explanatory Data Analysis group at the Leiden Institute of Advanced Computer Science (LIACS). LIACS is the computer science institute of Leiden University, where I also participate in the Leiden Data Science research programme. My main interest is exploratory data mining: how can we enable domain experts to explore and analyse their data, to discover structure that lead to novel knowledge?

For this it is very important that all methods and results are explainable to domain experts, who may not be data scientists. The approach I take is to define and identify patterns that matter, i.e., succinct descriptions that characterise relevant structure in the data. Which patterns matter strongly depends on the data and task at hand, hence defining the problem is one of the key challenges of exploratory data mining. I often use pattern-based modelling techniques, for which information theoretic concepts such as the Minimum Description Length (MDL) principle have proven very useful. I am also interested in interactive data mining, i.e., involving humans in the loop.

Finally, I find it very interesting to do fundamental data mining for real-world applications, both in science (e.g., life sciences, social sciences) and industry (e.g., manufacturing and engineering, aviation). There is no better way to show the potential of exploratory data mining than by demonstrating that patterns matter.


see all

Activities

Current and upcoming Recent
  • Publicity Chair of IDA 2017.
  • Horizon Talk: "Towards Explanatory Data Analysis". IDA 2017, London.
  • Co-chair of IDEA 2017, the workshop on Interactive Data Exploration and Analysis at KDD 2017.
  • Teacher of Statistics '16-'17 (BSc Computer Science).
  • PhD defence of Vladimir Dzyuba, whom I supervised together with Luc De Raedt. KU Leuven, Leuven, 9 June.
  • Invited talk: "Data Mining by Compression". KU Leuven, Leuven, 9 June 2017.
  • Opponent in the PhD defence of the dissertation of Andreas Henelius, titled "Exploring classifier attribute interactions and time series using constrained randomisations". Aalto University, Helsinki, 5 May 2017.
  • Invited talk: "Mine, Interact, Learn, Repeat". Aalto University, Helsinki, 4 May 2017.
  • Teacher of Honours Class Data Science'16-'17 (BSc), together with Arno Knobbe and guest lecturers.
  • Publicity Co-Chair of SDM 2017.
  • Guest lecture: "MDL for Pattern Mining". UCLouvain, Louvain-la-Neuve, 27 April 2017.
  • Guest editor of TKDD special issue on Interactive Data Exploration and Analytics.
  • Teacher of Methodology and Research Approach '16-'17 (Post Experience MSc ICT in Business), together with Prof. Mirjam van Reisen.
  • Training: "Statistics". Ministry of Infrastructure and the Environment, October 24, The Hague, the Netherlands.

see all

Selected recent publications

In press
Paramonov, S, van Leeuwen, M & De Raedt, L Relational Data Factorization. Machine Learning, Springer
2017
Ukkonen, A, Dzyuba, V & van Leeuwen, M Explaining Deviating Subsets through Explanation Networks. In: Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD'17), Springer, 2017.
Dzyuba, V & van Leeuwen, M Learning what matters – Sampling interesting patterns. In: Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'17), pp 534-546, Springer, 2017.
Dzyuba, V, van Leeuwen, M & De Raedt, L Flexible constrained sampling with guarantees for pattern mining. Data Mining and Knowledge Discovery vol.31(5), pp 1266-1293, Springer, 2017. (ECMLPKDD'17 Special Issue)implementation
Le Van, T, Nijssen, S, van Leeuwen, M & De Raedt, L Semiring Rank Matrix Factorisation. Transactions on Knowledge and Data Engineering vol.29(8), pp 1737-1750, IEEE, 2017.
2016
van Stein, B, van Leeuwen, M, Wang, H, Purr, S, Kreissl, S, Meinhardt, J & Bäck, T Towards Data Driven Process Control in Manufacturing Car Body Parts. In: Proceedings of IEEE International Conference on Computational Science and Computational Intelligence (IEEE CSCI-ISBD'16), IEEE, 2016.
van Rijn, S, Wang, H, van Leeuwen, M & Bäck, T Evolving the Structure of Evolution Strategies. In: Proceedings of IEEE Symposium Series on Computational Intelligence (IEEE SSCI'16), IEEE, 2016.
van Stein, B, van Leeuwen, M & Bäck, T Local Subspace-Based Outlier Detection using Global Neighbourhoods. In: Proceedings of IEEE International Conference on Big Data (IEEE BigData'16), IEEE, 2016.
van Leeuwen, M & Ukkonen, A Expect the Unexpected - On the Significance of Subgroups. In: Proceedings of Discovery Science (DS'16), pp 51-66, Springer, 2016.
Le Van, T, van Leeuwen, M, Fierro, AC, De Maeyer, D, Van den Eynden, J, Verbeke, L, De Raedt, L, Marchal, K & Nijssen, S Simultaneous discovery of cancer subtypes and subtype features by molecular data integration. Bioinformatics vol.32(17), pp 445-454, Oxford University Press, 2016.implementation
Copmans, D, Meinl, T, Dietz, C, van Leeuwen, M, Ortmann, J, Berthold, M & de Witte, PAM A KNIME-based Analysis of the Zebrafish Photomotor Response Clusters the Phenotypes of 14 Classes of Neuroactive Molecules. Journal of Biomolecular Screening vol.21(5), pp 427-436, SAGE Publishing, 2016.implementation
van Leeuwen, M, De Bie, T, Spyropoulou, E & Mesnage, C Subjective Interestingness of Subgraph Patterns. Machine Learning vol.105(1), pp 41-75, Springer, 2016.implementation