Patterns
that Matter

News and updates

  • 17.03.2016 IDEA 2017, our (full-day) workshop on Interactive Data Exploration and Analytics, got accepted at KDD 2017!
  • 10.03.2017 Our paper titled Semiring Rank Matrix Factorisation, with Thanh Le Van, Siegfried Nijssen, and Luc De Raedt, got accepted in TKDE. Congratulations Thanh!
  • 07.02.2017 Our paper titled Flexible constrained sampling with guarantees for pattern mining, with Vladimir Dzyuba and Luc De Raedt, has been accepted for publication in DAMI and presentation at ECML-PKDD 2017. Congratulations Vladimir!
  • 16.01.2017 Our paper titled Learning what matters – Sampling interesting patterns, with Vladimir Dzyuba, got accepted at PAKDD 2017. Congratulations Vladimir!
  • 11.01.2017 Thanh Le Van successfully defended his PhD thesis titled Rank Matrix Factorisation and its Applications. Congratulations Dr. Thanh!
  • 23.12.2016 We are looking for 14 PhD students for Leiden University's Data Science research project!
  • 01.11.2016 Our paper titled Towards Data Driven Process Control in Manufacturing Car Body Parts, with Bas van Stein, Hao Wang, Stephan Purr, Sebastian Kreissl, Josef Meinhardt, and Thomas Bäck, got accepted at IEEE CSCI-ISBD 2016.
  • 11.10.2016 Our paper titled Local Subspace-Based Outlier Detection using Global Neighbourhoods, with Bas van Stein and Thomas Bäck, got accepted at IEEE BigData 2016. Congratulations Bas!
  • 27.09.2016 Our paper titled Evolving the Structure of Evolution Strategies, with Sander van Rijn, Hao Wang, and Thomas Bäck, got accepted at IEEE SSCI 2016. Congratulations Sander!
  • 01.09.2016 NEW JOB! I am now assistant professor Data Science at the Leiden Institute of Advanced Computer Science (LIACS).
  • 28.06.2016 Our paper titled Expect the Unexpected - On the Significance of Subgroups, with Antti Ukkonen, got accepted at DS 2016. Update: Slides now available!
  • 01.06.2016 Hugo Proença has started as a PhD student in the SAPPAO project, in collaboration with IIT Roorkee and GE Aviation. He will work on pattern mining for flight data. Welcome Hugo!
  • 30.05.2016 Our paper titled Simultaneous discovery of cancer subtypes and subtype features by molecular data integration, with Thanh Le Van et al., got accepted at Bioinformatics. Congratulations Thanh!
  • 01.05.2016 Sander van Rijn has started as a PhD student in the DAMIOSO project, in collaboration with Honda Research. He will work on simulation data mining. Welcome Sander!

I am assistant professor Data Mining at the Leiden Institute of Advanced Computer Science (LIACS) at Leiden University, where I participate in the Leiden Data Science research programme. My main interest is exploratory data mining: how can we enable domain experts to explore and analyse their data, to discover structure and ultimately novel knowledge?

The approach I take is to define and identify patterns that matter, i.e., succinct descriptions that characterise relevant structure present in the data. Which patterns matter strongly depends on the data and task at hand, hence defining the problem is one of the key challenges of exploratory data mining. I often use pattern-based modelling techniques, for which information theoretic concepts such as the Minimum Description Length (MDL) principle has proven very useful. I am also interested in interactive data mining, i.e., involving humans in the loop.

Finally, I find it very interesting to do fundamental data mining for real-world applications, both in science (e.g., life sciences, social sciences) and industry (e.g., manufacturing and engineering, aviation). There is no better way to show the potential of exploratory data mining than by demonstrating that patterns matter.


see all

Activities

Current and upcoming
  • PhD defence of Vladimir Dzyuba, whom I supervised together with Luc De Raedt. KU Leuven, Leuven, 9 June.
  • Teacher of Statistics '16-'17 (BSc Computer Science).
  • Co-chair of IDEA 2017, the workshop on Interactive Data Exploration and Analysis at KDD 2017.
  • Publicity Chair of IDA 2017.
  • Teacher of Information Theoretic Data Mining '17-'18 (MSc Computer Science).
Recent
  • Opponent in the PhD defence of the dissertation of Andreas Henelius, titled "Exploring classifier attribute interactions and time series using constrained randomisations". Aalto University, Helsinki, 5 May 2017.
  • Invited talk: "Mine, Interact, Learn, Repeat". Aalto University, Helsinki, 4 May 2017.
  • Teacher of Honours Class Data Science'16-'17 (BSc), together with Arno Knobbe and guest lecturers.
  • Publicity Co-Chair of SDM 2017.
  • Guest lecture: "MDL for Pattern Mining". UCLouvain, Louvain-la-Neuve, 27 April 2017.
  • Guest editor of TKDD special issue on Interactive Data Exploration and Analytics.
  • Teacher of Methodology and Research Approach '16-'17 (Post Experience MSc ICT in Business), together with Prof. Mirjam van Reisen.
  • Training: "Statistics". Ministry of Infrastructure and the Environment, October 24, The Hague, the Netherlands.
  • Invited talk: "Expect the Unexpected - On the Significance of Subgroups". SSDM workshop @ ECML PKDD 2016. September 19, Riva del Garda, Italy.
  • Workshop and Tutorial Co-ChairECML PKDD 2016.
  • Invited talk: "Big Data: Hit or Hype?". ECMA Congress 2016. September 15, Antibes, France.
  • Co-Chair of IDEA 2016, workshop on Interactive Data Exploration and Analytics at KDD 2016.
  • Invited talk: "Big Data: Hype of Hit?". Verenigingscongres De Nederlandse Associatie (DNA). June 3, Noordwijkerhout.

see all

Selected recent publications

In press
 
Le Van, T., Nijssen, S., van Leeuwen, M. & De Raedt, L. Semiring Rank Matrix Factorisation. In: TKDE, vol.?(?), ?.
 
Dzyuba, V., van Leeuwen, M. & De Raedt, L. Flexible constrained sampling with guarantees for pattern mining. In: Data Mining and Knowledge Discovery, special issue ECML PKDD'17, vol.?(?), 2017.
2017
 
Dzyuba, V. & van Leeuwen, M. Learning what matters – Sampling interesting patterns. In: Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'17), 2017.
2016
 
van Stein, B., van Leeuwen, M., Wang, H., Purr, S., Kreissl, S., Meinhardt, J. & Bäck, T. Towards Data Driven Process Control in Manufacturing Car Body Parts. In: Proceedings of IEEE International Conference on Computational Science and Computational Intelligence (IEEE CSCI-ISBD'16), 2016.
 
van Stein, B., van Leeuwen, M. & Bäck, T. Local Subspace-Based Outlier Detection using Global Neighbourhoods. In: Proceedings of IEEE International Conference on Big Data (IEEE BigData'16), 2016.
 
van Rijn, S., Wang, H., van Leeuwen, M. & Bäck, T. Evolving the Structure of Evolution Strategies. In: Proceedings of IEEE Symposium Series on Computational Intelligence (IEEE SSCI'16), 2016.
Le Van, T., van Leeuwen, M., Fierro, A.C., De Maeyer, D., Van den Eynden, J., Verbeke, L., De Raedt, L., Marchal, K. & Nijssen, S. Simultaneous discovery of cancer subtypes and subtype features by molecular data integration. In: Bioinformatics, vol.32(17), 2016.
van Leeuwen, M., De Bie, T., Spyropoulou, E. & Mesnage, C. Subjective Interestingness of Subgraph Patterns. In: Machine Learning, vol.105(1), 2016.
Chau, D.H., Vreeken, J., van Leeuwen, M., Shahaf, D. & Faloutsos, C. (eds) Proceedings of the ACM SIGKDD Workshop on Interactive Data Exploration and Analytics (IDEA 2016), 2016.
van Leeuwen, M., & Ukkonen, A. Expect the Unexpected - On the Significance of Subgroups. In: Proceedings of Discovery Science (DS'16), 2016.
van Leeuwen, M. & Galbrun, E. Association Discovery in Two-View Data (extended abstract). In: TKDE Poster Track of ICDE 2016, 2016.
Copmans, D., Meinl, T., Dietz, C., van Leeuwen, M., Ortmann, J., Berthold, M.R. & de Witte, P.A.M. A KNIME-based Analysis of the Zebrafish Photomotor Response Clusters the Phenotypes of 14 Classes of Neuroactive Molecules. In: Journal of Biomolecular Screening, vol.21(5), 2016.
2015
van Leeuwen, M. & Galbrun, E. Association Discovery in Two-View Data. In: Transactions on Knowledge and Data Engineering, vol.27(12), 2015.
Fromont, E., De Bie, T. & van Leeuwen, M. (eds) Advances in Intelligent Data Analysis XIV (proceedings of IDA 2015), LNCS 9385, Springer, 2015.
Aksehirli, E., Nijssen, S., van Leeuwen, M. & Goethals, B. Finding Subspace Clusters using Ranked Neighborhoods. In: Workshop proceedings of ICDM 2015 (HDM workshop), 2015.
Chau, D.H., Vreeken, J., van Leeuwen, M., Shahaf, D. & Faloutsos, C. (eds) Proceedings of the ACM SIGKDD Workshop on Interactive Data Exploration and Analytics (IDEA 2015), 2015.
van Leeuwen, M. & Cardinaels, L. VIPER - Visual Pattern Explorer. Demo paper at: ECML PKDD 2015, 2015.
Paramonov, S., van Leeuwen, M., Denecker, M. & De Raedt, L. An exercise in declarative modeling for relational query mining. In: Proceedings of the 25th International Conference On Inductive Logic Programming (ILP'15), 2015.
Le Van, Th., van Leeuwen, M., Nijssen, S. & De Raedt, L. Rank Matrix Factorisation. In: Proceedings of the 19th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'15), 2015.
van Leeuwen, M. & Ukkonen, A. Same bang, fewer bucks: efficient discovery of the cost-influence skyline. In: Proceedings of the SIAM Conference on Data Mining 2015 (SDM'15), 2015.