Here's a blog which highlights some of the more memorable events during my daily routine.... Events include accepted or rejected papers (ACCEPT/REJECT), master thesis defenses by students I supervised (MBI), research presentations of papers I (co-)authored (TALK), grant awards and rejections (ACCEPT/REJECT), and important research interest statements, among others.

Dr. Yigit Ozkan: Cybersecurity Maturity Assessment and Standardisation

posted Jul 12, 2022, 1:07 AM by Marco Spruit

Today Bilge Yigit Ozkan defended her dissertation Cybersecurity Maturity Assessment and Standardisation at Utrecht's Academiegebouw. Her opponents had many interesting and though questions for her, but she stood her ground firmly. In short, she did really well👍

From the summary: This dissertation investigates cybersecurity maturity assessment and cybersecurity standardisation to improve organisations' cybersecurity. We state our research objective as follows: To support the improvement of organisations' cybersecurity by means of maturity assessment and standardisation. To guide our research project, we pose our main research question as “How can we integrate cybersecurity maturity assessment and cybersecurity standardisation to provide tailored support for organisations in their cybersecurity improvement efforts?”.

My personal Top 3 highlights of Bilge's dissertation are (1) the ETSI Technical Report on Cybersecurity Essentials for SMEs (CH6), the working ASMAS prototype (CH7), and (3) the Elsevier impact Journal of Intellectual Capital publication on adaptive maturity modelling.

TALKS: Inaugural lecture on 1 April 2022 16:00 CEST

posted Mar 30, 2022, 12:34 AM by Marco Spruit   [ updated Apr 14, 2022, 1:09 AM ]

Marco Spruit Oratie 1 april 2022 Universiteit Leiden

On 1 April 2022 at 16:00 CEST I delivered my inaugural lecture on the acceptance of the position of professor of Advanced Data Science in Population Health at Leiden University. With this public 45-minutes lecture in Dutch, titled Translational Data Science in Population Health, I officially accepted my appointment as full professor at both the Leiden University Medical Centre and the Faculty of Science of Leiden University, holding the chair Advanced Data Science in Population Health.

I introduced translational data science as an independent discipline at Leiden University and in the Dutch scientific landscape. I explained the why, how and what. The overarching storyline ran from conceptual science policy to unruly implementation in daily practice. You could have attended virtually as well through the livestream on! Afterwards, all attendees could take a hardcopy of my bilingual booklet👍

UPDATE: Translational Data Science in Population Health. My inaugural lecture on the acceptance of the position of professor of Advanced Data Science in Population Health on 1 April at Leiden University is now available here: [NL] [EN].

TALKS: Artificial Intelligence for Medication Reviews

posted Nov 4, 2021, 5:54 AM by Marco Spruit

Today I gave an invited talk on "AI en farmacie in balans?" titled The STRIP Assistant Decade - Artificial Intelligence for Medication Reviews, at the Nederlandse Vereniging van ZiekenhuisApothekers (NVZA) Jaarcongres 2021 in De Fabrique at Maarsen. Finally a crowd (of around 50 people) again to talk to👍 It is an everlasting story about the long and bumpy road of a great AI idea towards actual innovation in daily care. Some luck and lots of perseverance required...


posted Jul 15, 2021, 8:33 AM by Marco Spruit   [ updated Jul 15, 2021, 8:44 AM ]

As a trivial challenge, you could try to locate my name on this excellent milestone publication:
  • Blum, Sallevelt, Spinewine, O'Mahony, Moutzouri, Feller, Baumgartner, Roumet, Jungo, Schwab, Bretagne, Beglinger, Aubert, Wilting, Thevelin, Murphy, Huibers, Drenth-van Maanen, Boland, Crowley, Eichenberger, Meulendijk, Jennings, Adam, Roos, Gleeson, Shen, Marien, Meinders, Baretella, Netzer, Montmollin, Fournier, Mouzon, O'Mahony, Aujesky, Mavridis, Byrne, Jansen, Schwenkglenks, Spruit, Dalleur, Knol, Trelle, Rodondi (2021). Optimizing Therapy to Prevent Avoidable Hospital Admissions in Multimorbid Older Adults (OPERAM): Cluster Randomised Controlled Trial. BMJ, 374(n1585). []
Never been happier with a 41st co-author position😁 It marks the culmination of a decade-long hard work on the STRIP Assistant (STRIPA), our Clinical Decision Support System to facilitate medication reviews for polypharmacy patients. Other STRIPA studies include OPTICA and STRIMP. The British Medical Journal (BMJ) is an absolute top journal with a whopping impact factor of 39.98!

It even comes with a nice explainer video👍 Have a look:

TALKS: SAILS Lunch Time Seminar

posted Jun 22, 2021, 1:41 AM by Marco Spruit   [ updated Jun 22, 2021, 1:56 AM ]

On Monday 21 June 2021, I gave a talk at Leiden University's SAILS Lunch Seminar on Natural Language Processing for Translational Data Science in Mental Healthcare. First, I positioned the research domain of Translational Data Science, in the context of the COVIDA research programme on Dutch NLP for healthcare. Then, I presented our prognostic study on inpatient violence risk assessment by applying natural language processing techniques to clinical notes in patients’ electronic health records (Menger et al, 2019). Finally, I discussed followup work where we try to better understand the performance of the best performing RNN model using LDA as a text representation method among others, which reminded us once more of the lingering issue of data quality in EHRs.

Dr. Lefebvre: Research Data Management for Open Science

posted Mar 15, 2021, 10:24 AM by Marco Spruit   [ updated Mar 15, 2021, 10:25 AM ]

Today Armel Lefebvre defended his dissertation Research Data Management for Open Science. Unfortunately, in completely online COVID19-proof fashion. Nevertheless, Armel passionately, competently and confidently defended his PhD research! Coincidentally, Armel's dissertation is the first Ph.D. thesis in which I am credited in the role of promotor (instead of being listed as co-promotor).

From the back cover: "This dissertation maps out the challenges in the current practices in science regarding reproducibility and data sharing in research. First, we identify the main stakeholders in the context of open science in Dutch academia. Next, we analyze research practices in the aspects of reproducibility and data management in both the actual laboratory context and scientific publications. We discuss particularly the threats that laboratories would face in the future without the assistance of proper research data management strategies. Finally, we focus on the future of scholarly communication and discuss how research object technology and open science readiness can contribute to open and more reproducible scientific practices."

I bet we will hear much more in coming years with the scholarly discourse about the concepts which Armel introduces in this work, especially Laboratory Forensics and Open Science Readiness... Stay tuned!

VACANCY: Assistant Professor Data Science in Population Health (Tenure Track) in Leiden

posted Dec 27, 2020, 12:35 PM by Marco Spruit   [ updated Dec 30, 2020, 1:25 PM ]

What you do

This unique tenure track position offers the best of both worlds: 50% of your work will be performed from the Campus The Hague of the LUMC, and the other 50% from the Leiden Institute of Advanced Computer Science (LIACS) within the faculty of Science of Leiden University. This means that you will be a strategic linking pin in various collaborations at the junction of data science and natural language processing in the broad area of population health. This position is embedded within the recently launched Population Health Living Lab (PHLL) The Hague, which allows you to contribute to a sustainable and robust realization of the most extensive population dataset within the Netherlands, and to consequently perform novel multidisciplinary data analyses. As assistant professor, you are expected to contribute to at least one of our overarching research themes on our Translational Data Science research agenda. Regarding teaching, you are expected to contribute around 50% of your appointment to LUMC’s Population Health Management (PHM) master’s program and LIACS’s curricula, which includes co-developing, co-teaching, and coordinating the data science courses as well as the track itself, as well as thesis supervision.


  • You position yourself as an interorganizational linking pin in the Medical Delta ecosystem at the junction of Data Science initiatives in the broad area of Population Health
  • You contribute to the further development of the Population Health Living Lab (PHLL) ecosystem with respect to research related to data engineering and translational data science
  • You contribute to the Population Health Management master’s program by co-developing, co-teaching and coordinating data science courses as well as the track itself

What we ask

You’re an expert in either the research theme of Data Engineering/Information Science or (Big) Data Analytics/Machine Learning, and knowledgeable in the other one. Similarly, you are an expert in utilizing statistical methods and machine learning techniques on real data. You are conscientious and creative, and you have experience at the postdoctoral level with a strong publication record and a proven track record in teaching. Furthermore, you are experienced in raising research funds. You are passionate about investigating and utilizing data science technologies, focusing on state-of-the-art application-oriented research in Explainable AI, AutoML, Big sensors/wearables data, speech recognition, neuro-linguistic programming, affective computing, etc. You are skilled in Python development, like using SciKit-Learn, HuggingFace, PySyft, and Streamlit. Lastly, you are communicatively skilled and you work well collaboratively.

More information?

Hello Leiden!

posted Dec 1, 2020, 6:45 AM by Marco Spruit   [ updated Jan 14, 2021, 11:52 AM ]

Today is my first day as Professor of Advanced Data Science in Population Health at the Public Health & Primary Care (PHEG) department of the Leiden University Medical Centre (LUMC) and the Leiden Institute of Advanced Computer Science (LIACS) of the Faculty of Science (W&N)! Apart from being a great milestone in itself, here is my TOP-3 of Unique Selling Points why I am particularly excited:
  1. It is a formal DUAL APPOINTMENT, meaning that am appointed at both LUMC as well as LIACS. This makes me the official linking pin for the many upcoming collaborations at the junction of data science and natural language processing in healthcare.
  2. In Leiden, my new colleagues have developed over the last years the LARGEST POPULATION DATASET in the Netherlands, with access to anonimised health records of 500K+ patients, using the Central Bureau of Statistics (CBS) as its Trusted Third Party. Pure gold!
  3. My primary affiliation is within a MULTIDISCIPLINARY setting on the campus The Hague: the Population Health Living Lab (PHLL). This is a so-called QUADRUPLE HELIX fieldlab, where Academia, Industry, Citizens, and Government all collaborate.
I'd like to thank everyone at Utrecht University for the many inspiring informal encounters, personal development programmes and research collaborations that I have had with many of you throughout these... 13 years. I have learned a bunch and it was a lot of fun!

But from now on, it is... Hello Leiden!

PS: I find it truly amazing to discover that my announcement on LinkedIn has been read over 13,000 times already after just one week!

Dr. Tawfik: Text Mining for Precision Medicine

posted Nov 25, 2020, 6:37 AM by Marco Spruit   [ updated Nov 25, 2020, 6:38 AM ]

Yesterday Noha Tawfik defended her dissertation Text Mining for Precision Medicine: Natural Language Processing, Machine Learning and Information Extraction for Knowledge Discovery in the Health Domain. In extreme COVID19 style, we were with merely 8 people --including audience-- in the Senate Hall of the UU Academiegebouw. Nevertheless, Noha admirably competently and passionately defended her PhD research!

In Noha's first research phase, she mainly employed Information Extraction to automate the identification and analysis of Genome-Wide Association Studies, given a particular disease, to investigate the relation between different phenotypic traits and Single Nucleotide Polymorphisms, known to be associated with that disease. In the second research phase, Noha expands upon the previous work by employing Machine Learning algorithms to the problem of detecting contradictions between two statements, extracted from abstracts of published articles. interpreting contradictory findings as a likely Precision Medicine finding. In the third and final phase of her research, Noha her contradiction detection research in conformance with Natural Language Inference (NLI) best practices, and participated in the 2019 ACL "Medical Natural Language Inference" challenge where she battled successfully against entire teams of various top universities.

All in all a truly excellent achievement in 4 years time with no less than 7 peer-reviewed publications!

Dr. Omta: a hybrid PhD defense

posted Oct 16, 2020, 1:07 AM by Marco Spruit   [ updated Oct 16, 2020, 1:09 AM ]

On Wednesday 14 October, Wienand Omta successfully defended his dissertation Knowledge Discovery in High Content Screening in Corona-proof hybrid style in the Academiegebouw, for which I was the co-promotor. His work in Big Data Analytics within the domain of High Content Screening (HCS) as a technology that allows life scientists to analyze the effect of bioactive molecules on cellular phenotypes, is perhaps now more important than ever before, as HCS technology is widely used in drug discovery projects, academia and the pharmaceutical industry, for example, to search for a potential COVID19 vaccin. Not only does Wienand's dissertation include various impact journal publications, such as the one on Combining Supervised and Unsupervised Machine Learning Methods for Phenotypic Functional Genomics Screening, ever since 2012 he has also worked on the HC StratoMineR platform, for which his spin-off company Core Life Analytics recently secured a 1 M EUR Series A investment. The future is bright!

1-10 of 189