Spotlight Interview – Featuring Katrine Meldgård

What is your background, and what motivated you to take on the role as Data Manager for the ReproUnion Biobank & Infertility Cohort (RUBIC) last year?

I hold a MSc in Engineering in Bioinformatics – a field I have been interested in since high school due to the intersection between natural sciences and technical solutions. Throughout my studies and early work experience, I have worked both independently and in collaborative settings, and I strongly prefer the latter. I find it highly motivating to work in environments where different areas of expertise come together to enable projects with greater scope and impact.

My background is in method development and technical solutions, which fits very well with clinical research, where clinicians play a crucial role in data collection through their clinical expertise. Having previously experienced this kind of collaboration as a student assistant, I was keen to continue working in a similar setting after graduating. The position as Data Manager in RUBIC felt like a natural next step – and an opportunity to contribute to research that can make a meaningful difference.

What does data management mean in the context of RUBIC – and why is it so important?

Collaboration is a key pillar of RUBIC – across Denmark and Sweden, between multiple fertility clinics, and between male and female study inclusions. Each site and group contributes to data collection, and it would be unrealistic for everyone to keep track of all aspects while also managing their other responsibilities.

This is where data management becomes essential. One of my main tasks is to maintain an overview across sites by comparing and merging Danish and Swedish datasets, performing quality checks, and linking clinical data with the biobank. This work needs to be done systematically so that everyone using the data can trust its quality and consistency. Centralised data management not only saves time, but also ensures that shared definitions and datasets are used consistently across publications.

You mentioned how RUBIC combines clinical data, biobank samples and cross-border datasets. What are the main challenges in harmonising all this information?

As with most large datasets, the main challenges are related to human error – which is both expected and manageable. Much of the data is collected by people directly involved in RUBIC, making it relatively accessible, as they know the data well and can help resolve any issues that arise.

The more challenging part is clinical data collected outside the core RUBIC inclusion, as it originates from the standard healthcare system. In these cases, we sometimes rely on input from collaborators outside the project to correct or clarify the data.

How does strong data management accelerate research and publications in practice?

A key benefit is the time saved on data preparation and quality control. While individual studies may have specific inclusion criteria, a set of basic quality checks is always necessary. Performing these steps centrally is far more efficient and ensures that nothing is overlooked.

By reducing the time researchers need to spend on data integrity and technical preparation, they can focus more fully on their research questions and on interpreting the results. In that sense, strong data management directly supports both faster progress and higher-quality research outputs.

What has surprised you most about working with RUBIC and ReproUnion so far?

I wasn’t entirely sure what to expect when I started, but I have been positively surprised by the strength of the collaborative environment. It has been inspiring to see the wide range of competences involved in ReproUnion and how actively they engage across disciplines and institutions.