Kees van Bochove is the founder of The Hyve, a 40-person international company dedicated to the support and facilitation of open source, open standards, and open data in biomedical research. He studied Computer Science at the University of Utrecht and Bioinformatics at VU University Amsterdam and Tufts University in Boston. Kees is active in many open-source software development communities such as i2b2/tranSMART, cBioPortal, OHDSI, RADAR, Galaxy, etc. Through his many years of experience in open source software and standards development in biomedical informatics, Kees has gained a deep understanding of all aspects of collaborative open source development and open data science, including open source community building and governance, software quality, and sustainability requirements, data workflows, etc.
Kees has been involved in OHDSI initially via the IMI EMIF project starting in 2013 and has been building a team around OMOP/OHDSI through the EMIF collaboration, working with a.o. Janssen and ErasmusMC as well as by providing OMOP mapping and OHDSI installation and support services to several pharmaceutical companies. Kees was also involved in planning the first OHDSI Europe meeting in March 2018, and he hosted a subsequent follow-up workshop to introduce OHDSI to a number of European national health-data initiatives (from a.o. The Netherlands, Germany, Switzerland, and Denmark) in May 2018. The Hyve is also leading WP4 in the IMI EHDEN project, which is one of the largest components of this important project, laying the technical groundwork and building data conversion and quality management tools for further developing the European and global OHDSI community.
Can you discuss how you became a part of the OHDSI network, and what inspires you about this community?
My first encounter with OHDSI was via the IMI EMIF project in 2014. However, it really came to life for me when I joined the OHDSI community meeting in Washington in 2016 – and I have been at all of them since, both the US and of course the EU community meetings! What I really like about it is the enormous energy and the true multidisciplinary focus on advancing medical research. If I’m at an OHDSI meeting, of course I’m representing The Hyve and projects we participate in, but I don’t feel like I’m put in a box, unlike other meetings where you are branded as a ‘vendor’ — there’s a genuine interest in helping out each other and what you can bring to the table. The same goes for an OHDSI study-a-thon — you can be in a call for a study team, and you don’t even notice that it’s made up of people from all sorts of backgrounds (epidemiology, medicine, data science, computer science, etc.) and types of organizations (hospitals, academics, industry…). We all focus on obtaining those medical insights and evidence.
What can you tell people about The Hyve, why you felt passionate enough to start this company, and how it is involved in OHDSI?
At The Hyve, our business model is that we make ourselves useful in multiple biomedical open-source communities, partly funded by grant projects such as EMIF and EHDEN, and also via commercial projects for pharma companies and hospitals. As founder, I’m involved in many of them in some way – i2b2/tranSMART, GA4GH, CDISC, FHIR, RADAR-BASE, cBioPortal, OpenTargets, ELIXIR, NIH, etc. However, I have to say that the OHDSI meetings are quite unique in their focus, and they are the only biomedical open-source communities that I’ve been in so far that involve singing on stage! OHDSI has really grown on us and it’s the largest part of what we do at The Hyve, together with our focus on FAIR data.
You clearly have a passion for open science. How did it begin, and why do you think it is so critical (especially with the current COVID19 situation we face)?
My background is in computer science and I actually started out with a focus on open-source software, in particular in the biomedical domain. I think it’s really well aligned with the idea of science, in which you constantly build on the work of fellow scientists around the world. (Also, in bioinformatics, there is a long tradition of using open-source software.) However, I quickly discovered that open-source software is just a means to an end and that the key was to take the underlying values of open-source software communities to science as a whole. To really build on each other’s work, it’s not enough to peer review a paper now and then, nor is it enough to have Github repositories for your code. You have to collaborate on all levels — on data, on methodology, on software, on the language and terms you use — exactly like OHDSI does! I would argue that modern science is broken, especially the disciplines we are involved in such as medicine and biology, it’s been warped lately into a paper-publishing industry. The COVID-19 crisis in a way is a blessing in disguise for science — not only can science re-establish its value to society, but it also has led to unprecedented openness and collaboration. It’s a strange world to be in as The Hyve. Normally, we need to explain and defend open science and open-source principles, and we often have to wait for months to turn around data use agreements or MSAs with open-source publishing clauses. Now, suddenly everyone is using preprints, sharing data, and running open events. Keeping up with all the hackathons, study-a-thons and open-data initiatives for COVID-19 alone is more than you can handle in a day job!
Can you discuss your role in the development of EHDEN, and how you feel about the first year of the project?
We were involved in the IMI EMIF project, which in many ways was a precursor to EHDEN, from the start of that project in 2013. In fact, it was one of the defining projects for The Hyve, as we had just started as a company back then! I’m still incredibly grateful to Janssen, at that time through Bart Vannieuwenhuyse, for involving us in that project. However, EMIF was a project that had a lot of moving parts, and in which we explored many different routes — harmonizing registry data with tranSMART, building an OMOP-based network with the data custodians, thinking about the code of practice and ways of collaboration, trying out different software architectures, etc. I was personally mostly involved in discussions about the architecture, which I felt was more about aligning different scientific disciplines, such as epidemiology and bioinformatics (which are quite different!) then about software and infrastructure. However, the good news is that during the 5 years of EMIF we had an incredible amount of learning, and also thanks to the leadership of Peter Rijnbeek and Nigel Hughes, in EHDEN we could build on all that expertise and devise a plan that would let us hit the ground running. Nowadays, one of the main aspects I’m focusing on is to implement the FAIR principles in EHDEN and also in OHDSI in general. I’ve written a bit more about that in the open science chapter in the Book of OHDSI.
What are some of EHDEN’s key goals for 2020, and how excited are you about the potential of this program?
In 2019, we had a lot of success already with our EHDEN Data Partner call, which yielded 20 new data partners for the network, and the SME call, in which we trained and certified 11 companies for OMOP mapping. Going into 2020, we expected to focus mostly on building out the infrastructure so that we would be able to run our first network studies next year. However, COVID-19 has drastically changed and in a way accelerated that — many EHDEN partners and the consortium leadership contributed to the OHDSI COVID-19 study-a-thon, and we just launched a Rapid Collaboration Call for databases that need help mapping their COVID-19 patient-level data to OMOP. Under pressure, a lot of things become fluid, and we certainly have witnessed that in the last month!
You are one of the community’s most active members on Twitter. In your opinion, how can social media play a positive role in OHDSI and/or open science?
You could argue that the OHDSI community in a way is a well-kept secret, and that it is ahead of its time in some respects. There’s a lot of talk in the pharma industry at large of using Real World Evidence, and also an increasing awareness of clinical researchers on the value of observational research next to randomized controlled trials. But if you put it in perspective, RWD is still small compared to RCTs, and even within that, only relatively few people are aware of how far OHDSI already is in putting these ideas in practice on a global scale. One explanation may be that most people in the OHDSI community are so busy focusing on advancing medical evidence generation, that they have no time to do marketing, which is certainly better than the other way around! And marketing, or making people aware of what’s happening in OHDSI is certainly one aspect of social media like Twitter. However, there’s also another aspect, and that is that the scientific discourse increasingly happens over or is at least assisted and augmented by social media, especially Twitter. If you strategically follow some people, for example, Eric Topol right now about COVID-19 policymaking in the U.S., you get a curated feed of news, links to articles and preprints that are much more effective than just searching for papers, and which also includes sometimes quite interesting discussions. The way I use social media like Twitter is similar: I tweet about what is happening in open-source biomedical communities such as OHDSI, and my followers are interested in staying abreast of that. I really appreciate the OHDSI Twitter and LinkedIn accounts, because for a growing group of scientists and policymakers, these are important sources of information. And with the Book of OHDSI, we have a lot of great content – social media and also webinars such as the EHDEN webinar series can enhance and highlight that.
What are some of your hobbies, and what is one interesting thing that most community members might not know about you?
My main hobby is singing classical music, so traditionally around this time I would have performed the Matthaus Passion as a singer or soloist a few times! I suspect there may be more singers out there in the community, so perhaps there is interest in an ‘OHDSI choir.’ With one consortium, RADAR-CNS, we became really close to having a performing choir to lighten up the yearly meetings, but it proved too hard to keep attendance steady over the years. Anyone reading this with the same idea knows who to contact now :-).